Updated Results section: hypothesis 1

\begin{figure}[htbp]

\begin{center}

401 | 402 | \begin{center} |

\includegraphics[width=\halfWidth]{Figs/averageSeqLength.pdf}

\vspace{0cm}

\caption{Average sequence length for every game session, not considering the super circles and considering the super circles (e.g. a super circle

405 | - of level 2 in a sequence counts for 2 circles). 'All' and 'All (2)' | |

406 | - represent the two tests with all the features on, 'No skills' represents the test without the skills, 'No market' represents the test without the | |

407 | - market and 'No chal.' represents the test without the challenges. | |

406 | + of level 2 in a sequence represents 10 circles in the solution). A, A-2 and A-3 represent the tests with all the features on; NS, NS-2 and NS-3 represent the | |

407 | + tests without the skills; NM, NM-2 and NM-3 represent the tests without the market; NC, NC-2 and NC-3 represent the tests without the challenges. | |

}\label{fig_averageSeqLength}

\end{center}

410 | -\end{figure*} | |

\end{figure}

411 | 411 | |

412 | -As shown in Figure~\ref{fig_averageSeqLength}, the game session in which we had the lowest average of sequence lengths (for all the sequences sold by | |

413 | -all the players) is the one that was played without the market, with an average of $4.40$. Even if we consider the super circles (the special circles that count | |

414 | -for more than one in the scoring function), the average sequence length is still the lowest for that session, with a value of $4.90$. Notice that even in | |

As shown in Figure~\ref{fig_averageSeqLength}, the three game sessions in which we had the lowest average of sequence lengths (for all the sequences sold by

all the players) are the ones that were played without the market, with averages of $4.40$ for NM, $4.19$ for NM-2 and $4.63$ for NM-3.

Even if we consider the super circles (the special circles that are actually 10 circles combined into one), the average sequence lengths for those

three sessions are still the lowest ones, with values of $4.90$ for both NM and NM-2, and $5.40$ for NM-3.

416 | + | |

417 | +%We performed statistical tests to make sure that the observed differences in the means are statistically significant. | |

Since the distribution of the

lengths for all the sequences sold to the system during a game session do not follow a normal distribution, we used a non-parametric test (Kruskal-Wallis) to

verify if the sequence lengths of the different game sessions seem to come from the same distributions.

The Kruskal-Wallis test revealed a significant effect of the game conditions on the sequence lengths without considering super circles

(${\chi}^2(2) = 1391.7$, $p < \num{2.2e-16}$) and also when considering super circles (${\chi}^2(2) = 1388.4$, $p < \num{2.2e-16}$).

423 | + | |

We then made a post hoc test (Dunn's test) to do pairwise comparisons between all the groups. With or without considering super circles, all the game conditions

were shown to be significantly different ($p < 0.01$), except a few shown in table~\ref{tab_Dunn}. Note that the strongest similarities are found between

the three 'All' groups and the three 'No market' groups. Some of the 'No skills' experiments are found to be similar to the 'All' groups, which could indicate

that the presence of the skills have a very limited effect on the sequence length. The NC experiment is found to be similar to two 'No market' groups, but that

can be explained by the fact the players for the NC experiment were very weak (see section~\ref{sect_hyp4}).

429 | + | |

430 | +\begin{table}[h] | |

431 | + \caption{Similar groups of sequence length distributions, as reported by Dunn's test. An 'n' in the table represent a similar pair when not considering | |

432 | + super circles, and an 's' in the table represents a similar pair when considering super circles.}\label{tab_Dunn} | |

433 | +\begin{center} | |

434 | +\begin{tabular}{c|cccccccc}\hline | |

435 | + & A-2 & A-3 & NS & NS-2 & NS-3 & NM & NM-2 & NM-3\\\hline | |

436 | + A & n/s & n & & n/s & & & & \\ | |

437 | + A-2 & & n & & n/s & & & & \\ | |

438 | + A-3 & & & n/s & & & & & \\ | |

439 | + NC & & & & & & n & & n/s \\ | |

440 | + NC-3 & & & & & n/s & & & \\ | |

441 | + NM & & & & & & & n/s & n \\\hline | |

442 | +\end{tabular} | |

443 | +\end{center} | |

444 | +\end{table} | |

445 | + | |

446 | +%WILL HAVE TO MOVE THE FOLLOWING SENTENCES TO HYPOTHESIS 4 SECTION | |

Notice that even in

the two sessions for which we had the smallest total experience (see Figure~\ref{fig_totalXP}), both averages of sequence lengths were larger than the averages

of the game session without the market. Those observations confirm that the market is helping the players in the creation of longer sequences.

417 | 450 | |

doing actions that are not specific to a certain subset of colors. Even if the market should be helpful in finding circles with the required

subset of colors, it seems highly probable that the players felt that this type of challenge was too hard and never tried to complete it.

635 | 668 | |

636 | -\subsection{Testing hypothesis 4: relationship between total experience and percentage solved} | |

\subsection{Testing hypothesis 4: relationship between total experience and percentage solved}\label{sect_hyp4}

637 | 670 | %Coming back on the 4 tests, total game xp vs percentage of problem solved |

As mentioned in the Experiments section, the initial plan was to measure the impact of each feature by analyzing how much of the problem can be solved

by the players in each of the game sessions. Interestingly, we observed a larger than expected variance in the participants' skills which made it practically