Authored by Olivier
1 parent a0ca0a31b8
Exists in

### Results updated: section Testing hypothesis 3 (challenges) completed

Showing 5 changed files with 64 additions and 54 deletions

CHIpaper/Figs/minNbCols.pdf

No preview for this file type

CHIpaper/Figs/minSeqLength.pdf

No preview for this file type

 ... ... @@ -419,7 +419,7 @@ 419 419 lengths for all the sequences sold to the system during a game session do not follow a normal distribution, we used a non-parametric test (Kruskal-Wallis) to 420 420 verify if the sequence lengths of the different game sessions seem to come from the same distributions. 421 421 The Kruskal-Wallis test revealed a significant effect of the game conditions on the sequence lengths without considering super circles 422 -(${\chi}^2(2) = 1391.7$, $p < 2.2E-16$) and also when considering super circles (${\chi}^2(2) = 1388.4$, $p < 2.2E-16$). 422 +(${\chi}^2(11) = 1391.7$, $p < 2.2E-16$) and also when considering super circles (${\chi}^2(11) = 1388.4$, $p < 2.2E-16$). 423 423 424 424 We then made a post hoc test (Dunn's test) to do pairwise comparisons between all the groups. With or without considering super circles, all the game conditions 425 425 were shown to be significantly different ($p < 0.01$), except a few shown in table~\ref{tab_Dunn}. Note that the strongest similarities are found between 426 426 427 427 428 428 429 429 430 430 431 431 432 432 433 433 434 434 435 435 436 436 437 437 438 438 439 439 440 440 441 441 ... ... @@ -578,86 +578,96 @@ 578 578 The challenge system was implemented to analyze the current state of the game and guide the players towards doing actions that are currently needed. As mentionned 579 579 previously, five different challenge types were implemented in the game (see Section Challenge system for the complete list). In order to analyze the effect 580 580 of the challenges on the way the participants were playing, for each challenge type, we compared the relevant statistics of the game during the challenge 581 -with the rest of the game session (when a different challenge was on). 581 +with the rest of the game session (when a different challenge was available). 582 582 583 -Note that we are considering only the four sessions in which the challenges were on and that the Sell/buy and Buyout challenges were disabled during the 583 +Note that we are considering here only the nine sessions in which the challenges were present and that the Sell/buy and Buyout challenges were disabled during the 584 584 session without the market. 585 585 586 -\subsubsection{Sell/buy challenge} 587 - 588 -For the Sell/buy challenge, we were interested in comparing the number of individual circles sold on the market per minute when the challenge was active 589 -and when it was not. The results, presented in Figure~\ref{fig_sellBuySNP}, show that players were selling more circles during the challenge in all 590 -the experiments except the second one with all the features. However, the surprisingly high rate of circles sold during the time when the challenge 591 -was not active for the experiment 'All (2)' can be easily explained. During that test, the Sell/buy challenge appeared only twice (in the first 592 -25 minutes of the game session) and we had a player who put six skill points in the {\em Master Trader} skill and was selling more and more circles 593 -as the session went on (selling an impressive total of 288 circles during the session). 594 - 595 -\begin{figure*}[htbp] 596 - \begin{center} 597 - \includegraphics[width=\halfWidth]{Figs/sellBuySNP.pdf} 598 - \vspace{0cm} 599 - \caption{Number of individual circles sold on the market per minute with and without the Sell/buy challenge active. 'All' and 'All (2)' 600 - represent the two tests with all the features on, and 'No skills' represents the test without the skills. 601 - }\label{fig_sellBuySNP} 602 - \end{center} 603 -\end{figure*} 604 - 605 586 \subsubsection{Minimum number of colors challenge} 606 587 607 -To measure the effect of the Minimum number of colors challenge on the game, we compared the average number of colors of the sequences sold by the players 608 -when the challenge was active and when it was not. The results are presented in Figure~\ref{fig_minNbCols}. In all the game sessions except the one 609 -without the market, the average number of colors in common is higher when the challenge is active. Interestingly, the biggest difference 610 -in the averages occurred during the test with no skills. One possible explanation could be that without the skills, the players have to rely more 611 -on completing the challenges to get bonuses and go up in the leaderboard. In the session without the market, the challenge did not make a significant 612 -difference on the average number of colors in common. This tends to confirm that the market is a tool that can help the players acquiring 613 -circles with more colors. 588 +To measure the effect of the {\em Minimum number of colors challenge} on the game, we compared the average number of colors of the sequences built by the players 589 +when the challenge was active and when it was not. The different averages for each game session are presented in Figure~\ref{fig_minNbCols}. 590 +In all the game sessions except A-3 and NM, the average number of colors in common is higher when the challenge is active. 614 591 615 -\begin{figure*}[htbp] 592 +\begin{figure}[htbp] 616 593 \begin{center} 617 594 \includegraphics[width=\halfWidth]{Figs/minNbCols.pdf} 618 595 \vspace{0cm} 619 - \caption{Average number of colors in the sequences with and without the Minimum number of colors challenge active. 'All' and 'All (2)' 620 - represent the two tests with all the features on, 'No skills' represents the test without the skills, and 'No market' represents the test without the 621 - market. 596 + \caption{Average number of colors in the sequences with and without the {\em Minimum number of colors challenge} active. 'A', 'A-2' and 'A-3' 597 + represent the tests with all the features present, 'NS', 'NS-2' and 'NS-3' represent the tests without the skills, and 'NM', 'NM-2' and 'NM-3' represents 598 + the test without the market. 622 599 }\label{fig_minNbCols} 623 600 \end{center} 624 -\end{figure*} 601 +\end{figure} 625 602 603 +The distribution of the averages of the number of colors in common for all the game sessions considered here is normal (Shapiro-Wilk $p = 0.79$), 604 +allowing us to use a Welch's t-test to compare the means for both groups, {\em i.e.} 1.96 colors in common during the challenge and 1.76 during 605 +the rest of the time. The test confirmed a significant effect of the presence of the challenge on the average number of colors in common 606 +($t(16)=2.19$, $p=0.04$, Cohen's $d = 1.03$). 607 + 626 608 \subsubsection{Minimum sequence length challenge} 627 609 628 -In order to analyze the effect that the Minimum sequence length challenge had on the game, we compared the average sequence length during the challenge 629 -and when a different challenge was active. As shown in Figure~\ref{fig_minSeqLength}, this challenge is the one for which we observe the smallest effect. 630 -The Minimum sequence length challenge does not seem to significantly change the players' game plan, except in the experiment without the skills. As we mentionned 631 -in the analysis of the previous challenge, it seems that when the skills are not present, the players give a lot more attention to the challenges. 632 -%As for the two experiments with all the features on, the average sequence length is a little bit lower during the challenge, which is surprising and hard to explain. 633 -As with the previous challenge, we can observe that the Minimum sequence length challenge does not seem to have affected the session without the market. 634 -This seems to show that the market can help the players to build longer sequences, but in the two experiments with all the features on, the average 635 -sequence length is a little bit lower during the challenge, which is contradictory. 636 -%%%WHAT ELSE CAN WE SAY??? 610 +In order to analyze the effect that the {\em Minimum sequence length challenge} had on the game, we compared the average sequence length during the challenge 611 +and when a different challenge was active for all the game sessions. As shown in Figure~\ref{fig_minSeqLength}, the presence of this challenge increased 612 +the average sequence length in all the game sessions except the three sessions with all the features. 637 613 638 -\begin{figure*}[htbp] 614 +\begin{figure}[htbp] 639 615 \begin{center} 640 616 \includegraphics[width=\halfWidth]{Figs/minSeqLength.pdf} 641 617 \vspace{0cm} 642 - \caption{Average sequence length with and without the Minimum sequence length challenge active. 'All' and 'All (2)' 643 - represent the two tests with all the features on, 'No skills' represents the test without the skills, and 'No market' represents the test without the 644 - market. 618 + \caption{Average sequence length with and without the {\em Minimum sequence length challenge active}. 'A', 'A-2' and 'A-3' 619 + represent the tests with all the features present, 'NS', 'NS-2' and 'NS-3' represent the tests without the skills, and 'NM', 'NM-2' and 'NM-3' represents 620 + the test without the market. 645 621 }\label{fig_minSeqLength} 646 622 \end{center} 647 -\end{figure*} 623 +\end{figure} 648 624 625 +The means of the average sequence lengths during the challenge and for the rest of the time are 5.38 and 5.08 respectively. Since the distribution 626 +of the averages of sequence lengths is normal (Shapiro-Wilk $p = 0.27$), we used a Welch's t-test to compare those means, but the test wasn't able 627 +to prove that those means are significantly different ($t(16)=0.79$, $p = 0.44$). 628 + 629 +Although there is not a statistically significant difference between the two groups, we can generally see a small effect for six of the nine groups with 630 +challenges. The fact that we observe the opposite effect in the three game sessions with all the features is very surprising, but hard to explain. One possible 631 +explanation could be that when all the features are present, the players have more to think about and check the challenges a little bit less. 632 + 633 +\subsubsection{Sell/buy challenge} 634 + 635 +For the {\em Sell/buy challenge}, we were interested in comparing the number of individual circles sold on the market per minute when the challenge was active 636 +and when it was not. The results, presented in Figure~\ref{fig_sellBuySNP}, don't show a clear trend. Indeed, in half of the game sessions, the 637 +number of circles sold per minute is higher during the challenge, while it's the opposite for the other half of the game sessions. 638 + 639 +\begin{figure}[htbp] 640 + \begin{center} 641 + \includegraphics[width=\halfWidth]{Figs/sellBuySNP.pdf} 642 + \vspace{0cm} 643 + \caption{Number of individual circles sold on the market per minute with and without the Sell/buy challenge active. 'A', 'A-2' and 'A-3' 644 + represent the tests with all the features present, 'NS', 'NS-2' and 'NS-3' represent the tests without the skills, and 'NM', 'NM-2' and 'NM-3' represents 645 + the test without the market. 646 + }\label{fig_sellBuySNP} 647 + \end{center} 648 +\end{figure} 649 + 650 +Once again, the numbers of circles sold per minute in the six different game sessions follow a normal distribution (Shapiro-Wilk $p = 0.26$), so we 651 +used a Welch's t-test to compare the means of both groups, which are 13.18 during the challenge and 12.73 during the rest of the time. The t-test 652 +failed to reject the null hypothesis that both means are the same ($t(10)=0.11$, $p = 0.91$). 653 + 654 +We believe that the main reason why there doesn't seem to be any difference between the two groups is that most people were able to 655 +complete this type of challenge without really changing anything to their normal behavior. This challenge was simply too easy, because most of the players 656 +are always selling or buying (through the bids) at least 2 or 4 circles every five minutes (the length of a challenge). 657 + 649 658 \subsubsection{Buyout} 650 659 651 -The buyout challenge appeared only once in total in all the three gaming session with challenges and with the market. Thus, we don't have a significant 660 +The {\em Buyout challenge} appeared only once in total in all the three gaming session with challenges and with the market. Thus, we don't have a significant 652 661 amount of data to analyze the effect of this challenge. The reason why this challenge almost never appeared is because players were always using the 653 -buyout, which greatly reduced the probability of showing this challenge. 662 +buyout, which greatly reduced the probability of showing this challenge. 654 663 655 664 \subsubsection{Specific colors in common} 656 665 657 -This challenge also cannot be analyzed because it was never completed by any player, despite appearing a total of five times in all the game sessions. 666 +The {\em Specific colors in common challenge} is also difficult to analyze because it was completed only 8 times in total during the nine sessions with challenges, despite 667 +appearing 11 times throughout those nine experiments. 658 668 This can be explained by the fact that it was the hardest challenge. All the other challenges are more general and can be completed by 659 669 doing actions that are not specific to a certain subset of colors. Even if the market should be helpful in finding circles with the required 660 -subset of colors, it seems highly probable that the players felt that this type of challenge was too hard and never tried to complete it. 670 +subset of colors, it seems highly probable that the players felt that this type of challenge was too hard and almost never tried to complete it. 661 671 662 672 \subsection{Testing hypothesis 4: relationship between total experience and percentage solved} 663 673 %Coming back on the 4 tests, total game xp vs percentage of problem solved ... ... @@ -666,7 +676,7 @@ 666 676 impossible to compare one game session with an other. Indeed, some players quickly understood all the rules of the game and how to maximize their score, 667 677 while others struggled to make points during the whole session, even with the help of the authors who were monitoring the session. 668 678 669 -\begin{figure*}[htbp] 679 +\begin{figure}[htbp] 670 680 \begin{center} 671 681 \includegraphics[width=\halfWidth]{Figs/totalXP_session.pdf} 672 682 \vspace{0cm} ... ... @@ -675,7 +685,7 @@ 675 685 market and 'No chal.' represents the test without the challenges. 676 686 }\label{fig_totalXP} 677 687 \end{center} 678 -\end{figure*} 688 +\end{figure} 679 689 680 690 As shown in Figure~\ref{fig_totalXP}, the percentage of the problem that was solved is nearly identical for all the tests (around $60\%$) except for the 681 691 first test with all the features and the test with no challenges, in which the players in general performed worse (as indicated by the total experience