Commit 88459a0a863522c92c9511aeb5148e46bd39b861

Authored by Olivier
1 parent a0ca0a31b8
Exists in master

Results updated: section Testing hypothesis 3 (challenges) completed

Showing 5 changed files with 64 additions and 54 deletions Side-by-side Diff

CHIpaper/Figs/minNbCols.pdf View file @ 88459a0

No preview for this file type

CHIpaper/Figs/minSeqLength.pdf View file @ 88459a0

No preview for this file type

CHIpaper/Figs/sellBuySNP.pdf View file @ 88459a0

No preview for this file type

CHIpaper/MarketPaper.pdf View file @ 88459a0

No preview for this file type

CHIpaper/MarketPaper.tex View file @ 88459a0
... ... @@ -419,7 +419,7 @@
419 419 lengths for all the sequences sold to the system during a game session do not follow a normal distribution, we used a non-parametric test (Kruskal-Wallis) to
420 420 verify if the sequence lengths of the different game sessions seem to come from the same distributions.
421 421 The Kruskal-Wallis test revealed a significant effect of the game conditions on the sequence lengths without considering super circles
422   -(${\chi}^2(2) = 1391.7$, $p < 2.2E-16$) and also when considering super circles (${\chi}^2(2) = 1388.4$, $p < 2.2E-16$).
  422 +(${\chi}^2(11) = 1391.7$, $p < 2.2E-16$) and also when considering super circles (${\chi}^2(11) = 1388.4$, $p < 2.2E-16$).
423 423  
424 424 We then made a post hoc test (Dunn's test) to do pairwise comparisons between all the groups. With or without considering super circles, all the game conditions
425 425 were shown to be significantly different ($p < 0.01$), except a few shown in table~\ref{tab_Dunn}. Note that the strongest similarities are found between
426 426  
427 427  
428 428  
429 429  
430 430  
431 431  
432 432  
433 433  
434 434  
435 435  
436 436  
437 437  
438 438  
439 439  
440 440  
441 441  
... ... @@ -578,86 +578,96 @@
578 578 The challenge system was implemented to analyze the current state of the game and guide the players towards doing actions that are currently needed. As mentionned
579 579 previously, five different challenge types were implemented in the game (see Section Challenge system for the complete list). In order to analyze the effect
580 580 of the challenges on the way the participants were playing, for each challenge type, we compared the relevant statistics of the game during the challenge
581   -with the rest of the game session (when a different challenge was on).
  581 +with the rest of the game session (when a different challenge was available).
582 582  
583   -Note that we are considering only the four sessions in which the challenges were on and that the Sell/buy and Buyout challenges were disabled during the
  583 +Note that we are considering here only the nine sessions in which the challenges were present and that the Sell/buy and Buyout challenges were disabled during the
584 584 session without the market.
585 585  
586   -\subsubsection{Sell/buy challenge}
587   -
588   -For the Sell/buy challenge, we were interested in comparing the number of individual circles sold on the market per minute when the challenge was active
589   -and when it was not. The results, presented in Figure~\ref{fig_sellBuySNP}, show that players were selling more circles during the challenge in all
590   -the experiments except the second one with all the features. However, the surprisingly high rate of circles sold during the time when the challenge
591   -was not active for the experiment 'All (2)' can be easily explained. During that test, the Sell/buy challenge appeared only twice (in the first
592   -25 minutes of the game session) and we had a player who put six skill points in the {\em Master Trader} skill and was selling more and more circles
593   -as the session went on (selling an impressive total of 288 circles during the session).
594   -
595   -\begin{figure*}[htbp]
596   - \begin{center}
597   - \includegraphics[width=\halfWidth]{Figs/sellBuySNP.pdf}
598   - \vspace{0cm}
599   - \caption{Number of individual circles sold on the market per minute with and without the Sell/buy challenge active. 'All' and 'All (2)'
600   - represent the two tests with all the features on, and 'No skills' represents the test without the skills.
601   - }\label{fig_sellBuySNP}
602   - \end{center}
603   -\end{figure*}
604   -
605 586 \subsubsection{Minimum number of colors challenge}
606 587  
607   -To measure the effect of the Minimum number of colors challenge on the game, we compared the average number of colors of the sequences sold by the players
608   -when the challenge was active and when it was not. The results are presented in Figure~\ref{fig_minNbCols}. In all the game sessions except the one
609   -without the market, the average number of colors in common is higher when the challenge is active. Interestingly, the biggest difference
610   -in the averages occurred during the test with no skills. One possible explanation could be that without the skills, the players have to rely more
611   -on completing the challenges to get bonuses and go up in the leaderboard. In the session without the market, the challenge did not make a significant
612   -difference on the average number of colors in common. This tends to confirm that the market is a tool that can help the players acquiring
613   -circles with more colors.
  588 +To measure the effect of the {\em Minimum number of colors challenge} on the game, we compared the average number of colors of the sequences built by the players
  589 +when the challenge was active and when it was not. The different averages for each game session are presented in Figure~\ref{fig_minNbCols}.
  590 +In all the game sessions except A-3 and NM, the average number of colors in common is higher when the challenge is active.
614 591  
615   -\begin{figure*}[htbp]
  592 +\begin{figure}[htbp]
616 593 \begin{center}
617 594 \includegraphics[width=\halfWidth]{Figs/minNbCols.pdf}
618 595 \vspace{0cm}
619   - \caption{Average number of colors in the sequences with and without the Minimum number of colors challenge active. 'All' and 'All (2)'
620   - represent the two tests with all the features on, 'No skills' represents the test without the skills, and 'No market' represents the test without the
621   - market.
  596 + \caption{Average number of colors in the sequences with and without the {\em Minimum number of colors challenge} active. 'A', 'A-2' and 'A-3'
  597 + represent the tests with all the features present, 'NS', 'NS-2' and 'NS-3' represent the tests without the skills, and 'NM', 'NM-2' and 'NM-3' represents
  598 + the test without the market.
622 599 }\label{fig_minNbCols}
623 600 \end{center}
624   -\end{figure*}
  601 +\end{figure}
625 602  
  603 +The distribution of the averages of the number of colors in common for all the game sessions considered here is normal (Shapiro-Wilk $p = 0.79$),
  604 +allowing us to use a Welch's t-test to compare the means for both groups, {\em i.e.} 1.96 colors in common during the challenge and 1.76 during
  605 +the rest of the time. The test confirmed a significant effect of the presence of the challenge on the average number of colors in common
  606 +($t(16)=2.19$, $p=0.04$, Cohen's $d = 1.03$).
  607 +
626 608 \subsubsection{Minimum sequence length challenge}
627 609  
628   -In order to analyze the effect that the Minimum sequence length challenge had on the game, we compared the average sequence length during the challenge
629   -and when a different challenge was active. As shown in Figure~\ref{fig_minSeqLength}, this challenge is the one for which we observe the smallest effect.
630   -The Minimum sequence length challenge does not seem to significantly change the players' game plan, except in the experiment without the skills. As we mentionned
631   -in the analysis of the previous challenge, it seems that when the skills are not present, the players give a lot more attention to the challenges.
632   -%As for the two experiments with all the features on, the average sequence length is a little bit lower during the challenge, which is surprising and hard to explain.
633   -As with the previous challenge, we can observe that the Minimum sequence length challenge does not seem to have affected the session without the market.
634   -This seems to show that the market can help the players to build longer sequences, but in the two experiments with all the features on, the average
635   -sequence length is a little bit lower during the challenge, which is contradictory.
636   -%%%WHAT ELSE CAN WE SAY???
  610 +In order to analyze the effect that the {\em Minimum sequence length challenge} had on the game, we compared the average sequence length during the challenge
  611 +and when a different challenge was active for all the game sessions. As shown in Figure~\ref{fig_minSeqLength}, the presence of this challenge increased
  612 +the average sequence length in all the game sessions except the three sessions with all the features.
637 613  
638   -\begin{figure*}[htbp]
  614 +\begin{figure}[htbp]
639 615 \begin{center}
640 616 \includegraphics[width=\halfWidth]{Figs/minSeqLength.pdf}
641 617 \vspace{0cm}
642   - \caption{Average sequence length with and without the Minimum sequence length challenge active. 'All' and 'All (2)'
643   - represent the two tests with all the features on, 'No skills' represents the test without the skills, and 'No market' represents the test without the
644   - market.
  618 + \caption{Average sequence length with and without the {\em Minimum sequence length challenge active}. 'A', 'A-2' and 'A-3'
  619 + represent the tests with all the features present, 'NS', 'NS-2' and 'NS-3' represent the tests without the skills, and 'NM', 'NM-2' and 'NM-3' represents
  620 + the test without the market.
645 621 }\label{fig_minSeqLength}
646 622 \end{center}
647   -\end{figure*}
  623 +\end{figure}
648 624  
  625 +The means of the average sequence lengths during the challenge and for the rest of the time are 5.38 and 5.08 respectively. Since the distribution
  626 +of the averages of sequence lengths is normal (Shapiro-Wilk $p = 0.27$), we used a Welch's t-test to compare those means, but the test wasn't able
  627 +to prove that those means are significantly different ($t(16)=0.79$, $p = 0.44$).
  628 +
  629 +Although there is not a statistically significant difference between the two groups, we can generally see a small effect for six of the nine groups with
  630 +challenges. The fact that we observe the opposite effect in the three game sessions with all the features is very surprising, but hard to explain. One possible
  631 +explanation could be that when all the features are present, the players have more to think about and check the challenges a little bit less.
  632 +
  633 +\subsubsection{Sell/buy challenge}
  634 +
  635 +For the {\em Sell/buy challenge}, we were interested in comparing the number of individual circles sold on the market per minute when the challenge was active
  636 +and when it was not. The results, presented in Figure~\ref{fig_sellBuySNP}, don't show a clear trend. Indeed, in half of the game sessions, the
  637 +number of circles sold per minute is higher during the challenge, while it's the opposite for the other half of the game sessions.
  638 +
  639 +\begin{figure}[htbp]
  640 + \begin{center}
  641 + \includegraphics[width=\halfWidth]{Figs/sellBuySNP.pdf}
  642 + \vspace{0cm}
  643 + \caption{Number of individual circles sold on the market per minute with and without the Sell/buy challenge active. 'A', 'A-2' and 'A-3'
  644 + represent the tests with all the features present, 'NS', 'NS-2' and 'NS-3' represent the tests without the skills, and 'NM', 'NM-2' and 'NM-3' represents
  645 + the test without the market.
  646 + }\label{fig_sellBuySNP}
  647 + \end{center}
  648 +\end{figure}
  649 +
  650 +Once again, the numbers of circles sold per minute in the six different game sessions follow a normal distribution (Shapiro-Wilk $p = 0.26$), so we
  651 +used a Welch's t-test to compare the means of both groups, which are 13.18 during the challenge and 12.73 during the rest of the time. The t-test
  652 +failed to reject the null hypothesis that both means are the same ($t(10)=0.11$, $p = 0.91$).
  653 +
  654 +We believe that the main reason why there doesn't seem to be any difference between the two groups is that most people were able to
  655 +complete this type of challenge without really changing anything to their normal behavior. This challenge was simply too easy, because most of the players
  656 +are always selling or buying (through the bids) at least 2 or 4 circles every five minutes (the length of a challenge).
  657 +
649 658 \subsubsection{Buyout}
650 659  
651   -The buyout challenge appeared only once in total in all the three gaming session with challenges and with the market. Thus, we don't have a significant
  660 +The {\em Buyout challenge} appeared only once in total in all the three gaming session with challenges and with the market. Thus, we don't have a significant
652 661 amount of data to analyze the effect of this challenge. The reason why this challenge almost never appeared is because players were always using the
653   -buyout, which greatly reduced the probability of showing this challenge.
  662 +buyout, which greatly reduced the probability of showing this challenge.
654 663  
655 664 \subsubsection{Specific colors in common}
656 665  
657   -This challenge also cannot be analyzed because it was never completed by any player, despite appearing a total of five times in all the game sessions.
  666 +The {\em Specific colors in common challenge} is also difficult to analyze because it was completed only 8 times in total during the nine sessions with challenges, despite
  667 +appearing 11 times throughout those nine experiments.
658 668 This can be explained by the fact that it was the hardest challenge. All the other challenges are more general and can be completed by
659 669 doing actions that are not specific to a certain subset of colors. Even if the market should be helpful in finding circles with the required
660   -subset of colors, it seems highly probable that the players felt that this type of challenge was too hard and never tried to complete it.
  670 +subset of colors, it seems highly probable that the players felt that this type of challenge was too hard and almost never tried to complete it.
661 671  
662 672 \subsection{Testing hypothesis 4: relationship between total experience and percentage solved}
663 673 %Coming back on the 4 tests, total game xp vs percentage of problem solved
... ... @@ -666,7 +676,7 @@
666 676 impossible to compare one game session with an other. Indeed, some players quickly understood all the rules of the game and how to maximize their score,
667 677 while others struggled to make points during the whole session, even with the help of the authors who were monitoring the session.
668 678  
669   -\begin{figure*}[htbp]
  679 +\begin{figure}[htbp]
670 680 \begin{center}
671 681 \includegraphics[width=\halfWidth]{Figs/totalXP_session.pdf}
672 682 \vspace{0cm}
... ... @@ -675,7 +685,7 @@
675 685 market and 'No chal.' represents the test without the challenges.
676 686 }\label{fig_totalXP}
677 687 \end{center}
678   -\end{figure*}
  688 +\end{figure}
679 689  
680 690 As shown in Figure~\ref{fig_totalXP}, the percentage of the problem that was solved is nearly identical for all the tests (around $60\%$) except for the
681 691 first test with all the features and the test with no challenges, in which the players in general performed worse (as indicated by the total experience