Commit a0585b6c90804f77900c70efad80809001a1e9f0

1 parent
88459a0a86

Exists in
master

### Results updated: section Hypothesis 4 (challenges) completed. Section What makes…

… a good player updated. Conclusion updated. Acknowledgments updated

Showing
**3 changed files**
with
**65 additions**
and
**41 deletions**
Side-by-side Diff

CHIpaper/Figs/totalXP_session.pdf
View file @
a0585b6

CHIpaper/MarketPaper.pdf
View file @
a0585b6

CHIpaper/MarketPaper.tex
View file @
a0585b6

... | ... | @@ -403,8 +403,8 @@ |

403 | 403 | \includegraphics[width=\halfWidth]{Figs/averageSeqLength.pdf} |

404 | 404 | \vspace{0cm} |

405 | 405 | \caption{Average sequence length for every game session, not considering the super circles and considering the super circles (e.g. a super circle |

406 | - of level 2 in a sequence represents 10 circles in the solution). A, A-2 and A-3 represent the tests with all the features on; NS, NS-2 and NS-3 represent the | |

407 | - tests without the skills; NM, NM-2 and NM-3 represent the tests without the market; NC, NC-2 and NC-3 represent the tests without the challenges. | |

406 | + of level 2 in a sequence represents 10 circles in the solution). 'A', 'A-2' and 'A-3' represent the tests with all the features on; 'NS', 'NS-2' and 'NS-3' represent the | |

407 | + tests without the skills; 'NM', 'NM-2' and 'NM-3' represent the tests without the market; 'NC', 'NC-2' and 'NC-3' represent the tests without the challenges. | |

408 | 408 | }\label{fig_averageSeqLength} |

409 | 409 | \end{center} |

410 | 410 | \end{figure} |

411 | 411 | |

... | ... | @@ -655,13 +655,13 @@ |

655 | 655 | complete this type of challenge without really changing anything to their normal behavior. This challenge was simply too easy, because most of the players |

656 | 656 | are always selling or buying (through the bids) at least 2 or 4 circles every five minutes (the length of a challenge). |

657 | 657 | |

658 | -\subsubsection{Buyout} | |

658 | +\subsubsection{Buyout challenge} | |

659 | 659 | |

660 | 660 | The {\em Buyout challenge} appeared only once in total in all the three gaming session with challenges and with the market. Thus, we don't have a significant |

661 | 661 | amount of data to analyze the effect of this challenge. The reason why this challenge almost never appeared is because players were always using the |

662 | 662 | buyout, which greatly reduced the probability of showing this challenge. |

663 | 663 | |

664 | -\subsubsection{Specific colors in common} | |

664 | +\subsubsection{Specific colors in common challenge} | |

665 | 665 | |

666 | 666 | The {\em Specific colors in common challenge} is also difficult to analyze because it was completed only 8 times in total during the nine sessions with challenges, despite |

667 | 667 | appearing 11 times throughout those nine experiments. |

668 | 668 | |

669 | 669 | |

670 | 670 | |

671 | 671 | |

672 | 672 | |

673 | 673 | |

674 | 674 | |

675 | 675 | |

676 | 676 | |

677 | 677 | |

... | ... | @@ -669,58 +669,66 @@ |

669 | 669 | doing actions that are not specific to a certain subset of colors. Even if the market should be helpful in finding circles with the required |

670 | 670 | subset of colors, it seems highly probable that the players felt that this type of challenge was too hard and almost never tried to complete it. |

671 | 671 | |

672 | -\subsection{Testing hypothesis 4: relationship between total experience and percentage solved} | |

673 | -%Coming back on the 4 tests, total game xp vs percentage of problem solved | |

674 | -As mentioned in the Experiments section, the initial plan was to measure the impact of each feature by analyzing how much of the problem can be solved | |

675 | -by the players in each of the game sessions. Interestingly, we observed a larger than expected variance in the participants' skills which made it practically | |

676 | -impossible to compare one game session with an other. Indeed, some players quickly understood all the rules of the game and how to maximize their score, | |

677 | -while others struggled to make points during the whole session, even with the help of the authors who were monitoring the session. | |

672 | +\subsection{Testing hypothesis 4: percentage of the problem solved as a measure of the importance of different game features} | |

673 | +One of the research goals was to measure the impact of each feature by analyzing how much of the problem can be solved | |

674 | +by the players in each of the game sessions. Our initial hypothesis was that players who have access to all the game features should | |

675 | +be able to solve more of the problem. | |

678 | 676 | |

677 | +Interestingly, we observed a larger than expected variance in the participants' skills which made it sometimes | |

678 | +difficult to compare one game session with another in terms of the percentage of the problem that was solved. | |

679 | +Indeed, some players quickly understood all the rules of the game and how to maximize their score, | |

680 | +while others struggled to make points during the whole session, even with our help. | |

681 | + | |

679 | 682 | \begin{figure}[htbp] |

680 | 683 | \begin{center} |

681 | 684 | \includegraphics[width=\halfWidth]{Figs/totalXP_session.pdf} |

682 | 685 | \vspace{0cm} |

683 | - \caption{Total game experience and percentage of the problem solved for each of the 5 game sessions. 'XP' represents experience points. 'All' and 'All (2)' | |

684 | - represent the two tests with all the features on, 'No skills' represents the test without the skills, 'No market' represents the test without the | |

685 | - market and 'No chal.' represents the test without the challenges. | |

686 | + \caption{Total game experience and percentage of the problem solved for each of the 12 game sessions. 'XP' represents experience points. | |

687 | + 'A', 'A-2' and 'A-3' represent the tests with all the features on; 'NS', 'NS-2' and 'NS-3' represent the | |

688 | + tests without the skills; 'NM', 'NM-2' and 'NM-3' represent the tests without the market; 'NC', 'NC-2' and 'NC-3' represent the tests without the challenges. | |

686 | 689 | }\label{fig_totalXP} |

687 | 690 | \end{center} |

688 | 691 | \end{figure} |

689 | 692 | |

690 | -As shown in Figure~\ref{fig_totalXP}, the percentage of the problem that was solved is nearly identical for all the tests (around $60\%$) except for the | |

691 | -first test with all the features and the test with no challenges, in which the players in general performed worse (as indicated by the total experience | |

692 | -points for those game sessions). In particular, the comparison of the first game session with all the features with the second one ('All' and 'All (2)') | |

693 | -demonstrates that we cannot simply use the percentage of the solution found as a way to measure the impact of a feature. Even with the exact same game | |

694 | -conditions, there is a big difference in the total experience and percentage of solutions found. | |

693 | +As shown in Figure~\ref{fig_totalXP}, the percentage of the problem that was solved varies from 48\% to 75\% in all the different experiments. | |

694 | +In particular, the differences observed for experiments with the exact same game conditions (sometimes up to a 18\% difference) | |

695 | +demonstrates that we cannot simply use the percentage of the exact solution found as a way to measure the impact of a feature. | |

696 | +Moreover, the top five sessions in terms of percentage solved (all sessions with more than 65\%) come from the four different game conditions. | |

695 | 697 | |

696 | -Notice that a game session with many good players combining for a high total of experience points does not guarantee that | |

697 | -a bigger percentage of the solution will be found by the players. This is due to the fact that, in the current state of the game, players can be selling | |

698 | +We used linear regression to test if the percentage of the problem solved is, to some extent, directly proportional to the total experience points accumulated | |

699 | +by all the players during a session (graph not shown). The linear function obtained had a coefficient of correlation $r = 0.89$ and a coefficient of determination | |

700 | +$r^2=0.79$, which shows a certain level of correlation. The different game conditions are obviously creating some of the observed variance. | |

701 | +%Notice that a game session with many good players combining for a high total of experience points does not guarantee that | |

702 | +%a bigger percentage of the solution will be found by the players. | |

703 | +Another reason for the variance is the fact that, in the current state of the game, players can be selling | |

698 | 704 | sequences that correspond to a solution that was already found earlier. While it would be possible to lower the score of a solution (sequence) that already |

699 | 705 | exists, it would be hard to explain to unexperienced players why one sequence is worth less than another with exactly the same length and number of colors |

700 | 706 | in common. That is why we decided to not take into account the existing ({\em i.e.} already found) solutions in the scoring function. |

701 | 707 | |

702 | -In the following sections, we show the impact of each feature based on different metrics. | |

708 | +%In the following sections, we show the impact of each feature based on different metrics. | |

703 | 709 | |

704 | 710 | \subsection{Understanding what makes a good player} |

705 | 711 | |

706 | 712 | Based on the questionnaire filled by the players before playing the game, and the global leaderboard of all the players from all the sessions put together, |

707 | -we tried to find similarities between the top players. Table~\ref{tab_playerStats} shows the most interesting differences between the top six players | |

713 | +we tried to find similarities between the top players. Table~\ref{tab_playerStats} shows the most interesting differences between the top 12 players | |

708 | 714 | and the rest of the players. In the questionnaires, players had to indicate their age category (between 21 and 25 for example), their own evaluation |

709 | -of their puzzle solving abilities and a range of hours of time spent playing video games every week. The mean age of the two groups of players | |

710 | -was calculated by taking the middle point of the age categories. The average age of the top 6 players was about 5 years younger than the one of | |

715 | +of their puzzle solving abilities and a range of hours of time spent playing video games every week. | |

716 | + | |

717 | +The average age of the two groups of players | |

718 | +was calculated by taking the middle point of the age categories. The average age of the top 12 players was about $2.5$ years younger than the one of | |

711 | 719 | the other players. For the puzzle solving self evaluation, the players could choose a level between 1 and 5 (5 being the strongest). The average |

712 | -level of the top 6 players was 3.83, compared to 2.81 for the others. As with the age categories, we computed averages of time spent playing | |

713 | -video games every week using the middle point of the categories. The top six players were playing roughly 3 times more every week than the | |

720 | +level of the top 12 players was 3.67, compared to 2.90 for the others. As we did with the age categories, we computed averages of time spent playing | |

721 | +video games every week using the middle point of the categories. The top 12 players were playing roughly $2.5$ times more every week than the | |

714 | 722 | rest of the players. |

715 | 723 | |

716 | 724 | \begin{table}[h] |

717 | -\caption{Average statistics on the top six players vs the others}\label{tab_playerStats} | |

725 | +\caption{Average statistics on the top 12 players vs the others}\label{tab_playerStats} | |

718 | 726 | \begin{center} |

719 | 727 | \begin{tabular}{ccc}\hline |

720 | - & Top 6 players & Others\\ | |

721 | -Age & 25.50 & 30.33\\ | |

722 | -Self evaluation & 3.83 & 2.81\\ | |

723 | -Game time & 10.42 & 3.20\\\hline | |

728 | + & Top 12 players & Others\\ | |

729 | +Age & 23.42 & 25.99\\ | |

730 | +Self evaluation & 3.67 & 2.90\\ | |

731 | +Game time & 10.00 & 4.11\\\hline | |

724 | 732 | \end{tabular} |

725 | 733 | \end{center} |

726 | 734 | \end{table} |

727 | 735 | |

... | ... | @@ -729,19 +737,35 @@ |

729 | 737 | |

730 | 738 | We implemented a human computing game that uses a market, skills and challenges in order to solve a problem collaboratively. The problem that is solved |

731 | 739 | by the players in our game is a graph problem that can be easily translated into a color matching game. The total number of colors used in the tests was small |

732 | -enough so that we were able to compute an exact solution and evaluate the performance of the players. We organized five game sessions of 10 players with | |

733 | -different game conditions and to our surprise, the great variability in the participants' skills made it impossible to make direct comparisons between the tests | |

734 | -in regards to the percentage of the solutions found. However, our tests showed that the market is a useful tool to help players build better solutions | |

735 | -(longer sequences, in our case). Our | |

736 | -results also show that skills and challenges systems are helpful tools to inform, influence and guide the players in doing specific actions that are | |

737 | -beneficial to the system and other players. | |

738 | -Finally, based on the game sessions that we organized, it seems that younger players who play video games on a regular basis are able to understand the rules | |

740 | +enough so that we were able to compute an exact solution and evaluate the performance of the players. We organized 12 game sessions of 10 players with | |

741 | +four different game conditions (3 times each). | |

742 | + | |

743 | +Our tests showed without a doubt that the market is a useful tool to help players build longer solutions (sequences, in our case). In addition, | |

744 | +it also makes the game a lot more dynamic and players mentioned that they really enjoyed this aspect of the game. | |

745 | + | |

746 | +Our results also showed that skills in general are helpful to influence and guide the players into doing specific actions that are | |

747 | +beneficial to the system and other players. We have found that skills are more efficient in their role of guiding the players if | |

748 | +they are not directly related to the main goal of the game: the {\em Color Expert} skill for example did not affect the proportion of | |

749 | +multicolored sequences built by the players. | |

750 | + | |

751 | +The results on the challenges indicate that they can be useful to promote an action in the game ({\em Minimum number of colors in common} for example), but | |

752 | +in order to be effective, the difficulty needs to be well-balanced. Challenges that are too easy ({\em Sell/buy challenge} for example) or | |

753 | +too hard ({\em Specific colors in common challenge} for example) do not affect the game significantly. | |

754 | + | |

755 | +Although the great variability in the participants' skills made it very difficult to make direct comparisons between the different game conditions | |

756 | +in regards to the percentage of the solutions found, we showed that the percentage solved is to a certain extent proportional to the total experience gained | |

757 | +by all players during a game session. | |

758 | + | |

759 | +Finally, it seems that younger players who play video games on a regular basis and | |

760 | +have a strong self evaluation of their puzzle solving skills are able to understand the rules | |

739 | 761 | of the game and find winning strategies faster than the average participant. |

740 | 762 | |

741 | 763 | \section{Acknowledgments} |

742 | 764 | |

743 | -The authors would like to thank Jean-Fran\c{c}ois Bourbeau, Mathieu Blanchette, Derek Ruths and Edward Newell for their help with the initial design of the game. | |

744 | -The authors would also like to thank Silvia Juliana Leon Mantilla for her help with the organization of the game sessions and the recruitment of participants. | |

765 | +First and foremost, the authors wish to thank all the players who made this study possible. | |

766 | +The authors would also like to thank Jean-Fran\c{c}ois Bourbeau, Mathieu Blanchette, Derek Ruths and Edward Newell for their help with the initial design of the game, | |

767 | +and Alexandre Leblanc for his helpful advice on the statistical tests. | |

768 | +Finally, the authors wish to thank Silvia Juliana Leon Mantilla and Shu Hayakawa for their help with the organization of the game sessions and the recruitment of participants. | |

745 | 769 | |

746 | 770 | % REFERENCES FORMAT |

747 | 771 | % References must be the same font size as other body text. |