Commit f87cc31eebf387a3fac665e6f6692126967b4e80

Authored by Olivier
1 parent 4cc4c0cd7b
Exists in master

New folder with paper in the Nature Scientific Report format. Also added gameplay loop diagram.

Showing 15 changed files with 1229 additions and 0 deletions Side-by-side Diff

NatureSR/Figs/averageSeqLength.pdf View file @ f87cc31

No preview for this file type

NatureSR/Figs/boxplot_BK.pdf View file @ f87cc31

No preview for this file type

NatureSR/Figs/boxplot_CE.pdf View file @ f87cc31

No preview for this file type

NatureSR/Figs/boxplot_MT.pdf View file @ f87cc31

No preview for this file type

NatureSR/Figs/boxplot_SC_nbCols.pdf View file @ f87cc31

No preview for this file type

NatureSR/Figs/boxplot_SC_seqLength.pdf View file @ f87cc31

No preview for this file type

NatureSR/Figs/interface.png View file @ f87cc31

43.3 KB

NatureSR/Figs/interface_mod.png View file @ f87cc31

52.9 KB

NatureSR/Figs/minNbCols.pdf View file @ f87cc31

No preview for this file type

NatureSR/Figs/minSeqLength.pdf View file @ f87cc31

No preview for this file type

NatureSR/Figs/sellBuySNP.pdf View file @ f87cc31

No preview for this file type

NatureSR/Figs/totalXP_session.pdf View file @ f87cc31

No preview for this file type

NatureSR/MarketPaper.tex View file @ f87cc31
  1 +\documentclass[fleqn,10pt]{wlscirep}
  2 +
  3 +% Load basic packages
  4 +\usepackage{balance} % to better equalize the last page
  5 +\usepackage{graphics} % for EPS, load graphicx instead
  6 +\usepackage[T1]{fontenc}
  7 +\usepackage{txfonts}
  8 +\usepackage{mathptmx}
  9 +%\usepackage[pdftex]{hyperref}
  10 +%\usepackage{color}
  11 +%\usepackage{booktabs}
  12 +%\usepackage{textcomp}
  13 +
  14 +\usepackage{tikz}
  15 +\usetikzlibrary{arrows.meta, positioning}
  16 +\usepackage{smartdiagram}
  17 +\usesmartdiagramlibrary{additions} % required in the preamble
  18 +\usetikzlibrary{arrows} % required in the preamble
  19 +
  20 +\def \halfWidth {0.5\textwidth}
  21 +
  22 +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
  23 +
  24 +
  25 +
  26 +\title{Collaborative solving in a human computing game using a market, skills and challenges}
  27 +
  28 +\author[1,*]{Alice Author}
  29 +\author[2]{Bob Author}
  30 +\author[1,2,+]{Christine Author}
  31 +\author[2,+]{Derek Author}
  32 +\affil[1]{Affiliation, department, city, postcode, country}
  33 +\affil[2]{Affiliation, department, city, postcode, country}
  34 +
  35 +\affil[*]{corresponding.author@email.example}
  36 +
  37 +\affil[+]{these authors contributed equally to this work}
  38 +
  39 +%\keywords{Keyword1, Keyword2, Keyword3}
  40 +
  41 +\begin{abstract}
  42 +Using a human computing game to solve a problem that has a large search space is not straightforward. The difficulty of using such an approach
  43 +comes from the following facts: (i) it would be overwhelming for a single player to show him or her the complete search space and at the same time,
  44 +(ii) it is impossible to find an optimal solution without considering all the available data. In this paper, we present a human computing
  45 +game that uses a market, skills and a challenge system to help the players solve a graph problem in a collaborative manner. The results obtained
  46 +during 12 game sessions of 10 players show that the market helps players to build larger solutions. We also show that a skill system and, to a lesser extent, a
  47 +challenge system can be used to influence and guide the players towards producing better solutions.
  48 +\end{abstract}
  49 +\begin{document}
  50 +
  51 +\flushbottom
  52 +\maketitle
  53 +% * <john.hammersley@gmail.com> 2015-02-09T12:07:31.197Z:
  54 +%
  55 +% Click the title above to edit the author information and abstract
  56 +%
  57 +\thispagestyle{empty}
  58 +
  59 +\section{Introduction}
  60 +Human-computation and crowd-sourcing are now perceived as valuable techniques to help solving difficult computational problems. In order to make the best use of human skills in these systems, it is important to be able to characterize the expertise and performance of humans as individuals and even more importantly as groups.
  61 +
  62 +Currently, popular crowd-computing platforms such as Amazon Mechanical Turk (AMT) \cite{Buhrmester01012011, Paolacci} or Crowdcrafting \cite{Crowdcrafting} are based on similar divide-and-conquer architectures, where the initial problem is decomposed into smaller sub-tasks that are distributed to individual workers and then aggregated to build a solution. In particular, these systems prevent any interaction between workers in order to prevent groupthink phenomena and bias in the solution \cite{Lorenz:2011aa}.
  63 +
  64 +However, such constraints are necessarily limiting the capacity of the system to harness the cognitive power of crowds and make full benefit of collective intelligence. For instance, iterative combinations of crowdsourced contributions can help enhancing creativity \cite{DBLP:conf/chi/YuN11}. The usefulness of parallelizing workflows has also been suggested for tasks accepting broad varieties of answers \cite{DBLP:conf/chi/Little10}.
  65 +
  66 +The benefits of developing recommendation systems or coordination methods in collaborative environments has been demonstrated \cite{DBLP:conf/cscw/KitturK08,DBLP:conf/cscw/DowKKH12,DBLP:conf/chi/ZhangLMGPH12}. Therefore, in order to gain expressivity and improve their performance, the next generation of human-computation systems will certainly need to implement mechanisms to promote and control the collaboration between workers. Nonetheless, before transitioning to this model, it is important to first estimate the potential gains in productivity, and quantify the usefulness of the mechanisms and incentives to promote collaborative solving and prevent groupthink.
  67 +
  68 +Historically, computation on graphs has proven to be a good model to study the performance of humans in solving complex combinatorial problems \cite{Kearns:2006aa}. Experiments have been conducted to evaluate the dynamics of crowds collaborating at solving graph problems \cite{DBLP:journals/cacm/Kearns12} but still, little is known about the efficiency of the various modes of interaction.
  69 +
  70 +In this paper, we propose a formal framework to study human collaborative solving. We embed a combinatorial graph problem into a novel multiplayer game-with-a-purpose \cite{DBLP:conf/chi/AhnD04,DBLP:conf/aaai/HoCH07}, which will be used to engage participants and analyze collective performances. More precisely, we design a market game in which players can sell and buy solutions or bits of information, and couple this platform with (i) a skills system to enhance the efficiency of specific gaming strategies and (ii) a challenge system to guide the work of the crowd. We use this game to investigate the validity of the following hypotheses.
  71 +
  72 +\subsection{Hypotheses}
  73 +
  74 +The development of the game with its three main features, {\em i.e.} the market, the skills and the challenge system, was based on those four hypotheses:
  75 +\begin{enumerate}
  76 +\setlength{\itemsep}{0em}
  77 + \item A market system will help the players build better solutions.
  78 + \item A skill system is useful to orient the players into doing specific actions that are beneficial to the game and other players.
  79 + \item A challenge system is effective in encouraging the players to do a specific action in the game.
  80 + \item The collected solutions are better when all the three features are present in a game session, independently of the players' personal skills.
  81 +\end{enumerate}
  82 +
  83 +To answer these questions, we conducted a study on 120 participants using different variants of our market game. Our results confirm the benefits of using a trading platform to produce better solutions. Interestingly, we also found that a skills system helps to promote actions that are favorable to the collective solving process,
  84 +%(e.g. increasing the diversity of intermediate solutions)
  85 +but that the efficiency of a skill is reduced if it is designed to help solve one of the primary objectives of the game. Finally, we observed that a precise parametrization of challenges (i.e. finding an appropriate difficulty, nor too easy, nor too difficult) is required to result in an improvement of the quality of the collective work.
  86 +
  87 +Our game is freely available at \texttt{URL:TBA}, and can be used as a platform for further independent studies.
  88 +
  89 +\section{Problem}
  90 +
  91 +The game was implemented to solve a graph problem, which is the problem of finding maximal cliques in a multigraph.
  92 +Let $G(V,E)$ be a multi-colored graph, where each vertex $v \in V$ has a set of colors $c(v)$. There is a colored edge $e=(v,u) \in E$
  93 +between the vertices $v$ and $u$ for every color in $c(v) \cap c(u)$ ({\em i.e.}, one for every color that they have in common). In other words, there is
  94 +no colored edge between two vertices $v$ and $u$ for which $c(v) \cap c(u) \neq \emptyset$. Let $|C|$ be the total number of colors in the graph.
  95 +The problem is then the one of finding maximal cliques for each possible $n$ number of colors (where $1 \leq n \leq |C|$), {\em i.e.} cliques in which all
  96 +the edges (and vertices) have the same $n$ colors.
  97 +%This problem has a worst time complexity of $O(|V|2^{|C|})$.
  98 +A simple exact algorithm can solve the problem in $O(|V|2^{|C|})$. We make the conjecture that it is also the worst time complexity of the problem.
  99 +
  100 +This problem was chosen for two reasons. First, it can be solved quickly by a computer when the number of colors is small, thus making it possible to compute the exact
  101 +solution and measure the percentage of the solution that is found by the players in a game session. Second, this problem can easily be translated into a color
  102 +matching game, which takes advantage of the ability of human perception. Indeed, since the colored edges between the vertices are given implicitly by the
  103 +colors of the vertices, it is possible to show the players only the colored vertices. To solve the problem, the players have to find the largests sets
  104 +of circles with colors in common, for all possible subsets of colors.
  105 +
  106 +Note that it is not our goal to compare the performance of players with the performance of computers in solving this problem. With a limited number of colors
  107 +(like six in our tests), the exact algorithm can solve the problem in seconds. For this study, we required a
  108 +problem that was structured enough so that we could
  109 +easily calculate the optimal solution and evaluate the performance of the players depending on what features were on or off and also the effect of the different
  110 +features on the quality of solutions.
  111 +
  112 +
  113 +\section{Presentation of the game}
  114 +
  115 +\subsection{Goal of the game}
  116 +
  117 +The main objective of the game is to build {\em sequences} ({\em i.e.} sets) of circles (circles represent vertices of the graph) that (i) are as long as possible
  118 +and (ii) contain as many colors in common
  119 +as possible. Circles used by the players to build the sequences either come random packages bought from the system or they come from another player
  120 +through the market. The sequences can then be sold to the system for a certain amount of game money, which is determined by a scoring function that takes into
  121 +account the length and the number of colors in common of the sequence.
  122 +
  123 +%\begin{minipage}[c][8cm]{\textwidth/2}
  124 +%\begin{figure}
  125 +% \centering
  126 +%\smartdiagramset{
  127 +%uniform color list=teal!40!yellow for 4 items,
  128 +%circular final arrow disabled=true,
  129 +%circular distance=2.25cm,
  130 +%arrow tip=to,
  131 +%arrow line width=2pt,
  132 +%additions={
  133 +%additional item bottom color=orange!60!yellow,
  134 +%additional item border color=gray,
  135 +%additional item shadow=drop shadow,
  136 +%additional item offset=0.9cm,
  137 +%additional item font=\small,
  138 +%additional arrow line width=2pt,
  139 +%additional arrow tip=stealth,
  140 +%additional arrow color=orange!60!yellow,
  141 +%}
  142 +%}
  143 +%\smartdiagramadd[circular diagram]{Buy random bags, Build sequences, Sell sequences to the system, Receive in-game money}{
  144 +%left of module1/Buy single circles,left of module3/Buyout sequences, right of module1/Sell single circles, right of module3/Complete challenges
  145 +%}
  146 +%\smartdiagramconnect{stealth-}{module2/additional-module1}
  147 +%\smartdiagramconnect{stealth-}{module2/additional-module2}
  148 +%\smartdiagramconnect{-stealth}{module1/additional-module3}
  149 +%\smartdiagramconnect{stealth-}{module4/additional-module3}
  150 +%\smartdiagramconnect{stealth-}{module4/additional-module4}
  151 +
  152 +
  153 +%\caption{test}
  154 +%\end{figure}
  155 +%\end{minipage}
  156 +
  157 +
  158 +\begin{figure}
  159 + \centering
  160 +\begin{tikzpicture}[auto]
  161 +\tikzset{
  162 + normalNode/.style={rectangle,rounded corners, draw=gray, top color=white!90!gray, font=\small,
  163 + bottom color=teal!40!yellow,thick, inner sep=0.7em,
  164 + minimum size=1em, text centered, minimum width=2cm,
  165 + drop shadow, text width=1.75cm},
  166 + marketNode/.style={rectangle,rounded corners, draw=gray, top color=white!90!gray, font=\small,
  167 + bottom color=orange!60!yellow, thick, inner sep=0.7em,
  168 + minimum size=1em, text centered, minimum width=2cm,
  169 + drop shadow, text width=1.75cm},
  170 + chalNode/.style={rectangle,rounded corners, draw=gray, top color=white!90!gray, font=\small,
  171 + bottom color=blue!30!teal, thick, inner sep=0.7em,
  172 + minimum size=1em, text centered, minimum width=2cm,
  173 + drop shadow, text width=1.75cm},
  174 + myright/.style={-{Stealth[length=4mm]}, color=gray, line width=0.1cm,
  175 + draw, shorten <=0.2cm,shorten >=0.2cm, bend right},
  176 + myleft/.style={-{Stealth[length=4mm]}, color=gray, line width=0.1cm,
  177 + draw, shorten <=0.2cm,shorten >=0.2cm, bend left},
  178 +}
  179 +
  180 +\node[normalNode] (rdmBag) {Buy random bags};
  181 +\node[normalNode] at ([yshift=-2.5cm] 0:2.5cm) (money) {Receive in-game money};
  182 +\node[normalNode] at ([yshift=-2.5cm] 180:2.5cm) (build) {Build sequences};
  183 +\node[normalNode] at ([yshift=-2.5cm] 270:2.5cm) (sellSeq) {Sell sequences to the system};
  184 +\node[marketNode] at (-3, 0) (buy) {Buy single circles};
  185 +\node[marketNode] at (-3, -5) (buyout) {Buyout sequences};
  186 +\node[marketNode] at (3, 0) (sellSNP) {Sell single circle};
  187 +\node[chalNode] at (3, -5) (chal) {Complete challenges};
  188 +
  189 +\path[myright] (rdmBag) to (build);
  190 +\path[myright] (build) to (sellSeq);
  191 +\path[myright] (sellSeq) to (money);
  192 +\path[myright] (money) to (rdmBag);
  193 +\path[myright] (buy) to (build);
  194 +\path[myleft] (buyout) to (build);
  195 +\path[myleft] (sellSNP) to (money);
  196 +%\path[myleft] (rdmBag) to (sellSNP);
  197 +\path[myright] (chal) to (money);
  198 +\end{tikzpicture}
  199 +\caption{Gameplay loop diagram. The green boxes represent the actions that the players can make when there are no challenges and no market. The orange (resp. blue) boxes
  200 +represent the actions that are allowed when the market (resp. challenge system) is present and how they interact with the gameplay.}
  201 +\end{figure}
  202 +
  203 +
  204 +\subsection{Scoring function}
  205 +
  206 +The score of a sequence sold to the system is equal to $baseScore_{n} * seqLength^2$, where $baseScore_{n}$ is a base score depending on the number
  207 +of colors in common (see Table~\ref{tab_baseScore}) and $seqLength$ is the length of the sequence. The base scores were calculated based on the exact solution
  208 +for the graph that was generated for the tests (see section Generating the graph for a description of the graph that was used) in such a way to give a reward
  209 +that is proportional to the difficulty of building the sequence. More precisely,
  210 +we calculated the average length $L_n$ of all solutions for each $n$ number of colors. The base score is simply the reciprocal of this average ($1/L_n$) multiplied
  211 +by a balancing factor (505 in our case). The balancing factor was chosen in order to get a score of 500 for a sequence of length 10 with only one color in common,
  212 +which is exactly the price of two random packages of circles. Also notice that the value of a sequence is exponential in relation to its length, which is
  213 +to encourage players to build the longest possible sequences.
  214 +
  215 +\begin{table}[h]
  216 +\begin{center}
  217 +\begin{tabular}{cc}\hline
  218 +Number of colors & Base score\\
  219 +0 & 0\\
  220 +1 & 5\\
  221 +2 & 14\\
  222 +3 & 26\\
  223 +4 & 40\\
  224 +5 & 55\\
  225 +6 & 72\\\hline
  226 +\end{tabular}
  227 +\end{center}
  228 +\caption{Value of the base score depending on the number of colors in common}\label{tab_baseScore}
  229 +\end{table}
  230 +
  231 +\subsection{Game interface}
  232 +
  233 +The game client and the server were built in Java 1.7.
  234 +As shown in Figure~\ref{fig_interface}, the game interface can be divided into 3 parts: the player information panel, the game panel and the market panel.
  235 +
  236 +\subsubsection{A: Player information panel}
  237 +
  238 +This panel simply contains information on the player's wallet, the current level of the player and has three buttons, allowing the player to open
  239 +dialogs showing information on the current challenge, the skills (see section Skills for a description of the available skills) and the leaderboard.
  240 +One experience point is given to the player for each game dollar that he/she wins. The player can lose game money, but cannot lose experience points
  241 +(experience points can only go up). Every time a player levels-up, he/she gets a skill point that can be used to improve any of the skills.
  242 +
  243 +\subsubsection{B: Game panel}
  244 +
  245 +The first component of the game panel is the 'My sequence' panel, which shows the current sequence that is being built by the player. The maximum size
  246 +of a sequence is 10. Colors in common in the
  247 +sequence are indicated by a thick black border surrounding the colors in the circles. Players can use the arrows to switch between the different sequence slots
  248 +(2 sequence slots are available at the start of the game). The current value of the sequence is shown at the right, and the price for adding one more circle
  249 +with the same colors in common is shown right below in gray. Finally, the sell button allows the player to sell the current sequence to the system: the sequence then
  250 +disappears and the money is given to the player. Selling a sequence is equivalent to submitting a solution to the system.
  251 +
  252 +The second component is the 'My hand' panel, which can contain up to 20 circles. Players can add a circle to the sequence by clicking on it. Circles are
  253 +represented by their colors and by a price label (in a black box). The price corresponds to the current value of the circle on the market. Clicking on
  254 +the price label sells the circle to the highest bidder on the market. Circles that are bought from a random package or from other players are sent to
  255 +the hand.
  256 +
  257 +The 'Awaiting to get sold' is where the circles are sent just before being sold to the highest bidder. If the bid disappears before the transaction is completed,
  258 +the 'sold' circle will stay there. The player can then click on it to cancel the selling and put it back in the hand.
  259 +
  260 +Finally, the bottom panel is a news feed, showing information on the game state, like the remaining time to complete the challenge and the last transactions
  261 +completed by the player for example.
  262 +
  263 +\subsubsection{C: Market panel}
  264 +
  265 +At the top of the market panel, buttons allow the player to create bids for circles or to buy random packages (or bags) of circles.
  266 +The 'Random bag' costs \$250 and contains 5 circles with fewer colors. The 'Premium bag' costs \$500 and contains 5 circles
  267 +with a higher chance of getting circles with many colors.
  268 +
  269 +Right below the buttons is the 'Automatic bids' panel, which allows the player to get automatic bids for circles corresponding to the sequences that
  270 +he or she is building. A percentage of profit for the price of the automatic bids can be set with the slider.
  271 +The profit is defined as the money the player would make by adding one more circle with the same colors in the current sequence (difference between the gray and black prices
  272 +above the Sell button).
  273 +
  274 +The 'My bids' panel shows all the bids that the player currently has on the market. The bid price is shown below the circle (in the black box). On the right side
  275 +of the circle is the number of sequences with the same colors that the player can buy from other players (in the blue box).
  276 +Clicking on the blue box opens a window showing the list of sequences that can be bought. Buying a sequence from another player is called a 'buyout' (see
  277 +the following subsection for a more detailed description of buyouts).
  278 +
  279 +Finally, the last panel at the bottom shows the last circle or sequence that was bought by the player.
  280 +
  281 +\begin{figure*}[htbp]
  282 + \begin{center}
  283 + \includegraphics[width=\textwidth]{Figs/interface_mod.png}
  284 + \vspace{0cm}
  285 + \caption{The game interface, separated in three panels. Panel A (inside the red box) is the player information panel.
  286 + Panel B (inside the green box) is the game panel. Panel C (inside the orange box) is the market panel.
  287 + }\label{fig_interface}
  288 + \end{center}
  289 +\end{figure*}
  290 +
  291 +\subsection{Market}
  292 +
  293 +The market has three functions: (i) allow the players to exchange circles through a bidding system, (ii) allow players to buy sequences built and sold by other
  294 +players so that they can be improved, and (iii) merge together sequences of length 10 to create super circles that are then put back in the game.
  295 +
  296 +For every subset of colors, the server has a list of all the bids that are currently on the market. The value of
  297 +the highest bid on the market is shown below every circle under the possession of the players. When a circle is sold by a player, it is sent through
  298 +the server to the highest bidder.
  299 +
  300 +Buyouts work differently. Players cannot bid on sequences, but the server holds for two minutes all the sequence that have been sold by the players.
  301 +During those two minutes, other players can buy the sequences for a price that is equal to 150\% of the initial score of the sequence. When a buyout
  302 +is made, the bonus game money is sent to the player who initially sold the sequence to the system.
  303 +
  304 +Finally, the game system creates a super circle every time a sequence of length 10 is sold by a player. A super circle of level 2 (representing 10 circles)
  305 +counts as two circles when put in a sequence. Super circles can be of any level (a sequence of 10 super circles of level 2 form a super circle of level 3, and so on).
  306 +The idea behind the creation of the super circles was to remove the limitation of the maximum sequence size imposed by the game interface.
  307 +
  308 +\subsection{Skills}
  309 +
  310 +Four different skills were implemented in the game. One skill point is awarded to a player when he or she levels up, which can then be put in any
  311 +of the four skills. The maximum level of each skill is equal to six (there are six levels of bonuses). Each skill was put in the game as a way to
  312 +guide the player in doing actions that are beneficial to the system or to the other players:
  313 +
  314 +\begin{itemize}
  315 +\item {\em Buyout King}: lowers the price of buying a sequence from another player (goal: encourage buyouts);
  316 +\item {\em Color Expert}: gives a bonus to selling sequences that have more than one color in common (goal: push players to build more multicolored sequences);
  317 +\item {\em Sequence Collector}: gives an additional sequence slot (goal: give more space to encourage the creation of longer sequences with more colors in common);
  318 +\item {\em Master Trader}: gives a bonus to selling circles to other players (goal: promote the selling of individual circles).
  319 +\end{itemize}
  320 +
  321 +\subsection{Challenge system}
  322 +
  323 +We implemented a challenge system that analyzes the recent actions of the players and creates a new challenge every five minutes. The five challenge
  324 +types are:
  325 +
  326 +\begin{itemize}
  327 +\item {\em Sell/buy circles}: requires the players to sell or buy circles;
  328 +\item {\em Buyout sequences}: requires the players to buy sequences from other players;
  329 +\item {\em Minimum number of colors}: requires the players to sell sequences with at least a certain number of colors in common;
  330 +\item {\em Minimum sequence length}: requires the players to sell sequences with a minimum sequence length;
  331 +\item {\em Specific colors in common}: requires the players to sell sequences with a specific subset of colors in common.
  332 +\end{itemize}
  333 +
  334 +Basically, the system continuously monitors the activities of the players and decreases or increases the probabilities of each challenge type.
  335 +The next challenge is then selected using a multinomial sampling on these probabilities. The number of times $T$ that the challenge-related action must be
  336 +completed is selected randomly between 2 and 4. The prize that is awarded for completing the challenge is equal to \$$1500 * T$.
  337 +
  338 +\section{Experiments}
  339 +
  340 +\subsection{Independent and dependent variables}
  341 +
  342 +In the context of this study, there were three independent variables: the market (present; not present), the skills (present; not present) and the
  343 +challenges (present; not present). Instead of trying all 8 possible combinations of independent variables, we decided to focus on four game conditions:
  344 +\begin{enumerate}
  345 +\item All features present (or A)
  346 +\item Everything except the market, hereafter referred to as ``No Market'' (or NM)
  347 +\item Everything except the skills, hereafter referred to as ``No Skills'' (or NS)
  348 +\item Everything except the challenges, hereafter referred to as ``No challenges'' (or NC)
  349 +\end{enumerate}
  350 +Focusing on those four playing conditions allowed us to repeat each experiment more times with different groups of players.
  351 +Moreover, the goal was to evaluate the importance of every game feature by removing them one at a time and evaluating
  352 +the effect on the results obtained by the players.
  353 +
  354 +As for the dependent variables, we were interested in measuring the following:
  355 +\begin{enumerate}
  356 +\item Percentage of the problem solved
  357 +\item Total experience points earned by the players
  358 +\item Average sequence length of the sequences created by the players
  359 +\item Average number of colors in common of the sequences created by the players
  360 +\item Proportion of sequences of more than one color in common created by the players
  361 +\item Number of circles sold individually to another player
  362 +\item Number of sequences bought from other players (buyouts)
  363 +\end{enumerate}
  364 +
  365 +\subsection{Game sessions}
  366 +
  367 +We recruited 120 people in total to test our game. We divided the participants into groups of 10 and repeated three times each of the four
  368 +game conditions presented in the previous subsection.
  369 +%Note that for every test session, we had to deal with one or two (maximum) last minute cancellation(s).
  370 +%In those cases, we replaced the missing player(s) by a lab member, who had played the game before.
  371 +Each participant was playing the game for the first time, except for some people that were invited as replacements to deal with last minute cancellations.
  372 +%(except for the replacement(s)).
  373 +%Having mostly unexperienced players was important in order to make sure that there was no bias
  374 +%coming from the experience gained by the players if they played a second time.
  375 +Before starting each game session, the players were shown a document explaining
  376 +the rules of the game and the interface. They were also asked to fill in a questionnaire so that we could get information on the participants, such as their age,
  377 +their abilities at puzzle solving and their experience with video games for example. For all the experiments, the game session lasted 45 minutes.
  378 +
  379 +\subsection{Generating the graph}
  380 +We generated one random colored multigraph that we used for all the 12 tests. Since the edges in the graph depend entirely on the colors of the vertices, it is
  381 +sufficient to generate only the colored vertices. For the tests, a graph containing 300 vertices and 6 different colors was generated. To randomly select the number
  382 +of colors for each vertex, a geometric distribution of parameter $p = 0.5$ was used, so that the vertices with a lot of colors are rarer. Once the number of colors was
  383 +selected for the vertex, the set of colors was selected uniformly.
  384 +
  385 +%\subsection{Experiment 1: all features}
  386 +
  387 +%In the first experiment, the participants played the game with all the feature available to them, {\em i.e.} the market, the challenge system and the skills. This experiment
  388 +%serves as the control.
  389 +
  390 +%\subsection{Experiment 2: no market}
  391 +
  392 +%In order to evaluate the effect of the market on the quality of the solutions produced by the players, the market was completely removed for this experiment. The players were not
  393 +%able to trade SNPs nor sequences of SNPs. The other features (the skills and challenges) were available.
  394 +
  395 +%\subsection{Experiment 3: no challenges}
  396 +
  397 +%For this experiment, the challenge system was removed from the game to evaluate its usefulness in guiding the players. The other features (the market and the skills) were available.
  398 +
  399 +%\subsection{Experiment 4: no skills}
  400 +
  401 +%In the fourth experiment, the skills were completely removed during the game sessions. The goal of this last experiment was to analyze the effect on the results when the players
  402 +%did not have the ability to choose one or many specializations and the bonuses attached to them.
  403 +
  404 +\section{Results and Discussion}
  405 +
  406 +\subsection{Testing hypothesis 1: the efficiency of the market}
  407 +
  408 +The market system we implemented in the game allows the players to exchange circles and partial solutions (in the form of buyouts). The main goal of the market
  409 +is to help the players in building longer sequences.
  410 +
  411 +%\begin{tabular}{ccc}\hline
  412 +%Test & Average sequence length & Average sequence length (super)\\
  413 +%All & 5.62 & 6.56\\
  414 +%All (2) & 5.40 & 6.23\\
  415 +%No skills & 6.08 & 7.66\\
  416 +%No challenges & 4.86 & 6.04\\
  417 +%No market & 4.40 & 4.90\\
  418 +%\end{tabular}
  419 +
  420 +\begin{figure}[htbp]
  421 + \begin{center}
  422 + \includegraphics[width=\halfWidth]{Figs/averageSeqLength.pdf}
  423 + \vspace{0cm}
  424 + \caption{Average sequence length for every game session, not considering the super circles and considering the super circles (e.g. a super circle
  425 + of level 2 in a sequence represents 10 circles in the solution). 'A', 'A-2' and 'A-3' represent the tests with all the features on; 'NS', 'NS-2' and 'NS-3' represent the
  426 + tests without the skills; 'NM', 'NM-2' and 'NM-3' represent the tests without the market; 'NC', 'NC-2' and 'NC-3' represent the tests without the challenges.
  427 + }\label{fig_averageSeqLength}
  428 + \end{center}
  429 +\end{figure}
  430 +
  431 +As shown in Figure~\ref{fig_averageSeqLength}, the three game sessions in which we had the lowest average of sequence lengths (for all the sequences sold by
  432 +all the players) are the ones that were played without the market, with averages of $4.40$ for NM, $4.19$ for NM-2 and $4.63$ for NM-3.
  433 +Even if we consider the super circles (the special circles that are actually 10 circles combined into one), the average sequence lengths for those
  434 +three sessions are still the lowest ones, with values of $4.90$ for both NM and NM-2, and $5.40$ for NM-3.
  435 +
  436 +%We performed statistical tests to make sure that the observed differences in the means are statistically significant.
  437 +Since the distribution of the
  438 +lengths for all the sequences sold to the system during a game session do not follow a normal distribution, we used a non-parametric test (Kruskal-Wallis) to
  439 +verify if the sequence lengths of the different game sessions seem to come from the same distributions.
  440 +The Kruskal-Wallis test revealed a significant effect of the game conditions on the sequence lengths without considering super circles
  441 +(${\chi}^2(11) = 1391.7$, $p < 2.2E-16$) and also when considering super circles (${\chi}^2(11) = 1388.4$, $p < 2.2E-16$).
  442 +
  443 +We then made a post hoc test (Dunn's test) to do pairwise comparisons between all the groups. With or without considering super circles, all the game conditions
  444 +were shown to be significantly different ($p < 0.01$), except a few shown in table~\ref{tab_Dunn}. Note that the strongest similarities are found between
  445 +the three 'All' groups and the three 'No market' groups. Some of the 'No skills' experiments are found to be similar to the 'All' groups, which could indicate
  446 +that the presence of the skills have a very limited effect on the sequence length. The NC experiment is found to be similar to two 'No market' groups, but that
  447 +can be explained by the fact the players for the NC experiment were very weak (as can be seen by the total experience gained during that session in
  448 +Figure~\ref{fig_totalXP}).
  449 +
  450 +\setlength{\tabcolsep}{4pt}
  451 +\begin{table}[h]
  452 +\begin{center}
  453 +\begin{tabular}{c|cccccccc}\hline
  454 + & A-2 & A-3 & NS & NS-2 & NS-3 & NM & NM-2 & NM-3\\\hline
  455 + A & n/s & n & & n/s & & & & \\
  456 + A-2 & & n & & n/s & & & & \\
  457 + A-3 & & & n/s & & & & & \\
  458 + NC & & & & & & n & & n/s \\
  459 + NC-3 & & & & & n/s & & & \\
  460 + NM & & & & & & & n/s & n \\\hline
  461 +\end{tabular}
  462 +\end{center}
  463 +\caption{Similar groups of sequence length distributions, as reported by Dunn's test. An 'n' in the table represents a similar pair when not considering
  464 + super circles, and an 's' in the table represents a similar pair when considering super circles.}\label{tab_Dunn}
  465 +\end{table}
  466 +
  467 +%WILL HAVE TO MOVE THE FOLLOWING SENTENCES TO HYPOTHESIS 4 SECTION
  468 +%Notice that even in
  469 +%the two sessions for which we had the smallest total experience (see Figure~\ref{fig_totalXP}), both averages of sequence lengths were larger than the averages
  470 +%of the game session without the market. Those observations confirm that the market is helping the players in the creation of longer sequences.
  471 +
  472 +\subsection{Testing hypothesis 2: the benefits of a skill system}
  473 +
  474 +We implemented the skill system for two reasons: (i) to give the players more incentive to accumulate experience points as fast as possible, because
  475 +the reward for leveling-up is an additional skill point, and (ii) to influence indirectly
  476 +the players into doing actions that are either improving the solutions collected by the system or helpful to the other players (which in the end will also
  477 +improve the solutions). In our game, two skills were related to the market ({\em Buyout King} and {\em Master Trader}) and two skills were related to building
  478 +sequences ({\em Color Expert} and {\em Sequence Collector}). In the following paragraphs, we will analyze how those four skills affected the strategies and actions
  479 +of the players. Note that when some players lost all their money in the game, they had to start a new game. In our results, we count both
  480 +new games as if they were played by different players, since the players who restart might choose a different set of skills the second time. That explains
  481 +why the total number of players is larger than 120. Players of the 'No skills' game condition were considered and put automatically in the without skill
  482 +group.
  483 +
  484 +\subsubsection{Buyout King}
  485 +
  486 +The {\em Buyout King} skill allows the players to reduce the price of buying a sequence from another player (which we call a buyout). The idea behind this skill
  487 +was to encourage the players to buy small sequences built by other players so that they could improve them before selling them back to the system.
  488 +In other words, a buyout is the action of buying a partial solution made by another player in order to improve it.
  489 +
  490 +\begin{figure}[htbp]
  491 + \begin{center}
  492 + \includegraphics[width=\halfWidth-1in]{Figs/boxplot_BK.pdf}
  493 + \vspace{0cm}
  494 + \caption{Boxplot of the number of buyouts made by players with (37 players) and without (66 players) the {\em Buyout King} skill.
  495 + }\label{fig_boxplotBK}
  496 + \end{center}
  497 +\end{figure}
  498 +
  499 +Figure~\ref{fig_boxplotBK} shows statitics for the players who have put at least one skill point in the {\em Buyout King} skill and the players who did not
  500 +use the skill at all. We were interested in the number of buyouts that the players with the skill were making compared to the rest of the players. Note that
  501 +since this skill is related to the market, we did not consider the 'No market' sessions for these results.
  502 +
  503 +The median value for players who spent a skill point in the {\em Buyout King} skill is 15, while the median value for
  504 +the players without the skill is 1.5, indicating that half of the players without the skill did not use the
  505 +buyout at all or used it only once. Since the distribution of the number of buyouts is not following a normal distribution (the Shapiro-Wilk test rejected the null hypothesis
  506 +with $p = 2.6E-10$), we used a Mann-Whitney's U test to compare the medians of the two groups. We found a significant effect of the presence
  507 +of this skill on the medians ($U = 1629.5$, $p = 0.004$, effect size $r = 0.28$).
  508 +
  509 +\subsubsection{Master Trader}
  510 +
  511 +The {\em Master Trader} skill allows the players to get bonus money in addition to the regular market price for each circle they sell individually. This skill was put
  512 +in the game in order to increase the activity on the market by encouraging more players to send the circles that they don't need to players who need it the most.
  513 +
  514 +\begin{figure}[htbp]
  515 + \begin{center}
  516 + \includegraphics[width=\halfWidth-1in]{Figs/boxplot_MT.pdf}
  517 + \vspace{0cm}
  518 + \caption{Boxplot of the number of circles sold individually by players with (33 players) and without (70 players) the {\em Master Trader} skill.
  519 + }\label{fig_boxplotMT}
  520 + \end{center}
  521 +\end{figure}
  522 +
  523 +Figure~\ref{fig_boxplotMT} shows statistics for the players who have put at least one skill point in the {\em Master Trader} skill and all the other players.
  524 +We were interested in comparing the number of individual circles that were sold to another player for the two different categories. Once again, since this
  525 +skill depends on the presence of the market, we did not consider the 'No market' experiments in the results shown.
  526 +
  527 +The median value for the players who had selected the {\em Master Trader} skill (73) is more than three times larger than the one for the rest of the players (21.5).
  528 +Since the distribution of the number of circles sold individually is not following a normal distribution (the Shapiro-Wilk test rejected the null hypothesis
  529 +with $p = 5.3E-13$), we used a Mann-Whitney's U test to compare the medians of the two groups. We found a significant effect of the presence
  530 +of the {\em Master Trader} skill on the medians ($U = 1633.5$, $p = 7.2E-4$, effect size $r = 0.33$).
  531 +
  532 +\subsubsection{Color Expert}
  533 +
  534 +The {\em Color Expert} skill gives a bonus multiplier to the scoring function for sequences with more than one color in common. This skill was implemented in
  535 +order to give extra motivation to build sequences with many colors in common, since they are harder to build. Indeed, more focus is needed from the player to match
  536 +many circles with more than one color in common.
  537 +
  538 +\begin{figure}[htbp]
  539 + \begin{center}
  540 + \includegraphics[width=\halfWidth-1in]{Figs/boxplot_CE.pdf}
  541 + \vspace{0cm}
  542 + \caption{Boxplot of the proportion of sequences with more than one color in common sold by players with (94 players) and without (49 players) the {\em Color Expert} skill.
  543 + }\label{fig_boxplotCE}
  544 + \end{center}
  545 +\end{figure}
  546 +
  547 +In Figure~\ref{fig_boxplotCE}, we show the comparison of the proportion of multicolored sequences sold by players with the {\em Color Expert} skill
  548 +and players without it. Interestingly, the median values for both groups are almost identical: 0.317 (or 31.7\%) for the players with the skill and
  549 +0.313 (or 31.3\%) for the players without the skill. The distribution of the proportion of multicolored sequences was not normal (the Shapiro-Wilk
  550 +test rejected the null hypothesis with $p = 0.3E-4$), so we did a Mann-Whithney's U test to compare the medians. As expected, the test failed to reject
  551 +the null hypothesis that the values were sampled from the same distribution ($p = 0.89$).
  552 +
  553 +We conclude that the {\em Color Expert} skill does not affect the behavior of the players. This can be explained by the fact that one of the main goals of
  554 +the game is to create sequences with as many colors in common as possible, whether the player selects this skill or not.
  555 +
  556 +\subsubsection{Sequence Collector}
  557 +
  558 +Every point in the {\em Sequence Collector} skill gives an additionnal slot to build a sequence. Because of the limited size of the player's hand and the
  559 +limited number of sequence slots, it's hard to build long sequences with many colors in common. It is for both the sequence length and
  560 +the number of colors in common that we added the {\em Sequence Collector} skill in the game.
  561 +
  562 +\begin{figure}[htbp]
  563 + \begin{center}
  564 + \includegraphics[width=\halfWidth-1in]{Figs/boxplot_SC_seqLength.pdf}
  565 + \vspace{0cm}
  566 + \caption{Boxplot of the average sequence length of sequences built by players with (60 players) and without (83 players) the {\em Sequence Collector} skill.
  567 + }\label{fig_boxplotSC_seqLength}
  568 + \end{center}
  569 +\end{figure}
  570 +
  571 +We first compared the average sequence length of sequences built by players with the {\em Sequence Collector} skill and the ones built by the rest of
  572 +players (see Figure~\ref{fig_boxplotSC_seqLength}). While the median value for the players without the skill (5.63) is a little bit larger than the one for the players
  573 +with the skill (5.12), the averages of both groups are actually similar (5.61 and 5.56 in the same order). Since the distribution of the average
  574 +sequence lengths were not normal (the Shapiro-Wilk test rejected the null hypothesis with $p = 0.0057$), we did a Mann Whitney's U test to compare the
  575 +medians of both groups. The test failed to reject the null hypothesis that the values were sampled from the same distribution ($p = 0.69$). Thus,
  576 +there is no evidence that the {\em Sequence Collector} skill helps players build longer sequences. This tends to confirm what we mentioned earlier (in Section
  577 +Testing hypothesis 1): the presence of the skills in general does not seem to affect the length of the sequences built by players.
  578 +Once again, this can be explained by the fact that selling long sequences is one of the two main goals of the game, and is one of the main components
  579 +of the scoring function.
  580 +
  581 +\begin{figure}[htbp]
  582 + \begin{center}
  583 + \includegraphics[width=\halfWidth-1in]{Figs/boxplot_SC_nbCols.pdf}
  584 + \vspace{0cm}
  585 + \caption{Boxplot of the average number of colors in common of sequences built by players with (60 players) and without (83 players) the {\em Sequence Collector} skill.
  586 + }\label{fig_boxplotSC_nbCols}
  587 + \end{center}
  588 +\end{figure}
  589 +
  590 +We then compared the average number of colors in common of the sequences built by players with and without the {\em Sequence collector} skill (see Figure~\ref{fig_boxplotSC_nbCols}).
  591 +The median value for the players without the skill (1.58) is 12\% lower than the one for the players with the skill (1.80). Since the distribution of the average
  592 +number of colors in common is not following a normal distribution (the Shapiro-Wilk test rejected the null hypothesis
  593 +with $p = 1.2E-7$), we used a Mann-Whitney's U test to compare the medians of the two groups and we found a significant effect of the presence
  594 +of this skill on the medians ($U = 3113$, $p = 0.01$, effect size $r = 0.21$). The {\em Sequence collector} skill is thus helping players
  595 +to build sequences with more colors, by allowing them to store unfinished sequences of multiple colors in the additional slots until they are able to complete them.
  596 +
  597 +\subsection{Testing hypothesis 3: the usefulness of the challenge system}
  598 +
  599 +The challenge system was implemented to analyze the current state of the game and guide the players towards doing actions that are currently needed. As mentionned
  600 +previously, five different challenge types were implemented in the game (see Section Challenge system for the complete list). In order to analyze the effect
  601 +of the challenges on the way the participants were playing, for each challenge type, we compared the relevant statistics of the game during the challenge
  602 +with the rest of the game session (when a different challenge was available).
  603 +
  604 +Note that we are considering here only the nine sessions in which the challenges were present and that the Sell/buy and Buyout challenges were disabled during the
  605 +session without the market.
  606 +
  607 +\subsubsection{Minimum number of colors challenge}
  608 +
  609 +To measure the effect of the {\em Minimum number of colors challenge} on the game, we compared the average number of colors of the sequences built by the players
  610 +when the challenge was active and when it was not. The different averages for each game session are presented in Figure~\ref{fig_minNbCols}.
  611 +In all the game sessions except A-3 and NM, the average number of colors in common is higher when the challenge is active.
  612 +
  613 +\begin{figure}[htbp]
  614 + \begin{center}
  615 + \includegraphics[width=\halfWidth]{Figs/minNbCols.pdf}
  616 + \vspace{0cm}
  617 + \caption{Average number of colors in the sequences with and without the {\em Minimum number of colors challenge} active. 'A', 'A-2' and 'A-3'
  618 + represent the tests with all the features present, 'NS', 'NS-2' and 'NS-3' represent the tests without the skills, and 'NM', 'NM-2' and 'NM-3' represents
  619 + the tests without the market.
  620 + }\label{fig_minNbCols}
  621 + \end{center}
  622 +\end{figure}
  623 +
  624 +The distribution of the averages of the number of colors in common for all the game sessions considered here is normal (Shapiro-Wilk $p = 0.79$),
  625 +allowing us to use a Welch's t-test to compare the means for both groups, {\em i.e.} 1.96 colors in common during the challenge and 1.76 during
  626 +the rest of the time. The test confirmed a significant effect of the presence of the challenge on the average number of colors in common
  627 +($t(16)=2.19$, $p=0.04$, Cohen's $d = 1.03$).
  628 +
  629 +\subsubsection{Minimum sequence length challenge}
  630 +
  631 +In order to analyze the effect that the {\em Minimum sequence length challenge} had on the game, we compared the average sequence length during the challenge
  632 +and when a different challenge was active for all the game sessions. As shown in Figure~\ref{fig_minSeqLength}, the presence of this challenge increased
  633 +the average sequence length in all the game sessions except the three sessions with all the features.
  634 +
  635 +\begin{figure}[htbp]
  636 + \begin{center}
  637 + \includegraphics[width=\halfWidth]{Figs/minSeqLength.pdf}
  638 + \vspace{0cm}
  639 + \caption{Average sequence length with and without the {\em Minimum sequence length challenge} active. 'A', 'A-2' and 'A-3'
  640 + represent the tests with all the features present, 'NS', 'NS-2' and 'NS-3' represent the tests without the skills, and 'NM', 'NM-2' and 'NM-3' represents
  641 + the tests without the market.
  642 + }\label{fig_minSeqLength}
  643 + \end{center}
  644 +\end{figure}
  645 +
  646 +The means of all the average sequence lengths during the challenge and for the rest of the time are 5.38 and 5.08 respectively. Since the distribution
  647 +of the averages of sequence lengths is normal (Shapiro-Wilk $p = 0.27$), we used a Welch's t-test to compare those means, but the test wasn't able
  648 +to prove that those means are significantly different ($t(16)=0.79$, $p = 0.44$).
  649 +
  650 +Although there is not a statistically significant difference between the two groups, we can generally see a small effect for six of the nine groups with
  651 +challenges. The fact that we observe the opposite effect in the three game sessions with all the features is very surprising, but hard to explain. One possible
  652 +explanation could be that when all the features are present, the players have more to think about and check the challenges a little bit less.
  653 +
  654 +\subsubsection{Sell/buy challenge}
  655 +
  656 +For the {\em Sell/buy challenge}, we were interested in comparing the number of individual circles sold on the market per minute when the challenge was active
  657 +and when it was not. The results, presented in Figure~\ref{fig_sellBuySNP}, don't show a clear trend. Indeed, in half of the game sessions, the
  658 +number of circles sold per minute is higher during the challenge, while it's the opposite for the other half of the game sessions.
  659 +
  660 +\begin{figure}[htbp]
  661 + \begin{center}
  662 + \includegraphics[width=\halfWidth]{Figs/sellBuySNP.pdf}
  663 + \vspace{0cm}
  664 + \caption{Number of individual circles sold on the market per minute with and without the {\em Sell/buy challenge} active. 'A', 'A-2' and 'A-3'
  665 + represent the tests with all the features present, and 'NS', 'NS-2' and 'NS-3' represent the tests without the skills.
  666 + }\label{fig_sellBuySNP}
  667 + \end{center}
  668 +\end{figure}
  669 +
  670 +Once again, the numbers of circles sold per minute in the six different game sessions follow a normal distribution (Shapiro-Wilk $p = 0.26$), so we
  671 +used a Welch's t-test to compare the means of both groups, which are 13.18 during the challenge and 12.73 during the rest of the time. The t-test
  672 +failed to reject the null hypothesis that both means are the same ($t(10)=0.11$, $p = 0.91$).
  673 +
  674 +We believe that the main reason why there doesn't seem to be any difference between the two groups is that most people were able to
  675 +complete this type of challenge without really changing anything to their normal behavior. This challenge was simply too easy, because most of the players
  676 +are always selling or buying (through the bids) at least 2 or 4 circles every five minutes (the length of a challenge).
  677 +
  678 +\subsubsection{Buyout challenge}
  679 +
  680 +The {\em Buyout challenge} appeared only once in total in all the three gaming session with challenges and with the market. Thus, we don't have a significant
  681 +amount of data to analyze the effect of this challenge. The reason why this challenge almost never appeared is because players were always using the
  682 +buyout, which greatly reduced the probability of showing this challenge.
  683 +
  684 +\subsubsection{Specific colors in common challenge}
  685 +
  686 +The {\em Specific colors in common challenge} is also difficult to analyze because it was completed only 8 times in total during the nine sessions with challenges, despite
  687 +appearing 11 times throughout those nine experiments.
  688 +This can be explained by the fact that it was the hardest challenge. All the other challenges are more general and can be completed by
  689 +doing actions that are not specific to a certain subset of colors. Even if the market should be helpful in finding circles with the required
  690 +subset of colors, it seems highly probable that the players felt that this type of challenge was too hard and almost never tried to complete it.
  691 +
  692 +\subsection{Testing hypothesis 4: percentage of the problem solved as a measure of the importance of different game features}
  693 +One of the research goals was to measure the impact of each feature by analyzing how much of the problem can be solved
  694 +by the players in each of the game sessions. Our initial hypothesis was that players who have access to all the game features should
  695 +be able to solve more of the problem.
  696 +
  697 +Interestingly, we observed a larger than expected variance in the participants' personal skills which made it sometimes
  698 +difficult to compare one game session with another in terms of the percentage of the problem that was solved.
  699 +Indeed, some players quickly understood all the rules of the game and how to maximize their score,
  700 +while others struggled to make points during the whole session, even with our help.
  701 +
  702 +\begin{figure}[htbp]
  703 + \begin{center}
  704 + \includegraphics[width=\halfWidth]{Figs/totalXP_session.pdf}
  705 + \vspace{0cm}
  706 + \caption{Total game experience and percentage of the problem solved for each of the 12 game sessions. 'XP' represents experience points.
  707 + 'A', 'A-2' and 'A-3' represent the tests with all the features on; 'NS', 'NS-2' and 'NS-3' represent the
  708 + tests without the skills; 'NM', 'NM-2' and 'NM-3' represent the tests without the market; 'NC', 'NC-2' and 'NC-3' represent the tests without the challenges.
  709 + }\label{fig_totalXP}
  710 + \end{center}
  711 +\end{figure}
  712 +
  713 +As shown in Figure~\ref{fig_totalXP}, the percentage of the problem that was solved varies from 48\% to 75\% in all the different experiments.
  714 +In particular, the differences observed for experiments with the exact same game conditions (sometimes up to a 18\% difference)
  715 +demonstrates that we cannot simply use the percentage of the exact solution found as a way to measure the impact of a feature.
  716 +Moreover, the top five sessions in terms of percentage solved (all sessions with more than 65\%) come from the four different game conditions.
  717 +
  718 +We used linear regression to test if the percentage of the problem solved is, to some extent, directly proportional to the total experience points accumulated
  719 +by all the players during a session, which is a good way to measure the skills of the players during each session.
  720 +The linear function obtained (graph not shown) had a coefficient of correlation $r = 0.89$ and a coefficient of determination
  721 +$r^2=0.79$, which shows a certain level of correlation. The different game conditions are obviously creating some of the observed variance.
  722 +%Notice that a game session with many good players combining for a high total of experience points does not guarantee that
  723 +%a bigger percentage of the solution will be found by the players.
  724 +Another reason for the variance is the fact that, in the current state of the game, players can be selling
  725 +sequences that correspond to a solution that was already found earlier. While it would be possible to lower the score of a solution (sequence) that already
  726 +exists, it would be hard to explain to unexperienced players why one sequence is worth less than another with exactly the same length and number of colors
  727 +in common. That is why we decided to not take into account the existing ({\em i.e.} already found) solutions in the scoring function.
  728 +
  729 +%In the following sections, we show the impact of each feature based on different metrics.
  730 +
  731 +\subsection{Understanding what makes a good player}
  732 +
  733 +Based on the questionnaire filled by the players before playing the game, and the global leaderboard of all the players from all the sessions put together,
  734 +we tried to find similarities between the top players. Table~\ref{tab_playerStats} shows the most interesting differences between the top 12 players
  735 +and the rest of the players. In the questionnaires, players had to indicate their age category (between 21 and 25 for example), their own evaluation
  736 +of their puzzle solving abilities and a range of hours spent playing video games every week.
  737 +
  738 +The average age of the two groups of players
  739 +was calculated by taking the middle point of the age categories. The average age of the top 12 players was about $2.5$ years younger than the one of
  740 +the other players. For the puzzle solving self evaluation, the players could choose a level between 1 and 5 (5 being the strongest). The average
  741 +level of the top 12 players was 3.67, compared to 2.90 for the others. As we did with the age categories, we computed averages of time spent playing
  742 +video games every week using the middle point of the categories. The top 12 players were playing roughly $2.5$ times more every week than the
  743 +rest of the players.
  744 +
  745 +\begin{table}[h]
  746 +\begin{center}
  747 +\begin{tabular}{ccc}\hline
  748 + & Top 12 players & Others\\
  749 +Age & 23.42 & 25.99\\
  750 +Self evaluation & 3.67 & 2.90\\
  751 +Game time & 10.00 & 4.11\\\hline
  752 +\end{tabular}
  753 +\end{center}
  754 +\caption{Average statistics on the top 12 players vs the others}\label{tab_playerStats}
  755 +\end{table}
  756 +
  757 +\section{Conclusion}
  758 +
  759 +We implemented a human computing game that uses a market, skills and challenges in order to solve a problem collaboratively. The problem that is solved
  760 +by the players in our game is a graph problem that can be easily translated into a color matching game. The total number of colors used in the tests was small
  761 +enough so that we were able to compute an exact solution and evaluate the performance of the players. We organized 12 game sessions of 10 players with
  762 +four different game conditions (three times each).
  763 +
  764 +Our tests showed without a doubt that the market is a useful tool to help players build longer solutions (sequences, in our case). In addition,
  765 +it also makes the game a lot more dynamic and players mentioned that they really enjoyed this aspect of the game.
  766 +
  767 +Our results also showed that skills in general are helpful to influence and guide the players into doing specific actions that are
  768 +beneficial to the system and other players. We have found that skills are more efficient in their role of guiding the players if
  769 +they are not directly related to the main goal of the game: the {\em Color Expert} skill for example did not affect the proportion of
  770 +multicolored sequences built by the players.
  771 +
  772 +The results on the challenges indicate that they can be useful to promote an action in the game ({\em Minimum number of colors in common} for example), but
  773 +in order to be effective, the difficulty needs to be well-balanced. Challenges that are too easy ({\em Sell/buy challenge} for example) or
  774 +too hard ({\em Specific colors in common challenge} for example) do not affect the game significantly.
  775 +
  776 +Although the great variability in the participants' personal skills made it very difficult to make direct comparisons between the different game conditions
  777 +in regards to the percentage of the solutions found, we showed that the percentage solved is to a certain extent proportional to the total experience gained
  778 +by all players during a game session. Therefore, the percentage of the problem solved is clearly not only dependent on the features present in the game, but
  779 +also on the participants' ability to be good at the game.
  780 +
  781 +Finally, it seems that younger players who play video games on a regular basis and
  782 +have a strong self evaluation of their puzzle solving skills are able to understand the rules
  783 +of the game and find winning strategies faster than the average participant.
  784 +
  785 +\section*{Acknowledgments}
  786 +
  787 +First and foremost, the authors wish to thank all the players who made this study possible.
  788 +The authors would also like to thank Jean-Fran\c{c}ois Bourbeau, Mathieu Blanchette, Derek Ruths and Edward Newell for their help with the initial design of the game,
  789 +and Alexandre Leblanc for his helpful advice on the statistical tests.
  790 +Finally, the authors wish to thank Silvia Juliana Leon Mantilla and Shu Hayakawa for their help with the organization of the game sessions and the recruitment of participants.
  791 +
  792 +% REFERENCES FORMAT
  793 +% References must be the same font size as other body text.
  794 +%\bibliographystyle{SIGCHI-Reference-Format}
  795 +\bibliography{references}
  796 +
  797 +\end{document}
NatureSR/references.bib View file @ f87cc31
  1 +%% This BibTeX bibliography file was created using BibDesk.
  2 +%% http://bibdesk.sourceforge.net/
  3 +
  4 +
  5 +%% Created for Jerome Waldispuhl at 2015-09-25 01:30:22 -0400
  6 +
  7 +
  8 +%% Saved with string encoding Unicode (UTF-8)
  9 +
  10 +
  11 +
  12 +@inproceedings{DBLP:conf/aaai/HoCH07,
  13 + Author = {Chien{-}Ju Ho and Tsung{-}Hsiang Chang and Jane Yung{-}jen Hsu},
  14 + Bibsource = {dblp computer science bibliography, http://dblp.org},
  15 + Biburl = {http://dblp.uni-trier.de/rec/bib/conf/aaai/HoCH07},
  16 + Booktitle = {Proceedings of the Twenty-Second {AAAI} Conference on Artificial Intelligence, July 22-26, 2007, Vancouver, British Columbia, Canada},
  17 + Date-Added = {2015-09-25 05:30:10 +0000},
  18 + Date-Modified = {2015-09-25 05:30:21 +0000},
  19 + Pages = {1359--1364},
  20 + Timestamp = {Mon, 10 Dec 2012 15:34:43 +0100},
  21 + Title = {PhotoSlap: {A} Multi-player Online Game for Semantic Annotation},
  22 + Url = {http://www.aaai.org/Library/AAAI/2007/aaai07-215.php},
  23 + Year = {2007},
  24 + Bdsk-Url-1 = {http://www.aaai.org/Library/AAAI/2007/aaai07-215.php}}
  25 +
  26 +@inproceedings{DBLP:conf/chi/AhnD04,
  27 + Abstract = {We introduce a new interactive system: a game that is fun and can be used to create valuable output. When people play the game they help determine the contents of images by providing meaningful labels for them. If the game is played as much as popular online games, we estimate that most images on the Web can be labeled in a few months. Having proper labels associated with each image on the Web would allow for more accurate image search, improve the accessibility of sites (by providing descriptions of images to visually impaired individuals), and help users block inappropriate images. Our system makes a significant contribution because of its valuable output and because of the way it addresses the image-labeling problem. Rather than using computer vision techniques, which don't work well enough, we encourage people to do the work by taking advantage of their desire to be entertained.},
  28 + Author = {Luis von Ahn and Laura Dabbish},
  29 + Bibsource = {dblp computer science bibliography, http://dblp.org},
  30 + Biburl = {http://dblp.uni-trier.de/rec/bib/conf/chi/AhnD04},
  31 + Booktitle = {Proceedings of the 2004 Conference on Human Factors in Computing Systems, {CHI} 2004, Vienna, Austria, April 24 - 29, 2004},
  32 + Date-Added = {2015-09-25 05:13:24 +0000},
  33 + Date-Modified = {2015-09-25 05:13:38 +0000},
  34 + Doi = {10.1145/985692.985733},
  35 + Pages = {319--326},
  36 + Timestamp = {Fri, 10 Feb 2006 16:00:37 +0100},
  37 + Title = {Labeling images with a computer game},
  38 + Url = {http://doi.acm.org/10.1145/985692.985733},
  39 + Year = {2004},
  40 + Bdsk-Url-1 = {http://doi.acm.org/10.1145/985692.985733},
  41 + Bdsk-Url-2 = {http://dx.doi.org/10.1145/985692.985733}}
  42 +
  43 +@inproceedings{DBLP:conf/chi/YuN11,
  44 + Author = {Lixiu Yu and Jeffrey V. Nickerson},
  45 + Bibsource = {dblp computer science bibliography, http://dblp.org},
  46 + Biburl = {http://dblp.uni-trier.de/rec/bib/conf/chi/YuN11},
  47 + Booktitle = {Proceedings of the International Conference on Human Factors in Computing Systems, {CHI} 2011, Vancouver, BC, Canada, May 7-12, 2011},
  48 + Date-Added = {2015-09-25 04:13:04 +0000},
  49 + Date-Modified = {2015-09-25 04:13:48 +0000},
  50 + Doi = {10.1145/1978942.1979147},
  51 + Pages = {1393--1402},
  52 + Timestamp = {Wed, 11 May 2011 10:43:31 +0200},
  53 + Title = {Cooks or cobblers?: crowd creativity through combination},
  54 + Url = {http://doi.acm.org/10.1145/1978942.1979147},
  55 + Year = {2011},
  56 + Bdsk-Url-1 = {http://doi.acm.org/10.1145/1978942.1979147},
  57 + Bdsk-Url-2 = {http://dx.doi.org/10.1145/1978942.1979147}}
  58 +
  59 +@inproceedings{DBLP:conf/cscw/KitturK08,
  60 + Abstract = {Wikipedia's success is often attributed to the large numbers of contributors who improve the accuracy, completeness and clarity of articles while reducing bias. However, because of the coordination needed to write an article collaboratively, adding contributors is costly. We examined how the number of editors in Wikipedia and the coordination methods they use affect article quality. We distinguish between explicit coordination, in which editors plan the article through communication, and implicit coordination, in which a subset of editors structure the work by doing the majority of it. Adding more editors to an article improved article quality only when they used appropriate coordination techniques and was harmful when they did not. Implicit coordination through concentrating the work was more helpful when many editors contributed, but explicit coordination through communication was not. Both types of coordination improved quality more when an article was in a formative stage. These results demonstrate the critical importance of coordination in effectively harnessing the wisdom of the crowd in online production environments.},
  61 + Author = {Aniket Kittur and Robert E. Kraut},
  62 + Bibsource = {dblp computer science bibliography, http://dblp.org},
  63 + Biburl = {http://dblp.uni-trier.de/rec/bib/conf/cscw/KitturK08},
  64 + Booktitle = {Proceedings of the 2008 {ACM} Conference on Computer Supported Cooperative Work, {CSCW} 2008, San Diego, CA, USA, November 8-12, 2008},
  65 + Date-Added = {2015-09-25 03:07:08 +0000},
  66 + Date-Modified = {2015-09-25 03:07:27 +0000},
  67 + Doi = {10.1145/1460563.1460572},
  68 + Pages = {37--46},
  69 + Timestamp = {Mon, 24 Nov 2008 10:58:36 +0100},
  70 + Title = {Harnessing the wisdom of crowds in wikipedia: quality through coordination},
  71 + Url = {http://doi.acm.org/10.1145/1460563.1460572},
  72 + Year = {2008},
  73 + Bdsk-Url-1 = {http://doi.acm.org/10.1145/1460563.1460572},
  74 + Bdsk-Url-2 = {http://dx.doi.org/10.1145/1460563.1460572}}
  75 +