Effects of Reward and Punishment on Conflict Processing : Same or Different ?

While it is commonly known that reward and punishment are two effective motivators of behavior, little is known about the underlying mechanisms of reward and punishment in conflict processing. Here, we examined what roles reward and punishment played in this cognitive process by using a revised version of Stroop task. Confining to incongruent trials, explicit reward association in task-relevant dimension obstructed the processing of conflict information in Experiment 1, while the explicit punishment association in task-relevant dimension enhanced the conflict processing relative to the no-punishment condition in Experiment 2, suggesting the mechanisms of reward and punishment are different from each other with the possible involvement of particularly used strategy. Additionally, both reward associations and punishment associations to task-irrelevant dimension showed faster response time in conflict processing, which likely reflected the roles of reward and punishment were the same when they were implicitly related to conflict processing. Such results document that the effects of reward and punishment on conflict processing are modulated by the involvement of consciousness, supporting the flexible roles of reward and punishment in conflict processing.

However, the view that reward and punishment are two effective motivators of behaviors has been recently challenged by findings revealing that they do not result in performance improvement but cause behavioral costs (Pessoa, 2009).For example, Hupp et al. (2002) have shown that a delayed reward condition itself did not increase sportsmanlike behavior, whereas the addition of tokens (and praise) to the delayed reward increased sportsmanlike behavior for children who were diagnosed with attention deficit/hyperactivity disorder.Isham and Geng (2011) examined whether rewarding performance feedback (even when false) altered the reported time of action, and found that when a speeded response and reward were independently manipulated to decouple the cognitive and reward components in the feedback signal, neither variable affected the judged time of action.Thus, they concluded that reward could not independently alter the reported time of action, but only when combined with the meaningful feedback (fast or slow) could it modulate the judged time of an action.Moreover, Smith (2005) located seven studies linking aspects of children's cognitive development to family discipline, and found all of these seven studies had shown a correlation between harsh discipline (i.e., punishment) and poorer academic achievement or lower cognitive development across a range of ages and ethnic groups.In Cameron, Banko, and Pierce's study, negative effects were shown in high-interest tasks when the rewards were tangible, expected (offered beforehand), and loosely tied to the level of performance (Cameron et al., 2001).Balsam and Bondy (1983) also argued the symmetry between aversive and appetitive control in basic experimental research implied that parallel negative side effects of reward did exist.Therefore, the effectiveness of reward and punishment still remains unclear.
While there have been numerous studies examining the mechanism of cognitive conflict, few has taken into account the potential roles of reward and punishment in this process.Furthermore, though reward and punishment have been found to share the same brain region, namely the cingulate cortex which is critically involved in processing the expectation and occurrence of reward and punishment (Stürmer, Nigbur, Schacht, & Sommer, 2011), they have rarely been studied simultaneously, which may be probably due to their opposite affective value.For example, Krebs, Boehler, and Woldorff (2010) mainly investigated how reward associations influenced the processing of conflicting information with monetary incentives, and they found that color-naming performance in a Stroop task was enhanced on trials with potential-reward versus those without.Recently, neuropsychological evidence demonstrated that brain regions implicated in human cognitive control was also critically involved in reward-based learning (Miller & Cohen, 2001;Ridderinkhof, Ullsperger, Crone, & Nieuwenhuis, 2004;Ridderinkhof, van den Wildenberg, Segalowitz, & Carter, 2004;Ullsperger & von Cramon, 2003), supporting the interaction between reward and conflict processing.
The present study primarily and simultaneously investigated the effects of reward and punishment on conflict processing.In typical interference tasks like Stroop, Flanker, and Simon task, a set of multidimensional stimuli are assigned to a set of responses with only one stimulus dimension being task-relevant while other stimulus dimensions being task-irrelevant but at least one of them shares features with the relevant dimension (Lu & Proctor, 1995;Stürmer et al., 2007).In Stroop task, for example, subjects are asked to respond to the ink color of a color word (e.g., "RED") as quickly as possible while ignoring its semantic meaning.Typically, responding to the ink color is facilitated when the ink color (task-relevant dimension) and word's meaning (task-irrelevant dimension) are compatible (e.g., the word "RED" displayed in red ink), but impeded when they are incompatible.Generally, incompatible trials provoke conflicts in information processing that causes Stroop response time interference to occur.The present study used Stroop task as a tool to detect the effects of reward and punishment on conflict processing.
Additionally, it has been suggested that explicit and implicit manipulation differs in that the former entails more conscious while the latter functions in a spontaneous, automatic, or unconscious manner.A number of studies have shown that they have different underlying cognitive mechanism (e.g., Dillon, Ritchey, Johnson, & LaBar, 2007).Thus, the second goal of the current study was to investigate what specific effects of reward and punishment have on conflict processing when they were explicitly associated with task-relevant dimension (ink color) and when they were implicitly associated with task-irrelevant dimension (word's semantic meaning) respectively.It was expected that different effects of reward and punishment would be found on conflict processing as a function of explicit and implicit manipulation.
In order to ensure the implicit manipulation, the current study used a revised version of the Stroop task.We used object nouns which have typical color like grass to replace the color words like green.Naor-Raz and Tarr (2003) suggested that words specifying the color-diagnostic objects automatically processed the shape information first, and as such also led to the activation of associated color information at some level while processing the object representation.The activation of typical color of the object seems indirect and occurs much later than the activation of word meaning in classical Stroop task (e.g., the word meaning of 'RED').Two experiments were conducted using this variant of Stroop task to detect the effect of reward on conflict processing in Experiment 1 and the effect of punishment in Experiment 2.
It should be noted that, the monetary incentives in two experiments were exclusively dependent on the ink-color dimension, producing two conditions: one is explicitly related to monetary incentives (e.g., the ink color of "green") in task-relevant dimension; the other is implicitly "related" to monetary incentives (e.g., the object noun "Grass") in task-irrelevant dimension.This manipulation allowed us to investigate the explicit effect of incentive in the relevant dimension, as well as implicit effect of incentive associations that were entirely irrelevant to the task.

Method
Participants.Twenty three undergraduates (16 females, mean age = 20.04 ± 1.2 years, age range = 18-21 years) participated in this experiment for monetary compensation (according to their performance on the task, the maximum possible payment was ￥11, while the minimum ￥9).All were right-handed native speakers of mandarin Chinese with normal or corrected-to-normal vision, and normal color vision as assessed by the City University Color Vision Test (Fletcher, 1980) and Ishihara Color Test.Written informed consent had been obtained from each participant following a research protocol approved by the Institutional Review Board of the South China Normal University (Guangzhou, China).
Materials.The stimuli consisted of 4 object nouns with typical color (i.e., grass, ocean, gold, and blood) and 1 neutral noun which did not imply any color (i.e., time).Diagnostic color ratings were obtained from 34 participants from the same subject pool who did not contribute to the test data.They were asked to write down the typical color of each noun and rated the canonicality of color based on a 7-point Likert scale with 1 for very low canonicality and 7 for very high canonicality.All subjects consistently reported the intended color for each of the four color-diagnostic nouns, with the rating of each noun higher than 5 (M = 6.11,SD = 1.02), while none of them reported any particular color for the neutral word.
Procedure.The procedure followed closely to that in Krebs, Boehler, and Woldorff (2010).All participants were tested individually in a dimly lit room.Following task instructions, they performed a revised version of the classic color-naming Stroop task in which they responded to the ink color of the nouns (Color, task-relevant dimension) while ignoring their semantic meaning (Word, task-irrelevant dimension, see Fig. 1).Each trial began with a small gray fixation square (visual angle was 0.3º) which was maintained in the center of a black screen.After 800 ms, a colored Chinese noun was presented above the fixation for 1500 ms, pseudorandomly chosen from the following set: "小草(grass)", "金子(gold)", "鲜血(blood)", "大海(ocean)", or "时间(time)" (vertical 0.8º, horizontal ranging from 2º to 4.5º).The words were written in one of the four ink colors (red, yellow, blue, or green).Participants should respond to the ink color as quickly as possible by using index and middle fingers of each hand to press the corresponding buttons.The colored noun was turned off after response or at the end of the response window.If participants failed to respond within a 1500 ms time window, the program would regard it as an error.The interval between trials was varying from 400 to 800 ms in the step of 100 ms.Before the test phase, all participants were required to have training in order to learn the designated button-color combinations.Whenever their accuracy achieved 85% or above, the program would asked the participants to press the key 'P' to go into the test phase, if not, they would keep on practicing until they reached this standard.The color-button combinations counterbalanced across subjects.

Figure 1. Stimuli and experimental conditions
Description：A subset of ink colors was associated with the potential of reward (potential-reward; i.e., green and blue), while the remaining ink colors were not (no-reward; i.e., red and yellow).The word meaning (irrelevant dimension) could be congruent, incongruent reward-related, incongruent reward-unrelated, or neutral with respect to the ink color.
Green and blue were assigned as the colors associated with the potential money reward, with congruent (e.g., "Grass" written in green), incongruent (e.g., "Blood" written in yellow), and neutral (e.g., "Time" written in red) trials intermixed with each other.In order to keep all participants obtain similar reward (a range from ￥9 to ￥11), the reward standard was adjusted dynamically based on individual performance.There were 320 potential-reward and 320 no-reward trials in total, presented in four blocks of 160 trials.A fast and correct response in potential-reward trials would generate a small gain of money, but an incorrect or slow response would not result in any penalty of money.They were also given feedback on the amount of money earned at the end of each block, which was the only feedback they received.There was a 1-min break between blocks.

Results
In the training phase of the experiment, the mean number of times participants studied the 4 color-button combination was 2.15 (SE = .86).Response latencies and mean accuracy rates were submitted to a 2 (Color: potential-reward color vs no-reward color) × 3 (Word: congruent, incongruent and neutral) analysis of variance with repeated measurements.In order to examine differential effects of reward-related and reward-unrelated irrelevant information, we conducted an additional 2 (Color: potential-reward color vs no-reward color) × 2 (Incongruency: incongruent-reward-related vs incongruent-reward-unrelated) repeated-measures ANOVA focusing on the incongruent trials.All reaction time analyses were performed on correct responses only.

Discussion
Different from Krebs et al. (2010), our results indicated a cost of reward in potential-reward trials, as well as a poor performance in incongruent-reward-related trials as evidenced by longer response time found in these two types of conditions.In other words, reward did not facilitate the color naming performance, but oppositely hindered this process.The results suggested that no matter the ink color was explicitly related to reward or the typical color of object noun was implicitly related to reward (i.e., reward were implicitly transferred to the irrelevant dimension), reward would lead attention astray and result in behavioral detriment in conflict processing.
Additionally, our results also showed that congruent condition yielded longer RTs than that of neutral condition.Using similar revised Stroop paradigm, Naor-Raz and Tarr (2003) showed the result that naming colors on words specifying the color-diagnostic objects reversed the Stroop effect, with the congruent color condition (the word "Grass" printed in green color) producing longer response times than the incongruent color condition (the word "Grass" printed in red color).They argued that color-shape associations that arose during the lexical access of object nouns would produce competition between lexical entries for the name of the object and the name of the color, thereby slowing color name access.Such competition was much stronger in congruent condition than in incongruent condition.However, there was no significant difference between congruent and incongruent conditions in the current study.It should be noted that incongruent condition here included incongruent-reward-related and incongruent-reward-unrelated trials.The incongruent-reward-related trials elicited longer RTs due to the attention distraction from reward, making the response time of the combined incongruent condition longer and comparable to congruent condition.In order to test this interpretation, two one-way repeated measures ANOVAs were conducted between congruent and incongruent-reward-related conditions as well as between congruent and incongruent-reward-unrelated conditions.The results showed that there was no significant difference between congruent and incongruent-reward-related conditions, F < 1, but significant difference was found between congruent and incongruent-reward-unrelated conditions, F (1, 22) = 4.41, p < .05.That is, our results were consistent with Naor-Raz and Tarr (2003)'s finding.In Experiment 2, we would take a close look at the role of punishment in conflict processing --was it the same as or different from reward?

Method
Participants.Twenty three new undergraduates (12 females, mean age = 19.43 ± .9 years, age range = 18-21 years) were recruited from the same subject pool as Experiment 1, using the same inclusion criteria.None of them had participated in Experiment 1.
Materials and Procedure.Materials and the procedure were identical to those used in Experiment 1, with two exceptions.Firstly, now green and blue were potential-punishment colors, yellow and red were no-punishment colors.Secondly, now penalty was emphasized.That is, an incorrect or slow response in potential-punishment trials would result in a small penalty of money, but a fast and correct response in potential-punishment trials would not produce a small gain of money.

Results
In the training phase of the experiment, the mean number of times participants studied the 4 color-button combinations was 2.36 (SE = .57).
2 by 3 ANOVA revealed a significant main effect of Color (F (1, 22) = 19.93,p < .001).Interestingly, the potential-punishment trials showed shorter response latencies (615 ms), significantly different from the no-punishment trials (661 ms).Similar to Experiment 1, there was a significant main effect of Word (F (2, 44) = 19.35,p < .001),reflecting longer response latencies in the congruent (644 ms) and incongruent (643 ms) conditions than those of the neutral condition (627 ms, F (1, 22) = 24.32,p < .001;F (1, 22) = 32, p < .001),respectively.However, there was no significant interaction between these two factors.The accuracy was uniformly high across all conditions (M = 94%), and analysis of accuracy revealed no significant Color effect, no significant Word effect and no significant interaction between the two main factors (all ps > .05).

Discussion
Similar to Experiment 1, the results showed a longer response time in congruent and incongruent conditions compared with neutral condition.Interestingly, the response latency in potential-punishment trials was significantly shorter than no-punishment trials, a reverse pattern found for reward in Experiment 1.When confining to incongruent trials, consistent with the finding in reward, incongruent-punishment-related trials showed a longer response time than incongruent-punishment-unrelated trials.That is, when reward or punishment was explicitly related to the task-relevant dimension, reward and punishment played different roles in conflict processing, while when they were implicitly related to task-irrelevant dimension, reward and punishment had the same effects on conflict processing.
Many neurophysiological experiments showed a clean association between attention and reward (e.g., Peck, Jangraw, Suzuki, Efem, & Gottlieb, 2009), which shed light on that reward thwarts the conflict processing by distracting attention with external reinforcement.The current study further unraveled that the cost related to reward in conflict processing occurred in both explicit and implicit associations.However, in terms of punishment, such cost only happened in implicit association trials, but an 'enhancement' in explicit association trials.We speculated that in explicit association the mechanisms of reward and punishment were different.Our behavior is often guided by the desire to obtain positive outcomes and avoid negative consequences.Thus, it is reasonable that one may be attracted by reward (hedonically pleasing incentive) but avoid the non-reward incentive (relatively discomforting incentive).Similarly, one may avoid punishment (discomforting incentive) and be attracted by the no-punishment incentive (relatively hedonically pleasing incentive), which is in line with that punishment in nature reinforces or directs the attention to the non-punished behavior.It is not surprising, under conscious control one can use such strategy to obtain gains and avoid loss.
In implicit association, reward and punishment both showed a cost in conflict processing, by producing substantially stronger interference in irrelevant information that was implicitly linked to reward/punishment as compared to incongruent information that was entirely unrelated to reward/punishment.Thus, we speculated that out of consciousness, the irrelevant information was made salient to oneself by reward and punishment.That is, with such saliency reward and punishment serves as implicit attention distracters, making the conflicting processing become much longer.

General Discussion
In the present study, we examined the specific effects of reward and punishment on conflict processing and addressed two issues: whether and how reward and punishment would have impact on conflict processing during a variant of Stroop task.The results showed that confining to incongruent trials with explicit association longer response latency was observed in potential-reward trials relative to no-reward trials (Experiment 1), while shorter response latency was observed in potential-punishment trials relative to no-punishment trials (Experiment 2), suggesting the underlying mechanisms of reward and punishment are different when reward or punishment is explicitly associated with conflict processing.Specifically, reward led attention astray and elongated the response to potential-reward trials, while in terms of punishment attention was distracted in no-punishment trials but not potential-punishment trials, thus shortening the response to potential-punishment trials.It was possible that when conflict was explicitly associated with reward or punishment, subjects would use particular strategy to control their attention in conflict processing to obtain gains or avoid loss.However, confining to incongruent trials with implicit association, the results showed that incongruent-reward-related (Experiment 1) and incongruent-punishment-related (Experiment 2) trials both yielded longer RTs relative to incongruent-reward-unrelated or incongruent-punishment-unrelated trials.It suggested that reward and punishment would play the same role in conflict processing when conflict information was implicitly associated with reward and punishment.It was speculated that reward and punishment could implicitly distract attention by making the reward/punishment related trials become significant to oneself.
However, in the current study the behavioral detriment effect (longer response time) found in potential-reward trials was different from the performance facilitation (shorter response time) shown in Krebs et al.'s study.It is likely due to the different materials used in the two studies, supporting the notion that reward is not always beneficial for behavioral performance (Padmala & Pessoa, 2010;Pessoa, 2009).Moreover, though Krebs et al. (2010) showed the facilitation effect of reward on conflict processing in explicit association, they agreed that reward could concomitantly induce behavioral costs, which was confirmed by their finding of behavioral detriments when the task-irrelevant dimension (i.e., word meaning) implicitly referred to reward-predictive colors.Besides, they argued that the effect of reward on conflict processing was possibly due to that attention was distracted by reward.
The close relationship between attention distraction and reward has been demonstrated in previous studies.For example, it has been found that reward information has the potential to disrupt behavioral adjustments that are typically observed subsequent to incongruent trials in a Flanker task.And the strength of this gain-induced modulation was found to depend on subjects' motivation to pursue reward, which incurs participants' attention distraction (van Steenbergen, Band, & Hommel, 2009).A recent study conducted by Blaukopf and Digirolamo (2006) also showed that saccades in high-reward trials were slowed compared to saccades in low-reward trials, suggesting reward attracts attention and influences the programming of conscious movements.Moreover, Blaukopf and Digirolamo also found such result pattern in high-punishment trials relative to low-punishment trials, which was different from our finding in the explicit punishment association.Such difference might be related to the combination of rewards and punishments used simultaneously in one experimental setting in Blaukopf and DiGirolamo's study, while the current study only used one type of incentives (reward or punishment) in each of the two experiments.Thus, the effects of reward and punishment found in Blaukopf and Digirolamo's study would not be clearly teased part.
It is well-known that incentive stimuli may be hedonically pleasing or discomforting.Accordingly, it is hypothesized that two separate motivation systems were triggered by positive and negative incentive stimuli, respectively (e.g., Cacioppo & Gardner, 1999;Davidson, 2000;Carver, 2001).Positive incentive stimuli are associated with appetitive motivation, promoting positive feelings as well as inducing an action tendency of approach.Unpleasant stimuli activate a defensive motivation system, coming along with unpleasant feelings and therefore making an organism escape from the situation or avoid it (Davidson, 2000).Reward is corresponding to appetitive motivation system, while punishment is corresponding to defensive motivation system.Thus, from this perspective, it suggests that the underlying mechanisms between reward and punishment are different, which is in line with the finding of the current results that reward and punishment have different impacts on conflict processing at explicit level.
Supporting evidence also comes from other previous studies.For example, Gomez and McLaren (1996) examined the effects of reward and punishment on response disinhibition, happy and nervous moods, heart rates and skin conductance levels during performance of an instrumental learning task.For one group of subjects (the reward group), correct responses were reinforced with a small monetary reward, while for another group (the punishment group), incorrect responses led to a small loss of money.Results indicated that subjects in the punishment group made fewer disinhibitory responses, were more nervous and less happy, and had a higher skin conductance level compared with subjects in the reward group (Gomez & McLaren, 1997).
Additionally, Mulder (2008) found that in social decision making, the concept of punishment increased cooperation while the concept of a reward did not, and there were more disapproval towards an offender when there was a punishment for non-compliance than when there was a reward for compliance.These findings suggest that punishing non-cooperation which signals that non-cooperation socially disapproved of more strongly fosters moral concerns regarding cooperation than rewarding cooperation which signals that cooperation is socially approved of.In the animal studies, it was found that punishment memory decayed much faster than reward memory in olfactory learning and visual pattern learning in crickets, and neurotransmitters conveying punishment and reward signals differ in crickets.They proposed that the faster decay of punishment memory than reward memory observed in insects and humans reflected different cellular and biochemical processes after activation of receptors for amines conveying punishment and reward signals (Nakatani et al., 2009).
Interestingly, at implicit level a simple monetary reward and punishment scheme can both significantly affect the processing of conflict.It was argued implicit incentive associations with task-irrelevant stimulus properties, namely the word meaning, might induce an increase in salience of these properties.This increased salience may disrupt task performance by enhancing the incorrect stimulus-response mapping or by drawing some attentional resources away from the processing of the relevant dimension (Krebs et al., 2010;Pessoa, 2009).Lin and Nicolelis (2008) also demonstrated that both reward-and punishment-related stimuli in nature were motivationally salient and attracted the attention in a recent animal study.Zink et al. (2003) found that the activity in the striatum during probabilistic reversal learning reflects the salience of the critical punishment event leading to behavioral adjustment.Reward-related mesolimbic dopamine steers animal behavior, creating automatic approach toward reward-associated objects and avoidance of objects unlikely to be beneficial by influencing the strategic establishment of endogenous attention (Hickey, Chelazzi, & Theeuwes, 2010).Zink et al. (2004) provided evidence that the striatum's role in reward processing was dependent on the saliency of reward.