Race is Still Black and White: Voluntary Racial Phenotypic Change Elicits Meaning Threat and Backlash

We offer evidence that a target who voluntarily changes his/her racial phenotypic features causes perceivers to engage in two-pronged social policing of racial group boundaries: (a) vilifying and disliking the target (cognitive and affective backlash; external policing) (Experiments 1a-1b, 2, & 3) and (b) increasing own racial essentialism, in response to a meaning threat (internal policing) (Experiment 3). In all experiments, participants received a vignette of a protagonist that underwent non-elective surgery (white/Asian, Experiments 1a-1b; white/Black, Experiments 2-3). In the voluntary change condition, the protagonist asks that the surgeon change his/her racial features to resemble that of a different race whereas, in the involuntary change condition the protagonist asks that the surgeon keep his/her racial features intact (Experiment 1: eye shape, Experiment 2: Afrocentric features). Findings supported the predictions and showed a dissociation between similarity and categorization judgments, underscoring the essentialized versus socially constructed nature of beliefs about race.


Introduction
People who undergo ethnic cosmetic procedures opt to change their racial phenotypic features. In Asian, Latino/a and Black populations, these procedures often include creasing eyelids, sharpening noses, thinning lips, and bleaching skin to achieve a more European/Anglo appearance (Davis, 2003;Hunter, 2005;Hunter, 2011). Given a social reality in which racial phenotypic features are perceived as "diagnostic" of hidden psychological properties -such that non-white (versus white) phenotypes are often associated with unwholesome and negative characteristics (for racial phenotypicality bias, see Maddox, 2004) (for colorism, see Hunter, 2007) -it may not be surprising, therefore, that the majority of ethnic cosmetic procedures are geared towards and undertaken by people of color. Hunter (2011) argued that people of color who opt to change their racial phenotypes to appear whiter tend to maintain their racial identities as intact: "The racial capital of whiteness is now something consumers can buy. It is not necessarily the case that consumers of skin-whitening products want to be white per se, but the huge demand for these products suggests that many people want to look white, or at least light, relative to other people in their racial or ethnic group" (Hunter, 2011, p. 149).
Skin bleaching, in particular, has grown into a multibillion-dollar industry despite the fact that many products contain chemical compounds (e.g., mercury, hydroquinone, and corticosteroids) linked to adrenal gland damage, kidney and liver failure as well as cancer and have been therefore prohibited for sale (but not for manufacturing) in the United States (see, Hunter, 2011). Advertisements for these products emphasize notions of "white beauty," privilege a whiter over darker appearance (e.g., "dark out, white in"), and urge users to "Click LIKE for fairer skin" on social media, such as on Facebook (Doland, 2014). Although less frequently, white individuals have also been documented to voluntarily alter racial phenotypic features via procedures such as filling lips, augmenting buttocks, and darkening skin tone to look more "exotic" (see Dreisinger, 2008).
The present set of studies are centered on the following question: Do people who voluntarily change their racial phenotypic features to appear more similar to a different race's become subject to social backlash, that is, to negative attitudes and dislike in the ultimate service of maintaining essentialist status quo beliefs about race? (For backlash, see Rudman & Fairchild, 2004; also see Rudman & Phelan, 2008;Rudman, Moss-Racusin, Phelan & Backlash entails punishing norm violators via social sanctions, while aiming to deter others, in an attempt to maintain existing beliefs and social hierarchies (Rudman, 1998;Rudman & Fairchild, 2004;Rudman & Phelan, 2008;Rudman, Moss-Racusin, Phelan, & Nauts, 2012). When perceivers encounter targets that violate social expectations (e.g., agentic women who defy gender-based stereotypes), perceivers tend to experience dislike towards these targets (e.g., Rudman et al., 2012). Dislike is often a manifestation of prejudice (Fiske, 2002) and is thus not a trivial social sanction. Being disliked causes people to evaluate themselves more poorly (Srivastava & Beer, 2005), experience discrimination in the workplace (Rudman & Glick, 2001), among other adverse outcomes. In the context of race, it follows that a target that opts to change his or her racial phenotypic features to appear more similar to a different race's and thus transgresses race, would be similarly met with dislike.
Moreover, we contend that a target that transgresses race will incur an additional backlash in the form of vilification. We situate the construct of vilification in theorizing and empirical findings on humanness and subtle and everyday forms of dehumanization (Haslam, 2006). In particular, Haslam and colleagues (Bastian & Haslam, 2010;Haslam, 2006) have shown that there exists a set of positively and negatively valenced (e.g., conscientiousness and stinginess) human uniqueness traits, which are conceived as separating humans from other animals (Bastian & Haslam, 2010) (also see Leyens et al., 2001 on primary and secondary emotions). Vilification is a judgment of 'core badness' and is thus conceived herein as a simultaneous exaggeration of a person's negative human uniqueness traits and underestimation of his/her positive human uniqueness traits (versus infrahumanization in which judgments of both positive and negative human uniqueness traits are reduced) (for perceptual dehumanization, in which norm violators' faces are processed less holistically, see Fincher & Tetlock, 2016).
The mere existence of a target that voluntarily changes their racial phenotypic features calls into question the immutability or fixity beliefs about race. Thus, punishing the transgressor via backlash ("you are bad and I don't like you") might not completely resolve the uncertainty about the state of the world that perceivers often experience when they encounter norm transgressions (i.e., meaning threat, see Heine, Proulx, and Vohs, 2006), and would therefore necessitate an additional internal resolution, such as in the form of strengthening status quo essentialist beliefs about race. A meaning threat can be illustrated metaphorically by Yoko Ono's classic art piece, entitled White Chess Set (1966), which consists of an all-white board with all-white pieces. Given that both players' pieces cannot be distinguished from each other's -even if one set of pieces were conceived of as black pieces that had been painted over in white -this game will inevitably lead to a confusion between 'us' and 'them' pieces, rendering the distinction of opposing groups and the game itself meaningless.
We thus hypothesize that beyond backlash, perceivers will react to a person who opts to change his/her racial phenotypic features by experiencing a meaning threat, and in particular, feelings of uncertainty about the fixity and predictability of 'race.' Social policing of a voluntary racial transgressor is aligned with people's documented desire to restore certitude in the status quo and a form of "fluid compensation" (see, Heine, Proulx & Vohs, 2006) -an attempt to resolve the need for cognitive closure (Kruglanski & Webster, 1996) (also see Jost & Banaji, 1994;Wakslak, Jost & Bauer, 2011 for a system justification perspective). The target's volition is central to these predictions because accidental (versus voluntary) change is less likely to threaten status quo beliefs given that people who incur them still abide by social norms (i.e., are victims of circumstance).

Experimental Paradigm and Predictions
In all experiments, participants received a vignette of a protagonist that underwent non-elective surgery (white/Asian, Experiments 1a-1b; white/Black, Experiments 2-3). In the voluntary change condition, the protagonist asks that the surgeon change his/her racial features to resemble that of a different race whereas, in the involuntary change condition, the protagonist asks that the surgeon keep his/her racial features intact (Experiment 1: eye shape, Experiment 2: 'Afrocentric' features and skin tone). In the control condition, the protagonist did not incur any change. We predicted that a voluntary (versus accidental) change to the protagonist's racial phenotypic features would invoke social policing, that is, cognitive and affective backlash in the form of vilification and dislike.
Essentialism is predicated on a binary structure: the essence is assumed to be present or absent (Medin & Ortony, 1989). Thus, we predict that volition would have an effect on backlash but not on categorization judgments. That is, the 'transgressor' will still be considered to be a member of the racial category of origin, albeit, a more poorly functioning exemplar that is vilified and disliked. Thus, regardless of whether phenotypic change is voluntary or accidental, we expect to find a dissociation between similarity and categorization judgments, such that post-phenotypic change, the protagonist will be judged as more similar to the new racial category but as belonging to the original racial category. So far, this form of "origin essentialism" has only been shown with fictitious, non-human exemplars that did not possess volition (Rips, 1989).
Taken together, if predictions are corroborated, the findings would help shed light on an under-researched topic: the nature of societal reactions to individuals who voluntarily change their racial phenotypic features. In particular, the social pressure placed on people of color to appear whiter -and a multibillion-dollar industry that profits from it (in the face of documented health risks, see Hunter 2011) -might lead to a "damned if you do; damned if you don't" consequence for these individuals, concomitant with heightened cultural conceptions that reify race, serve to police racial boundaries, and maintain the status quo. Such findings would add to the extant literature on the entrenched nature of racial essentialism, a belief system which is at odds with a reality in which race is socially constructed (e.g., Bodmer & Cavalli-Sforza, 1976;Molnar, 1992;Gould, 1981;Tate & Audette, 2001) and that has been linked to stereotyping and social inequities (e.g., Bastian & Haslam, 2006;Prentice & Miller, 2007).

Experiment 1a
Experiment 1a was designed to explore whether a voluntary (versus accidental) change to the protagonist's racial phenotypic features would invoke cognitive and affective backlash in the form of vilification and dislike (external policing). European and Asian American participants received vignettes that depicted an Asian/white man/woman. In the voluntary and involuntary change conditions, the protagonist's eye shape was changed (on the racial diagnosticity of eye shape, see Brown, Dane, & Durham, 1998), whereas in the control condition, the protagonist incurred no change. We predicted that a voluntary (versus accidental) racial phenotypic change would invoke cognitive and affective backlash in the form of vilification and dislike. Our investigation was designed to explore the global nature of social policing to voluntary phenotypic transgressors given that racial essentialism is foundational to human categorization (Prentice & Miller, 2007). We deemed it important, however, to select a similar number of Asian-and white-identified participants to prevent the findings from being biased by an unequal representation of participant race.
Across all experiments, based on a prospective power analysis (Faul, Erdfelder, Lang, & Buchner, 2007), we determined that for a sample of 93 participants per condition (α = .05), we could detect an effect size of η 2 = .10 with a probability of .80, in a 2 (protagonist's race of origin: Black/Asian versus white) x 3 (Condition: control, voluntary, and involuntary) between-subjects design. However, the current paradigm is novel and, therefore, a true effect size is unknown. Thus, we aimed to include 200 participants (and up to a maximum of 257 participants) per experiment. Data for all experiments can be found on the Open Science Framework (OSF): https://osf.io/wh6fr/?view_only=d2c5cf953b8a4784bb57c45faa2fafb0.

Participants
Two hundred seventy-five participants were recruited via TurkPrime, a crowd-sourcing website that uses Amazon.com's MTurk. All participants had an MTurk approval rating of 90% or higher and lived in the United States. Participants received $.50 for their participation in the study. Twenty-nine participants were excluded (14 voluntary condition, 8 involuntary condition, 7 control condition) for either incorrectly answering manipulation ijps.ccsenet.org International Journal of Psychological Studies Vol. 12, No. 4; checks (described below), missing data, not identifying as Asian or white, or for not allowing their data to be used, leaving 246 participants (111 male, 135 female; 128 Asian-identified, 118 white-identified; M Age = 34.9, SD Age = 10.8) in the final analyses. Removing the 29 participants did not influence the results, so they were excluded from the final analysis.

Design
We employed a 2 (protagonist's race of origin: Asian versus white) x 3 (condition: control, voluntary, and involuntary) between-subjects factorial design.

Control.
John/Jane is a health-conscious Asian American/White man/woman in his/her early 30s with a successful career, a bright outlook on life, and no family history of serious illness.
Involuntary versus voluntary change. The first part was identical to the control vignette. The second part was as follows: One day, John/Jane contracted an eye infection and surgery was required to preserve his/her vision. He/she underwent surgery by an ophthalmologist who was also a plastic surgeon. The surgeon gave John/Jane the option of keeping his/her eyes the same or changing them.
Involuntary change. John/Jane requested that his/her eyes stay the same. The surgeon tried to comply with his/her request, but due to a surgical error his/her eyes ended up looking like a White/Asian person's.
Voluntary change. John/Jane requested that his/her eyes be changed to look like a White/Asian person's. The surgeon complied with his/her request and his/her eyes ended up looking like a White/Asian person's.
Categorization vs. similarity dissociation measure. Two items were adapted from Rips (1989) to measure categorization, "How would you categorize John/Jane's race?" and similarity, "What race would you physically describe John/Jane as being most similar to?" using four-point Likert-type scales (1 = Asian; 2 = More Asian than White; 3 = More White than Asian; 4 = White). Ratings were recoded such that lower values corresponded to the protagonist's race of origin. The order of items and scale points were counterbalanced across all participants.
Vilification. This measure consisted of ten items adapted from Bastian and Haslam (2010). Participants were asked to rate the extent to which Jane/John possessed the following traits: broadminded, conscientious, humble, polite, and thorough (five positive uniquely human traits; α = .864) as well as, disorganized, hard-hearted, ignorant, rude, and stingy (five negative uniquely human traits; α = .889) on seven-point Likert scales (1 = not at all; 7 = very much so).
Likability. Three items were adapted from Rudman et al., (2012) to measure likability. Participants rated: "How much do you like John/Jane?"; "Is John/Jane someone you want to get to know better?"; and "Would John/Jane be liked by people he/she associates with?" on seven-point Likert scales (1 = not at all; 7 = very much), α = .890.
Manipulation checks. This measure was designed to capture whether participants paid attention to the voluntary/involuntary manipulation. The first item consisted of the question: "Which of the following best describes John/Jane's procedure?" followed by four options: (1) John/Jane accidentally had his/her eye shape changed, (2) John/Jane chose to have his/her eye shape changed, (3) John/Jane accidentally had the shape of his/her nose changed, (4) John/Jane chose to have the shape of his/her nose changed. The second question was: "After the surgery, what eye shape did John/Jane have?" followed by two options: (1) Asian and, (2) White.

Cognitive and Affective Backlash
We predicted that a voluntary (versus accidental) change to the protagonist's racial phenotypic features would elicit cognitive and affective backlash in the form of vilification (i.e., a simultaneous exaggeration of a person's negative human uniqueness traits and underestimation of his/her positive human uniqueness traits) and dislike.
Vilification. We conducted a 2 (protagonist's race of origin: Asian versus white) by 3 (condition: control, voluntary change, and involuntary change) by 2 (valence: positive versus negative) mixed-factorial ANOVA on the uniquely human trait ratings. Overall, protagonists were perceived more positively than negatively, that is, there was a main effect of trait valence such that participants found all protagonists to possess higher levels of   Taken together, these findings indicate that participants in the voluntary condition vilified the protagonist to a larger extent than counterparts in the involuntary and control conditions. Likability. A 2 (protagonist's race of origin: Asian versus white) by 3 (condition: control, voluntary change, and involuntary change) between-subjects ANOVA was conducted on the 3-item composite measure of likability. As predicted, there was a significant effect of condition, F (2, 240) = 24.091, p < .001, η 2 p = .167 ( Figure 2).

Experiment 1b
It is possible that Experiment 1a findings -that voluntary racial phenotypic change leads to backlash -resulted from the fact that the protagonist desired a phenotypic change, racial or otherwise. Thus, in Experiment 1b, ijps.ccsenet.org International Journal of Psychological Studies Vol. 12, No. 4; participants were given vignettes of white/Asian men/women who voluntarily changed a racially diagnostic feature (i.e., eye shape) or a race-neutral feature (i.e., chin shape). We predicted that a voluntary change to a racially diagnostic feature (versus a race-neutral) change would invoke greater cognitive and affective backlash in the form of vilification and dislike. Our investigation was designed to explore the global nature of social policing to voluntary phenotypic transgressors given that racial essentialism is foundational to human categorization (Prentice & Miller, 2007). We deemed it important, however, to select a similar number of Asian-and white-identified participants to prevent the findings from being biased by an unequal representation of participant race.

Participants
One hundred participants were recruited via TurkPrime, a crowd-sourcing website that uses Amazon.com's MTurk. All participants had an MTurk approval rating of 90% or higher and lived in the United States. Participants received $.50 for their participation in the study. Seventeen participants were excluded (11 race-neutral condition, 6 racially diagnostic condition) for either incorrectly answering manipulation checks (described below), missing data, not identifying as Asian or white, or for not allowing their data to be used, leaving 83 participants (44 female, 39 male; 45 Asian-identified, 33 white-identified; M Age = 33.2, SD Age = 10.8) in the final analyses. Removing the 17 participants did not influence the results, so they were excluded from the final analysis.

Design
We employed a 2 (protagonist's race of origin: Asian versus white) x 2 (condition: racially diagnostic change versus race-neutral change) between-subjects factorial design.

Materials
Vignettes. The racially diagnostic change vignette was identical to the one used in Experiment 1a. The race-neutral change vignette was as follows: John/Jane is a health conscious, Asian American/White man/woman in his/her early 30s with a successful career, a bright outlook on life, and no family history of serious illness. One day, John/Jane was in an accident that left a large, infected gash on his/her chin. He/She underwent surgery by an oral and maxillofacial doctor who was also a plastic surgeon. The surgeon gave John/Jane the option of keeping his/her chin the same or changing it. John/Jane requested that his/her chin be changed to appear narrower. The surgeon complied with his/her request and his/her chin ended up looking narrower.
Categorization vs. similarity dissociation measure. Identical to that used in Experiment 1a.

Manipulation checks.
In all conditions, participants were asked, "Which of the following best describes John/Jane's procedure?" followed by four options: (1) John/Jane accidentally had his/her eye shape changed, (2) John/Jane chose to have his/her eye shape changed, (3) John/Jane accidentally had the shape of his/her chin changed, (4) John/Jane chose to have the shape of his/her changed. In the race-neutral change condition participants were asked: "After the surgery, what was the shape of John/Jane's chin?" followed by two options: (1) Wider, and (2) Narrower. Participants in the racially diagnostic change condition were given a second manipulation check identical to Experiment 1a's.

Procedure
After following the consent process approved by San Francisco State University's Institutional Review Board and agreeing to participate, participants were randomly assigned to one of the following experimental conditions: racially diagnostic change or race-neutral change. After reading the given vignette, participants completed the categorization versus similarity dissociation measure, provided vilification and likability ratings, and, finally, answered the manipulation checks and a basic demographics questionnaire.

Results and Discussion
Cognitive and Affective Backlash.
We predicted that a voluntary change to the protagonist's racial phenotypic features (versus a race-neutral feature) would elicit greater cognitive and affective backlash in the form of vilification (i.e., a simultaneous exaggeration of a person's negative human uniqueness traits and underestimation of his/her positive human uniqueness traits) and dislike. ijps.ccsenet.org International Journal of Psychological Studies Vol. 12, No. 4; Vilification. We conducted a 2 (protagonist's race of origin: Asian versus white) by 2 (condition: racially diagnostic change versus race-neutral change) by 2 (valence: positive versus negative) mixed-factorial ANOVA on the uniquely human trait ratings. Overall, protagonists were perceived more positively than negatively, that is, there was a main effect of trait valence such that participants found all protagonists to possess higher levels

Likability.
A 2 (protagonist's race of origin: Asian versus white) by 2 (condition: racially diagnostic change versus race-neutral change) between-subjects ANOVA was conducted on the 3-item composite measure of likability.

Dissociation between Categorization and Similarity
We expected to find evidence for "origin essentialism" (Rips, 1989), such that after altering a racially diagnostic feature, the protagonist will be judged as more similar to the new racial category but as belonging to the original racial category. To this end, we conducted a 2 (protagonist's race of origin: Asian versus white) by 2 (condition: racially diagnostic change versus race-neutral change) by 2 (rating type: categorization versus similarity) mixed-factorial ANOVA.
As predicted, there was a significant condition by rating type interaction, F (1, 79) = 10.753, p = .002, η 2 p = .120. In the racially diagnostic change condition, Bonferroni adjusted simple effects analyses showed that participants rated the protagonist's appearance as being more similar to the new racial category (M = 1.931, SE = .124), but that they categorized the protagonist's race as being more aligned with the protagonist's race of origin ( Experiment 1b provides some evidence that while there may be backlash towards targets who change racially diagnostic or neutral features, that perceivers tend to vilify targets who opt to change a racially diagnostic feature to a significantly larger extent than targets who choose to change a race-neutral feature.

Experiment 2
It is possible that findings from the previous experiments were stimuli driven, specifically by the given protagonist's race (i.e., Asian or white). Experiment 2 was designed to examine therefore whether Experiment 1a's findings would replicate in a different ethnic/racial context. To this end, we adapted the vignettes to depict Black or white protagonists who underwent surgery to alter their Afrocentric features (i.e., nose, lips, and skin tone, see Brown, Dane, & Durham, 1998). Similar to Experiment 1a, we predicted that a voluntary (versus accidental) change to the protagonist's features would invoke cognitive and affective backlash in the form of vilification and dislike (social policing). Our investigation was designed to explore the global nature of social policing to voluntary phenotypic transgressors given that racial essentialism is foundational to human categorization (Prentice & Miller, 2007). We deemed it important, however, to select a similar number of Black-and white-identified participants to prevent the findings from being biased by an unequal representation of participant race.

Participants
Two hundred and thirty-four participants were recruited via TurkPrime, a crowd-sourcing website that uses Amazon.com's MTurk. All participants had an MTurk approval rating of 90% or higher and lived in the United ijps.ccsenet.org International Journal of Psychological Studies Vol. 12, No. 4; States. Participants received $.50 for their participation in the study. Thirty-four participants were excluded (20 involuntary condition, 14 voluntary condition, 0 control condition) for either incorrectly answering manipulation checks (described below), missing data, or for not allowing their data to be used, leaving 200 participants (141 female, 59 male; 105 Black-identified, 95 white-identified; M Age = 33.6, SD Age = 10.9) in the final analyses. Removing the 34 participants did not influence the results, so they were excluded from the final analysis.

Design
We employed a 2 (protagonist's race of origin: Black versus white) x 3 (Condition: control, voluntary, and involuntary) between-subjects factorial design.

Materials
Vignettes. Vignettes were adapted from Experiment 1a to depict Black and white protagonists.

Control.
John/Jane is a health-conscious, African American/White man/woman in his/her early 30s with a successful career, a bright outlook on life, and no family history of serious illness.
Involuntary versus voluntary change. The first part was identical to the control vignette. The second part was as follows: John/Jane contracted meningitis and ended up 'losing his/her face,' such that his/her facial features became disfigured with severe damage to the skin. He/She underwent surgery by an oral and maxillofacial doctor who was also a plastic surgeon. The surgeon gave John/Jane the option of keeping his/her appearance the same or changing it.
Involuntary change. John/Jane requested that his/her facial features and skin tone stay the same. The surgeon tried to comply with his/her request, but due to a surgical error his/her facial features and skin tone ended up looking like a White/Black person's Voluntary change. John/Jane requested that his/her facial features and skin tone be changed to look like a White/Black person's. The surgeon complied with his/her request and his/her facial features and skin tone ended up looking like a White/Black person's.
Categorization vs. similarity dissociation measure. Identical to that used in Experiments 1a and 1b. However, scale options were adapted for the current vignettes (1 = Black; 2 = More Black than White; 3 = More White than Black; 4 = White).
Manipulation checks. The first item consisted of the question: "Which of the following best describes John/Jane's procedure?" followed by four options: (1) John/Jane accidentally had his/her facial features and skin tone changed, (2) John/Jane chose to have his/her facial features and skin tone changed, (3) John's/Jane's procedure only changed his/her facial features, (4) John's/Jane's procedure only changed his/her skin tone. The second question was: "After the surgery, what did John/Jane's facial features and skin tone look like?" followed by two options: (1) A Black person's, (2) A White Person's.

Procedure
After following the consent process approved by San Francisco State University's Institutional Review Board and agreeing to participate, participants were randomly assigned to one of the experimental conditions: control, voluntary change, and involuntary change. After reading the given vignette, participants completed the categorization versus similarity dissociation measure, provided vilification and likability ratings, and, finally, answered the manipulation checks and a basic demographics questionnaire.

Cognitive and Affective Backlash
We predicted that a voluntary (versus accidental) change to the protagonist's racial phenotypic features would elicit cognitive and affective backlash in the form of vilification (i.e., a simultaneous exaggeration of a person's negative human uniqueness traits and underestimation of his/her positive human uniqueness traits) and dislike.
Vilification. We conducted a 2 (protagonist's race of origin: Black versus white) by 3 (condition: control, voluntary change, and involuntary change) by 2 (valence: positive versus negative) mixed-factorial ANOVA on the uniquely human trait ratings. Overall, protagonists were perceived more positively than negatively, that is, ijps.ccsenet.org International Journal of Psychological Studies Vol. 12, No. 4; there was a main effect of trait valence such that participants found all protagonists to possess higher levels In sum, we found evidence (this time employing vignettes depicting Black and white protagonists) that a voluntary (versus accidental) change to the protagonist's racial phenotypic features invokes cognitive and affective backlash in the form of vilification and dislike (external policing).

Dissociation between Categorization and Similarity
We expected to find evidence for "origin essentialism" (Rips, 1989), such that post-phenotypic change, the protagonist will be judged as more similar to the new racial category but as belonging to the original racial category. To this end, we conducted a 2 (protagonist's race of origin: Black versus white) by 3 (condition: control, voluntary change, and involuntary change) by 2 (rating type: categorization versus similarity) mixed-factorial ANOVA.

Experiment 3
In alignment with the meaning maintenance model (see Heine, Proulx & Vohs, 2006), we have shown evidence that perceivers experience a meaning threat in response to learning that a target has chosen to undergo racial phenotypic change. In Experiment 3, we test this hypothesis directly by operationalizing a meaning threat as feelings of uncertainty about the fixity and predictability of 'race.' Our investigation was designed to explore the global nature of social policing to voluntary phenotypic transgressors given that racial essentialism is foundational to human categorization (Prentice & Miller, 2007). We deemed it important, however, to select a similar number of Black-and white-identified participants to prevent the findings from being biased by an unequal representation of participant race.

Method Participants
Two hundred and ninety-five participants were recruited via TurkPrime, a crowd-sourcing website that uses Amazon.com's MTurk. All participants had an MTurk approval rating of 90% or higher and lived in the United States. Participants received $.50 for their participation in the study. Thirty-eight participants were excluded (23 involuntary condition, 11 voluntary condition, 4 control condition) for either incorrectly answering manipulation checks (described below), missing data, not identifying as either Black or white or for not allowing their data to be used, leaving 257 participants (179 female, 78 male; 131 Black-identified, 126 white-identified; M Age = 35.1, SD Age = 11.9) in the final analyses. Removing the 38 participants did not influence the results, so they were excluded from the final analysis.

Design
We employed a 2 (protagonist's race of origin: Black versus white) x 3 (Condition: control, voluntary, and involuntary) between-subjects factorial design.

Materials
Vignettes. Identical to those used in Experiment 2.
Categorization vs. similarity dissociation measure. Identical to that used in previous experiments.
Likability. Identical to that used in previous experiments, α = .882.
Uncertainty. This eight-item measure of uncertainty, which served as a proxy for a meaning threat, was designed to measure both the specific uncertainty elicited by John's/Jane's situation (e.g., "I dislike that John/Jane's situation could mean many different things") as well as a more general uncertainty about not knowing a target's race (e.g., "Not being able to identify one's race/ethnicity makes me feel anxious") on seven-point Likert scales (1 = absolutely untrue; 7 = absolutely true), α = .901.

Manipulation Checks.
Identical to those used in Experiment 2.

Procedure
After following the consent process approved by San Francisco State University's Institutional Review Board and agreeing to participate, participants were randomly assigned to one of the experimental conditions: control, voluntary change, and involuntary change. After reading the given vignette, participants completed the categorization versus similarity dissociation measure and provided vilification and likability ratings. Participants in the involuntary and voluntary change conditions also completed the uncertainty measure and answered the manipulation checks. Finally, all participants completed a basic demographics questionnaire.

Cognitive and Affective Backlash
We predicted that a voluntary (versus accidental) change to the protagonist's racial phenotypic features would elicit cognitive and affective backlash in the form of vilification (i.e., a simultaneous exaggeration of a person's negative human uniqueness traits and underestimation of his/her positive human uniqueness traits) and dislike.
Vilification. We conducted a 2 (protagonist's race of origin: Black versus white) by 3 (condition: control, voluntary change, and involuntary change) by 2 (valence: positive versus negative) mixed-factorial ANOVA on the uniquely human trait ratings. Overall, protagonists were perceived more positively than negatively, that is, there was a main effect of trait valence such that participants found all protagonists to possess higher levels There was no difference between the involuntary and control conditions on the negative uniquely human trait ratings, t (251) p = .499,d = .185,.186].

Dissociation Between Categorization and Similarity
We expected to find evidence for "origin essentialism" (Rips, 1989), such that post-phenotypic change, the protagonist will be judged as more similar to the new racial category but as belonging to the original racial category. To this end, we conducted a 2 (protagonist's race of origin: Black versus white) by 3 (condition: control, voluntary change, and involuntary change) by 2 (rating type: categorization versus similarity) mixed-factorial ANOVA.

General Discussion
The current set of studies provides evidence that perceivers react to a person who opts to change skin-deep but socially-laden racial phenotypic features by engaging in social policing of racial group boundaries: directing cognitive and affective backlash at the target (i.e., vilification and dislike, respectively). Maintaining 'race' as essentialized, versus conceiving of it as a socially constructed category (see Bodmer & Cavalli-Sforza, 1976;Molnar, 1992;Gould, 1981;Tate & Audette, 2001), provides cognitive economy (see Medin, 1989), a sense of meaning, order and predictability (see Haslam et al., 2000), as well as reinforces stereotypes (see Bastian & Haslam, 2006), which helps to keep this framework intact; a catch-22.
It is not the case, however, that people are born with an intuitive sense of 'race' (Hirschfeld, 1988;1995). Instead, race becomes imbued with an alleged natural essence, in retrospect, when children learn that race matters in society (see Hirschfeld, 1995;Prentice & Miller, 2007;Rothbart & Taylor, 1992). Specifically, children begin sorting people into race categories by three to four years of age based on surface structural cues, such as skin tone (Katz, 1982;Davey, Mullin, Norburn & Pushkin, 1983;Ramsey, 1987). Around mid-childhood, racial thinking becomes theory-driven (Katz, 1982), and at this point and into adulthood, racial categories transform from being surface structural, or feature-based, into explanation-based concepts that entail causal thinking (Yuill, 1992). Skin tone, for example, is no longer just a cue that differentiates between people who belong to different social groups, but becomes "diagnostic" of a person's aggression and criminality, a window into an alleged causal essence (for racial phenotypicality bias, see Maddox, 2004;Maddox & Gray, 2002; for a skin tone memory bias, such that a Black man appears lighter in the mind's eye following a counter-stereotypic prime, such as "educated," see Ben-Zeev et al., 2014). In sum, children notice racial surface-structural phenotypic differences early on but the causal meaning that these features become imbued with is a result of learned social beliefs.
Perceptions of causality between an alleged hidden binary and discrete essence (i.e., one that is or is not presumed to exist) and observable characteristics are foundational to inferences about the extent to which an exemplar is deemed to be a well-functioning category member. Rehder and Burnett (2005) demonstrated that when participants were introduced to a novel category with a causal structure -that is, in which a single feature was shown to cause all other features -participants perceived exemplars that possessed the causal feature but not any of its associated surface structural features as poor category exemplars (i.e., as less well-functioning). Our findings can be situated in a causal framework as follows: protagonists who opted to change their racial phenotypic features were likely subject to backlash because they chose to sever the "diagnostic" association between their perceived racial essence (i.e., causal feature) and their surface-structural phenotypic features; rendering themselves as lesser category members. The fact that protagonists in the voluntary conditions were judged as belonging to their racial category of origin, and thus as possessing a racial essence, was evidenced by the dissociation between similarity and categorization, which replicated across all three experiments.
The current study offers a foray into understanding the effects of voluntary racial phenotypic change as a window into how social policing of racial group boundaries serves to maintain essentialist beliefs about race. As such, it leaves some questions unresolved, and which beckon future investigations. First, given that racial essentialism is foundational to human categorization (e.g., Medin & Ortony, 1989;Prentice & Miller, 2007), the current study was designed to explore the global nature of social policing to voluntary racial phenotypic transgressors. Thus, we cared more about the "how," that is, the nature of backlash (i.e., vilification and dislike) directed at voluntary transgressors than the "why," or the content of perceivers' inferences regarding voluntary transgressors' motivations and perceivers' rationales for backlash and increased racial essentialism. Thus, the nature of perceivers' judgments remain unclear. It is possible that perceivers viewed protagonists of color who opted to change their racial phenotypic features to appear whiter as assimilating to Eurocentric norms (see, Davis, 2003;Haiken, 1997) whereas white protagonists who desired to appear more of color as appropriating marginalized outgroup norms (see, Brubaker, 2016). In any case, we advocate for future explorations designed to shed a more nuanced light on perceivers' specific rationales for engaging in social policing of racial phenotypic transgressors, including an examination of any intersectionality effects (e.g., potentially differential reasoning underlying punishing ingroup versus outgroup racial transgressors).
Second, one might argue that perhaps protagonists in the voluntary conditions were perceived as seeking to alter their racial identities, and if so, that this perceived desire for identity change (versus for racial cosmetic change alone) was responsible for backlash and increased essentialism. Inferences about appearance-versus identity-driven motivations are not likely to be perceived as orthogonal, however. Consider Davis's (2003, p.74) assertion that cosmetic surgery (racial or otherwise) can be seen as "…an intervention in identity rather than 'just' a beauty practice." That is, changing one's racial phenotypic features results in some degree of social passing, de facto -and thus in a somewhat novel social identity (see Ginsberg, 1996) -regardless of whether one's motivation was more aesthetically than identity driven to begin with. Thus, we contend that even if a protagonist would explicitly maintain their racial identity as intact, backlash would likely still occur.
Theory-based speculation aside, consider reactions to a celebrity of color who appears to have lightened his or her complexion substantially while still explicitly identifying as a person of color, such as the former baseball player Sammy Sosa. Perceivers' reactions tend to range from accusing the celebrity of self-loathing to accusing the celebrity of a complete rejection of his or her black identity (Hall, 2018). Thus, from both theory-based and cultural perspectives, it is reasonable to predict that even those individuals who choose to change their racial phenotypic features while maintaining their racial identities of origin, explicitly, would be subject to social policing. Nevertheless, the issue of whether and how racial appearance and identity motivated phenotypic change are linked and predict backlash provides rich fodder for future empirical investigations.
Third, the current voluntary and involuntary condition vignettes were designed to possess as much of an analogous structure as possible, which resulted in passive depictions of voluntary change. That is, in the voluntary condition, a protagonist did not actively seek racial phenotypic change but was offered this possibility as a secondary cosmetic procedure. In the future, it would be useful to explore whether employing vignettes that illustrate a protagonist's more active and direct voluntary pursuit of racial phenotypic change might elicit the same or perhaps even a greater degree of dislike and vilification.
Finally, the question of whether racial ingroups vs. outgroups might hold different reasons for engaging in vilification of and experiencing dislike to a racial transgressor is an open-ended one and rife for future investigations. Vilification and dislike might occur for different reasons and as a function of different moderators [e.g., social status, age, the extent to which a phenotypic change is deemed acceptable (e.g., straightening one's hair)], including whether a protagonist is a racial ingroup or outgroup member. Consider, yet again, the case of Sammy Sosa who has been vilified in the media by both Black and white people, but for seemingly different reasons. For Black perceivers (racial ingroups), Sammy Sosa's skin bleaching, and ensuing white appearance, seem to invoke judgments of racial betrayal, such as that of a "race traitor" or a person who experiences self-hatred and internalized racism (see, Hall, 2018;Nittle 2018). For example, Wendy Williams, a popular African American talk show host, exclaimed, "Sammy Sosa, …Wow, he really hated himself." (Nittle, 2018). In an op-ed, Hall (2018), argued that when famous Black people whiten their racial phenotypic features, including the "king of pop" Michael Jackson, Sammy Sosa and rapper Nicki Minaj, these celebrities are met with accusations of not being "black enough." White perceivers' (racial outgroups) vilification and dislike narratives seem to be characterized by negative judgments of a different flavor. Some white perceivers, for example, deem people of color's attempts to whiten their looks, including Sammy Sosa's, to be indicative of poor mental health (see, Williams, 2017). The similarities and differences between ingroup and outgroup members' vilification and dislike narratives lie outside the scope of the current work and constitute a worthwhile endeavor for future investigations. A fruitful direction might be to situate such investigations in the growing literature on "acting white" accusations and cultural invalidationspurposeful or inadvertent threats to an individual's social identity -by racial ingroups and outgroups (see, Durkee, Gazley, Hope & Keels, 2019).
Overall, the present findings offer some evidence that perceivers respond to a target's voluntary racial phenotypic change by policing racial group boundaries -vilifying and disliking the target (cognitive and affective backlash). This essentialized view of race offers a way to make sense of the world while buffering existential angst (see Soloman, Greenberg & Pyszczynski, 1991). However, racial essentialism is not only counter-scientific but has also been documented to cause detrimental outcomes, such as stereotyping and discrimination (Bastian & Haslam, 2006), and the maintenance of a hierarchical racial structure that privileges certain groups and discriminates against others (Smedley & Smedley, 2005). It thus behooves us as a society to heed Hirschfeld's (1998) exhortation, "It is precisely because race is essentialized that it serves systems of power and authority" (p.73), by helping to shift social views to embrace race as a social construction -advancing multiculturalism (e.g., Plaut, Thomas & Goran, 2009) while eschewing colorblindness (Neville, Lilly, Duran, Lee, & Browne, 2000) -in a way that celebrates racial group differences, and without racism.