Measurement Invariance of the Difficulties in Emotion Regulation Scale in India and the United States

Measurement invariance testing is considered essential in determining whether a measure can be meaningfully used across cultural groups, though establishing such invariance is relatively rare in cross-national studies. The present study investigated measurement invariance of a widely used measure of emotion dysregulation, the Difficulties in Emotion Regulation Scale (DERS; Gratz & Roemer, 2004), in a sample of college students in India (n = 198) and the United States (US; n = 295). Results demonstrated that the item-level six-factor model for the DERS did not fit the data well in either the US or Indian samples. A scale score six-factor model without the item-level information fit the data well in both samples, and a scale score five-factor model (without the Lack of Emotional Awareness subscale) fit the data better in both samples. Using the five-factor scale score models, configural invariance testing indicated that the model varies across the two cultural groups. Overall, our findings failed to demonstrate measurement invariance of the DERS, suggesting that the DERS functions differently in the two cultural groups. Further research is needed to examine cross-national differences in the conceptualization and measurement of emotion regulation.


Introduction
Emotion regulation (ER) involves awareness, understanding, and acceptance of emotions, the ability to control impulsive behaviors related to negative emotions, and the ability to use flexible emotions in order to meet individual goals and situational demands.The absence of any of these components would signify the presence of difficulties in ER, or emotion dysregulation (Gratz & Roemer, 2004).As such, these dysregulated processes may not be optimal in meeting long-term goals and environmental demands (e.g., underregulation or insufficient regulation of the amount and intensity of expressed emotion, and overregulation or suppression of emotion; Cole, Michel, & Teti, 1994).Cultural processes, systems, and ideals impact the ways in which emotions are regulated; these factors increase the likelihood of an emotional expression when culturally appropriate and decrease such expressions if it is not considered culturally appropriate.Further, cultural factors prepare individuals for affectively-laden situations that will be encountered such that situations likely to elicit emotions that are inconsistent with prevailing cultural scripts are kept at a minimum (Mesquita & Albert, 2007).Thus, people can approach (or avoid) certain people, places, or things in order to regulate emotional experiences and expressions that are acceptable (or unacceptable) by the dominant culture within a country.
The cross-national literature on emotion dysregulation has focused on early and middle childhood (e.g., Morris, Silk, Steinberg, Myers, & Robinson, 2007;Raval, Martini, & Ravel, 2007;Wilson, Raval, Raval, Salvina, & Panchal, 2012), and cross-national studies that examine emotion dysregulation in college students are rare.Nonetheless, emotion dysregulation is linked to negative developmental outcomes (Tull, Barrett, McMillan, & Roemer, 2007), and cross-national investigations of emotion dysregulation processes across the range of development are needed.Given the many intra-and interpersonal transitions associated with attending college, this developmental stage provides fertile ground for investigations of emotion dysregulation.Toward this aim, one of the most commonly used measures of emotion dysregulation in adolescents and emerging adults is the Difficulties in Emotion Regulation Scale (DERS; Gratz & Roemer, 2004), although there is limited evidence supporting its use in cross-national studies of emotion dysregulation.
Evaluating the cross-national functioning of the DERS is important in order for researchers and clinicians to draw appropriate inferences and conclusions.The American Educational Research Association, American Psychological Association, and National Council on Measurement in Education (1999) note the importance of providing theoretically and empirically supported assessments of the reliability and validity of measures.Peng and colleagues (1997), however, report that North American samples tend to have higher reliabilities than other cultural groups, suggesting potential measurement bias.It is possible to obtain group mean differences that are spurious and actually a result of the measures having different psychometric properties in each culture (Chen, 2008).Before examining mean differences and to determine the functionality of a measure, Chen (2008) suggests that the measure be assessed across the cultural groups.
In order to use a measure across groups, Hui and Triandis (1985) suggest several methods for examining measures across cultures.Measurement invariance, however, is a more recent and desirable manner for establishing a measure across groups; it is gaining use in cross-cultural research, but it is still rare (Chen, 2008).Measurement invariance or equivalence determines if a construct or measure corresponds across groups (Meredith, 1993;Vandenberg & Lance, 2000).Measurement invariance is crucial as it establishes that any obtained mean differences reflect true latent differences rather than differences in measurement (Meredith, 1993).Using measures that lack invariance across cultures can lead to biases in cross-cultural statistical results; therefore without it group comparisons should be avoided (Chen, 2008).Therefore, the purpose of the present study was to examine measurement invariance of the DERS (Gratz & Roemer, 2004) in a sample of college students in India and United States (US).
Broad cultural norms within a country, among other factors, play a substantial role in how individuals regulate their emotions, including which emotions will be controlled and which will be expressed (Morris, Silk, Steinberg, Myers, & Robinson, 2007;Raval, Martini, & Raval, 2007).In particular, cultural differences in construals of self are implicated in emotion processes, although few studies have taken a developmental approach to understanding self-construals among groups such as college students (see Becker, Natarajan, & Raval, 2012).Theorizing regarding differences in interdependent (i.e., connection to others is prioritized) and independent self-construals (i.e., separation from others is prioritized) have been proposed in understanding differences across countries such as India and the US, respectively (Markus & Kitayama, 1991;Triandis, 1995).This distinction has been very influential in cross-national conceptualizations of culture, although it must also be acknowledged that both interdependence and independence fall along a continuum, and as such, have within-culture variability and are not considered to be polar opposite culture models (i.e., individuals within one country cannot be described as only valuing either relatedness or autonomy and there is no 1:1 correspondence between country and self-construal typology; see, e.g., Keller, 2003;Tamis-LeMonda et al., 2008).
Of relevance to our examination of college students in India and the US, Kağıtçıbaşı (1996Kağıtçıbaşı ( , 2005) ) suggested a third cultural model which combines autonomy and relatedness.This model of integrating perspectives of autonomy and relatedness is particularly pertinent to our study given the increasing "Westernization" among the urban educated middle-class in countries such as India.With these considerations in mind, it is also noted that the dimensions of independence and interdependence may be differentially prioritized in countries such as the US and India.For example, an analysis conducted by Hofstede (2001) found the US to be ranked number one on the independent dimension, whereas Japan ranked number 22, indicating less of a focus on independence and more on interdependence.
Relatedly, while acknowledging intra-country differences across individuals, the broad goals of expressing or regulating emotion may be different in countries that tend to prioritize interdependence or independence.For instance, maintaining social relationships may be a primary goal in countries that foster a predominantly interdependent self-construal, whereas countries that foster a predominantly independent self-construal may prioritize the assertion of individuality (Markus & Kitayama, 1991).Second, the likelihood of affective expression or dysregulation may differ between individuals from countries that foster these differing self-construals.Asians, in general, are less likely to express emotion compared to European-Americans in order to maintain group well-being --a characteristic of interdependence (Kuppersbusch et al., 1999;Matsumuto, 1991;Wilson, Raval, Raval, Salvina, & Panchal, 2012).In contrast, Euro-Americans more regularly express a wider range of emotions, including negative emotions, which are characteristic of individual uniqueness and independence (Keller & Otto, 2009).Third, specific expressive and regulatory behaviors may also vary between individuals from countries that promote predominantly independent or interdependent self-construals.For instance, Wilson and colleagues (2012) found that school-age children from the US reported using more direct, verbal methods of communicating their emotions (e.g., stating explicitly that one was angry or sad) compared to children from India.Fourth, certain types of emotions are more or less likely to be experienced and expressed in certain countries (Markus & Kitayama, 1991).For instance, Keller and Otto (2009) suggested that withholding the expression of emotion, especially negative emotions, is more beneficial for the community in countries where interdependence or relatedness is emphasized, whereas expressing both positive and negative emotions seems to be more beneficial in countries where independence is particularly salient.Consistent with this claim, Safdar and colleagues (2009) found that Japanese college students were less likely to express either negative (e.g., contempt, anger, disgust) or positive (e.g., happiness, surprise) emotions compared to North American college students.
Given that the broad goals of expressing or regulating emotion may be different in countries that tend to prioritize interdependence or independence, it is important to examine whether the measures that are being used to assess emotion regulation adequately capture the same targeted construct across those countries in order to prevent biases in cross-country statistical results.As shown above, the India and the US represent countries where there may be differences in the expression and regulation of emotions.Therefore, the purpose of the present study was to examine measurement invariance of the DERS (Gratz & Roemer, 2004) in a sample of college students in India and the US.

Participants
Participants included 198 (101 male, 97 female) college students in India and 295 (116 male, 179 female) college students in the US.Among the US participants, 88.1% were Caucasian, 1.4% African American, 7.8% Asian American, and 2.4% other.Participants in the US ranged in age from 18 to 24 years, and those in India ranged from 18 to 22 years.Participants from India were recruited through announcements in first, second, and third year classes of psychology, sociology, and business majors at two colleges in the Northwestern state of Gujarat.Participants from the US were recruited through the psychology department undergraduate participant pool at a university in Southwestern Ohio and included students with a wide range of undergraduate majors.

Sample Demographics
Several demographic variables varied between Indian and US college students.First, the majority of Indian participants reported living at home (84.8%), with the remaining living in a hostel, residence hall, or other form of residence.In contrast, the majority (98.0%) of US participants reported living somewhere other than home (e.g., residence hall, off-campus housing), with the remaining living at home, χ 2 (1) = 355.80,p < .001.Second, a majority of the Indian sample (57.1%) reported growing up in a joint family household (i.e., with parents, siblings, and extended family members), whereas it is presumed that almost all US family households were nuclear, including only parents and siblings.Third, the majority of the Indian sample reported their religion as Hinduism (80.8%) with the remaining sample reporting Jainism, Christianity, Islam, or other.In contrast, the majority of the US sample was Christian (82.0%) with the remaining sample reporting Buddhism, Hinduism, Islam, Judaism, or other, χ 2 (1) = 80.40, p < .001.

Procedure
The present study was part of a larger study examining college students' functioning in India and the US and was approved by the research ethics board at the first author's institution.In both countries, written consent was obtained from each participant in their native language prior to measure administration.In the US, participants completed questionnaires in English using online survey software in a computer lab at the first author's academic institution in groups of 9-10 individuals.In India, participants completed questionnaires in Gujarati in a paper-pencil format in groups of approximately 30-40 people.The DERS was translated from English to Gujarati, and back translated to ensure linguistic equivalence.

Measures
Difficulties in Emotion Regulation Scale (DERS; Gratz & Roemer, 2004).The DERS is a 36-item self-report questionnaire that assesses clinically relevant difficulties in ER with an emphasis on negative emotions.Items are scored on six scales, labeled Lack of Emotion Awareness (6 items), Lack of Emotional Clarity (5 items), Difficulties Controlling Impulsive Behaviors When Distressed (6 items), Difficulties Engaging in Goal-Directed Behavior when Distressed (5 items), Nonacceptance of Negative Emotional Responses (6 items), and Limited Access to Effective ER Strategies (8 items).Items are scored on a 5-point scale (1 = almost never, 5 = almost always).Subscale scores are obtained by summing the corresponding items.Evidence has been provided in support of the reliability of DERS scores.Specifically, DERS scores have been found to demonstrate good test-retest reliability over a period of 4 to 8 weeks in a sample of US college students (r = .88;Gratz & Roemer, 2004), and both the overall DERS score and subscale scores have been found to have high internal consistency within both clinical (e.g., Fox, Axelrod, Paliwal, Sleeper, & Sinha, 2007;Gratz, Tull, Baruch, Bornovalova, & Lejuez, 2008) and nonclinical populations in the US (e.g., Gratz & Roemer, 2004).Support for the construct and predictive validity of DERS scores within clinical and nonclinical populations in the US have also been found (Fox et al., 2007;Gratz & Roemer, 2004, 2008;Gratz, Rosenthal, Tull, Lejuez, & Gunderson, 2006).In the present study, internal consistencies for the DERS Total score were high (India sample α = .88,US sample α = .92).See Table 1 for descriptive statistics, and Table 2 for internal consistencies of all DERS subscales in India and US samples.All analyses were conducted with the full US sample, and internal consistencies were then re-run to include only those US college students who indicated being of European-American background.Internal consistencies did not differ when non-Caucasian students were removed from the US sample, and so results are presented with the full US sample in order to be as representative of US college students as possible.In addition, including all US participants regardless of ethnicity is consistent with the college samples used in the original DERS validation (Gratz & Roemer, 2004).

Data Analytic Strategy
The goal of this study was to examine the measurement equivalence or invariance of the DERS across the US and Indian samples.Therefore, multiple group confirmatory factor analyses (CFA) were the primary mode of assessment.Following the steps outlined by Chen (2008), the first step is to examine configural invariance or form invariance of DERS.Each additional step in the process examined the invariant and variant items for differences.Mplus version 5.21 (Muthen & Muthen, 1998-2007) was used to examine the relations among the DERS constructs using maximum likelihood estimation.The measurement models were based upon theoretical predictions and examined using the following criteria: (1) theoretical salience, (2) incremental fit indices (Comparative Fit Index [CFI], Tucker-Lewis Index [TLI]), and (3) an absolute fit index (Root Mean Squared Error of Approximation [RMSEA]).Chen (2007) determined that CFI and RMSEA were sensitive to lack of invariance.Therefore, CFI and RMSEA were considered primary indicators of fit of the measurement models.
To meet criteria for theoretical fit, the model must be predicted from documented theory and previous research.For incremental fit indices (i.e., CFI and TLI), fit indices above 0.90 indicate a well-fitting model (Hu & Bentler, 1999).For RMSEA, a fit of less than 0.05 indicates a well-fitting model (Browne & Cudeck, 1992)..34.18Note: All factor loadings significant at p < .001unless noted.

Factor Structure of the DERS in the US and Indian Samples
The baseline item-level model for the DERS in the US sample did not fit the data well,  2 (n = 296, 579) = 1803.99,CFI = .80,TLI = .78,RMSEA = .09.See Table 2 for the factor loadings of the six-factor item-level model.Based on the recommendations of Bardeen, Fergus, and Orcutt (2012), we removed the items that comprise Lack of Emotional Awareness factor from the analysis.The modified item level model also did not fit the data well,  2 (n = 296, 395) = 1319.47,CFI = .83,TLI = .81,RMSEA = .09.To determine if the higher-order structure fit the data, a scale score model examined the data without the item level information.The simplified six-factor model fit the data,  2 (n = 296, 9) = 30.83,CFI = .95,TLI = .92,RMSEA = .09(see Table 3 for the factor loadings of the six-factor scale score model).Again, utilizing the recommendations of Bardeen et al. (2012), the Lack of Emotional Awareness subscale was removed, and the model fit improved,  2 (n = 296, 5) = 20.01,p = .001,CFI = .97,TLI = .93,RMSEA = .10,∆ 2 (4) = 10.82,p = .03.See Table 4 for the factor loadings of the five-factor baseline (scale score) model.

Invariance of the DERS across the US and Indian Samples
Using the five-factor baseline models, configural invariance (i.e., restricting the factor structure across groups) resulted in a model that did not fit the data well,  2 (18) = 116.66,CFI = .86,TLI = .84,RMSEA = .13.This result indicated that the structure of the DERS varies across the groups.The lack of invariance at the configural stage negates further invariance testing and utilizing mean comparisons of the scales across samples (Chen, 2008).

Discussion
Recently there has been heightened interest in the importance of measurement invariance in cross-group comparisons.It seems to be particularly prominent in cross-cultural research where socialization factors have a well-known impact.Researchers such as Hambleton (2005) have demonstrated the extent to which measures and the constructs measured may vary across cultures.This discrepancy in measures and the constructs measured has been found when studying various cultures, and demonstrates the need to test measurement invariance prior to conducting cross-cultural comparisons.
The DERS (Gratz & Roemer, 2004) is a commonly used self-report measure used to assess the domain of emotion dysregulation.The purpose of the present study was to examine measurement invariance of the DERS in a sample of college students in India and the US.The findings demonstrated that the item-level six-factor model for the DERS did not fit the data well in either the US or Indian samples.A scale score six-factor model without the item-level information fit the data in both samples, and a scale score five-factor model (without the Lack of Emotional Awareness subscale) fit the data better in both samples.Using the five-factor scale score models, configural invariance testing indicated that the model varies across the two cultural groups.Overall, our findings failed to demonstrate measurement invariance of the DERS, suggesting that DERS functions differently in the two cultural groups.
With the widespread use of the DERS, there has been an extensive amount of data that has demonstrated that at least one subscale (Lack of Emotional Awareness) consistently demonstrates weak to moderate intercorrelations with the other subscales within the DERS and perhaps may not be conceptualizing the same construct of emotion regulation as the other five subscales within the DERS (Bardeen et al., 2012).It has been suggested that awareness of negative emotional states may not be essential or adequate for adaptive emotion regulation (Tull, Barrett, McMillan, & Roemer, 2007).Tull and colleagues (2007) posited that some forms of emotional awareness may be maladaptive (e.g., rumination on negative emotion) while other forms may be adaptive (e.g., nonjudgmental acceptance).Thus, the Lack of Emotional Awareness subscale may adequately measure emotional awareness, but it may be that emotional awareness is not necessarily related to adaptive emotion regulation (Bardeen et al., 2012).
Similar to past studies that have found the Lack of Emotional Awareness subscale to be distinct from other dimensions of the DERS (e.g., Bardeen et al., 2012;Tull et al., 2007), results of the present study also find that the Lack of Emotional Awareness similarly may not belong to the same domain as other dimensions of the DERS by demonstrating a much lower contribution to the DERS factor relative to the five other subscales in both the Indian and US samples.
While there may be multiple explanations for variance of a measure across cultures, many problems may be connected to problems of bias or to translations into different languages (Byrne & Watkins, 2003).Translation may play a particular role in failing to find measurement invariance with the DERS in the present study since it was translated into Gujarati for the Indian sample.Even though standard procedures for translation and back-translation were followed in the present study, it remains possible that translating the measure contributes to a different meaning of words, items, or even constructs.
Problems with bias can be due to three primary sources: the construct of interest, the methodological procedure, and the item content.Construct bias is based on the idea that a construct may not hold the same degree of meaningfulness for both groups being studied.The behaviors that the measure is examining may be "differentially appropriate across cultural groups" (Byrne & Watkins, 2003, p. 157).For example, constructs of emotion dysregulation may not mean the same in both countries.Behaviors and emotions reflecting positive emotion regulation may vary in different countries.Kitayama, Mesquita, and Karasawa (2006) found that positive socially engaging emotions (e.g., friendly feelings, feelings of respect) predicted well-being among Japanese college students.On the other hand, positive disengaging emotions (e.g., pride, self-esteem) were more likely to predict, though not as strongly, well-being among college students from the US.Therefore, in Eastern cultures such as India, social appraisals are correlated with one's well-being, whereas the appraisal of one's personal happiness is correlated with well-being in Western cultures such as the US (Mesquita & Albert, 2007).This may indicate that certain aspects of emotion regulation may not be equally beneficial or disrupting in individuals in different countries, which could contribute to the lack of measurement invariance found in this study.The present study found relatively low internal consistencies of the DERS subscales for the Indian sample.
Internal consistencies for the Indian sample were lower than the internal consistencies for the US sample for all DERS subscales.This finding indicates that the subscales may not be functioning coherently in the Indian sample, further questioning the relevance of these constructs.This finding may also reflect the tendency for North American samples to have higher reliabilities than other cultural groups (Peng et al., 1997).
Problems with method bias can be broken down into sample bias, instrument bias, and administration bias.Sample bias relates to how comparable the samples being studied are in terms of variables other than the target variable being examined (Byrne & Watkins, 2003).For example, the two samples differed in some basic demographic information (e.g., living arrangements) that may influence their comparability.For instance, the majority of Indian participants reported living at home (84.5%) while the majority of US participants were not living at home (77.3%).Differences in the comparability of factors such as this may contribute to the failure to find measurement invariance.Additionally, the developmental period of emerging adulthood may be different for the college students of India versus college students in the US (Arnett, 2004;Saraswathi, Manjrekar, & Pant, 2003).This is a crucial area for future research.
In regards to instrument bias (Byrne & Watkins, 2003), it is possible that both samples are not equally familiar with the Likert-type scaling format used in the DERS, which could lead to a biasing in the item scores.Response set may also be an issue when trying to establish measurement invariance across countries, which involves either consciously or unconsciously selecting scale points in a way to convey a positive impression of oneself (Byrne & Watkins, 2003).Individuals from a country supporting a predominantly interdependent self-construal, such as India, may be more likely to engage in acquiescence bias, partly due to the desire to maintain group harmony and to convey agreeableness (Matsumoto & Juang, 2007), which may contribute to our failure to find measurement invariance.
Item bias can also impact measurement invariance in cross-country samples.Items may be biased if they draw out a differential meaning of their content across groups (Byrne & Watkins, 2003).This may have particular importance on measures where socialization practices relate to the construct being examined.Indian college students may report more emotion dysregulation (e.g., non-acceptance of negative emotions) than their US counterparts because previous research has found that the expression of negative emotions may have lower acceptability in India than in the US (Wilson et al., 2012), and lower acceptability of negative emotions than physical pain in India (Raval, Martini, & Raval, 2009;Raval et al., 2007).Negative emotions such as anger or sadness may be experienced but not expressed in family interactions, as they are typically associated with an individual's own needs, and may convey a discomfort with the social world that is harmful to one's relations with others and to family harmony (Raval, Raval, & Becker, 2012;Kitayama et al., 2006).Items such as "When I'm upset, I feel guilty for feeling that way" may have more relevance in India because of cultural norms related to lower acceptability of negative emotions.Thus, due to these differences in socialization practice, it is reasonable to conceive that there may be different sets of criteria for which emotion dysregulation is judged in different countries.
Several limitations of the present study should be noted.Both India and the US are diverse countries with significant regional and socio-demographic differences in lifestyles and belief systems.The current sample in the US was primarily drawn from the Midwest and the Indian sample was recruited from the North Western state of Gujarat.Specifically, the US sample included a limited number of non-Caucasian participants, which does not reflect the actual representation of these groups in the US.Thus, caution should be warranted in generalizing to all Indian or to all US college students.Researchers should investigate whether the present findings can be replicated with emerging adults outside of college and with youth across the developmental spectrum.Similarly, this study is limited by mono-informant and mono-measure methods.It will be important for future work in this area to include multiple measures of emotion processes, as well as collecting responses from multiple informants (e.g., parents, romantic partners).The administration procedures differed with the Indian sample completing the measure by paper and pencil and with the US sample completing the measure on the computer.Future studies should examine the effects of methods of administration on participant responses, and administer the measure in a similar manner to rule out procedural differences.
Despite the limitations, the present study makes an important contribution to the evaluation of the cross-national functioning of the DERS.Given that difficulties in emotion regulation has been linked to a variety of negative outcomes, it is necessary to have a psychometrically sound self-report measure of emotion regulation that is equivalent across cultural groups.Though the DERS has helped researchers further examine difficulties with emotion regulation and its effects on a number of phenomena, the present study suggests that researchers conduct further testing of this measure and consider creating a new measure of emotion regulation from the ground up that can be employed in a cultural context such as India.Toward this aim, qualitative methods may be used to examine the relevance of emotion regulation and dysregulation in order to further explore the components that make up emotion regulation in India and in turn define a culturally-nuanced construct and create a related set of response items.

Table 1 .
Means and standard deviations of participant responses to Difficulties in Emotion Regulation Scale in India and United States

Table 2 .
Factor loadings of six-factor item-level baseline models in India (N = 198) and United States (N = 295)

Table 3 .
Factor loadings of the six-factor baseline and configural models of DERS in India and the United States

Table 4 .
Factor loadings of the five-factor baseline and configural models of DERS in India and the United States Note: *** p < .001;ns = non-significant