An Evaluation of Items and Dimensional Structure of a Scale to Measure Teachers ’ Professional Commitment in Thailand

The purpose of this study was to examine the items and the dimensionality of a scale to measure professional commitment of teachers in Thailand. Data were collected by in-depth interview and questionnaire. The content analysis was conducted with the qualitative data for the purpose of item writing. Differential item functioning (DIF), exploratory factor analysis (EFA), confirmatory factor analysis (CFA), and bi-factor CFA model analysis were conducted to establish validity evidence. As a result of EFA and CFA, the study showed that a 3-dimensional scale can effectively measure professional commitment of teachers, which was achieved by the 18-item measure. A polytomuos item response theory model was also fitted to estimate item parameters and to examine the test information function.

In addition, some authors concluded the construct can be measured by a unidimensional scale, while others claim multidimensional scale is necessary to measure the construct.Researchers who treated commitment as a unidimensional construct focused on attitude (Blau, 1988) or behavior (Benkhoff, 1997) aspects of professional commitment.When both attitude and behavior were considered, professional commitment was typically defined as a two-dimensional construct (Aranya et al., 1981).Some researchers defined professional commitment as a three dimensional construct.For example, Meyer et al. (1993) defined professional commitment with affective, normative, and continuance factors.On the other hand, London (1983) defined the construct by identity, insights, and resilience.Furthermore, some researchers defined the construct by four factors (affective, normative, accumulated cost, and limited alternative factors) (Blau, 2003(Blau, , 2008)).
When professional commitment scales are developed, some scales are intended to be used for a specific occupation, whereas others are intended to be used for any occupations.For example, Meyer and Herscovitch (2001) believed that their three-dimensional commitment scale could be used for any occupations.Although the three-dimensional commitment scale (Meyer et al., 1993;Meyer & Herscovitch, 2001) is popular among practitioners, some researchers have pointed out that the scale may not applicable regardless of occupations (Culpepper, 2000;Blau, 2001Blau, , 2008;;Bergman, 2006;Jaros, 2007;Xu & Bassham, 2010).On the other hand, some others developed professional commitment scales with an intention to be used for specific occupations (Blau, 1988;Morrow & Wirth, 1989;Grzeda & Prince, 1997;Bagraim, 2003).
In summary, commitment has been defined and measured in many different ways, and there is a lack of consensus in the definition of the construct.As a result, the number of items and dimensional structures of the construct have been derived differently depending on how the construct is defined and who developed a scale.Thus, some parts of the existing scales of commitment scales are quite different from each other, while there are some overlaps.This weakness of measuring professional commitment may diminish validity, and a scale would be weakly related to the construct, depending on how the definition of the construct matches user's intention to measure the construct (Jaros, 2007;Blau, 2008, Xu & Bassham, 2010).
The purpose of this study was to develop and investigate the scale items and the factor structure of professional commitment specifically for teachers in Thailand.Teacher commitment has been suggested as one critical element to the success of school education.It is associated with teachers' work performance, absenteeism, turnover, and attitude toward school, as well as students' academic achievement (Crosswell & Elliott, 2004;Firestone, & Pennell, 1993;Park, 2005).Data were analyzed both qualitatively and quantitatively to develop a scale for measuring professional commitment of teachers.

Method
All data collections were done in Thai language, including interviews, item development process, and the measurement instruments.The procedures, measures, and results reported in this paper are translations of them into English.

Participants and Measures
The participants in this study were teachers in Thailand.This study was consisted of two phases.In Phase I, in-depth interviews were conducted with 23 teachers for the purpose of analyzing the concept of professional commitment perceived by teachers.As a result, 19 questionnaire items were developed.In Phase II, a total of 600 teachers were sampled, and responses from a total of 471 teachers to the questionnaire were analyzed for the purpose of refinement and validation of the scale.Data were collected in two separate steps in Phase 2. In the first step in Phase 2, the 19-item questionnaire developed in the Phase 1 was distributed to a total of 300 teachers, and 269 were returned completed, representing an acceptable response rate of 89.69%.The demographics of the sample were as follows.Public school 50.6% and private school 49.4%, education major in college 61.7% and other major 23.4%, and teaching same subject area as college major 74.3% and not teaching same subject area as college major 9.3%.
In the second step of Phase 2, two questionnaires were distributed to a total of 300 teachers.The first questionnaire was the one developed in Phase 1, but one item was removed from the questionnaire based on the analysis in the first step of Phase 2, accordingly, 18-item questionnaire.The second questionnaire was an 18-item questionnaire developed by Meyer et al. (1993), which intends to measure organizational commitment.This questionnaire was translated in Thai language for the purpose of this study.Originally, Meyer et al. used 7-point scale for their items.However, 5-point Likert-type scale was adopted to the items for this study, running from 1 (strongly disagree) to 5 (strongly agree).Among 300 participants, 202 (67.33%) were returned.The demographics of the sample were as follows.Public school 65.8% and private school 34.2%, education major in college 62.7% and other major 22.3%, and teaching same subject area as college major 73.3% and not teaching same subject area as college major 15.8%.

Procedures
In-depth interview was conducted in Phase 1.The interview was structured with a set of same questions to all teachers, which took approximately 30 minutes for each teacher.The interview questions were as follows.( 1) Why did you choose to become a teacher?What motivated you to become a teacher?( 2) Describe what you think about a teacher as a profession?( 3) Describe what you think about your responsibility as a teacher?( 4) Describe what you think about your teacher colleagues?( 5) Describe what you think about your organization?(5) Have you ever attended professional development seminars for teachers?If so, what were they? ( 6) What is your ultimate goal in this career?(7) If you were offered another job with higher salary and better benefits, would you keep your career as a teacher?( 8) Describe what you think important qualifications to become a teacher?MAXQDA software program was used to analyze the data qualitatively.The data from the interview was used to develop questionnaire items.In Phase 2, SPSS and Mplus software program were used to analyze the data quantitatively.In Step 1 of Phase 2, item-total correlations were examined for 19-items.At this point, one item was removed from the questionnaire due to its near-zero item-total correlation.Then, differential item functioning (DIF) analyses were conducted based on confirmatory factor analysis (CFA) with covariate modeling to compare the conditional item performance difference for each of three DIF factors (type=public vs. private schools, major=education major in college vs. other majors, and experience=teaching same subject area as college major vs. not).Also, exploratory factor analysis (EFA) was conducted to investigate the dimensionality of the 18-item questionnaire.In Step 2 of Phase 2, a series of CFA was conducted to cross-validate the factor structure identified in the previous step.Furthermore, a bi-factor CFA model analysis was conducted to compare the 18-item questionnaire developed in this study to the organizational commitment scale developed by Meyer et al. (1993).For model comparison purposes for these analyses, Chi-square () statistic, root mean square error of approximation (RMSEA), comparative fit index (CFI), and standardized root mean square residual (SRMR) were used as model fit indices.Lastly, Graded response model (GRM) was fitted to a combined sample from Step 1 and Step 2 of Phase 2 to estimate item parameters for the 18 items.Test information function was also examined.

Results
In this study, six key words (sub-concepts) were selected for the construct of teachers' professional commitment based on experts' judgment by utilizing the transcribed interviews.The group of experts consisted of the first author of this paper and four experienced teachers.The selected keywords were (a) value, (b) dignity, (c) responsibility, (d) willingness, (e) benefit and (f) alternative.The group of experts agreed that these six keywords sufficiently cover the construct of professional commitment of teachers.Then, the analysis using code matrix browser was adopted using MAXQDA software, in order to verify the six selected keywords appeared sufficiently in the interview data.A code matrix browser analysis involves coding and categorizing the data, with systematic searches to find usages of given keywords.As a result, the frequencies of the keywords were; responsibility 45 (25%), value 41 (22.78%), dignity 29 (16.11%),alternative 27 (15.00%),benefit 20 (11.11%), and willingness 18 (10.00%).Furthermore, all six keywords were mentioned by a majority of teachers (see Table 1).Also, a majority of teachers mentioned 5 or 6 keywords.Only 7 teachers mentioned 3 or 4 keywords (See Table 2).These results confirmed that the six selected keywords were sufficiently and frequently mentioned in the interviews, and the group of experts agreed to retain all six keywords to write questionnaire items.Next, the group of experts worked on operational definitions of the six keywords.The operational definitions were input to MAXQDA software, and the software extracted a collection of sentences from the interview data that used either the target keyword or the definition of the keyword.Then, the experts examined the list for each keyword to further elaborate and/or refine the operational definition.Based on the finalized operational definitions of the six keywords, the experts wrote questionnaire items.It was determined that 3 items per keyword would sufficiently cover the sub-construct, except one keyword alternative for which 4 items were written.As a result, a total of 19 questionnaire items were developed.
For example, the operational definition of responsibility was determined by the experts to be Working hard and taking responsibility in teaching.Being aware of students' achievement and teachers' ethics.This keyword was mentioned by 3 teachers among 21 teachers in their interviews.They provided comments as follows.
 I don't expect to work on my promotion.I would prefer to work for my students.
 I feel I give good knowledge to my students.I want them to be a good person.
 My career goal is for my students.
Then, based on the operational definition and the extracted interview data, the experts developed an item phrased I have worked very hard to help my students to be successful (item 8).
In Step 1 of Phase 2 of this study, item-total correlations were first computed for the 19-item scale with item response data collected from 269 teachers.As a result, it was revealed that one item showed very low item-total correlation (0.059).It was the item "I already had a career choice as a teacher in my mind when I was in college", which was related to keyword alternative.Therefore, this item was dropped from the scale, and remaining 18 items were used in the subsequent analyses of this study.Item-total correlations for the retained 18 items are summarized in Table 3.A list of translated 18 items is presented later in Table 4 along with the results of exploratory factor analysis.The actual 18 items in Thai language are presented in Appendix.Next, Differential Item Functioning (DIF) was examined for three demographic variables (type=public vs. private schools, major=education major in college vs. other majors, and experience=teaching same subject area as college major vs. not) by CFA with covariate modeling.The results revealed that statistically significant DIF was displayed for one item for each DIF factor.They were Item 2 for experience (= 5.993), Item 7 for major (= 5.407), and Item 11 for type (= 3.495).The estimated mean item performance differences between groups were 0.129, 0.246, and 0.104 for Items 2, 7, and 11, respectively.These values are in the scale of 5-point scale and not large enough to affect observed total scale scores by more than 1 point on average.Also, the group of experts examined these items and reached a conclusion that they cannot come up with any obvious reason to explain the observed conditional mean item-score differences.Therefore, the experts made a decision to retain these items, and the original content coverage of the construct was maintained.
Next, exploratory factor analysis (EFA) models were tested.The three-factor solution was supported by the Keiser criterion, as well as by the theory.Although the four-factor solution was supported theoretically, it was not supported by the Keiser criterion.Also, the three factor solution would combine the 6 items from keywords benefit and alternative, which could be interpreted as one common sub-construct economy aspect of professional commitment.In addition, the three-factor solution would evenly place 6 items per factor, which would be practically convenient.Therefore, among solutions with different numbers of factors, we determined that the 18 items in the scale yielded a three-factor solution the best.The varimax rotation was used, and the three factors accounted for 62.398 % of the total variance, 23.434% by factor 1, 19.494% by factor 2, and 19.470% by factor 3. The factor loadings ranged from 0.480 to 0.794.The results are shown in Table 4, and the highest factor loading for each item is bolded.We considered that each factor is consisted of items with bolded factor loading values.As a result, the result revealed that each factor is consisted of 6 items associated with 2 keywords.Factor 1 was consisted of items 1 to 6 (keywords value and dignity), Factor 2 was consisted of items 7 to 12 (keywords responsibility and willingness), and Factor 3 was consisted of items 13 to 18 (keywords benefit and alternative).Also, our results displayed similar three components to ones Meyer and Herscovitch (2001) and Blau (2001) defined.Factor 1 is similar to their emotional/affective factor, Factor 2 is similar to their obligational/normative, and Factor 3 is similar to their economy/continuance factor.Also, we computed Cronbach's for each factor, and they demonstrated sufficiently high values (.901, .870,and .844for Factors 1, 2, and 3, respectively).
Next, we conducted a series of CFA with the Step 2 sample of 202 teachers.We fitted a single-factor, two-factor, three-factor, and four-factor models.The single-factor model was supported by Blau (1988Blau ( , 1989)), Morrow (1993), Morrow andWirth (1989), andWallance (1993).The two-factor model combined 6 items from benefit and alternative keywords into one common factor and 12 items from value, dignity, and responsibility and willingness keywords.The two-factor model is supported by Bergman (2006) and Meyer et al., (2002).The three-factor model combined 6 items from value and dignity, 6 items from responsibility and willingness keywords, and 6 items from benefit and alternative keywords, as supported by our Step 1 EFA results, as well as Irving et al. (1997), Meyer et al. (1993), Snape and Redman (2003), Bagraim (2003), and Xu and Bassham (2010).The four-factor model 6 items from value and dignity, 6 items from responsibility and willingness keywords, but 3 items from benefit keyword and 3 items from alternative keyword, as supported by Carson, Carson, and Bedeian (1995), as well as by Blau (2003Blau ( , 2008)).
Results are summarized in Table 5.It was revealed that models with more factors fit the data better.The four-factor model was statistically better than a three-factor model by the Chi-square difference test.However, their fit index values (CFI, RMSEA, and SRMR) were nearly identical.In addition, the correlation between factors 3 and 4 (i.e., benefit and alternative keywords) in the four-factor model was 0.831, indicating that the two factors were potentially measuring very similar or the same sub-construct.Thus, we concluded that the three-factor model would be the final model.Results for the three-factor model are summarized in Table 6.score for 6 items for each one of the 3 subscales.Second, a bi-factor model was set up such that each one of the six observed variables was predicted by two latent factors.One latent factor was defined based on which scale the observed variable came from (either this study or Meyer et al.).The other factor was defined based on what sub-scale the observed variable measures.We considered that PCEm and OCA were measuring a similar sub-scale.Similarly, we considered that PCO and OCN, as well as PCEc and OCC, were measuring a similar sub-scale.
The analyzed bi-factor model is depicted in Figure 1 along with the results.The result showed that the factor loadings for the factors that represented scales for this study and Meyer et al. (1993) were reasonably high, indicating the subscales well represented the target constructs by this study, as well as by Meyer et al.Also, the factors that represented subscales had reasonable magnitudes of correlations.However, the factor loadings for the three sub-scale factors were not consistent between the two scales.In addition, the correlation between the two scale factors were near zero (r = -0.035).These results potentially indicate that the two constructs measured by the two scales are not likely the same.This is probably due to the fact this study focused on professional commitment, while Estimated item parameters based on the GRM are reported in Table 4.The slope parameter () had values ranging from 1.420 to 2.827.The highest slope was for item 2 "I believe and respect in profession as a teacher", indicating that this item highly discriminated teachers with high commitment and teachers with low commitment.Furthermore, test information function showed that the 18-item scale can measure the construct for the standardized theta range from -2.91 to 2.16 with reasonable amount of the information 4.94 or higher.This information value is equivalent to the reliability of 0.80 or higher, indicating that a wide range of the trait levels can be measured with reasonably high reliability by this 18-item scale.

Conclusion
In this study, we demonstrated item development through a combination of qualitative data analyses and scale refinement through quantitative data analyses to measure professional commitment of teachers in Thailand.
Overall, the 18-item scaled developed in this study seems to function properly to measure the target construct.We also believe the 18-item scale has potential utilities and impacts.For example, this scale can be used to measure teacher professional commitment for prospectus teachers, as well as beginning in-service teachers, to become aware of their own professional commitment levels to ultimately be successful in their career.
Both literature and results from this study indicated that both professional commitment and organizational commitment are well represented by three factors (e.g., affective commitment, normative commitment, and continuance commitment).However, our bi-factor CFA results indicated that the factor loadings for the three sub-scale factors were not consistent between the scale to measure organizational commitment and the one to measure professional measurement, as well as near-zero correlation between the two commitment factors.These results indicate that the construct of commitment may have two distinctive interpretations, and the two constructs (professional commitment and organizational commitment) were not likely the same.Nonetheless, we found that 6 items developed in this study (Items 4,5,6,16,17,and 18) were very similar to some of the items developed by Meyer et al. (1993) and Blau (2003).It is an indication that there may be one or more sub-construct that is common to organizational commitment and professional commitment.We believe further study is needed to clarify this speculation.Also, Meyer et al.'s scale to measure organizational commitment was developed for use for any professions, whereas the scale in this study was constructed specifically for use with teachers.Since it is not clear whether this difference resulted in a dominant effect on the results, we also urge a future study to clarify this matter.
Limitations of this study include that it was not based on a large sample.For example, the Phase 1 of this study may have been biased due to the limited number of teachers whom the qualitative data were collected from.Also, the level of professional commitment is likely correlated with the year of experience.Although it is not a known limitation at this point, the factor structure may be different depending on years of experience.It will be an interesting study to investigate in the future.Also, the scale was developed in Thailand context with Thai language.The scale should be validated in different language and in different cultural contexts.

Table 1 .
The number of teachers who mentioned each of the six keywords in their interviews

Table 2 .
The number of keywords mentioned by teachers in their interviews

Table 4 .
Standardized factor loadings for the 18 items for 3-factor EFA model

Table 5 .
Comparisons for four confirmatory factor analysis models Note. = change chi-square relative to chi-square for the preceding model; CFI=Comparative Fit Index; RSMEA= Root Mean Square Error of Approximation; SRMR= Standardized Root Mean Square Residual; *** p<0.001

Table 6 .
Standardized factor loadings for three-factor CFA model Meyer et al.focused on organizational commitment.Also, Meyer et al.'s scale was developed for use with any profession, whereas the scale in this study was constructed specifically for use with teachers.Lastly, Graded response model (GRM) was fitted to estimate item parameters with a combined sample of n = 471 from Steps 1 & 2 in Phase 2. First, we examined two models; the reduced GRM with common slope for all items and the fully specified GRM with unique slope for each item.The model estimation of the reduced GRM resulted in a -2LL value of 1949.974whereas a fully specified GRM yielded a value of 2123.413.The difference in these two values (173.439) is distributed as chi-square with 17 degrees of freedom with p < .0001.Therefore, we chose the model with unique slopes for 18 items for this study.