Psychometric Properties on Lecturers ’ Beliefs on Teaching Function : Rasch Model Analysis

This paper focuses on the psychometric analysis of lecturers’ beliefs on teaching function (LBTF) survey using Rasch Model analysis. The sample comprised 34 Community Colleges’ lecturers. The Rasch Model is applied to produce specific measurements on the lecturers’ beliefs on teaching function in order to generalize results and inferential studies. The items proved to measure a single dimension of lecturers’ beliefs and how it influences the teaching function. The developed instrument termed LBTF covers six sub dimensions. Both construct and content validity were achieved through the Rasch Model analysis using the dimensionality, item fit, and item polarity parameters. The reliability of the instrument was achieved by conducting person and item separation analysis, Cronbach alpha, and calculated person and item reliability estimates. The results of Rasch Model analysis show that the items of LBTF fit the model appropriately.


Introduction
Lecturers' beliefs are developed throughout their lives and are influenced by a variety of factors, including events, experiences, and other people in their lives (Knowles, 1992).Where some beliefs are taken directly from the culture, others are culturally framed.These experiences shape their beliefs about students, curriculum development and the overall process of schooling (McGillicuddy-De Lisi & Subramanian, 1996;Aikenhead, 2005).Shulman (1986) concluded that lecturers' beliefs come from four sources: the accumulated knowledge-content, materials and educational structures, lecturer's formal education, and "practical wisdom", that is from practical experience.Furthermore, the sense of thinking and reasoning (developmental) processes stands out more than the acquisition of specific knowledge (OECD, 2009).Beliefs influence people's knowledge acquisition and interpretation, task selection and organization, and ways of understanding (Mansor, 2010).Lecturers construct their own meaning of any curriculum as they (delve into it and initiate how to implement it in the classroom) negotiate an orientation towards it and decide what, if anything, to implement in their classroom (Aikenhead, 2005).Thus, lecturer participation in the curriculum planning process is considered essential, whether in the defining of problems or the presenting of concrete solutions in the form of programs of study.The failure of much curriculum innovation has been attributed to the neglect by innovators of teachers' perceptions (Sutherland, 1981), and it seems that teachers' own interests and concerns are only rarely allowed to influence or direct the choices made by curriculum developers (Ben-Peretz, 1980).
Many studies have investigated the effects of teacher beliefs on instructional practices (Ball, 1996;Beswick, 2005Beswick, , 2007;;Handal & Herrington, 2003;D. McLeod & S. McLeod, 2002), little research exists which precisely explore what happens when there is a conflict between these beliefs and current practices.For example, although, several studies show that beliefs have a strong impact on action, a research conducted in Turkey by Karaagaç and Threlfall (2004) shows that lecturer' goals, in particular when it "imposes" lecturers ", can bring to class practices that conflict with their beliefs).Lecturers need significant help in identifying the difference between their beliefs and practices espoused to think through to new inherent initiative culture (Standen, 2002).Lecturers update their beliefs in accordance to new forms of instruction, but unfortunately they are not changing their current teaching methods (Quinn & Wilson, 1997).Mansour (2010) supported the idea that teachers are essential change agents for educational development and that teachers' beliefs are precursors to change.Consequently, it is essential to take lectures' beliefs and practices into account, as well as the factors that form or influence those beliefs and practices (Mansor, 2010).
Developing the researcher's own instrument requires knowledge about item or question construction, scale development, format, and length, validity and reliability of the instrument and its scores (Sekaran, 2003;Creswell, 2012;Johnson & Christensen, 2012).Measuring the lecturers' beliefs needs a measurable instrument to be developed and tested in order to clarify how beliefs reform lecturers' teaching functions.Thus, this study aims to use the Rasch Model analysis as a powerful tool for evaluating constructs validity and reliability of the instrument in order to use it for interpretation the lecturers' beliefs on teaching functions.Furthermore, the research questions in this paper are Do the items of lecturers' beliefs on teaching functions scale have adequate fit statistics, showing that each item relates to the variable and measurement tool in a meaningful way, and does the lecturers' belief on teaching functions scale demonstrate high separation and good reliability in person and items set?

Objective
Based on the perspective of using Rasch Model (RM) as a model in one sense in that it represents the structure which data should exhibit in order to obtain measurements from the data; i.e. it provides a criterion for successful measurement.As such, measuring the lecturers' beliefs needs a measurable instrument to be developed and tested in order to clarify how beliefs reform lecturers' teaching functions.Thus, this study aims to use the.RM analysis as a powerful tool for evaluating constructs validity and reliability of the instrument in order to use it for interpretation the lecturers' beliefs on teaching functions.

Beliefs and Teaching Functions
Learning is seen as the active construction constructing knowledge in the gradual expansion of networks of ideas through interaction with other people and materials in the environment (Marshall, 1992).Constructivism puts primary emphasis on the independence of the interpretation of each of their own experience and ways to build their lives and ideas (Newbrough, 1995;Roth, 1994;Gil-Perez et al., 2002;Mansour, 2009).However, before expecting lecturers to change their beliefs, they need to first be aware of them and then beliefs may be remained unconsciously (Crandall, 2000).Many researchers pointed out that the lecturers must rethink their role of teaching in order to facilitate communication situations appropriate to the nature of the various inter-relationships (between lecturer and student, between students and between lecturers, students and content) (Goodyear, Spector, Steeples, & Tickner, 2001;Coppola, Hiltz, & Rotter, 2002;Williams, 2003;Akdere & Marshall, 2005).Variety of interpretations of the terms "lecturer's functions" and "competencies" (Gonczi et al., 1993, Eraut, 1998;Salmon, 2000, Goodyear et al., 2001;Westera, 2001;Anderson et al., 2003).Therefore, the concept of competency is used in various ways.Two approaches are clear defined the concept of competency; one viewing competency defines it as an observation skill sets as a personal skill or ability, related to the effectiveness of behavior, and another approach that includes skills such as strategic behavior, related to the ability to adjust performance requirements of the context (Eraut, 1998).The changes perceived by lecturers as required for teaching in virtual environments situation (Coppola et al., 2002).

Basic Principles of Rasch Measurement Model
Developing valid measures of abstract constructs is essential to advancement of psychological and educational researches.The most common practice in scale development consists of a administering a group of items intended to measure the same construct and subsequently aggregating the responses to form total a scale value.The items are weighted equally in the summation and treated as if all fall on an interval scale (Kindlon et al., 1996).Weighting items equally implies that all items are of identical importance in assessing the construct.In addition, treating items as linear (equal interval) assumes that psychological distance between scale points (such as: strongly agree, agree, neutral, disagree, and strongly disagree) is the same throughout the item (Kindlon et al., 1996).Moreover, reliability of scores, number of underlying constructs, and scale construction practices should empirically test the assumptions of equal-item weighting and the linear treatment of scales can represent the quantity of a trait possessed by an individual.Rasch Model (RM) is a model in one sense in that it represents the structure which data should exhibit in order to obtain measurements from the data; i.e. it provides a criterion for successful measurement.The RM, one of a group of models originating from item response theory, was initially developed in connection with the construction of ability tests.The model expresses Guttman's basic ideas in a probabilistic manner, as follows: (a) given any item, a person of higher ability should have a higher probability of getting the item right than would a person of lower ability, and (b) given any person, an item of lower difficulty should be solved (gotten right) with a higher probability.In the RM, the probability of a specified response (e.g.right/wrong answer) is modeled as a function of person and item parameters (Bond & Fox, 2007).Specifically, in the original RM, the probability of a correct response is modeled as a logistic function of the difference between the person and item parameter.RM analysis is a powerful tool for evaluating construct validity.Rasch fit statistics are indications of construct irrelevant variance and gaps on Rasch item-person map are indications of construct under-representation.There are some important aspects of RM measurement should be considered to understand the interpretations of its results analysis.

Item Polarity
Analysis of the polarity or consistency of the items are indicators used to show the items move in one direction to which the constructs being measured.A positive indicator show all items are moving in parallel function to measure the constructs formed.If there is a negative indicator for a particular item it should re-examine the data to be improved or removed either because of these indicators show that there are items or individuals who respond in difference with the construct (Linacre, 2003).Item polarity or point measure correlation (PTMEA Corr.) is the early detection of construct validity (Bond & Fox, 2007).

Dimensionality
Dimensionality aspects are important for determining the instrument was measured in one direction and one dimension (Linacre, 2003;Bond & Fox, 2007).Dimensionality aspect is one of the conditions in analysis using the RM.This is to ensure content validity and construct validity o f the instrument (Wu & Adams, 2007).Dimensionality refers to the forcing on one attribute or dimension at a time.The criteria for dimensionality is exceeding of 40 % (Linacre, 2003;Bond & Fox, 2007).

Rating Scale Analysis
One of the significant aspects of RM is determining the probability of participant responses equally spread between the scales.RM has ability to differentiate among scales of instrument based on data gathered.RM polychromous data analysis is used to determine whether it correspondents to the model or not.Polychromous RM can also measure the hypothesis of a scale in terms of adding value to the agreement or disagreement as moving from one continuum to another continuum (Linacre, 2003;Bond & Fox, 2007).Not all scales can be used for RM.If the structure calibration is less than 1.40 and more than 5 this scale should be collapsed (Linacre, 2003;Bond & Fox, 2007).

Item separation
Item separation refers to all participants are able to answer all level difficulty of items.That means the participants can be separated based on those constructs that are measured.The criterion for usefulness of an instrument is exceeding its item separation (Linacre, 2007).A higher value of separation means greater spared of items and persons along a continuum.Lower values of separation indicate redundancy in items and less variability of persons on the trait.

Item and Person Reliability
Item and person reliability item reliability refers to the consistency of item placement along the pathway if these items were given to another sample of the same size that behaved the same way.While person reliability refers to the consistency of person ordering that could be accepted if this sample of persons were given a parallel set of items measuring the same construct (Wright & Masters, 1982).The criteria for accepting reliability RM is exceeding 0.50 (Linacre, 2007;Bond & Fox, 2007).

Infit and Misfit
Infit refers to the degree of fit of an item or a person.Infit means square is transformation of the residuals, the difference between the predicated and observed for easy interoperation.It expects value is 1.As a rule of thumb, values between 0.70 and 1.30 are generally regards as acceptable.Values greater than 1.30 are termed misfiting and those less than 0.70 as overfiting (Bond & Fox, 2007).Another values suggested by Linacare (2005) is 0.5 < x < 1.5.

Method
This study being a pilot study aims to test the validity and reliability of instrument in order to measure the lecturers' beliefs on teaching function.Questionnaire was used as a survey design.The sample of this pilot study was 34 respondents who were lecturers at CC.The questionnaire includes 68 questions divided into 9 constructs which are classroom management includes 3 questions, curriculum knowledge includes 4 questions lesson plan & presentation includes 5 questions, teaching strategies includes 15 questions, communication includes 4 questions, assessing students' learning includes 11 questions, prior knowledge includes 2 questions, and professionalism includes 20 questions respectively.The RM analysis investigated the validity and reliability of the LBTF instrument.The questionnaire was developed based on the standards criteria of RM analysis which are item dimensionality, item polarity, item fit analysis.Calibration scale instrument was undertaken during the implementation of pilot tests to assess the suitability of the scale of the LBTF instrument.Progressive scale of five categories was used for these instruments that consist of 1 = never, 2 = rarely, 3 = frequently, 4 = very frequently, and 5 = always.Some researchers agree that the optimal length of scales needs to determine by the nature of what is to be examined and extent to which respondents can discriminate among levels (Light et al., 1990).

Results
Quantitative data of developed LBTF instrument was analyzed by using Winsteps version 3.68.2 in order to test the questionnaire items' validity and reliability.Summary of RM analysis results of the developed LBTF questionnaire as followed.

Dimensionality Analysis
Dimensionality aspects are important for determining the instrument was measured in one direction and one dimension (Linacre, 2003;Bond & Fox, 2007).In Rasch analysis, a satisfactory dimensionality determined by raw variance explained by measures which should be more than 40% and unexplned variance in 1st contrast which should be ≤ 15.Table 2 shows raw variance explained by measures was 53.0%, and unexplned variance in 1st contrast was 6.1 %.Thus, dimensionality data results posts that the LBTF data fit the RM as illustrated in Table 1.

Reliability Analysis
Pilot test was conducted with 76 items for LBTF instrument, and with 36 items for LTPM instrument among 34 lecturers of CC in Yemen.The criteria for accepting reliability in RM is exceeding 0.50 (Linacre, 2007;Bond & Fox, 2007).In addition, acceptable separation should be more than 2 (Fisher, 2007).Reliability reports on the consistency of a respondent's answers to the items in the scale (Pedhazur & Schmelkin, 1991).The RM analysis measures reliability with person separation reliability.This statistic shows the ability of the items to separate persons with different levels of the concept measured.Rasch reliability of the items was compared with Cronbach's alpha (CA).CA is a measure of internal consistency, and estimates the reliability of the scale by computing the variance between all possible pairs of items.Tables 2 and 3 show that the person and item reliability, person and item separation for LBTF instrument.As shown in Table 2 the person reliability was very high with .96,and the person separation was 4.68.Table 3 shows the item reliability was 0.93 and item separation was 3.59 which are acceptable.Therefore, the results of person and item reliability and person and item separation for LBTF indicate satisfactory readability for the LBTF instrument.Analysis of the study showed the reliability of 34 respondents with 67 items in theses constructs was high to measure the LBTF at community colleges in Yemen.Thus, the reliability of item and person for LBTF instrument values are fairly close together and bother presenting a strong acceptable level.

Item Polarity and Item Fit Analysis
Item polarity analysis is the essential step for measuring the validity of the constructs.Item polarity or point measure correlation (PTMEA Corr.) is the early detection of construct validity (Bond & Fox, 2007).This analysis has the same function as factor analysis to access the relationship between the items in measuring the constructs that required.The good correlation values of items should be ≥ .20.As shown in table 4 there is no negative value and all PTMEA of each item is > .20.Thus, the correlations of items indicate that there are no mis takes in data entry or miscoded items and this fit statistics with in the acceptable limit.
In the analysis of appropriateness and inappropriateness items of LBTF, RM analyzed each construct separately.The Item measure can list the logit measurement information for each item.Appropriateness of items under schedule also shows the information for mean square (MNSQ) to make it easier for outlier detection or misfit.Table 4 shows the schedule of LBTF items of the constructs of classroom management (CM), Curriculum knowledge (CK), Lesson Plan Presentation (LPP), Teaching Strategies (TS), Communication and Relation with Students (CRS), Assessing Students' Learning (ASL), Enhancing Professional Performance (EPP), and Integration (I).For the analysis of these constructs items, result expected MNSSQ infit analysis value should be 0.4 <x <1.5, and PTMEA value should be+ 0.2 <x<1 (Fox & Bond, 2007).It can be seen in the table 4 that all items fit.Therefore, the data are deemed acceptable for this study.

Calibration Scaling Analysis
RM analysis can also help to determine the validity of the scale used to make zero calibration setting and subsequent grading scale used.Rasch analysis determines the validity of the response probabilities being spread fairly across scales.Table 6 illustrates the schedules for grading scale calibration analysis of the LBTF survey method.
Table 5 and figure 1 show a summary of the category structure on a scale gradation and size structure of the intersection.In the column arrangement observation (observed count) shows the respondents' answers given to ranking scale.As shown in the Table 4, the most frequent answer is the scale of respondents rank ing 5 which 983 (39 %).The next grading scale that respondents selected was scale 4 of 870 (35%).The scale 3 had 430 (17%) respondents.While the least grading scale of least were scale 2 with 118 (5%) respondents, and scale of 1 of 107 (4%) respondents.
The observed averages show the pattern of respondents.Fairly normal pattern is expected with systematic instrument from negative to positive.As illustrated in Table 7, the response pattern obtained started from -1.65 logit and moved up monotonously towards +2.62 logit signifying that the pattern of respondents' answers are fairly normal.

Conclusion
This paper uses RM analysis to evaluate the usefulness of several items used in the LBTF as a measurement scale lecturers' teaching functions.The items are identified according to theory and evaluated according to the Rasch Measurement Model using Winsteps software.This study as a psychometric study tested the validity and reliability of the LBTF in order to develop the LBTF instrument.Based on the results of the Rasch analysis measurement; item reliability was 0.87 > 0.50, item separation was 2.58 > 2.0, dimensionality, and evaluation of RM fit (infit <1.5) resulted in LBTF with good psychometric properties.

Table 1 .
Item dimensionality of LBTF

Table 2 .
Person separation and reliability

Table 4 .
Item fit analysis

Table 5 .
Calibration scaling analysis Figure 1.Summary of the category structure on a scale gradation