Application of Grounded Theory Methodology Using CEFR in the Field of Language Testing

,


Introduction
Grounded theory (GT) is a qualitative and inductive research approach designed to explore, analyse and generate concepts about individuals and collective actions and social processes (Arthur, 2012, p. 85). It is an approach to research and a set of procedures for developing theory through analysis of data (Goundar, 2023). Grounded theory research begins with a general field of study and allows the theory to emerge from the data (Halim & Rouyan, n.d;Pace, 2012, p. 7). Earlier seminal longitudinal studies such as Caroll (2002), Haswell (2000), Herrington and Curtis (2000), McCarthy (1987), Rogers (2008), Sommers (2002), Sternglass (1997) and Watzke (2007) employed interviews, observations and assessments but not using a grounded theory approach. Against this background, Rogers (2009) supports the notion of longitudinal studies in writing research positing that for the emphasis on change over the time and across context have proven a particularly appropriate method in understanding writing development (p. 365). In brief, a longitudinal study using the lens of grounded theory was lacking in the context of Fiji's higher education sector (Goundar, 2020b;Goundar, 2023).
At present, language testing studies such as Haswell (2000), Hopf et al. (2019), McCarthy (1987), Rogers (2008), Sternglass (1997), and Valmori and De Costa (2016) have not discovered a theory that could explain the phenomenon of why there are differences in university students' writing levels. Flick (2018, p. 14) explains that grounded theory as a research approach is ideal for a field in which a problem exists for which an explanation is missing and also ideal for an area in which not much research and theorizing has been done before so that there is space left for new insights and perspectives to be developed. Charmaz (2003) attests to this by adding that grounded theory provides a flexible and practical approach to interpret complex and social phenomena. The role of grounded theory when compared to other methodologies is that in qualitative research its emphasis is on theory development (Halim & Rouyan, n.d;Strauss & Corbin, 1994, p. 274). The other reason for employing grounded theory is due to its systematic approach to data analysis. Myers (2009) states that other qualitative research methods often depend on the use of broad principle rather than the systematic approach, which hinders their application and interpretation. Therefore, employing grounded theory provided this study with a fresh perspective to create novel categories and concepts. Consequently, this study is a fresh contribution to the literature on longitudinal and grounded theory research, as it fills a notable gap in the context of language testing (Goundar, 2023). Indeed, it can be argued that the absence of grounded theories in the domain of language testing is a crucial gap, as it leaves teachers without a link between practice and theory (Halim & Rouyan,n.d,p. 4683), and could affect the effectiveness of the teaching and learning process.
Grounded theory research is also useful when investigating in the context of educational research as it is conducted in close conjunction with people and practice which is the niche of this study. Further, Arthur (2012, p. 92) explains that grounded theory helps us to develop middle-range theories that have great potential to succeed in explaining relevant behaviour in the educational setting, which has relevance for teachers and professionals in the educational setting. I employed grounded theory research design to evaluate undergraduate students' progress (or lack thereof) in academic written English over the course of their first-year university program. In addition, the study used the Common European Framework of Reference for Languages (CEFR) writing language guidelines (Goundar, 2023).
The CEFR is one of the most comprehensive frameworks for language evaluation around the world and it has been adopted by language testing organisations worldwide (Taylor & Jones, 2006, p. 1). Language tests such as Cambridge ESOL, International Legal English Certificate (ILEC), Asset Languages and IELTS (Taylor & Jones, 2006) are aligned to the CEFR framework. Syllabus designers or language test providers are inclined to align their exam design to CEFR due to its transparency and coherence (Taylor & Jones, 2006). The CEFR has also been applied to several European languages. Therefore, even though the CEFR may not be perfect, it is probably one of the most comprehensive frameworks for language evaluation currently around (Goundar, 2020b;Goundar, 2023). In addition, the global adoption of the CEFR framework in academic language testing assessments made it the ideal choice for this study (Goundar, 2023).
Database searches (Google Scholar, JSTOR, EBSCO) using the two phrases 'longitudinal studies' and 'grounded theory' did not provide any relevant results for language testing and notable scarcity in second language research (Goundar, 2023;Halim & Rouyan, n.d, p. 4682). This validates the use of grounded theory methodology in the study. This study offers insights that may be useful in other multilingual contexts where English is a second language, to inform medium of instruction policies for universities in those locations to adequately train students for an appropriate level of written language proficiency level beyond their time at the university (Goundar, 2023).

The Common European Framework of Reference for Languages (CEFR)
One of the most dominant frameworks' in the field of language testing is the Common European Framework of Reference for Languages (CEFR). There is evidence that claims Cambridge examinations such as IELTS contain and express the CEFR as an important feature. It includes the CEFR as part of their structure and represent the CEFR in a variety of ways (Taylor & Jones, 2006). In Europe, CEFR is used to serve policy agendas of fostering linguistic diversity, transparency of qualifications, mobility of labour and lifelong language learning (Goundar, 2023;Taylor & Jones, 2006, p. 4). Extending out of Europe, the CEFR has largely been adopted in describing language proficiency levels with resulting implications for local pedagogy and assessment (Taylor & Jones, 2006, p. 4). This established it as the ideal choice for this study as it provided the proficiency levels of students' language tests. The CEFR has also been recognized for contributing towards quality assurance matters, not just to improve systems and procedures but to support the growing professionalization of personnel and institutions involved in language learning, teaching and assessment (North, 2006;Taylor & Jones, 2006, p. 4). Taylor and Jones (2006, p. 4) in highlighting quality assurance matters noted that the CEFR's Code of Practice offers the practitioner community a common frame of reference and a shared meta-language for reflecting on and evaluating policy practiceensuring the door is always open for improvement.
However, there have been queries on the CEFR's application. Within the language testing community there are reservations on the use of CEFR as an instrument for harmonisation of policy/practice (Taylor & Jones, 2006, p. 4). Critics have questioned to what extent the CEFR provides a suitable instrument for operational test development (Fulcher, 2004;Goundar, 2023;Weir, 2004). For example, Nagai et al. (2020, p. 8) pointed out that Malaysian Education Plan needs aspirations for fully proficient English teaching force in order to implement the CEFR and claim that without the instructor's proficiency in English, the CEFR cannot be implemented. Most of the teachers in Malaysia are still not aware or show lack of interest in learning and adopting the framework (Nagai et al., 2020, p. 8). One of the authors of the CEFR, Coste (2007), pointed to a trade-off between the greater convenience of generic level descriptors (a B2 level learner) and the greater precision but more limited generalizability of scales focused on specific activities (a learner who is judged to be B2 in goal-orientated co-operation but B1 in addressing audiences). Coste (2007) suggested that to understand the restrictions on the CEFR's scales, they might prefer to regard it as "a measuring instrument which can define proficiency level, calibrating them as precisely as the graduations on a medical thermometer" (p. 39). Green (2018, p. 61) explained that unless given clear guidance, "users may over-interpret a B2 test score as indicating ability in all aspects of the CEFR descriptive scheme rather than just the restricted subset of abilities actually addressed by the test". Another argument put forward by Green (2018) is on test validation. Green (2018, p. 61) observed that instead of building a validity argument justifying the use of a test for a specific purpose (e.g., demonstrating that a potential student will be able to cope with the linguistic demands of university study), there is a risk that a testing agency need only show that a test is linked to the CEFR to persuade users that it is suitable for almost any purpose. According to Fulcher (2004), both critics and defenders of the CEFR have been concerned that score users may interpret results from tests that have been linked to the framework as interchangeable: that, "a score of 'X' on a UK test is equivalent in meaning to a score of 'Y' on a US test, and 'Z' on a EU test" (p. 260). Kolen and Brennan (2014, p. 3) clarified that if scores on two or more tests are to be equated (i.e., if results are to be treated as interchangeable), the tests "must be as similar as possible in content and statistical characteristics". The multidimensional and contested nature of language abilities means that it cannot be assumed that any two tests measure the same construct (Green, 2018, p. 61). The authors of the CEFR have responded to the criticism and limitations pointed out by stating that the initial intention of the framework is to provide a means of valuing and encouraging diversity (Goundar, 2020a(Goundar, , 2020bGoundar, 2023).
Professor John Trim, one of the authors' of CEFR mentioned in an interview that when working on the framework, the aim was to have a common reference point for individuals working in different fields and people using it for entirely different things and in very different ways could refer to in order to feel that they were part of the universe (Saville, 2005, p. 281). Further, responding to criticism that the CEFR lacks the level of detail required to build test specifications, North (2014) stresses that the framework is not itself a content standard but a generative "apparatus to develop a differentiated standard appropriate to the context" (p. 62). More countries outside of Europe have adopted the CEFR framework. Nagai et al. (2020, p. 1) reported that countries such as Vietnam, Japan, Indonesia and Malaysia have embraced the CEFR. In Japan, Osaka University's Foreign Languages Department carried out a study to rationalise the curricula for more than 20 languages taught there (Fennelly, 2016;North, 2009, p. 361).
In Malaysia, the framework was officially introduced in 2013 and has been included in the Malaysian Education Blueprint 2013-2025 and English Language Education Reforms 2015-2025; which indicates that the government has agreed not only to incorporate and align the framework into the present education system but accelerate its implementation (Nagai et al., 2020, p. 1). Apart from these countries, the CEFR has also appealed to the USA (Goundar, 2020b;Goundar, 2023). A modified version of the European Language Portfolio (ELP) has been set up in the United Sates called 'Linguafolio' (Goundar, 2023). The nature of Fiji's linguistic background is similar to most other countries where English is used as a second language (Goundar & Bogitini, 2019). In Fiji, speakers use English as a second language and their L1 is iTaukei, Hindi, Rotuman, Japanese, Chinese or Korean among many others (Goundar, 2019). This provided scope for the use of CEFR in the study because it captures the adaptability to be utilized in countries with different linguistics backgrounds and language needs (Goundar, 2023).

Relevant Literature
The impact of CEFR extends to areas beyond language education such as using the tests in granting permanent residency as well as citizenship to immigrants (Green, 2018;McNamara, 2011;Van Avermaet, 2009). It is considered a useful tool in the development of language test and has been accepted as the most significant recent event on the language education scene in Europe (Kantarcioglu & Papageorgiou, 2012, p. 82). CEFR has been used as a conceptual framework in numerous language testing studies such as Byram and Parmenter (2012) Fulcher (2004, p. 262) argues that over the years, language testing has become a significant component in many parts of Europe such as the U.K., the political power of CEFR has increased as it is used as a framework for language testing across governments and educational intuitions.
In language testing research, the Common European Framework of Reference for Language (Council of Europe, 2009), has been influential in defining proficiency levels (A1-Breakthrough, A2-Waystage, B1-Threshold, B2-Vantage, C1-Effective Operational Proficiency, and C2-Mastery); since its inception in 2001 (Goundar, 2023). The Language Testing journal on language assessment provided significant evidence in its special issue (2005) on how influential CEFR is in the field of language testing (Goundar, 2023).
Dí ez-Bedmar (2012) carried out a study in Spain to investigate the various proficiency levels within the same institutional groups and the nature of negative linguistic properties. This research used the CEFR proficiency levels and a computer-aided error analysis (Punch & Oancea, 2014) to verify the written essays of the English section of the University Entrance Examination. A total of 302 participants were employed in this study and were required to write an essay on the topic 'Where outside Spain would you like to go on a short pleasure trip?' (Callies. et al., 2014, p. 79). The findings of the study revealed that the majority of secondary school leavers performed at the same B1 proficiency level on the CEFR. The study did not analyse errors at each proficiency level but selected only A2 level that showed students made res.ccsenet.org

Review of European Studies
Vol . 15, No. 1;2023 4 frequent errors with the use of modal auxiliary verbs. This provided a useful framework for the study and the CEFR proficiency levels were used in classifying, which level the first year and final year undergraduate students stand at in the longitudinal study.
The historical review comprised scrutinising the literature on grounded theory methodology and longitudinal studies. An essential advantage of grounded theory is that it is directly rooted in the problems and issues faced by a discipline (Somekh & Lewin, 2011, p. 113), in this case language testing. This method has been criticised by scholars who have stated that the concepts of this strategy of 'grounded', 'theory' and 'discovery' potentially gives the researchers a false sense of epistemic security while at the same time undermining the significance of interpretation, narrative and reflection (Punch & Oancea, 2014, p. 170;Thomas & James, 2006, p. 767). These scholars have argued that as grounded theory emphasized on order and procedural machinery, which was very much a product of its time as in the past had served the qualitative research communities well; but now could be deemed as missing the best of qualitative inquiry (pp. 790-791). Despite these criticisms, the methodological process outlined in grounded theory methodology has been clearly demonstrated to be rigorous and insightful (Arthur, 2012;Punch & Oancea, 2014;Valmori & De Costa, 2016). According to Punch and Oancea (2014, p. 164), a researcher is [gradually] able to see the analytic story unfold. On this basis, they highlight that grounded theory is currently one of the most widely used and popular qualitative research methods.

Application of Grounded Theory Methodology
Arthur (2012, p. 86) points out that grounded theorists use data collection methods that best suit the research problem. Some of the frequently used methods in grounded theory research are interviews, field observations and different forms of written reports (Goundar, 2023). The data collection takes place in three stages, the first is open coding, followed by selective coding and finally theoretical coding (Goundar, 2023). In the open coding phase, the researcher stays close to the data and remains open in exploring the pattern that is going on with the set of data. Making comparison and asking questions are the two main activities used as a guide in labelling in open coding (Punch & Oancea, 2014, p. 233). Subsequently, in selective coding, the most significant code from open coding is used as a guide for further data coding (Goundar, 2023). Finally, the theoretical coding allows the researcher to use the data to generate, integrate into a theory or a set of themes (Glaser, 1978).
The end product of grounded theory is not a set of findings or a few themes but a set of grounded concepts integrated around a central category/theme to form a theoretical framework that explains why and how persons, organisations, communities or countries experience and respond to challenges or problematic situations (Somekh & Lewin, 2011, p. 113). In other words, it provides a stepping stone upon which to build knowledge and frameworks to guide practice.

Stages and Data Collection
Hence, using CEFR as the metric, this study evaluated academic English written proficiency levels of a cohort of first year undergraduate students at the beginning of the university program and at the end of the first year of study. The first test was in two parts, Writing Task One was to summarise information from a table of statistics provided. The second writing task in required students to provide reasons for arguments on the given topic. Whereas, in the first writing task was based on a company's gross profit graph and the second writing task was for the learners to argue their opinion on the topic with relevant examples. This provided insight into what are the differences as well as achievements in writing proficiency of undergraduate students in their first year of the three-year university program. Thus, the analysis of this study provided insights into whether the university students in question are on the appropriate path to completing their degree with adequate academic English language proficiency. In addition, results of the language test can be used to identify those students requiring extra support in written language skills.
In this one-year longitudinal study, three writing interventions were administered. After the first language test and evaluation using CEFR, the cohort was given writing interventions to assist with their writing abilities. To gauge if the writing interventions were successful or not, the second language test was administered at the end of their first year of university. Therefore, the project was divided into four stages as indicated in Figure 1  In the first stage, students were given the language test and their performance was evaluated using the CEFR to ascertain language abilities and levels of English language proficiency skills. The writing interventions then became the second stage of the project. The aim of Stage 2 was to improve the writing skills of students through various writing tasks. In the third stage, the second language test and CEFR took place in order to evaluate if the writing interventions were successful or not. Finally, stage 4 was an overall evaluation of the first three stages in order to draw insights that became part of conclusions and recommendations to inform policies on addressing medium of instruction and epistemic access at the undergraduate level of study.  Vol. 15, No. 1; researcher stays close to the data and remains open in exploring the pattern that is going on with the set of data. The second phase-is selective coding where the most significant codes from open coding is used as a guide for further data coding. Last is the theoretical coding phase which allows the researcher to use the data to generate, integrate into a theory or a set of themes.
Presently, Fiji does not have a standard regulation on the desirable CEFR proficiency level for university graduates to operate effectively in employment and society at large (Goundar, 2023). The advertisements for new recruitments in the media only state 'should be competent in the English language' (Goundar, 2023). The IELTS framework is only used for migration purposes and not in the education system (Narayan, 2023).

Use of Memos in the Study
In grounded theory methodology, memo plays a critical role throughout the research. This study also utilised memos. Birks and Mills (2015, p. 41) recommend that memos should be used as soon as a study is conceptualized. Grounded theorist, Charmaz (2014) recognises the writing of memos as a technique for initiating and maintaining productivity. There are generic or technical elements written in memos. Some of the things that can be included in the memos are: 1. Your feelings and assumptions about your research 2. Your philosophical position in relation to your research 3. Potential issues, problems and concerns in relation to your study design 4. Reflections on the research process, including factors that influence quality in the study 5. Procedural and analytical decision making 6. Codes, categories and your developing theory Adapted from Birks and Mills (2015, p. 42) Theoretical memos are different from the memos that Birks and Mills (2015) discussed above as they primarily concern the empirical data. The discussion of what is going on with the data forms the basis of theoretical memos specifically during the coding process. Urquhart (2013, p. 71) suggests that "regardless of whether you are engaging in a theory building design or not-adopt this practice when coding your data". It is a vital tool for theorising (Glaser, 1978;Urquhart, 2013, p. 110). During the data analysis process, finding the relationships between codes and categories is beneficial in theoretical memos. The guidelines on memoing outlined by Birks and Mills (2015, p. 42) were employed in understanding generic components of data for this study. The rules provided by Glaser (1978) were used for writing theoretical memos. Scholars (Birks & Mills, 2015;Charmaz, 2014;Glaser, 1978;Urquhart, 2013) claim that there are no standard formats for writing memos as it is flexible and depends on the researcher. Thus, I have used a simple format that has the title of what the memo is about. It also differentiates if it is a generic memo or theoretical one and including the date. Here is an example for a memo from the study:

Memo -Presentation of the Gender divide data 7 November 2022
In the third draft of my thesis, the supervisors pointed out to include gender divide data. I got so focused on looking at only the academic tests from linguistics perceptive that I missed out on including the gender divide data which I had kept in the master excel sheet. For continuation and flow, I presented the total number of students' statistics first at the beginning of the year or in Test One, then followed by the gender divide data at the beginning of the year. After this comes the total number of students' data at the end of the year or in Test Two followed by the gender divide data at the end of the year. This makes it easier for the readers to follow and allows the discussion to transit smoothly.
In the gender divide data tables, I put the categories that the students were grouped into according to gender, the total number of each gender and the percentage of that particular category. I believe it has been presented in such a way that the reader will be able to find exact information within seconds reach.

Findings and Discussions
As mentioned earlier, this paper looks at the study from its methodological stand therefore, the actual results of eth study can be found in Goundar (2023) as I will be limiting the discussion to the use of GTM and CEFR. At the beginning of the year, out of 120 students; 62 were at A1 and 49 were at A2 which is classified as basic user (Goundar, 2023). There was almost an equal divide of students who achieved the A1 level, 27% females and 25% males were recorded at this level. At the A2 level, more females-25% attained this level compared to males at 16%. It was observed that females were higher in number than the males at the B1 level, the females were 4% whereas the males recorded 3%. However, after the one-year study wherein writing interventions were administered there was notable progression. A total of 90 students moved up to B2 and 8 students moved up to C1. B2 is classified as independent user and C1 is proficient user. From the gender divide data, it can be concluded that the females progressed better by the end of the year when compared to males. Female students at B2 level were 41% while males were 34% at the same level. However, at the C1 there was equal number of males and females, both had 3%. This implies that students will likely be successful in their subsequent years of study at the university (Goundar, 2023).
The findings from the first Academic English language writing test administered at the beginning of the year and the second Academic English language writing test administered at the end of the year were categorised using GTM. These sections were Academic English writing conventions, inconsistencies with grammatical rules, vocabulary and sentence structure usage. Within these themes, other categories were formed to facilitate a better interpretation of the empirical data.
With the Academic English writing conventions, there were discussions on writing style, omission from content and redundancy in writing. In examining the section on inconsistencies with grammatical rules, three themes were created which included use of tenses, unorthodox use of verbs, and lexical categories. Subsequently, with the vocabulary and sentence structure usage section, homophones in sentences and oversight of grammar rules were derived as themes. Based on the findings, it was noted that more writing tasks related to grammar rules need to be implemented in both secondary schools and academic English language courses in the university (Goundar, 2023).
The final section of the study looked at the interview data from Academic English language tests, the feedback on the two tests and the writing interventions. The data was code from the 30 interviews to illustrate the voice of participants. Four themes emerged from students' feedback on the Academic English language tests. These were: gained confidence from Academic English tests, tests were beneficial, improved on writing skills and Academic English tests were challenging. This affirms that the use of tests was favourable for the students as they found it to eventually improve their standard of Academic English writing skills (Goundar, 2023). In addition, three key themes emerged from students' feedback on the writing intervention tasks. These included: writing tasks made improvements, learnt new skills and tasks created awareness. Higher education institutions can adapt these writing tasks in their first year Academic English course as students find it helpful and are able to build their academic English writing skills. These skills are crucial for their progression in the three-year university program. It is pivotal to understand the academic test and writing interventions from their perspective. Doing so informs us of new policies that can be implemented in higher education to promote educational equalities and provide epistemic access (Goundar, 2023).

Conclusions
This study is significant in the field of language testing because of the methodological approaches it utilised. The use of a grounded theory approach in a longitudinal study which for language testing is rare. This will augment the way language testing research is currently conducted. Additionally, the pedagogical benefit of putting in place a system of language evaluation after the first year will also help universities design first year courses accordingly. Notably, this study attempted to establish the connection between formalisation of language proficiency testing and the higher education system in Fiji. Presently, language proficiency testing is voluntary, and mainly-for purposes of international migration. Thus, this paper discussed on the application of the CEFR to the study of undergraduate students writing skills in order to gauge its relevance and usefulness in a non-European multilingual context. The paper began by discussing the relevance of CEFR, it also provided a brief critique of CEFR, the scales and how the students rated on the scales. The use of grounded theory methodology (GTM) is rare in the field of language testing as indicated by the literature survey. However, this paper illustrates how GTM can be used in a language testing study by setting out the necessary steps, the data collection tools, the data analysis, and presentation structure.
Further, this paper will be useful to engage novice users of GTM to apply it to the field of linguistics, specifically to language testing research. The methodological contributions and the unique data set of the study will advance scholarly and social policy conversations on this topic. The study makes an original contribution to the body of knowledge on how grounded theory research methodologies can be applied to a longitudinal language testing research context. The paper will benefit novice researchers such as Masters and PhD scholars as well as policy makers in applying the methodology to their studies.