A Theoretical Review on the Need to Use Standardized Oral Assessment Rubrics for ESL Learners in Saudi Arabia

There is a growing need for standardized oral assessment rubrics in learning institutions. This is linked to the growing number of ESL learners not only in Saudi Arabia but other parts of the world. To assess the need to use standardized oral assessment rubrics, this particular study explores various peer reviewed articles that support the use of standardized rubrics while assessing oral skills among ESL learners in Saudi Arabia. Standardized rubrics give students a reference point in regards to what is expected while learning oral skills. As a result, students are able to work towards improving their skills to meet the standards of the rubric. Various scholars have given different definitions for the term rubric. In all the definitions, grading criteria is a common feature. Some experts have stated that, modern rubrics should go beyond grading to guiding students in understanding their expectations in oral tests. When developing standardized rubrics, teachers should ensure that the rubrics meet the required validity and reliability to assist ESL learners in meeting their goals. Literature shows that there is a gap in the current oral assessment rubrics in Saudi Arabia, and it requires a prompt review. Therefore, developing a standardized rubric should take a multidisciplinary approach. Scholars and experts teaching ESL students must be consulted to ensure all important factors are considered and incorporated in the standardized rubric.


Introduction
Over the past few decades, the attention of foreign language teaching has shifted from enhancing leaners' linguistic competence to developing communicative competence. Consequently, assessing oral skills has received more attention in second language acquisition (SAL) research. In English as a foreign language (EFL) context, learners are expected to speak fluently and communicate effectively. However, speaking is considered as the most difficult skill to evaluate in comparison to other skills. One reason for this challenge is that, speaking involves cognitive processes that are not equally evaluated in other skills. Additionally, learning speaking skills requires more effort as compared to other skills such as listening, reading, and writing. For most ESL learners, including those in Saudi Arabia, the most stressful exams are those that test speaking skills. Notably, traditional assessment tools lack the ability to evaluate certain aspects of speaking skills (Oosterhof, 2001). This is due to the unclear assessment procedures which make learners develop anxiety. Unlike other skilled-based exams, which clearly state the evaluation methods, the nature of oral exams requires teachers to carefully plan the execution stage of the exam in order to accurately assess a learners' skills. Both teachers and learners believe that assessing oral exams is a challenge that adversely affect the intended learning outcomes. Therefore, assessing oral skills requires special rubrics or criteria to ensure the validity and reliability of oral tests, for ESL learners. make policies that will enhance its implementation. Additionally, the study also forms the basis for conducting further studies on the development of standardized oral assessment rubrics that will lead to the improvement of quality of education in Saudi Arabia.

Critical Analysis
Assessing speaking skills is a challenging phenomenon for many reasons (Navidinia et al., 2019). First, the nature of speaking requires an examiner to listen carefully to the speaker. However, there are internal and external factors that may hinder the evaluation process. Second, speaking is a complicated skill which often requires assessors to clearly define the assessment benchmark with examples. Hence, the criteria for distinguishing a good speaker from a poor one may differ greatly. Additionally, assessing speaking skills is a difficult task since it involves evaluating individual learners separately (Luoma, 2004). Due to these challenges, universities have not given speaking assessments enough attention in Saudi Arabia and other countries around the globe (Egan, 1999). By developing standardized oral rubrics, more learning institutions will focus on improving their learners' speaking skills.
There are numerous benefits of presenting well-designed rubrics. First, they stimulate feedback from students, when applied in classroom as a formative assessment method. Additionally, they can be used in a summative way to help students compare between their self-assessment and a grader's judgment. Oral assessments aim at: measuring language proficiency, identifying strengths and weaknesses of learners, evaluating the course objectives and placing learners in a teaching program. Therefore, in assessing the ability to communicate, a test must be designed in a way that shows the learners' ability to demonstrate their interactional competence in using the target language. Traditionally, a test should have three elements: practicality, validity, and reliability (Nodoushan, 2009a). Lately, learners' task performance is preferred since it involves assessment of learners' performance on an assigned task (Nodoushan, 2008a). To assess student's speaking abilities, teachers should test their interactive and communicative ability by encouraging them to participate in authentic speaking activities. In such tasks, learners' level of mastering speech aspects such as pronunciation, intonation and pragmatic characteristics are measured.

Table 1. Benefits of a well-designed rubrics
Benefits of a well-designed rubric 1.
It acts as a summative way to help students compare between their self-assessment and a grader's judgment 2.
Stimulates feedback from students 3.
Identifies strengths and weaknesses of learners 5.
Helps in evaluating the course objectives 6.
Assists in placing learners in a teaching program In Saudi Arabia, oral assessments in EFL classroom take different forms including; interviews, pair and group tasks and traditional scoring rubrics. Interviews are used to test learners' speaking skills, in which a face-to-face interaction takes place between the learner and the interviewer. However, such interviews are criticized since they do not represent real-life situations. Similarly, pair and group tasks are used to assess learners' speaking skills (Egyud & Glover, 2001;Nodoushan, 2002). They are advantageous since they save time by assessing more than one learner at a time. According to Ussama's (2013) study, traditional interview scoring rubric based on O'Loughlin's (2001) guidelines is also used to assess the learners' language speaking proficiency.
Teachers may not have the required knowledge to assess students' speaking ability. Therefore, there is a need to develop an authentic oral assessment rubric that measures students' oral performance. This is essential to ensure the validity of the assessment process. For a speaking assessment tool to be valid, it must have the following characteristics. Firstly, it should reflect communicative competence. Second, it should contain a valid representation of the measured constructs in the scoring rubric, and it must have a reliable scoring procedure. Teachers must take into consideration the purposes and contexts of assessment when they develop their scoring rubric. Examples of scoring rubrics include ETS and TOEFL. In ETS TOEFL (2004) scoring rubric, the relationship between pronunciation and fluency is emphasized as well as grammar and vocabulary that are linked to a learners' communicative competence.
There is an urgent need to design a standard speaking assessment rubric that helps in identifying strengths and weaknesses of EFL learners in Saudi Arabia (Nodoushan, 2008a). In assessing the speaking skills of Saudi undergraduates, it is common for teachers to develop or use traditional rubrics. Thus, most teachers often feel frustrated and lost while assessing students' speaking performance in oral tests. In other words, at Saudi universities, ESL courses do not have standardized rubrics developed by experts, and if any exist, they lack an accurate scoring system. Even though these rubrics provide teachers with the needed scoring criteria, they may not give reliable feedback to improve learners' performance.
In Saudi Arabia, most studies focus on addressing assessment rubrics of other skills other than speaking skills. For example, a study by Alahmadi et al. (2019) investigated the impact of a formative speaking assessment on Saudi undergraduates' performance in a summative test. Similarly, Aldukhayel (2017) explored the clarity and familiarity of three scoring rubrics that were used to assess students' writing achievement in the preparatory year program (PYP) in a Saudi university. Another study by Alshakhi (2019) aimed at exploring the weaknesses of the Saudi writing assessment methods employed at an English Language Institute. The results showed that replacing the existing holistic rubric with an analytic one can enhance contextual-based learning and eliminate cross-grading, to improve student-teacher relationship. In the same vein, Al-Serhani's (2007) study investigated the effect of portfolio assessment method on students' writing performance in EFL classrooms. It is clear from the findings of these studies that, most of the research conducted in the Saudi EFL context is related to the writing assessment hence, more research is required to develop speaking assessment rubrics.
Example of Holistic Rubric on writing assessment In higher education contexts around the world, methods of assessing oral communication have received special attention (Panadero & Jonsson, 2013). Specifically, most learning institutions have narrowed their focus on techniques that can achieve the best outcomes in oral assessments (Campbell et al., 2001;Schwartz & Arena, 2013;Stoynoff, 2013). In Saudi Arabia, English language education is highly test-oriented. Therefore, learners use memorization strategies to cope with the requirements of the tests. Since more emphasis is placed on reading and test-oriented writing skills, most ESL leaners in the Kingdom of Saudi Arabia (KSA) are reluctant to speak during the tests. As a result, such students face challenges in exams that assess oral skills, due to their lack of familiarity with oral assessment methods. In my personal experience as a non-native speaker of English and an EFL teacher, I have not only experienced difficulties in assessing students' speaking skills, but I've also noted how current assessment procedures, adversely affect students' oral performance.
In the process of learning and teaching, assessments aim at collecting data about a process, in order to make decisions about improving learners' outcomes and the goals of a program. As literature suggests, the focus of learners' assessment has been on traditional written exams while oral assessment methods have received little attention due to a lack of standardized oral grading rubric. Notably, rubrics are designed to identify the characteristics of communicative competence and the essential features of performance such as strategic competence and pragmatic competence. When evaluating students' oral performance, language teachers tend to have the same evaluation criteria, but they may assess the same student differently. Therefore, standardized rubrics can be utilized to maintain consistency among teachers. This will enhance effective assessment processes and results (Dunbar et al., 2006) Scholars have identified various features of a standardized rubric that can be employed for precise and consistent learning outcomes. For example, Saeed et al. (2019) considers a rubric as "a grading tool which consists of a set of explicit criteria that are used to assess a specific task performance" (p. 1060). Additionally, rubrics provide "descriptive statements of behaviors that candidates may exhibit in a particular sub-component" (Angelelli, 2009, p. 39, as cited in Samir & Tabatabaee-Yazdi, 2020, p. 102). According to (Sadler, 2009a), rubrics consist of a detailed grading system with either numbers or quality levels. While some rubrics contain standard quality words, such as good or below average, others may present an explanation of the quality in details. Examiners use rubrics to reduce the grading time, decrease the element of subjectivity and give learners appropriate feedback to overcome deficiencies in their performance (Stevens & Levi, 2005). Allen and Tanner (2006) define rubric as "a type of matrix that provides scaled levels of achievement or understanding, for a set of criteria or dimensions of quality for a given type of performance... " (p. 197). In other words, a grading rubric consists of scores, performance levels, criteria, and descriptors. It also shows what is expected of learners in an assignment or a test by stating the criteria and deciding on the levels of quality, ranging from excellent to poor. Assessment rubrics can be used by teachers to score learners' performance or work. Due to their effectiveness, rubrics are crucial instruments in assessing EFL learners' oral skills. To effectively assess learners' communicative competence, teachers need to understand the relationship between the content of speech courses and their objectives. Designing and developing rubrics is mostly done by publishers and researchers (Timmerman et al., 2010). In addition, rubric banks have been provided by online sources (Dornisch & McLoughlin, 2006). This makes it easier for teachers to develop rubrics to assess their students (Andrade & Du, 2005;Boud & Soler, 2015).
For a long time, 'assessment rubrics' have been used in books and research papers. According to Dawson (2017, p. 348), there are certain elements that a rubric must have. These include: "evaluative criteria, quality definitions for the criteria at particular level and a scoring strategy". In this regard, various studies have been conducted in an attempt to evaluate rubrics. However, they may not present functional definitions of how such rubrics are used or what they intend to achieve.
Rubrics have different characteristics. Firstly, rubrics can be specific or generic (Tierney & Simon, 2004). Secondly, there is a stark distinction between rubrics created by teachers and those created with students' collaboration (Reddy & Andrade, 2010). Thirdly, a rubric can be used by students in self-and peer-assessment as well as in questioning a task requirement (Andrade & Du, 2005;Panadero & Romero, 2014). In terms of the scoring system, some rubrics use holistic strategies while others use analytic scoring strategies (Sadler, 2009a). Since scores can be multiple, certain procedures must be included to resolve disputes. There are two methods for scoring a test: the holistic (impressionistic) method, which gives an overall score to a learner's performance (Luoma, 2004), and the analytic method which gives one score for different criteria such as vocabulary, communicative ability, fluency, and pronunciation. The analytic rating score increases the reliability of the assessment process and the assessment tool (Strikaew et al., 2015). Therefore, it is preferred in highly sensitive assessments. Using holistic rubrics for evaluating students may be disadvantageous since the performance of some students may not suit a certain category. Therefore, combining both holistic and analytic rubrics is preferable for assessing students in an EFL context. Although analytic rubrics are "complicated and time consuming" (Taufiqulloh, 2009, p. 187), they are very effective in assessing communication skills. According to Han (2017), holistic assessment methods are largely valid in the context of research, placement, or certification. On the contrary, analytic scales are less biased and assess specific details when employed for the formative assessment of learners. In some rubrics, software can be employed in making judgments about learners' performance (Dimopoulos et al., 2013). Nevertheless, many rubrics lack clear evaluative criteria related to "organizations, mechanics, word choice and supporting details.
Quality descriptors or definitions are used in rubrics to represent an evaluative criterion at a certain level (Sadler, 2009b). These quality definitions need the examiners or assessors to give judgments in line with the scoring strategy. The levels of quality in a rubric may be based on learning outcomes taxonomies or grade levels (e.g., Timmerman et al., 2010). According to Biggs and Tang's (2007, it is acceptable to combine both levels. For example, their rubric includes A-D grade descriptors and four levels of SOLO taxonomy. However, some researchers recommend the use of a single-point rubric since it demonstrates only the standard performance of learners (Fluckiger, 2010). Firoozi (2019) states that choosing valid and reliable assessment methods is essential to achieve valid results. To ensure the validity and reliability of rubrics, several processes can be used. For instance, Timmerman et al. (2010) used a quality process in which they performed statistical tests to guaranty the reliability of grading rubric. From the tests, they concluded that validity of rubrics can be achieved by consulting experts in a particular field and comparing criteria to the ones used in similar tasks. According to a study by Dunbar et al. (2006), a competent speaker rubric should have the ability to identify a suitable time for speaking, to speak in a clear and expressive way, to present organized and understandable ideas, to listen attentively, to use effective medium for communication, to structure an appropriate message, to recognize other people's level of receptivity to the spoken message, and to support given information with examples. In another study conducted by Saeed et al. (2019), an oral rubric was developed based on Canale and Swain's communicative approach to L2 teaching and testing. This approach consists of four competencies, namely: grammatical, strategic, discourse and sociolinguistic. The content of the rubric was validated by experts in the field and the four competencies were employed to develop an effective assessment criterion. For better understanding, rubrics can be presented using a table, grid, or matrix (Sadler, 2009a). However, rubrics can have other forms of presentation such as images or other non-text representations.

Recommendations
Although the scope of this paper is limited to literature on using rubrics in assessing students' oral performance, rubrics can be designed and used to achieve other goals in other contexts or domains, such as in peer review of language teaching (Magno, 2012) and research evaluation (Wong et al., 2013). Thus, conducting more studies on the design, implementation and effectiveness of speaking rubrics can improve educational outcomes.

Conclusion
The review of literature has shown that, assessment rubrics have a pivotal role in improving learners' performance, whereas inaccurate speech assessment rubrics may cause poor evaluation of the learners' skills.
Since most speaking/oral tests in the Saudi context lack a standardized, validated, and reliable scoring rubric based on which teachers can assess EFL leaners objectively, there is a need to develop standardized rubrics that will be used to evaluate undergraduate students in Saudi universities and other higher learning institutions. In addition, empirical research shows that, there is a need for more applied studies that address the issue of assessment criteria and performance bands. According to Dawson (2017) "Practitioners designing a rubric could also fruitfully reflect on the design elements as a way of revealing assumptions about rubrics" (p. 358). Therefore, designing a standardized scoring rubric is essential to provide learners with feedback with respect to the goals of the course they are learning. Consequently, students' outcomes will be improved and more learners will be encouraged to participate in oral assessment thus improving their speaking skills.