Assessment Practices of Preparatory Year English Program (PYEP): Investigating Student Advancement through Third and Fourth Levels

This small-scale mixed method research focuses on investigating the way Preparatory Year English Program (PYEP) female students in a Saudi tertiary level institution context are assessed and how they are advanced from level three (Pre-intermediate) and level four (Intermediate). A four-point agreement scale survey was conducted with fifteen English as a Foreign Language (EFL) teachers in the PYEP to critically investigate the issue from their own perspective. Furthermore, semi-structured interviews were conducted with eight EFL students studying in PYEP in the third and fourth levels. The analysis of the data indicated that teachers were lenient in grading students. They tend to adjust grading practices to the benefit of the students, so students were allowed to pass and progress to the next level up. Additionally, the interviewed students argued that teachers could help them by granting them up to five grace marks to pass their exams. The study also showed that teachers possessed sufficient power in assigning grades following assessment. They used this power to the advantage of their students to make them advance to the following level. The study concludes with suggestions and recommendations for further research.


Introduction
Nowadays, teaching in many parts of the world are going through a great transformation as teachers' anticipations to get their students to high standards of performance and to ensure their learning are continually increasing (Hargreaves, 2000).This is more apparent in the context of Saudi Arabia when it comes to EFL (e.g., Elyas, 2008;Mahboob & Elyas, 2014;Elyas & Picard, 2010, 2012, 2013;Elyas & Basalamah, 2012;Basalamah & Elyas, 2013a, 2014b).Probably, one of the most important aspects of the teaching process is assessment procedures as 250 studies discovered that the use of assessment to promote learning in the classroom improved student achievements (Earl & Katz, 2006).Therefore, evaluation of students' progress is considered as a major part of teachers' job (Brumfit & Johnson, 1979).
In my study, I want to show how female students in the Preparatory Year English Program (PYEP) at a tertiary level institute in Saudi Arabia are assessed in proficiency levels three and four.I will particularly show that students do not advance from levels three and four because of their assessment in formative and summative exams, but rather for other reasons.Additionally, I want to critically investigate the way students advance from proficiency levels three and four.

Definition of the Preparatory Year English Program (PYEP) in Saudi Arabia
The preparatory year in Saudi Arabia is designed to improve students' proficiency in English before they undertake any undergraduate studies.It is also designed to develop and improve students' knowledge of mathematical and analytic techniques through the medium of English language.Once the students are accepted in the PYP, they must take the Online Oxford Placement Test to be placed in the Preparatory Year English Program (PYEP).This program aims to further advance the English proficiency of Saudi students moving into the university system.Students receive eighteen hours of English Language learning per week for a full academic year.The duration of PYEP is one year, divided into two main semesters.Each semester, (eighteen weeks each), is divided further into two further modular semesters making up a total of four quarter semesters or modules (nine weeks each).Each module constitutes of four levels of English titled as follows: level one is the beginner and below beginner level; it generally aims to provide students with a foundation from which they can advance from A1 Breakthrough to A2 Waystage on the common European Framework of Reference (CEFR).Level two is the elementary level; it aims to build and further develop language proficiency at A2 Waystage moving towards a higher level of proficiency.Level three is the pre-intermediate level; it aims to build and further improve language proficiency at A2 Waystage moving into the B1 Threshold level.Finally, level four is the intermediate level; it aims to build and further improve language proficiency at B1 Threshold.

Theoretical Framework and Literature Review
Troudi, S., Coombe, C. and Al-Hamly, M. (2009) stress that "Assessment continues to play a major role in learning and teaching and is extensively and intensively addressed in research studies and theoretical articles both in mainstream education and TESOL/TEFL literature" (p.546).Philosophers and educationalists have been considering tests as powerful devices in society for some time (Shohamy, 1998).For Madaus (991) tests represent a social technology deeply embedded in education, government, and business; as such they provide the mechanism for enforcing power and control.In Shohamy's (1998) view, tests are most powerful as they are often the single indicators for determining the future of individuals; as criteria for acceptance and rejection, they dominate other educational devices such as curriculum, textbook, and teaching.Noam (1996), views tests as tools for imposing implicit ideas about success, knowledge, and ability.He notes that, "How we assess can support careers and make people successful, but it can also destroy people's careers, place unfair burden on individuals' self-perception and unnecessary hurdles in the path of their achievement" (p.9).
In the Arab world, very few studies have been carried out in the area of testing and assessment.A study conducted by Alabdelwahab (2002) in Saudi Arabia which was an exploratory qualitative case study examining the introduction of the self-assessment portfolio as a method of assessment in English as a Foreign Language (EFL) classes, and the purpose of if was to examine EFL students', teachers', and school administrators' reactions to the use of a non-indigenous assessment methodology (i.e., the self-assessment portfolio).Also, a study conducted by Alsheraiqi (2010) on the washback effect of the CEPA English test on teaching in an educational zone in the United Arab Emirates (UAE); the study aimed at discovering the dimensions of that washback effect if such an effect existed.However, the only study in the Gulf region that has looked into teachers' assessment practices focusing on teachers' beliefs and knowledge affecting their decision-making processes in classroom-based assessment was the study conducted by Troudi et al (2009) on EFL teachers' views of English language assessment in higher education in the UAE and Kuwait; it was a qualitative study about the assessment roles and philosophies of a group of teachers of English as a foreign language in the UAE and Kuwait.
In this research, I argue that students in levels three and four are granted grace marks and thus advance from levels three and four not necessarily because of their actual achievements on exams but due to the leniency of many of the teachers who teach them.That practice contradicts the main purpose of assessment; measuring the development and accomplishments of students and giving an accurate authentic scale of the students' achievement.

Research Questions
This study hinges around one research question and two sub questions: 1) How do students progress from levels three and four?a) Does the rating of teachers affect the evaluation of students?b) Do students have strategies in passing exams and assessments?

Methodology:
The study adopted a small-scale mixed method research approach employing both semi-structured interviews to students and a four-point agreement scale survey to teachers.The semi-structured interview questions contained ten main questions with one sub-question for some of them that were constructed as follows: questions 1-3 were related to the levels of the students; questions 4 & 5 were related to how students approach exams, and questions 6-10 were related to students' assumptions on grading.As for the survey structure, it was constructed as follows: the first four statements were concerned with the grading scales at the institute and their measurement; statements 5 & 6 were concerned with the way students dealt with their exams, and the last set of statements was concerned with the teachers' grading policy and reasons.The study was conducted with eight Saudi female EFL students studying in levels three and four in the Preparatory Year English Program (PYEP) and with fifteen EFL teachers of different nationalities teaching in the same program.The study took duration of six weeks.

Ethics
All participants in the study were informed about the research procedure, purpose, and time.As for the students, they were notified of the time, aim, and process of the interviews; they were also told about recording their voices during the interviews (audio-recorded interviews) since it is a crucial issue to consider in the Saudi culture when recording the voices of female participants.Students volunteered to participate and signed consent forms to guarantee their rights to withdraw from the research at any point they may so wish.Additionally, students were chosen randomly, and they were assigned pseudonyms to protect their privacy and confidentiality.
As for the teachers, they were also chosen randomly, and they also volunteered to do the survey; it was taken by their own accord and consent.In addition, the survey was anonymous to protect the teachers' discretion and secrecy as well as to eliminate any harm or threat that might occur or arise during or after the process.
As for the approval of the study, the researcher had the approval from the head of the institute to conduct the study.

Participants
The participants in this research were eight female students and fifteen teachers who were all registered in the PYEP.The students were 18-20 years of age and homogeneous in respect to nationality, mother tongue, and both cultural as well as educational backgrounds.They had all studied in public schools and completed a six-year study of English as a foreign language.Also, all students had taken the Oxford Online Placement Test prior to their admission to college.The students chosen to participate in this study were the ones in proficiency levels three and four.The sample for the study was random sampling; as Ng (2012) believes that individual researchers can get reliable results using various non-probability sampling strategies like random sampling.As for the teachers in this study, they were all in the PYEP and were heterogeneous in respect of age, nationality, mother tongue, and both cultural and educational backgrounds.Their teaching experience ranged from three to twenty years.

Interview Questions
In this study, the researcher chose the interview questions for the students as a tool for data collection to get deeper insight into the matter of how students progress from proficiency levels three and four.Furthermore, as expressed by Canh (2012), interviewing is increasingly used in present-day qualitative research as an important part of triangulated data collection along with observations and questionnaires.The semi-structured interview questions contained ten main questions with one sub-question for some of them that were constructed as follows: questions 1-3 were related to the levels of the students; questions 4 & 5 were related to how students approach exams, and questions 6-10 were related to students' assumptions on grading.The interview questions in this research set out to explore the students' levels in PYEP, the number of times they had repeated that level, the way they approached their exams, and whether they depended, in passing their exams, on the leniency of teachers' grading.The researcher set ten semi-structured interview questions.Then she talked to her participants about her research and the purpose of it.Questions were administered to the students and were audio-recorded for more accurate analysis.All ten semi-structured interview questions in this study were constructed by the researcher to fulfill the research question and sub-questions in the study, to highlight the issues of the research, and to achieve its objectives.Additionally, the questions were a mix of closed-and open-ended items for more explaining and elaborating (see appendix A).For the sake of eliminating any chance of bias or on the part of the participants since she was their English teacher, and due to the fact that the researcher believed that having someone else, other than her, conduct the interview questions to participants would minimize the stress or embarrassment the participants might experience which could lead them to change their responses or not give the authentic ones, the researcher did not conduct the interviews herself but asked two experienced well-trusted colleagues to carry out the interviews with the students... Finally, the questions were written and conducted in, Arabic, the native language of the students to ensure clarity and preciseness on the part of the participants.The researcher wanted to ensure that her participants did not encounter any misunderstanding or vagueness in answering the questions since accuracy was very crucial to the research results.Finally, the entire interviews were transcribed verbatim and the translated into English (see appendix B).The results were analyzed manually.

Survey
The design of the survey was based on a four-point agreement scale that constituted of nine statements and an open-ended question.The researcher, in this study, chose the questionnaire for teachers to be the data collecting tool.The questionnaire was the most suitable tool in the view of the researcher since the topic in-research was sensitive one concerning the grading by teachers as well as leniency and power.Such a matter was almost impossible to discuss with teachers openly or through interviews, and the subject matter required anonymity and discreteness.Brown (2001) defines questionnaires as "Any written instruments that present respondents with a series of questions or statements to which they are to react either by writing out their answers or selecting from among existing answers" (p.6), and he points out that questionnaires can elicit individuals' reactions to, perceptions of, and opinions of an issue or issues.The questionnaire in this research was divided into two major parts: statements 1 & 2 were concerned with the way students dealt with their exams, and the last set of statements was concerned with the teachers' grading policy and reasons.The questionnaire was intended to elicit teachers' perceptions and attitudes towards both the grading policy in the institute they are working in, and their own grading policy.The questionnaire contained personal information (gender and years of experience) for participants, nine closed-ended questions containing a four-point agreement scale relating to their perceptions in grading the students, and one open-ended question for more elaboration on the reasons for grading.As Ng (2012) states "Most questionnaires contain both closed-and open-ended items" (p.30).The researcher believed that the combination of the questions would be very informative and enriching for the study (see appendix C).Also, the researcher planned that the questionnaire as a whole would not take more than five minutes maximum to complete.In doing so, the researcher can get as much responses from teachers as possible, and they would be encouraged to attempt the questionnaire.Finally, in order to get high return rates when administering the questionnaire, it was necessary to take the one-to-one approach administration, whereby the questionnaires were delivered by the researcher to the respondents and the surveys collected at the same time and place.

Pilot Study
Both the interview questions and questionnaire of the research were tested in an initial pilot study.The interview questions were piloted on two students from levels three and four to check the questions' clarity and preciseness as well as to eliminate any chance of ambiguity or misunderstanding.Also, the purpose behind piloting the questions was to generate feedback on leading questions, their ordering, easiness and difficulty as well as to check the time for interviewing.In addition to piloting the interview questions with a student, the researcher piloted her questionnaire with one teacher to ensure that the statements were not ambiguous and to check the feasibility of the procedure.Few modifications were required in both data-collecting tools, such as using more specific expressions in the interview questions and exchanging "You think that" to "I think that" in the opening statements of the questionnaire to make it more authentic.When the results came satisfying in respect of clarity, the researcher went ahead and conducted her questionnaire and interview questions with the participants.

Data Analysis
Data collected from the survey was analysed manually and, the researcher calculated the answers that represented the four scales and organised them in a frequency table (see Table 1).However, with regards to the interviews, the researcher followed the steps relating to qualitative data analysis as described by Creswell (2009) where the researcher transcribed the audio-recorded interviews verbatim, read through the data to get a general sense of the overall meaning, generalised codes and themes into segments, and finally interpreted the meaning of the themes.

Results
The results of this research were divided to two parts: one was concerned with the teachers' questionnaire and the other was concerned with the students' interviews.
7.1 Results of the Teachers' Questionnaire

The Way Students Dealt with Their Exams
As for the first set of statements, they were related to whether students depended on teachers' grading leniency or depended on chance in passing the summative exams.From the teachers' point of view, eight teachers thought that students did not take their exams seriously and that students depended highly on the teachers' leniency in grading.As for the students' dependence on luck in passing Multiple Choice (MC) questions in exams, 80 percent of teachers believed that students answered MC questions randomly and depended on luck in passing exams.

Teachers' Grading Policy and Reasons
As for the final set of statements, they contained three closed-ended statements related to the policy of evaluating teachers and one open-ended question concerning the reasons for which teachers might push a student's mark to the passing level.According to the first statement, nine teachers stated that they would award marks to students who did not complete their coursework whereas six teachers said they would not.As for the second statement in the last set, ten teachers stated that they would grade students on the effort they put into the task rather than on the actual production of the task while five teachers disagreed.Moving on to the third statement in that set, eleven teachers agreed that they would accept informal excuses for students' absences in summative as well as Multiple Choice Questions (MCQ) exams and would allow students to take the make-up exams.Last but not least, the answers to the open-ended question showed variant yet interesting responses.Four teachers had the rationale that the student was displaying a sense of an interactional attitude throughout the academic module.Three teachers had the rationale of noticing a clear progression in the students' performance.One teacher had the rationale of the student being sick during the exam while another teacher had the reason of the student having personal problems in her social life and that she deserved compassion.Three teachers had the rationale of relativity of upgrading students who were very close to the 'pass' mark.Finally, one teacher insisted she would not upgrade a student's grade under any circumstances.

Level of the Students
Three questions were in this set.As for the first question, three students were studying in level 3 while five students were in level 4. When asked the second question, four students admitted repeating the level they were currently studying in whereas the other four were new to the proficiency level.As for question three, two students admitted repeating the level 4, four students repeated level 3, and one student repeated level 2 while the last student is not a repeater at all.

Approach of Exams
This set contained two questions.In the fourth question, seven students replied that they answered the MCQs selectively.The students chose the answer they thought was appropriate, however, in case they did not know or did not understand the questions, and then they would answer randomly.One student stated that she went about all the questions randomly.As for the fifth question, six students admitted that they depended upon their studies on their own efforts and not on the help of their teachers while two students said they depended on both their efforts and the help of their teachers in achieving the pass mark.

Assumption of Students
This set contained five questions.Question six was based on the assumption the students had about the leniency of the grading system at the institute.Three students stated that it was possible for the institute to upgrade an 'F' grade of a student to the passing mark while five students said it was not possible.Furthermore, in the seventh question, three students said that it was possible for their teachers to help them and upgrade their mark to the passing grade whereas five students said that it was not possible.Question eight was on the students' speculation about the lowest failing mark they could get on an exam and still pass the exam; four students speculated it was the 55% grade, and two said it was the 58% grade while the other two said it was the 59% grade.As for the ninth question, one student admitted that she knew her final grade informally, and it was an 'F'; then she discovered that she had been upgraded with three grace marks and passed the course whereas seven students denied that such a practice had occurred with them.The last question was totally concerning the intuition of the students and whether they were certain they had failed an exam and then discovered that they had actually passed.Five students said that they came out of exams certain that they had failed and were surprised that they had passed while three students said they had not gone through such a situation.

Discussion:
8.1 Teachers' Questionnaire (Teachers' Perceptions) In this part, one main theme emerged from the analysis: teachers possessed sufficient power in assessing their students.That was significant in this study, and it was displayed in many ways.
There were 14 teachers out of fifteen in this study, who according to their beliefs, had good reasons to upgrade students' grades, and they also had the capability to do so; that was significant to prove teacher power.This goes along with what Rea-Dickins ( 2004) stated, "Teaching involves assessment; in making decisions… teachers have to determine the strengths and weaknesses of the alternatives available to them.They make selections based on their experience, their understandings of learning, language development, and language proficiency itself together with what they consider to be most appropriate and in the best interests of those they teach" (p.249).Also, there is no evidence in the literature about teacher possessing this kind of power in assessing their students.I came across one study conducted by Edelenbos and Kubanek-German in the Netherlands and Germany ( 2004), which reflected the growing concern in teacher-based assessment to understand the means by which teachers assess the English language development of their students Breen, M.P., Barrett-Pugh, C., Derewianka, B., House, H., Hudson, C., Lumley, T. and Rohl, M. (1997).As for the Arab world, and to the best of the researcher's knowledge, no studies have tackled teachers possessing power in assessment except this study.On the contrary, studies nowadays are showing teachers' oppression and deprivation of power, and they are very few too.In a study done by Troudi et al (2009) on EFL teachers' views of English language assessment in higher education in the UAE and Kuwait, Troudi concluded that teachers were not involved in assessment-related decision-making processes; they had very little voice in this important element of the curriculum and were marginalized within a top-down managerial approach to assessment.Another study by Farah (2007) was on the effects which a high-stakes international test had on the students' access to a field of study of their choice, and that was the only research with a critical agenda aimed at questioning certain assessment practices in the Gulf region.One of the main conclusions of the study was that teachers did not have voice and choice in assessing their students; in fact, the assessment policy was imposed on the teachers.
In my opinion, according to this study, teachers displayed power in assessing their students in a number of ways and examples.Firstly, teachers believed that students deserved extra grades for extra work.That relates to what Rea-Dickens, (2004) states, "Teacher assessment relates to the agent of the assessment while the formative/summative distinction refers to the purpose of the assessment" (p.252).Secondly, teachers would give their students grace marks when students showed efforts in their learning and displayed interaction in the classroom to encourage students to work harder and built their self-confidence.That relates to Messick (1999) who argues that in order to increase, both, access and success, assessment should foster student development, improve teaching practices, and recognize learning that happens in informal learning environments such as home and work.Thus, teachers followed up students and monitored students' progression and development.
Additionally, teachers believed that grading students on interactional attitudes and efforts in class would make them more motivated, and encourage them to learn.Therefore, students progressed to the following level not because of the formative and summative exams but rather on being interactive and producing more efforts in learning.Thirdly, teachers believed that students deserved to re-enter exams they missed by accepting students' informal excuses; an example of an informal excuse would be a sentence or two written on a plain piece of paper by the parent of the student stating the reasons for which his/her daughter was absent on the main exam.Finally, teachers sympathized with students and granted them grace marks when it came to the students' personal problems at home like dealing with a vicious step-mother, a rigid father, or brother; teachers also granted students extra marks if a student showed up sick for the exam.
To sum up, teachers definitely exercised power in assessing their students by giving them extra marks to advance them from proficiency levels three and four.

Students' Interviews (Students' Perceptions)
In this part, one main theme emerged from the analysis: students depended on their teachers to move to the following level.This relates to what Rea-Dickins (2001) confirmed that teachers find themselves at the confluence of different assessment cultures and faced with significant dilemmas in their assessment practices: sometimes torn between their role as facilitators and monitors of language development and that of assessors and judges of language performance as achievement.
According to this study, 80 percent of the students repeated the advanced levels.The reason the students gave for repeating the advanced levels was unified: difficult exam questions.Ghadeer (pseudo name of a student participant) said: "I repeated level 3 three times, and I do not want to repeat it again; I am tired of repeating it."However, when teachers awarded students grace marks, students moved to the next level.As a result, students suffer from repeating the advanced levels, which required more work, effort, and time.Students managed to reach the advanced levels due to extra marks given by the teachers in the previous levels.This resulted into two outcomes; students depended more on the grace marks from their teachers rather on their studying and doing their coursework.Also, students learned to be more reluctant, idle, and indifferent in the upper levels where hard work and care were most needed.Students acted carelessly in their studies just as they did in their exams; Haneen said: "I choose selectively when I know the answer but I choose randomly when I don't."Finally, students' dependence on the grace marks given by their teachers is displayed in their views and assumptions.50 percent of the students believed that they could be upgraded up to five marks to reach the passing grade.Hala said: "I could still pass if I get 55 out of sixty; teachers could help me with the remaining 5 marks and make me pass to the next level."Having almost half of the students believe that the teachers were lenient in grading would make the students take their exams less seriously and depend on teachers for upgrading their final grades.Sara said: "I knew through my teacher before the official grades were out that I hadn't passed, but then I was surprised that I had actually passed."The results of the students' interviewing concluded that many of them were certain that their teachers could help them by giving them extra grades, so students could pass to the following level.Many students also believed that they could still pass if they did not reach the passing grade.Additionally, they believed that they could be granted up to five grace marks and move to the next level up.However, one conclusion was certain from the findings and that is students believed that their teachers could help them to advance to the next level; in fact, students depended on that belief and acted upon it.

Reflection and Conclusion
In this research, the researcher critically investigated and problematized the way PYEP female EFL students were evaluated and how they progressed from proficiency levels three and four.After conducting a four-point agreement scale survey with fifteen EFL teachers, the results showed that 93 percent of the teachers awarded students grace marks for various reasons to make students reach the passing grade.In addition, 73 percent of teachers accepted informal excuses from students and allowed them to enter the make-up exams.Interestingly, 80 percent of teachers assumed that students answered their MCQ exams randomly and passed the exams by chance.While all the participating teachers had their own reasons for upgrading students' grades to the passing grade, and the reasons were all related to the students, teachers had the power and capability to assess their students the way they see fit and grant them grace marks to advance them to the next level up.
As for the semi-structured interviews which were conducted with eight students in the advanced proficiency levels three and four, the results were very similar.All the students admitted that the MCQs were difficult, indirect, and different from the material they studied.Also, all students believed that they could be upgraded to the passing mark, and they actually set minimum grades upon which they could be upgraded; these grades ranged from 55 -59 out of 60.This assumption, on the part of the students, carried some truth in it because some students had actually gone through similar situations where they were upgraded and got advanced to the following level after knowing informally they had failed.Although students claimed the exams were difficult, they depended on their teachers to help them pass by granting them extra grades and advance them to the next level up.However, the fact that teachers could upgrade students' grades indicated that teachers exercised sufficient power in assessing their students.This study shows that many teachers have certain statutory powers, and teachers can control a student's grade to a great extent and make her pass and progress to the next level up.

Further Research
The researcher intends to critically investigate the institute grading system, the nature of the exams and assessments conducted, and the students' perceptions of the exams they have been taking.This will shed more light on the current research study as well as continue from the point this study started off with.

o
Displaying a sense of an interactional attitude throughout the academic module.(4) o Noticing a clear progression in her performance.(3) o Being sick during the exam.(2) o Having personal problems in her social life.(2) o Following the general policy of upgrading students who are very close to the 'pass' mark.(3) o Never upgrade.(1)

Table 1 .
Results of the four-point agreement scale teachers' questionnaire