Benchmarking Year Five Students ’ Reading Abilities

Reading and understanding a written text is one of the most important skills in English learning.This study attempts to benchmark Year Five students’ reading abilities of fifteen rural schools in a district in Malaysia. The objectives of this study are to develop a set of standardised written reading comprehension and a set of indicators to inform ESL teachers about the exact ability of the students. A sample of 788 primary school students from the rural areas was involved in this study. The instrument utilised in this study was a set of standardised written reading comprehension test which was developed in line with Malaysian English Language Syllabus (2003), the revised Barrett’s Taxonomy of Reading Comprehension (Day & Park, 2005) and the revised Bloom’s Taxonomy (Anderson et al. 2001). The set of standardised written reading comprehension questions consists of 50 multiple-choice questions at elementary, intermediate and advanced levels. The findings show that many Malay respondents were categorised at ‘below expectations’ and female students perform better than male students. Finally, the researcher suggested several recommendations.


Introduction
According to the current Malaysian English language assessment in the Primary School Evaluation Test, students are assessed various language skills including vocabulary, grammar, reading comprehension and writing.Students' performance is reported generally by using composite grades.In reading comprehension, students are only required to answer ten multiple choice questions based on a non-linear text and and linear text (Malaysian Education Syndicate, 2008).Effort has been taken by the Malaysian Ministry of Education to improve the assessment system (Faizah, 2011) by combining both centralized and school-based assessment.However, students are still assessed in general and the system does not specifically assess students' reading abilities.Even though descriptors are provided, the information provided is only about students' general achievement which does not specifically state the strength and weaknesses of each student and provide suggestion on what students need to improve.The terms used in describing students ability lack of precision.For example, "Apply knowledge obtained through listening, speaking, reading and writing in various situations using good manners" (taken from Document of Standard Performance, Ministry of Education Malaysia 2013) Based on the Malaysia Education Blueprint (2013Blueprint ( -2025)), among the 11 shifts identified by the Malaysian Ministry of Education, the first shift is to benchmark the learning of languages, Mathematics, and Science.As stated in the latest education blueprint, every student should have received a strong grounding in literacy and numeracy which serves as the fundamental skills for all further learning.Therefore, a suitable reading assessment should be carefully designed and administered in Primary schools.

Problem Statement
According to the Malaysian Examination Syndicate (2008), reading comprehension is part of the assessment.The Malaysian English language public examination assesses students' reading, writing, grammar and literature in a paper.There is no specific assessment to gauge the students' reading ability.As remarked by Abdul Rashid Mohamed et al. (2010) the current assessment system is disadvantageous as the grades or test scores obtained constitute the only source of information that the teachers have concerning the reading abilities of their learners.The grades that are obtained by the students are merely composite grade.The grade assigned is a summary for all skills tested in a single paper.English teachers are only provided with students' grades received from public examinations but such results do not further depict students' reading abilities.Kubiszyn and Borich (2003) pointed out that grades often result in the loss of information as well as misinterpretation of the students' actual achievement, thus, grades are meaningless to students and parents as they are unable to provide a detailed description on learners' strength and weakness.The main shortcoming of grades is that they provide ambiguous and superficial descriptions of reading capabilities as the teachers would not be able to identify what the learners can and cannot do in reading.
School-based assessment was introduced in 2010 by the Malaysian Ministry of Education; Hwa and Lim (2008) noted that the aim of school-assessment is to improve teaching, learning and assessment.Students' achievement will be assessed and graded based on the criteria and standards specified in the subject syllabus.Unfortunately, only an overall result of students' English performance is reported but such result is still unable to pinpoint the strength and weakness of the students in reading since Brown (2004) claimed that assessment plays an essential role in teaching and learning process.Furthermore, teachers need all information on students' performance to aid their work (Carey, 2001).In the other words, Teachers should know how a student comprehends what he or she reads so that a teacher can address the problems found instructionally if a student is having certain difficulties (Popham, 1999).
Since there is no standardised assessment to gauge specific reading ability of Year Five students, this study was conducted to develop a set of standardised written reading comprehension test and develop indicators to benchmark Year Five students' reading ability in rural schools.

Reading Comprehension
The ultimate reason for reading is to comprehend the information in the text or the meaning which is intended to be conveyed by the author.A child must be able to understand the smaller word units first before being able to comprehend larger units of text such as paragraph or stories.As cited by Morales (2010), Wallace (1992) stated that reading is a tool for survival, a medium for social interaction and a means to access general knowledge of the world.

Reading Assessment
The Progress in Reading Literacy Study (PIRLS) was developed with the purpose of improving the teaching of reading and the reading skills acquisition around the world (Mullis et al., 2009).According to the PIRLS (2011) Assessment Framework, teachers use informal and formal assessment to monitor students' progress and achievement.As cited by Mullis et al. (2009), Lipson and Wixson (1997) pointed out that teachers use informal assessment to identify needs of particular individuals, or evaluate the students' pace in terms of presentation of concepts and materials.As cited by Mullis et al. (2009), Kennedy et al. (2007) stated that teachers carry out formal tests, both teacher-made and standardized assessments to order to make important decisions about the students, the decisions include grades or marks, promotion, or tracking.PIRLS (2011) provides a comprehensive picture of the reading literacy achievement to students who participated in each country.The achievement includes reading purpose and comprehension process as well as overall reading achievement.Another research on students' reading comprehension skills was done by researchers in Finland in recent years.Merisuo-Storm and Soininen (2012) conducted a study which attempted to measure how well sixth-grade students aged 12 to 13 years old understand a newspaper text and whether they are able to derive the meanings of certain words in it from the context.

Indicators of ReadingAbilities
In the United States of America, the education department has been conducting a reading related programme entitled National Assessment of Educational Progress (NAEP).The programme was first administered in 1969.As reported by U.S. Department of Education (2009), NAEP provides results which are currently used for three main purposes: (i) Monitoring trends in students' achievement.(ii) Providing evaluative statements based on the level of students' achievement.(iii) Making interstate comparisons.The requirement of evaluating the students' level of achievement is to create standards of students' performance by defining the level of student performance (basic, proficient, and advanced) and cut score is being established along the score scale.Evaluative judgments regarding the meaning of different levels of achievement is required in setting the achievement levels, and then moving from making descriptive statements about students' achievements to making evaluative statements about students' achievements.In addition, According to Broeder and Fu (2009), descriptors are used to promote transparency and coherence for language learning.The Common European Framework of Reference (CEFR) and the European Language Portfolio (ELP) are the most influential documents in the fields of language learning and teaching in Europe last decade and elsewhere (Broeder & Fu, 2009).The CEFR adopts an action-oriented approach towards the use of language.A descriptive scheme is being used to focus on the actions performed by persons to develop a range of general and communicative language competences.

Benchmarking
A benchmark refers to what students are expected to achieve at a given grade format (Airasian, 2001).According to Gronlund (2006), content standards consist of statements.These statements are specified in a general way on what students should learn.Every standard is followed by a number of benchmarks.The benchmarks clarify what students have achieved the content standards.It reveals what students know or can do.Therefore, Airasian (2001) pointed out that benchmarks are more specific than standards.According to Torrance (1995), benchmarking is being developed in many countries.Benchmarking and verbal descriptions are used which serve as the basic for performance assessment.The approach can be found in the Toronto 'benchmark' Standards of Student Achievement in Canada.The main purpose of benchmarking is to provide descriptors in curriculum areas.Teachers can standardise their reporting of students' achievement using the descriptors to gauge how well their students are doing.

Research Objectives
The objective of the study is to benchmark Year Five students' reading abilities in rural schools using a set of standardised written reading comprehension tests consisting of questions at elementary, intermediate and advanced levels.Besides, this study attempts to develop the indicators of students' reading ability based on the cut score obtained from the pilot study, which aims to provide teachers specific information about what students can and cannot do in reading comprehension.

Sample
A number of 788 Year Five students from 15 primary schools located in rural areas were involved in this study.

Instrument
A set of standardised written reading comprehension questions consisting of 50 multiple-choice questions was developed.Each section consists of six levels of comprehension questions namely: literal, reorganization, inferential, analysis, application and evaluation.The set of standardised written reading comprehension consists of three sections: elementary level (12 questions), intermediate level (24 questions) and advanced level (14 questions) The reading comprehension questions consist of three levels of questions based on Barrett's Taxonomy of Reading Comprehension (Literal, Reorganisation and Inferential) and three levels of higher-order thinking skills questions based on Bloom's Taxonomy (Analysis, Application and Evaluation).Abdul Rashid et al. (2010) cited Mok (2000) who claimed that the proportion of the test questions was based on the distribution of difficulty level which is 25% easy, 50% average and 25% difficult.The standardised written reading comprehension test was developed based on linear and non-linear texts.Non-linear texts consist of different genres such as "birthday card" and "advertisement" whereas linear texts comprise of article, dialogue, e-mail, informal letter, and story.

Piloting the Prototype Reading Comprehension Test
As many as 299 respondents of Year 4 (76 students), Year 5 (107 students), and Year 6 (116 students) from a selected school were involved in the pilot study.The pilot study allows the researcher to obtain the test validity and reliability.Hanna (1993) claimed that the reliability of a device is the extent to which its scores are consistent.As cited by Hanna (1993), the result contains a certain amount of error whenever anything is measured (Stanley, 1971, p. 356).As stated by Kubiszyn and Borich (2003), the reliability of a test refers to the consistency in which it yields the same rank for respondents taking the test more than one time.Brown (2004) pointed out that a reliable test is consistent and dependable.As cited by Abdul Rashid et al. (2010), Popham (1999) noted the most commonly used internal consistency procedure was the Kuder-Richardson method when a test consists of multiple-choice items.The reliability of this test was found to be 0.85.

Developing Cut Scores for Bands
The scores obtained from the pilot study were used to categorise the respondents in order to determine the reading proficiency of the students.The respondents were categorised into six bands based on the revised Barrett's Taxonomy of Reading Comprehension (Day & Park, 2005), the revised Bloom's Taxonomy (Anderson et al., 2001) and the Malaysian English Language Syllabus (2003).To develop the range of scores between bands, the researcher used z-scores.In this study, cut score was used to categorise the respondents into six bands (Band 1, 2, 3, 4, 5 and 6) as it could determine students' reading ability.According to Carey (2001), z-score determines how much a point deviates from the mean.Gronlund (2006) stated that z-scores indicate a number of standard scores in standard deviation units.It determines how far a given raw score is above or below a mean.
From the findings of the pilot study, the value of mean and standard deviation was calculated.The mean was 23.0 and the standard deviation was 8.0.The raw score (23) would be assigned a z-score of 0 and it is equal to the mean.The distance of one standard deviation was 8 raw score points everywhere along the baseline.The raw score of 31 (23 + 8) was the point where one standard deviation is above the mean.Table 1 shows the cut scores for the bands.This study required quantitative data and it would be used to develop the bands based on the scores gained from the test.Students' ESL reading proficiency was indicated by the different bands (Band 1 to Band 6).The data gathered was analysed using the following procedures.First of all, the scores obtained from the standardised written reading comprehension test were keyed into the computer.The Statistical Package of Social Science (SPSS-PC) version 20 was used to generate the statistical calculations.The results of the study were stipulated in the forms of frequency and percentage.

Developing the Students' Reading Indicators
To develop the students' reading indicators, the standardised written reading comprehension test was firstly constructed based on the Malaysian English Language Syllabus ( 2013), Barrett's Taxonomy of Reading Comprehension (Day & Park, 2005) and Bloom's Taxonomy (Anderson et al., 2001).After the administration of the assessment involving the standardised written reading comprehension test, the results obtained from the standardized assessment will be analysed to benchmark Year 5 students' reading abilities.The students' reading abilities will be reported using the performance bands and a set of reading indicators of students' reading abilities will be developed.
Indicators of reading ability with fair ideas based on the students' reading proficiency were formed (refer to Appendix A).This allows teachers to have a clear idea about what students have and have not mastered the sub-skills of reading comprehension by referring to each of the bands.The indicators of reading ability serve as a handy and practical diagnostic tool for determining ESL students reading abilities as it clearly identifies students' strength and weakness.Teachers are provided the reference in terms of the students' progress and achievement when the descriptor is used horizontally.The respondents' performances in the reading comprehension test were described in terms of their ability to answer comprehension, application, analysis, synthesis, and evaluation questions.They were presented as Band 1 to Band 6.Therefore, each respondent is provided with the result ranging from Band 1 to Band 6.

Results and Discussion
The results of standardised written reading comprehension test for the fifteen rural schools were shown in Table 3, 4 and 5.

Year 5 Students' Reading Performance in Rural Schools
Albertson (2010) claimed that reading performance level descriptors are designed to define what a student knows and can do at a specific grade and to help parents, educators, and students understand the performance level scores a student receives.For this study, the scale of reading performance (Table 2) was developed based on British Columbia Performance Standards: Reading for Information, (Province of British Columbia, 2013) to suit the Malaysian learners of Year 5.The three levels of reading performance of the Year 5 respondents were developed based on Texas Assessment of Knowledge and Skills Performance Level Descriptors: Reading (Texas Education Agency, 2006).3 illustrates the performance standards of Year 5 students from fifteen rural schools.Of all the 788 participants, there is only 0.1% or 1 respondent who exceeds the expectations.2.2 percent met the expectations and 97.7 percent were below the expectations.Based on the percentage of the performance standard, 15 respondents were categorised in Band 1; 373 respondents were categorised in Band 2 and 328 respondents were categorised in Band 3.There are 54 students who were categorised in Band 4 and only 15 students who were categorised in Band 5 (32-39 scores).The remaining one student was categorised in Band 6.A conclusion can be drawn that the Year 5 students were unable to perform well.According to the research done on factors influencing reading literacy at the primary school level by Geske and Ozola (2008), the results of the research had unambiguously proved notable literacy problems in rural schools.Likewise, PISA results (2009) stated that in Turkey, the Slovak Republic, Chile, Mexico and Italy, as well as the partner countries Peru, Tunisia, Albania, Argentina and Romania, the performance gap between students in urban schools and those in rural schools is more than 45 score points.

Year 5 Students' Reading Performance by Gender
The study revealed that 48.2% of male students were categorised as below expectations whereas 48.2% of female students were categorised as "below expectations".There was only a female student out of 788 students who exceeded the standard.Furthermore, 8 male students (1%) and 9 female students (1.2%) met the expectations.It can be concluded that female students performed better than male students.
The result is similar to the study conducted by Langen et al. (2006) which showed that female students have always outperformed male students in most of the countries.

Conclusion
The findings from the analysis of the data collected from participating students revealed that Year 5 students of rural schools did not perform well in the standardised written reading comprehension test.Further action has to be taken to curb the situation especially students who are at "Below Expectations".Minority of the students managed to achieve "Meets Expectations" and "Exceeds Expectations and these students" performance should be maintained.With this information, the ESL teachers can tailor their teaching instruction to meet the needs of students.At the same time, the state or district education department can also organise reading programme to upgrade the standard of reading in the rural schools.
the National Assessment of Educational Progress, Final Report.

II
Can use words, phrases, sentences and paragraphs to determine the meaning of unfamiliar words satisfactorily I Can use words, phrases, sentences and paragraphs to determine the meaning of unfamiliar words well Summarising (R2) III Can summarise from a section of text as a whole satisfactorily II Can summarise from a section of text as a whole well I Can summarise from a section of text as a whole excellently Synthesising (R3) III Can hardly synthesise by gathering information from at least two courses from the texts.II Can synthesise by gathering information from at least two courses from the texts satisfactorily I Can synthesise by gathering information from at least two courses from facts from opinions in implicit texts II Can distinguish facts from opinions in implicit texts poorly I Can distinguish facts from opinions in implicit texts well Application Making Application III Can hardly apply basic concepts of argumentation II Can apply basic concepts of argumentation poorly I Can apply basic concepts of argumentation satisfactorily

Table 1 .
Cut score

Table 2 .
Scale of reading performance

Table 3 .
Reading performance of year 5 students in rural schoolsTable4contains the frequency and percentage of Year 5 students' performance by ethnicity.From the findings, it shows that 0.5% of non-Malay students and 97.2 % of Malay students were categorised as 'below expectations' in reading.There were only 17 Malay students who met the expectations and only a Malay student who exceeded the expectations.This information will enable the ESL teachers to prepare their teaching instruction to meet the needs of the students.