On the Impact of Teacher-Made Vocabulary Tests vs. Standardized Vocabulary Tests on Reading Comprehension Performance of Iraqi Intermediate EFL Learners

This study Investigates the impact of Teacher-Made Vocabulary Tests (TMVT) vs. Standardized Vocabulary Tests (STVT) on Reading Comprehension Performance of Iraqi Intermediate EFL Learners. After analyzing collected data out of treatment, assessment, and instrumentation, the researcher evaluated and interpreted the score obtained by 66 young Iraqi female students in two language schools in Iraq. The interpretation shows how the participants reacted to Teacher-made vocabulary tests as compared to their reactions to Standardized Vocabulary test. The crux of the matter was the degree to which such vocabulary tests affect the EFL learners’ performance in Reading Comprehension tests. The researcher reported that the use of Standardized vocabulary tests DOES play a significantly positive role in promoting the reading comprehension skills of Iraqi EFL college students. She also found that Teacher-made vocabulary tests DO play a significantly positive role in promoting the reading comprehension skills of Iraqi EFL college students. Finally, the researcher found that the t test value of 2.635 indicated a significant difference between the STVT and TMVT. This means that the use of TMVT is more effective than the implementation of STVT in encouraging Iraqi EFL learners to read more successfully.


Problem Statement
Indubitably, the nature, function, and type of standardized tests, the internationally qualified exams are determined and designed primarily on the grounds of measuring English language proficiency levels of EFL learners worldwide, owing to the crux of confidentiality policy, the secrets of such standardized tests have always been unidentified and undisclosed to the public, language instructors, language assessors, and language learners. Other way round, the nature of teacher-made tests have always been under scrutiny by other teachers, official testing institutions, families, experts, and students. Researchers has conceived that reading texts followed by questions are better welcomed by the students than most other types of reading passages. But most recently it is suggested that passages followed by teacher-made vocabulary questions or tests are more effective in improving learners' vocabulary development and reading comprehension skills. Recently research on L2 vocabulary learning and reading comprehension has shown that readers need to develop essential vocabulary learning processes and abilities such as rapid word recognition, vocabulary development, text structure awareness and strategic reading. Yet, all researchers recognize that the actual ability to comprehend a text come about through reading, and doing a great deal of it, as the core of reading instruction" (Marianne, C. & Muricia, E. 2011).
Although different strategies have been used to improve vocabulary and reading comprehension, none of them foster a fun learning environment and reduce anxiety towards test-taking. Allowing students to make their own tests while they are reading the text, will give them an opportunity to have more authority over their own learning and feel more autonomous.
If teachers make their own tests from the text, they interact with the text more than usual while teaching the texts. Many EFL instructors feel that their reading comprehension is better than usual when they are able to demonstrate on comprehension tests . Though researchers have done a lot on other approaches, they have not studied these approaches comparatively.
Accordingly, this study attempted to test the effect of teacher-made tests vs. standardized vocabulary tests on vocabulary development and reading comprehension performance of Iraqi College EFL Students.

Significance of the Study
This study is expected to concoct grave theoretical and practical significance in foreign language assessment, teaching, and learning. Based on the statements above, the researcher in his research is going to show the effect of Teacher-made test of vocabulary vs. Standardized tests of vocabulary on reading comprehension of Iraqi EFL College students. Based on the questionnaire that Patrick Smith (University of American Puebla), in his article (Learner Self-Assessment in Reading Comprehension: The case for teacher-constructed tests, 1994) mentioned, 76% of teachers felt that the process of writing a test would help them to perform better in teaching vocabulary and reading comprehension passages.
Teacher-made vocabulary Test Questions allow the faculty to assess at least three aspects of student learning. In these questions, teachers see what their students consider the most important or memorable content, what they understand as fair and useful test questions, and how well they can answer the questions they have posed. This information not only provides direction for teaching, but can also alert the teacher when students have inaccurate expectations about upcoming exams. Teacher-made Vocabulary Test Questions help students assess how well they know the material and faculty feedback can refocus their studying.

Relevant Scholarship (Related Research Results)
Tests are designed to measure the extent to which an individual has "achieved" something, acquired specific information, or mastered a specific skill, usually as a result of planned instruction or training. It is the task of education to bring about deep learning and desirable changes in those who experience the educational process. In addition, for this task to be performed effectively and efficiently, educators must have a useful means of assessing the initial state of a particular individual or group and the changes that have occurred as a result of the instruction. In addition to attempting to qualitatively experience the full extent or direction of change, it is often important for teachers to form a qualitative assessment of the desirability of change (Charles & Richard, 1990). However, all of these assessment methods have found unimpressive student performance on standardized exams administered by public examination boards over the years, and this is a testament to everyone involved in the education sector. Olatoye (2014) found that Nigerian science students consistently performed poorly on internal and external exams. The observed decline in student performance on public standardized tests raises the question of whether high failure rates do not reflect the quality of test use in schools. In other words, tests used by teachers as a means of classroom assessment in the teaching and learning process may explain the observed poor academic performance of students (Charles and Richard, 2010). The standardized test can be used to supplement classroom-her tests created by teachers. Standardized tests, such as tests administered by teachers, help measure student progress and achievement and determine how well students have acquired knowledge and skills. Combining teacher-administered testing and observation along with carefully selected standardized tests provides a comprehensive program of student assessment and program evaluation in education (Okpala, Onocha, and Oyediji, 2012). The process of developing standardized tests provides insight into understanding intended functionality. Test standardization is the process of conducting tests on a representative sample of test takers for the purpose of establishing norms and having clearly defined procedures for administration and evaluation, including normative data. If there is, the test is considered standardized.
Three reading comprehension definitions have been prevalent and known in literacy programs in the United States of America as reported by Foertsch, 1998. Learning to read means learning to pronounce words, as stated by the first definition. Three are a number of other definitions in this area which need to be studied for more understanding of reading concept. Such definitions recognize the importance of teaching skills as part of the reading process (Allington & Cunningham, 1996;International Reading Association and National Early Childhood Education Association, 1998; Maryland State Department of Education, n.d.; Snow, Burns, & Griffin, 1998). It also supports balanced reading instruction for all students (Allington & Cunningham, 1996;Au, 1993;Foertsch, 1998; International Reading Association and National Association for the Education of Young Children, 1998;Snow, Burns, & Griffin, 1998). As we know, comprehension is an important goal for most language learners. As defined by the Partnership for Reading (2005), reading comprehension is the understanding of the text read or the process of creating meaning from the text. Comprehension is a "construction process" since it comprises all the components of the reading process that work together in reading a text to create an illustration of the passage in the reader's mind.
Comprehension Success in all academic areas is dependent upon the ability to read and comprehend. Reading comprehension is a dynamic process, which requires interaction between the reader and the text as a quintessential part of the activity. Mastery of the other components: phonemic awareness, phonics, vocabulary development, and fluency, facilitates reading comprehension. The act of comprehension is so sophisticated that there is not one instructional approach that can meet the needs of all readers with all texts in all learning situations (Snow, Burns, & Griffin, 1998).

Research Questions and Formulated Hypotheses
To illustrate the research tenets and assumptions, the researcher poses the following research questions; R Q1 . Does the use of Teacher-made vocabulary tests significantly promote the reading comprehension skills of Iraqi EFL college students? R Q2 . Does the use of Standardized vocabulary tests significantly promote the reading comprehension skills of Iraqi EFL college students? R Q3 . Is there any significant difference between the impact of the use of Teacher-made vocabulary tests vs. Standardized vocabulary in promoting the reading comprehension skills of Iraqi EFL college students?
The corresponding null-hypotheses are: H 01 . The use of Teacher-made vocabulary tests (TMVT) does NOT reliably promote the reading comprehension performance of Iraqi EFL college students.
H 02 . The use of Standardized vocabulary tests (STVT) does NOT reliably promote the reading comprehension performance of Iraqi EFL college students.
H 03 . There is NO reliable inconsistency between the impact of Teacher-made vocabulary tests (TMVT) vs. Standardized vocabulary tests (STVT) in promoting the reading comprehension performance of Iraqi EFL college students.

Participants
To actualize the purposes of this study, the researcher chose and recruited 60 homogeneous participants from a population of 120 Iraqi EFL students as her study participants. The participants were only female Iraqi EFL learners studying English at an Iraqi College, in Baghdad. The recruited participants were all intermediate EFL learners with almost a homogeneous background in English in Iraq. They study English as an obligatory course in High School and they have no exposure to the English language in any other language school apart from their state school. The participants at this term study the book "Developing Reading Proficiency 2" as a four credit course, provided that they passed the book "Developing Reading Proficiency 1" at first term. The age of participants ranges from 19 to 25. For the researcher to make sure whether the participants were at the same proficiency level, TOEFL Reading Proficiency test was administered. After making sure having 60 homogeneous participants, the researcher at random (by flipping a coin) divided candidates two classes, each having thirty students. One class called the control and the other named experimental.

Apparatus and Instruments
In this research, these three instrumentations were used: Standardized test is constructed by eligible and professional test designers mostly in England and the USA where ESOL and ETS are of huge popularity. IELTS, TOEFL, CELPIP, CAE, FCE, CAEL, and GESE are some of these tests, while a teacher administered test is, as the name suggests, a teacher administered test. Ahman and Glock (1971) argue that a standardized test is one that is almost always carefully designed by groups of people rather than elt.ccsenet.org English Language Teaching Vol. 16, No. 7;2023 a single individual (teacher-administered tests). In this study, teacher-administered vocabulary tests are those vocabulary tests developed by the teacher and used experimentally in the classroom in Iraq, where the study takes place, to measure and assess vocabulary knowledge of students in the classroom.

Standardized Vocabulary Tests (STVT)
Regardless of format, vocabulary tests are generally considered interchangeable indicators of vocabulary knowledge, coined by Spearman (1927) as an "indifference indicator." In his extensive analyzes of the factor structure of human abilities, Carroll (1993) concluded, "The exact form in which vocabulary knowledge is measured does not usually change the factor composition of the variables so much that the main characteristic being measured is native vocabulary" (p. 158). The standardized vocabulary tests in this study are those vocabulary tests traditionally used in Iraqi language schools and colleges to measure and evaluate students' vocabulary skills in class.

TOEFL (English Language Proficiency Test)
A TOEFL Reading Proficiency test was administered to find out the homogeneity of the participants. The original test is available at www.iielatinamerica.org. The reliability of this test was 0.93 (93%). The researcher of this study found the reliability of this test through KR-21 rk formula.

Pretest
After dividing homogeneous groups of the participants into two groups as a control and experimental group and piloting the test to a group of 20, the researcher administered a pretest of reading comprehension to the students in both classes. Five reading texts followed by 20multiple-choice and "WH questions" format from the book "Developing Reading Proficiency 2" written by Dr. M. H. Tahririan (2010), published by Payam Noor University, were chosen for pretest.
To construct the pretest, the researcher found the readability of texts through Felsch readability formula. It was done with the Word 2007 software and the mean score was calculated. Therefore, the texts which were chosen for comprehension test, had the readability between 59.77 and 73.81. Then the test was scored.

Post-test
The researcher administered a post-test of reading comprehension to the students in both classes to get the result of treatment. The post-test was five texts that were used in pretest. It means that the texts used in pretest and post-test were the same. The texts followed by 20 questions in the multiple-choice and WH-question format, but these questions were neither the author-made questions that were used for control group nor the student-made questions that students made in treatment group. The post-test test was different test from both the control and treatment group. A pilot teacher-made reading comprehension test was used for post-test, because the aim of post-test was to find out the effect of treatment in this study.

Study Design
Among several research designs, the one which seems to best fit the purpose of the present study is the quasi-experimental design (The pretest-post-test control group design) (Selinger & Shohamy, 1989). It is illustrated in Table 1 as following:

Procedure
The study was carried out through the comparative-correlative analysis of the scores obtained by the same participants who experience either Teacher-made tests or Standardized tests of vocabulary. Sample size is normally not less than 25. To do this study, the researcher chose 60 homogeneous participants from a population of 120 college students as his study participants by administering a TOEFL. Both genders were recruited whose major was English. They are from Baghdad University. They were studying at second term. The participants at this term study the book "Developing Reading Proficiency 2" as a four credit course provided that they passed the book "Developing Reading Proficiency 1" at first term. The age of participants ranges from 19 to 25. For the researcher to make sure whether the participants were at the same proficiency level, TOEFL Reading Proficiency test, was administered. The scores were ranged from low to high and 60 participants who scored between the ranges of 60 to 80 were chosen as participants of the study.
After making sure having 60 homogeneous participants, the researcher by dividing the participants into odd and even numbers made two groups of 30 and then at random (by flipping a coin) named these two groups. One group was as control group (having standardized tests of vocabulary in class) and another as experimental group (exposed to teacher-made tests in class). After dividing homogeneous groups of the participants into two classes, the researcher administered a pretest of reading comprehension to the students in both classes. The test for pretest was first piloted to 20 subjects who were at the same level with the participants and then the reliability of the pilot test calculated with split-half method, through Spearman Brown formula.

Results
Evidently enough, this study was conducted to investigate the effect of using Teacher-Made Vocabulary Tests vs. Standardized Vocabulary Tests strategy on reading comprehension of Iraqi EFL college students. The researcher has initially assumed that standardized vobulary tests are used conventionally in reading comprehension classes in the college to promote reading comprehension ability. For the same token, the researcher asked the teacher to construct some vocabulary tests to use in the class to promote reading comprehension as the experimental group. The purpose of this research was to see whether using these tests had any significant effect on improving the students' ability to deal with reading comprehension. After testing, scoring, and tabulating the results obtained, the researcher embarked on employing a series of statistical tests and analysis to provide solve research hypotheses.

Results on TOEFL Test
In order to make sure that the participants the researcher planned to recruit and choose as control and experimental group from a population of 120 to be homogenous, she administered a TOEFL proficiency test. Then the researcher ranged the scores from low to high and chose 60 participants whose scores were ranged between 60 to 80. After making sure having 60 homogeneous participants, the researcher by dividing the participants into odd and even numbers made two groups of 30, and then at random (by flipping a coin) chose one group as control group and another as experimental group.
The reliability of the TOEFL test was obtained through KR-21 rk formula and that is 0.93 to measure the level of homogeneity of the participants.  The important note is that the obtained reliability index of the TOEFL test based on the standards of ETTS, was reported as to be KR-21 rk = 0.93, which is a high reliability index.

Readability of Texts for the Study
Mousavi (1991) defines readability as "a measure of understandability of written text as given by an analysis of a variety of factors including syntax complexity, vocabulary, thematic expression and continuity of themes." (p.310) In this study the readability indexes of 14 reading texts of the book "Developing Reading Proficiency 2" written by Dr. M. H. Tahririan (2010), published by Payam Noor University were calculated through Felsch Readability Formula. The score in this formula is on a scale of 0 to 100, the lower the score the more difficult the writing is to read. Table 4 shows the Readability of texts. Readability and text selection for pretest and post-test.
Five texts were used for pretest and post-test. Table 6 shows the readability of each text. Valid N (list wise) 5

Pilot of Pretest
Initially, to determine the reliability of the reading comprehension test, a group of 20 people were given reading comprehension in a pilot study, after which the researcher investigated the reliability of reading comprehension using the Split-half method and Spearman Brown correlation. factor It is a measure of the strength of the relationship between two sets of data. According to this, correlation coefficients between two variables give a value between -1.00 and 1.00, the closer the value is to 1.00, the more reliable the test.   We employed Levene's test to compare mean average of both tests to figure out the equality of variances and to indicate that scores are consistent. Then the t test is used to determine the equality of means.
H0: The variances are equal.
H1: The variances are not equal. Table 12 indicates equal variances because the p value is 0.73 which is higher than α = 0.05. The P value which is .50 is more than .05. Also the t-value observed is .66 which is less than the t-critical at the 0.05 level of significance which is 1.67. Therefore, we can safely claim that the mean score of pretest in control and experimental group are not different.

Pilot of Post-test
First, to determine the reliability and validity of the reading comprehension test, a reading comprehension test was administered to a group of 20 as a pilot study, and then the researcher examined the reliability of the reading comprehension test by Spearman Brown using splint. -half method. correlation coefficient. It is a measure of the strength of the relationship between two sets of data. According to this, correlation coefficients between two variables give a value between -1.00 and 1.00, the closer the value is to 1.00, the more reliable the test. The researcher administered a post-test of reading comprehension to the students of both classes to obtain the treatment result. There were five texts in the posttest that were used in the pretest. This means that the texts used in the pre-and post-test were the same. The texts were followed by 20 questions in the form of multiple choice and WH questions, but these questions were not the questions asked by the control group, nor the questions asked by the students in the treatment group. The post-test questions were different questions for both the control and treatment groups. In the post-test, a comprehension test text prepared by the pilot teacher was used, because the purpose of the post-test was to find out the effect of the treatment in this study.  Table 17 indicates equal variances because the p value is .29 which is higher than α = 0.05. * P< .05 =It shows significant difference Table 17 provides sufficient evidence to reject the null hypothesis of the third null hypothesis of the study namely; (There is no significant difference between the effects of teacher-administered vocabulary tests and standardized vocabulary in promoting Iraqi EFL students' reading skills) because the P-value, which is 0.011, is less than 0.05. A t-value of 2.63 is also observed which is greater than the t-critical value of 0.05 at the level of significance which is 1.67. Therefore, we can safely say that the use of teacher-constructed vocabularies in the classroom is much more effective than the use of standardized vocabulary tests in Iraqi EFL learners' reading comprehension skills in the classroom.

Discussion
In this section, the researcher discussed some findings of his research with some findings of other researchers. In recent years, a great deal of research in L1 and L2 fields has been conducted on reading strategy training via vocabulary test exposure. Strategy training comes from the assumption that success in learning mainly depends on appropriate strategy use and that unsuccessful learners can improve their learning by being trained to use effective strategies (Dansereau, 1985;Weinstein & Underwood, 1985). Many studies have shown that vocabulary tests can be taught to students, and when taught, vocabulary help improve student performance on tests of reading comprehension and recall (Carrell, 1985;Brown & Palincsar, 1989;Carrell, Pharis, & Liberto, 1989;. We can consider comprehension strategies as plans or procedures that readers use and apply when they hear text read aloud, when they read text with a teacher, and when they read independently. According to Iwasaki, I. (TESL Journal, 2008)," using student-made quizzes as a method of learning in the EFL classroom, instead of as a mere method of testing proficiency in the L2, may be one way to foster a fun learning environment and reduce anxiety toward test taking." So using the policy of teacher-made vobulary tests as well as standardized vocabulary tests can be one way to help EFL students comprehend the text better and foster a fun learning environment and reduce anxiety toward test-taking. Many studies have shown that students, who answer the question while studying a text, perform better than students who do not (See Hamaker, 1986, Hamilton, 1985. Teacher-made vocabulary tests is a simple but productive way to support reader engagement with the text. According to Poway Unified District (PUSD, 2005), while reading chunk of text, teachers write down vocabulary questions they have about what they teach and what will happen next in the reading comprehension classroom. As mentioned earlier, this study investigated the effect of teacher-made vocabulary tests vs. Standardized vocabulary tests on reading comprehension of Iraqi college EFL students. Using teacher-made tests vocabulary tests in class in the experimental group led to a better performance in reading comprehension as compared to the standardized vocabulary tests in post-test.

Conclusion
Although different techniques and strategies have been used to improve reading comprehension, few of them foster a fun learning environment and reduce anxiety towards test-taking. Allowing teachers to make their own vocabulary tests while using standard vocabulary tests for the ease of work, will give them an opportunity to foster more authority over their own teaching plans. If teachers make their own tests from the texts, they interact with the text more than usual while teaching the tests to the students. Many students of EFL feel that their reading comprehension is better than usual when they are able to demonstrate on comprehension tests (Patrick . Though researches have done a lot on other approaches, they have not studied these approaches comparatively. As a result, this study attempted to show the effect of teacher-made vocabulary tests vs, standardized vobulary tests on the ways on how Iraqi EFL students deal with reading comprehension passages.
The main concern of this study was to study this assumption whether or not using teacher-made tests strategy can bring about any positive effect on the ability of students in dealing with the reading comprehension passages of Iraqi College EFL Students.
To assure and determine any significant changes in our participants, the results of each group in pretest were compared with post-test applying t test. Statistics show reliable growth in participants of experimental group; students in experimental group benefited highly from the treatment. Moreover, t test indices allowed researchers to reject the null hypotheses and thus, they found proper answers to their questions. This study is by nature a two-fold study, which covers both a macro study on teaching professionalism and micro study of learning development. The inquirer, to approve of her claims, piloted the research questions and tests, followed the situation specific statistical procedures, and came up with the statistical results elaborated comprehensively above. In this section, the researcher also summarizes the investigation procedure along with the findings based on the results and later mentions the pedagogical implications of the study. In the final section, she has provided the readers with some suggestions for further study. Based on what we can read and understand from the t test table reflected earlier and the results obtained, the researcher has come up with the following findings: The use of Standardized vocabulary tests DOES play a significantly positive role in promoting the reading comprehension skills of Iraqi EFL college students. Based on the difference between the t test results of pretest and post-test data, the mean score of Control Group before treatment in pretest was 67.67, and after treatment of STVT it increased to 72.
The use of Teacher-made vocabulary tests DOES play a significantly positive role in promoting the reading comprehension skills of Iraqi EFL college students. Based on the difference between the t test results of pretest and post-test data, the mean score of Control Group before treatment in pretest was 69*.83, and after treatment of TMVT it increased to 80.17.
By comparing the t test and F-value indices of pre and posttest independent samples t test analysis, we can see that the t value index has increased from .66 to 2.635 and the f-value has increased from .113 to 1.139. The results obtained rejects the first, the second and the third null-hypotheses.
The third null-hypothesis is rejected reliably, since the t test value of 2.635 indicated a significant difference between the STVT and TMVT. This means that the use of TMVT is more effective than the use of STVT in a reading comprehension class.