Distribution of Articles in Malaysian Secondary School English Language Textbooks

This paper reports the results of a corpus-based study on English grammar articles presented in the Malaysian Form 1 to Form 5 English Language textbooks. The study aimed to find out the distribution patterns of the articles and the distributions of their colligation patterns in the secondary school English Language textbooks. The findings showed that all the three articles (a, an, the) are presented in all the five English Language textbooks and that their frequency of occurrences has an increasing trend from Form 1 to Form 5. However, the distributions of the colligation patterns of the articles showed inconsistency from one form to another. Some colligation patterns were over-emphasized while others were neglected in the English language textbooks. This study indicates that a textbook corpus can be useful in analyzing the presentation of grammatical structures (articles, in the case of this research). The findings can provide guidance to teachers to improve their pedagogical practices in the teaching of articles and to cater to the weaknesses of the presentation of articles in the textbooks.


Introduction
The English article system is one of the most commonly used aspects of grammar in the English language. An article is used as a sentence element to determine nouns and to specify the definiteness of nouns. Article usage is very important as it contributes to the correct use of the English grammar. English article system is said to be "one of the most difficult grammatical items for the non-native English speakers to learn" (Matsuura & Yamada, 1982, p. 50), and it is also "one of the latest to be fully acquired" (Master, 1990, p. 461). According to Matsuura and Yamada (1982, p. 50), although the article is "one of the smallest grammatical items", the difficulties in learning the articles are undeniable. Celce-Murcia and Larsen-Freeman (1999) argue that articles are well-known to be problematic in the aspect of English grammar. Learners face difficulties in learning how to use articles because of their complicated usage (DeCapua, 2008).
All the three articles (a, an, the) have been listed in the Secondary School English Language Curriculum in Malaysia (Ministry of Education, 2003). Although the Form 1 to Form 5 learners are exposed to articles, they still commit frequent errors in using them in sentences (Suhaila Haji Mokhtar, 2002). Examining the articles' distribution and colligation patterns in the Form 1 to Form 5 English language textbooks would provide the stakeholders with an understanding of the content of these books and the reason why Malaysian learners commit errors in using the articles.
The textbook is the main material used by teachers and learners in the classroom. The Textbook Bureau of the Malaysia's Ministry of Education provides textbooks for the Malaysian schools to be used in the English language classrooms. According to Nooreen Noordin and Arshad Abdul Samad (2005, p. 1), the textbook is an essential part of an English class as "the major source of contact they [learners] have with the language apart from the input provided by the teacher". The textbook enables the learner to revise and work independently inside and outside the English language classroom (Mukundan, 2004). Although these textbooks are prepared according to the guidelines of the English language syllabus set by the Ministry of Education (Murugesan, 2003), they are also said to be written based on writers' intuition rather than on empirical studies (Mukundan, 2004). As textbooks are used as the core material in the Malaysian classrooms, it is crucial to look at how important grammatical structures like articles are presented in them.

Objective of the Study
This study seeks to investigate the distribution and colligation patterns of articles across and within the Malaysian Form 1 to Form 5 English Language textbooks.

Research Questions
In order to meet the aforementioned objective the following research questions were posed: 1) What are the distribution patterns of the articles used across and within the Malaysian Form 1 to Form 5 English Language textbooks?
2) What are the distributions of word classes that colligate with the articles used across and within the Malaysian Form 1 to Form 5 English Language textbooks?

Literature Review
The main aspect of language that is analyzed in textbook corpus research is the recurrence of vocabulary in the textbook corpus. The more the words are repeated and recycled in the textbooks, the more effective is the language learning process of the learners. Kachru (1962) as well as Crothers and Suppes (1967) claim that vocabulary acquisition depends highly on words that are repeated and recycled more than seven times in the textbooks that are still in use in the language classrooms. This has also been supported by Thornbury (2002, p. 24) who posits, "it has been estimated that, when reading, words stand a good chance of being remembered if they have been met at least seven times". The textbook is one of the core materials that should provide a chance for the learner to repeat the new vocabulary items or grammatical elements efficiently. Evaluation of the distribution patterns of these linguistic units can provide language teachers and textbook writers with useful information to improve the quality of students' learning.
Several studies have been carried out on the presentation of linguistic elements in English language textbooks in Malaysia. Singh and Mukundan (2005) conducted a study on Malaysian Form 2 English Language textbook. They investigated the discrepancies between the distributions of verbs in the textbook and the presentation of the verb in the Malaysian Form 2 English Language syllabus. They reported that the textbooks failed to fulfill the conditions on verb teaching set by the Malaysian Form 2 English Language syllabus and that the verbs taught by the textbooks made the verb acquisition difficult for the learners. Hence, the inappropriate verb teaching in the textbook may be one of the reasons for most Malaysian school leavers' low level of English Language ability (Singh & Mukundan, 2005).
In another textbook corpus research, Mukundan and Roslim (2009) examined the presentation of the prepositions in the Malaysian Form 1 to Form 3 English Language textbooks. They reported that the frequency order of the prepositions in the textbooks does not match with the frequency order of the prepositions in the British National Corpus (BNC). Moreover, they reported that there are several prepositions which are used differently in different categories as these prepositions have different functions. This, according to Mukundan and Roslim (2009), might lead to learners' confusion.
Furthermore, Mukundan and Khojasteh (2011) compared the presentation of modal auxiliary verbs in the Malaysian textbooks and those presented in the BNC. They reported that the modal auxiliary verbs presented in the Malaysian lower secondary school textbooks are not identical to the modal auxiliary verbs presented in the BNC. This indicates that the Malaysian Form 1 to Form 3 textbooks fail to present the more common and important aspects of grammar use in the real language (Mukundan & Khojasteh, 2011).

Methodology
This research is a corpus-based study which utilizes the data from the Malaysian Secondary School English Language textbooks corpus compiled by Mukundan and Anealka Aziz (2007). Using a corpus-based approach in this study is crucial as the researchers aim to analyze the empirical data in order to conduct a language based research. It has been claimed that the most suitable material to be used in a curriculum and textbooks research would be the corpora (Super, 2004). Hence, the Malaysian Secondary School English Language textbooks corpus is used in this study.
This study employs content analysis to analyze the use of English grammar articles in the textbooks. Krippendorf (2004) defines content analysis as an analysis of the whole texts, images and symbols that occur in a textbook. It is a quantitative analysis, summarizing the whole text (Neuendorf, 2004, in Menon, 2009). Ary, Jacobs and Sorenson (2010) state that content analysis is the method used in analyzing a specific characteristic found in a text whether it is a written text or visual text such as the textbook corpus. Content analysis is employed to analyze a specific characteristic in the content of a text. In order to observe the occurrence and distribution patterns and the colligation patterns of the three articles presented in the Malaysian Secondary School English Language textbooks computer-aided content analysis method was used.

Sample
The sample that was used for this study was the Textbooks Corpus compiled by Mukundan and Anealka Aziz (2007). It comprises the prescribed textbooks used by the Malaysian secondary school English Language learners of Form 1 to Form 5. This corpus consists of 311,214 running words. These textbooks were used at schools for 12 years from 1990 to 2002. Mukundan and Anealka Aziz (2007) developed this corpus to study the recurrence and repetition of vocabulary in each textbook. The corpus can help researchers to study the way grammar is presented in the textbooks.

Instrumentation
WordSmith Tool was the main instrument used for this research. It was the most suitable tool to be used for the purpose of this study. Many researchers recognize it as the most capable tool to analyze the data of a corpus quantitatively (Menon, 2009;Mukundan & Roslim, 2009;Mukundan & Menon, 2006;Baker, 2006;de Klerk, 2004de Klerk, , 2005Mukundan, 2004;Flowerdew, 2003;Henry & Roseberry, 2001;Scott, 2001;Nelson;Bondi, 2001). This software programme was designed by Mike Scott (1996) to enable learners and researchers alike to have better access to the software at their own space and time using their own personal computer (Scott, 2001). WordSmith is able to produce frequency listing, alphabetical listing, keywords in the context, investigation on the keywords and investigation on the words used with the keywords. For this study, only the WordList Tool and the Concord Tool are used. The WordList Tool produces the list of words with their frequency of occurrences and the Concord Tool displays concordance lines which show the location and dispersion of the keywords in the corpus.

Distribution Patterns of the Articles Used across and within the Textbooks
There are three articles -a, an, the, which are required to be taught in the KBSM syllabus for lower (Form One, Form Two and Form Three) and upper secondary (Form Four and Form Five) students. The frequency of occurrences of all the articles is investigated in this research in order to obtain the number of times these articles are presented to students throughout the texts in Form One to Form Five English Language classrooms. There were collectively 26,822 articles in the textbook corpus. As Figure 1 shows, the article the had the highest (18,270) while the article an had the lowest frequency (898) in the five English Language textbooks. The article a had a moderate frequency of 7,654. In other words, out of the total number of article occurrences in the textbook corpus, 68.1% related to the occurrence of the article the, followed by the article a (28.5%) and an (3.3%). Therefore, the article the was the most frequent article throughout Form 1 to Form 5 English Language textbooks in Malaysia.
Regarding the distribution of the articles (Table 1), in the Form 1 textbook, the article the makes up 70.8% (3001 occurrences) of the overall articles (4239), followed by 25.9% (1097 occurrences) of the article a, and the article an with 3.3% (141 occurrences). Similarly, in the Form 2 textbook, the article the is the most frequently occurring article with 64.1% (2496 occurrences of the overall 3897 articles), followed by the article a with 32.6% (1271 occurrences), and the article an with 3.3% (130 occurrences). Regarding the overall distribution of articles (5101 occurrences) in the Form 3 textbook, the article the comprised 64.9% (3309 occurrences) of the articles, followed by the article a with 32.0% (1630 occurrences) , and the article an with 3.1% (162 occurrences). In the Form 4 textbook, the article an is also the least frequently occurring article with 3.1% (209 occurrences) of the overall articles. Ahead of the article an is the article a with 27.9% (1894 occurrences) and the most frequently occurring article is the article the (69.0%, 4685 occurrences). In the Form 5 English Language textbook, the article the has the highest frequency with 70.3% (4779 occurrences), followed by the article a with 25.9% (1762 occurrences) while the article with the lowest frequency of occurrences is the article an with 3.8% (256 occurrences). Hence, the frequency of occurrences of the article the is the highest among the three articles in the Forms 1 to 5 Malaysian English Language textbooks, followed by the article a in the second ranking and lastly, the article an which has the lowest frequency of occurrences of the three articles in all the five English Language textbooks.

Distribution of Word Classes Colligating with the Articles Used across and within the Textbooks
The colligations of word classes with each article were examined based on a list of 15 structures proposed by Celce-Murcia and Larsen Freeman (1999). The word classes that commonly colligate with each article include: Structure 1 (S1): 'a' + singular count nouns In order to determine the colligation of word classes with each article, the frequency of occurrences of each pattern mentioned above was retrieved from Form 1 to Form 5 Malaysian English Language textbooks corpus using the Concord Tool.
The frequency of occurrences of each colligation pattern across the education levels showed that the patterns of occurrences were not consistent ( Table 2). The frequency of each pattern from the lowest level (Form 1) to the highest level (Form 5) did not have either an ascending or a descending pattern. The number of occurrences of the colligation pattern of S1 increased from the Form 1 textbook (341 occurrences) to the Form 4 textbook (632 occurrences) but decreased in the Form 5 textbook (539 occurrences). The pattern of S1 did not increase proportionally with the level of education. On the other hand, there was an increase in the frequency of the pattern S3 from the Form 1 textbook (118 occurrences) to the Form 3 textbook (250 occurrences), but the number decreased in the Forms 4 (214 occurrences) and 5 (200 occurrences) textbooks. The colligation pattern S2 did not occur at all in the textbooks.
The frequency of occurrences of each colligation pattern of the article an across the Form 1 to Form 5 English Language textbooks (Table 3) showed a great inconsistency in the number of occurrences of each pattern. The frequency of occurrences of S4, for example, across the five levels varied greatly from one another. The frequency of S4 in the Form 1 textbook was 20, followed by 18 occurrences in the Form 2 and 86 occurrences in Form 3 textbooks. The frequency went down to 41 occurrences in the Form 4 textbook but finally increased to 90 occurrences in the Form 5 textbook. The sudden increase of frequency counts in the Form 3 English Language textbook showed the great inconsistency in the frequency of S4 across the Form 1 and Form 5 Malaysian English Language textbooks.
Out of the many word classes that the article the usually colligates with, the focus was on S7 to S15 ( Table 4). Out of the nine colligation pattern, S7, in which the article the is followed by singular nouns, occurred the most throughout the entire corpus with 6107 occurrences. The colligation of the article the with plural nouns (S8) had the second highest frequency with a total number of 2746 occurrences. This was followed by S9, the colligation of the article the with noncount nouns with 1384 occurrences. The colligation pattern that came third with 898 frequency counts was S13 (the colligation of the article the with adjectives). The colligation pattern of the article the with the word 'only' (S14) indicated the lowest frequency with only 14 occurrences. The rest of the five colligation patterns, S10 (colligation of the with proper nouns), S11 (colligation of the with superlatives), S12 (colligation of the with ordinals) and S15 (colligation of the with expressions of time) had 339, 194, 573 and 220 occurrences, respectively. In examining the distribution of the frequency counts of each of these colligation patterns across the Form 1 to Form 5 textbooks, the colligation patterns showed inconsistency in their distribution, except for the colligation pattern of S8 (colligation of the with plural nouns) that displayed a consistently increasing number of occurrences from lower to higher forms.

Discussion
The findings have shown two main insights. Firstly, the frequency and distributions of the articles found in the Malaysian Form 1 to Form 5 English Language textbooks are revealed in this study. The results displayed the frequency of the articles in each of the textbooks to which the learners are either intentionally or incidentally exposed. Conrad (2000) has emphasized the importance of teachers' awareness of such frequencies as it helps them decide which grammatical items should be emphasized in the language classrooms. It has been argued that the information on frequencies is the key that leads linguists to words or structures which are central in a language, and that the lack of the frequency lists can cause difficulties in deciding what should be included or prioritized in learning-teaching materials (Mindt, 1995;Kennedy, 2002;Romer, 2004). In English Language classrooms, learners are encouraged to be exposed to the language as much as possible to gain mastery of the language. As Thornbury (2002, p. 24) suggested, "words stand a good chance of being remembered if they have been met at least seven times over spaced intervals". According to Celce-Murcia and Larsen-Freeman (1983, p. 23), "it makes sense to recycle various aspects of the target structures over a period of time: revisit old structures, elaborate on them, and use them for points of contrast as new grammatical distinctions are introduced." Thus, the importance of frequency information for teachers is emphasized as it helps the teachers to decide which new items should be emphasized in the language classrooms (Conrad, 2000).
Additionally, this study has revealed the frequency and distributions of the articles' colligation patterns in all of the five Malaysian secondary school English Language textbooks. According to Ur (2006), having the knowledge on the word classes that articles often colligate with is beneficial as learners will be unable to use words if they do not know how to put the words together. According to Frodesen and Eyring (2003), the indefinite article a categorizes a noun and is a representation of a type, group or a class. Hence, the indefinite article a usually "colligates with the singular count noun" (DeCapua, 2008, p. 59). In this study, among S1, S2 and S3, the colligation pattern S1 occurred the most in all the textbooks. As we know, the article a also colligates with words that come with initial vowel letters but have consonant sounds. Interestingly, this pattern of colligation did not occur in any of the Form 1 to Form 5 textbooks. Such findings may explain the reason why Malaysian learners often commit errors in using the article an for a with words that have initial vowel letters that sound as consonants, for example, a university. With regard to the article an, this article is commonly known to colligate with "vowel sound singular count nouns" and "occur before nouns with vowel sounds" (DeCapua, 2008, p. 59). The high frequency of occurrences of the colligation of the article an with the vowel sound singular count nouns exposes the learners to the main structure of colligation of the article an. In the textbooks, the colligation of the article an with words that are spelled with consonant initials but pronounced as vowel sounds was the lowest among the three colligation patterns of the article an. Although this pattern of colligation is common, due to the learners' insufficient exposure to this type of colligation in the textbooks, they frequently commit errors as they cannot distinguish words with consonant initial letters that are pronounced as vowels. This, Conrad (2004, p. 69) believes that "by minimizing the importance of variation, we are misrepresenting language in materials that we use with students". According to Conrad (2004), statistical evidence provided by the corpora indicates that grammatical patterns differ systematically across varieties of English and across registers and ignoring grammatical variants undermines the effectiveness of teaching materials. Hence, it is important to emphasize all these colligation patterns of the articles to the learners in order to help them to use the articles accurately and effectively.

Conclusion
This study highlights the importance of corpus-based research. Employing a corpus-based study is very useful in investigating the colligation patterns of the articles. As Lawson (2001) posits, insight about a particular linguistic feature such as lexico-grammatical associations can only be obtained by using a corpus. In addition, Kennedy (1998) believes that corpus-based findings can help textbook writers in selecting the materials and syllabus, giving weight to various items and organizing the language being taught, like the colligation patterns of articles.
The findings of this study could be useful to provide recommendations on the teaching of the English grammar articles in the English Language classrooms. The findings suggest that there should be more focus on the teaching of the article 'an' as the least emphasized article among the three. Looking at the distribution pattern of the article 'an' in the textbooks, it was found that this article is the least frequent among the three. Hence, the teachers can create appropriate teaching materials to expose the learners more to the article 'an'. Teachers can put in more effort in creating teaching materials in teaching the articles in the classroom as most of the articles that occur in the textbooks are taught in an incidental rather than an intentional way. Providing more input to the learners by bringing in more teaching materials into the ESL classrooms can help them to have better understanding of the English articles.
The findings of this study also emphasize the importance of learners' awareness of the word classes that usually colligate with the articles. Teachers are advised to put more emphasis on the colligation pattern in their intentional teaching of the articles. They can help their learners familiarize with the usage of articles in reference to the word classes (for instance, nouns) with which articles often colligate. According to the statistics of the highest colligation patterns that occurred with all the articles in the entire English Language textbooks, articles most commonly occur with nouns. Thus, the learners should be provided with input on the concept and types of nouns so that they can use articles more accurately. Although the textbooks expose the learners to a variety of usages and colligations of the articles with specific nouns, learners still make mistakes in using articles. Therefore, teachers should include supplementary practice in their teaching plans to help tackle the problem in using the appropriate article with a particular noun. It is also important to introduce them the different nouns from the most basic nouns (like singular count nouns) to the most complex nouns (such as proper nouns). Teachers are encouraged to provide extra teaching materials to educate the learners to distinguish one noun from another. The ability to identify different types of nouns enables the learners to use the appropriate article before nouns.