An Analysis of Research in Academic Prose between Native Speakers and Chinese Learners

This study is a corpus-based lexical study that aims to compare the use of research as a noun between native speakers and Chinese EAP learners in research articles in Linguistics. A self-built learner corpus of academic English (CMFD) and its parallel corpus (PQDT) are applied. Quantitative analysis of frequency and qualitative analysis of collocation of node words are used in this paper. The results reveal Chinese EAP learners use research more frequently than native speakers, and native speakers never use “researches” as a plural form of noun in academic writing while Chinese EAP learners use this form frequently. Compared with native speakers, Chinese learners tend to make the following errors: an overuse of research; using research as a countable noun; disorder in using of “research” and “researches”; confusedness of “much research” expressions; mixed collocation prosodies. The knowledge gained by this study can increase awareness of proper use of research in composition of instructors and L2 writers, leading to clearer, more accurate texts.


Introduction
Academic vocabulary plays an important role in academic discourse.However, it is found most problematic by learners.One supervisor of the first author once pointed out postgraduates' misuse of research (see Note 1) in their papers in her course twice.Research is indeed an important word in dissertations and theses.Moreover research is one of the most common words in the Academic Word List (see Note 2).(Research is found in sublist 1. Sublist 1 contains the most common words in the AWL.Sublist 2 contains the next most common words, and so on.There are 10 sublists totally).The noun of research is never used as a countable noun in articles written by English native speakers.However, according to the present authors' questionnaire (see Appendix B), more than half learners of Advanced English for Academic Purposes (EAP) use "researches" in sentences where native speakers use "research".
Based on the above phenomenon, this paper compares the usage of the word research between native speakers and Chinese learners in academic prose.It tries to find out the concrete differences on the use of research between the two.Firstly, by using both quantitative and qualitative analytic procedures to examine the frequencies and collocates of "research" and "researches" as nouns, the results gained in this study can increase awareness of proper use of research in composition of instructors and L2 writers, leading to clearer, more accurate texts.Secondly, our purpose of finding differences on the use of academic word research between native speakers and Chinese learners is to raise the awareness of learning and teaching academic vocabulary.Some scholars have stated that courses involving direct attention to language features were found to lead to better learning than courses that only focuses on incidental learning (Ellis, 1990;Long, 1988).Thus, we believe that the direct learning and teaching of the frequently-used AWL words can help students in their development of academic reading and writing abilities.

Literature Review
Recent years have seen the growing genre of English for Academic Purpose (EAP).According to Hyland (2002), "English for Academic Purposes refers to language research and instruction that focuses on the specific communicative needs and practices of particular groups in academic contexts.It means grounding instruction in an understanding of the cognitive, social and linguistic demands of specific academic disciplines."This expanding role for EAP has been accompanied by research on EAP both broad and at home.Flowerdew (2000) did research into the English language behaviors and patterns of nonnative academics in 2000.At the same time, Hyland (2000) studied the ideological impact of expert discourses, the social distribution of valued literacies, the access non-native and novice members have to prestigious genres, and found that the ways controlling specialized discourses are related to status and credibility.Recently, collocation and corpus analysis in academic writing have also attracted interest.Collocation plays an important role in lexical cohesion.Hoey (2007) argues that exposure to collocations primes or prepares us to recall their correct meaning, and use them correctly whenever we re-encounter them.And "language obtained through corpora has the advantage of being authentic and reveals uses that native speakers do not think of" (cited from BETTY LANTEIGNE &PETER CROMPTON, 2011).In addition, corpus is having a beneficial effect on contrastive studies (Connor & Moreno, 2005).Wu Jin (2011) conducted an analysis, from the point of collocation, between a self-built learner corpus of academic English and its reference corpus to investigate the depth of Chinese postgraduate students' academic vocabulary knowledge.Viphavee Vongpumivitch, et al (2009) did a corpus-based lexical study, also from the point of collocation, to explore the frequency of the AWL words that are used in the field of applied linguistics.These two investigations are valuable as they pay attention to collocations and corpus analysis.Yet, their research in academic writing is rarely on specific words.Although Bethany Gray & Viviana Cortes (2011) did research on the pronoun in academic writing, that is "this" and "these"; at home, Zhang Xiurong & Li Zengshun (2011) examined the frequencies and discourse functions of first person pronouns (we, our, us) in research articles from a corpus-based perspective; Sun Fang & Chen Jiansheng (2011) studied the use of "however" and "therefore" in terms of their frequencies and positions in economical research articles, these words are out of Academic Word List.That is, there is a gap in the research of specific word of AWL in terms of KW's frequency and collocation within corpora to date.Therefore, the present study tries to fill in this gap by studying the frequency and collocation of the word research as a noun in EAP Corpus, both of native speakers and Chinese learners.

Research Questions
1).In terms of the frequency and collocation, what are the differences between Chinese EAP learners and native speakers in using the word research in academic writing?
2).What types of errors in detail do Chinese EAP learners tend to make in using the word research in academic prose?

Corpus
Michael Stubbs (2007) maintains "a corpus allows us to get the facts right, a mass examples and document things thoroughly, and document types of facts (e.g. about frequency and typicality) which are not open to introspection and which are not well described in current dictionaries and grammars".And in contrastive studies, building comparable corpora is important.According to Connor & Moreno (2005), "Applying appropriate tertia compactionis at the design and analysis stages of contrastive research will help us build comparable corpora that can provide baseline data for meaningful cultural comparisons."In the current study, two corpora were built.One is a sub-corpus of academic papers from China Master's Theses Full-text Database (CMFD), which consists of theses from 10 academic disciplines and 168 special topic databases.In CMFD corpus, Linguistics discipline is chosen as the focus of the present analysis because most theses in this domain are written in English.The other is the Parallel Corpus, L1 English sub-corpus of PQDT (Master's theses from ProQuest).PQDT is the only full-text database in China providing high quality dissertations and theses.The scope covers extensive aspects, and most dissertations and theses come from over 2000 American and European universities.In the process of corpora building, all texts are randomly selected from CMFD and PQDT, and text samples are equivalence in time, discipline, number, length and level (master).The corpora referred in this paper are described in Table 1 & Table 2.

Instrument and Procedures
The present study employs Antconc 3.3 as the retrieval program, and its two tools are used, that is, concordance and collocates.Using the concordance tool of Antconc 3.3, all instances of "research(es)" were located in the two corpora.All occurrences of "research(es)" were coded as nouns.Instance of "researches" that was not used as a noun, one example extracted from PQDT used as singular form of verb, was excluded from analysis.
Frequencies were calculated for the total number of occurrences of "research(es)".Besides, Chi-square Calculator is used to test the significance.
Meanwhile, semantic prosodies are used for analysis of the collocation in both CMFD and PQDT.Firth (1957) claims that some words habitually collocate with other words.According to Michael Stubbs (2007), "words may habitually collocate with other words from a definable semantic set", "words have distinctive semantic profiles or prosodies".And some scholars consider semantic prosody as a further level of abstraction of the relationship between lexical units (Sinclair, 1996(Sinclair, & 1998;;Stubbs, 2001).Generally, four kinds of prosodies are used to analyze the collocation of node words with a certain span in corpora, that is, positive prosody, negative prosody, neutral prosody and mixed prosodies (Michael Stubbs, 1996).According to Partington (2004), semantic prosody falls into favourable, neutral and unfavourable prosodies.In this study, a pleasant or favourable affective meaning was labelled as positive while an unpleasant or unfavourable affective meaning was judged as negative.
When what was happening was completely neutral, or the context provided no evidence of any semantic prosody, the instance was labelled as neutral.In addition, Michael Stubbs (2007) also points out: "the strength of association between words can be measured in quantitative terms."There are many statistical tests used to measure collocational strength, e.g. the MI, z, t, log-likelihood scores.In this paper, MI-Score is applied as the role of "quantitative term" to measure "the strength of association between words".
Additionally, SPSS is employed to offer a descriptive statistics report of "research(es)", which aims to test any difference between Chinese EAP learner and native speakers.4, "research" Chi-square is larger than the Critical value, and P is less than 0.01, which shows the difference is significant.In the case of "researches", as the frequency in PQDT is 0 while the frequency in CMFD is 96, the significant difference can be easily observed.Besides, according to Table 5, although the means of "research(es)" in both corpora are near to equal, Std.Deviation (SD) of "research(es)" in PQDT is larger than that in CMFD.That is, compared with a great disparities in frequency among native speakers, there are few differences of "research(es)" frequency among Chinese EAP learners.The researchers use the collocates tool to retrieve KW's collocation.And the collocates were chosen based on the following rules: 1).The first left modifier of research was chosen to study for the aim of validity and easy processing.2).MI-Score of collocation is greater than 3 for the statistics meaning.3).The minimum co-occurrence frequency was set at 1. Collocation (L1) which occurs in CMFD and PQDT as a modifier of the noun "research(es)" includes the words in

Discussion
The part of results in this study can answer the original research questions:

1). In terms of the frequency and collocation, what are the differences between Chinese EAP learners and native speakers in using the word research in academic writing?
As can be seen in Table 3, research(es) frequencies are different between Chinese EAP learners and native speakers, and Table 4 shows this difference is significant ("research": χ2 = 14.86581681>6.634896601;P<.05).This difference shows Chinese EAP learners use "research" more often than native speakers.In the case of "researches", native speakers never use it while Chinese EAP learners use it frequently in academic prose.Meanwhile, as can be seen in Table 5, the mean of "research(es)" in each corpus is mostly equal while the SD of "research(es)" in PQDT is larger than that in CMFD.It indicates there are great disparities in frequency among native speakers while Chinese EAP learners have much common in "research(es)" frequency.Besides, with the tool of keyword list of AntConc, research is retrieved as a negative word in highlight color.Seemingly, it implies research occupies a higher key-ness rank in Chinese EAP learners' minds.
In the case of collocation, as shown in the previous section, there are also differences in several aspects.First, Chinese EAP learners use more often quantitative modifier and disciplinary modifier to collocate with "research(es)" than native speakers.This result can be observed in Table 10 in two aspects.For one thing, the total counts of quantitative modifier, disciplinary modifier in CMFD are more than those in PQDT.For another thing, the texts contained quantitative modifier and disciplinary modifier in CMFD are more than those in PQDT.

2). What types of errors in detail do
Chinese EAP learners tend to make in using the word research in academic prose?
Based on the above contrastive analysis, we can see some errors in research among Chinese EAP learners.First, there is an overuse of research among Chinese EAP learners in academic prose.Chinese EAP learners use "research" more often than native speakers.As for the form of "researches", native speakers never use it while Chinese EAP learners use it frequently in academic prose (see Table 3 & 4 & 5).The first potential reason is native speakers use alternately "study" & "studies" more often than Chinese EAP learners in their academic prose (935 hits to 864 hits).The second possible reason is that there are different perceptions in research between the two, which in turn causes another error.That is, Chinese EAP learners tend to use research as a countable noun.In fact, native speakers use research as an uncountable noun and it is unconventional for them to use "researches" as the plural form in academic writing.This difference can be found in Table 6 & Table 7, little is used to modify "research" by native speakers whereas many, a few, these, those to "researches" by Chinese EAP learners.Here are the examples: (1) "It provides convenience for conducting many researches."(CL in CMFD) (2) "Similar responses from different subjects were found by a few researches (e.g., Jenkins 1970, cited in Aitchison 1987;Kent & Rossanoff 1910, cited in Jay 2004)."(CL in CMFD) (3) "Those researches have made great achievements and set the norms of the use of genitive, among which the researches made by A Comprehensive Grammar of the English Language (CGEL) and Longman Grammar of Spoken and Written English (LGSWE) are distinguished."(CL in CMFD) (4) "However, the validity of these researches, which investigated the issue through general analysis with no particular case involved and a lack of data, had somewhat been affected."(CL in CMFD) (5) "However, there is little research on the aetiology, course, prognosis or treatment of post schizophrenic depression.

" (NS in PQDT)
Besides, there is a disorder among some learners on the use of the word research.Specifically, research was used as a countable and uncountable noun alike.This can be seen in Table 8.In text 6, there is a co-occurrence of the collocations of "little research" and "some researches".Similarly, "little research" and "enough researches" co-occur in text 8.This phenomenon signifies there is a chaotic state on the word research in some Chinese EAP learners' minds.
Additionally, due to a wrong perception on research about its number, an error takes place when Chinese EAP learners express the meaning of "much research".As can be seen in Table 8, Chinese EAP learners tend to use collocations of "some researches", "a few researches", "extensive researches", "many researches", "enough researches", while native speakers use collocations of "little research", "extensive research".
Last, there is an error of collocation prosody among Chinese EAP learners.Concretely, native speakers regard research as a word with non-positive prosody but Chinese EAP learners regard research as a word with mixed (positive & neutral & negative) prosodies.Most of collocates used by native speakers are neutral word, several negative word, but no positive word (See Table 7).However, in According to Agustin Llach, as L2 learners become more proficient and as they face cognitively challenging writing tasks, lexical errors do not disappear; instead, the types of errors change (Cited from CAROL SEVERINO, 2012).In this study, Chinese EAP learners tend to make the following errors: an overuse of research; using research as a countable noun; disorder on the use of "research" and "researches"; confusedness of "much research" expressions; mixed collocation prosodies.The research team members consider the following to be the possible sources of error.
1) Cross-linguistic influence.i) Intralingual transfer, for one thing, Chinese EAP learners have mixed plural form of noun with the verb form of the third person singular.In English, research can be used as both a noun and a verb, and "researches" as a verb form of the third person singular is commonly used by native speakers.This leads to that the verb singular morpheme of -es is mistaken for the noun plural morpheme of -es by Chinese EAP learners.For another, there is a word similar to research in term of word form, search, which is a countable noun and has its plural form of "searches".This also can cause a potential negative transfer among Chinese EAP learners.ii) Interlingual transfer, in Chinese, it is common to say "一项研究，两项研究，几项研究"，which is different from "a research, several research" in English.From the point of Chinese EAP learners, "一项研究，两 项研究，几项研究" is unmarked, and "a research, several research" is marked.This case is "where the native language shows an unmarked setting and the target language a marked one", which is "the most obvious case of transfer" (Rod Ellis, 1999).Because the setting of parameter is idiosyncratic (i.e.marked), Chinese EAP learners fall back on their L1 knowledge of "研究" in the process of learning research.
2) Insufficient input.Gass & Selinker (2008) argue that "input of some sort is necessary in order for acquisition to take place" and "there are three sources of input: (a) teacher, (b) materials, and (c) other learners".Chinese EAP learners failure in acquisition of research seems to imply they have a insufficient exposure to this word through any source of input.Or, either the quantity or quality of input is not enough, which causes the error in research acquisition.
3) Lack of Awareness.Noticing hypothesis was proposed by Schmidt.Underlying the hypothesis is the idea of noticing a gap.Schmidt and Frota (1986) suggested that "a second language learner will begin to acquire the target like form if and only if it is present in comprehended input and 'noticed' in the normal sense of the word, that is consciously".It highlights the role of attention, which is as important as input in the process of SLA.At one hand, attention has a diminished effect for proficiency.That is, Chinese EAP learners are more likely to pay attention to a specific word in early stages of learning.On the other hand, Chinese EAP learners are lack of register awareness, awareness of academic writing.Both aspects result in the incorrect use of research among Chinese EAP learners in academic writing.

Conclusions
A limitation of this study is the size of the sample used for the analysis.The scope of study is limited to linguistic field and the samples are insufficient in quantity.Accordingly, there is a 0 hit of "researches" in PQDT and it is not favorable to test its significance with Chi-square Calculator.This limitation also results in not a significant collocation of research in CMFD and PQDT.That is to say some features of collocation of research analyzed in this paper are not surely generalized to all Chinese EAP learners.Thus, future researchers may want to expand the size of their corpus to be as large as possible in order to increase the generalizability of their findings and to see if their results would be similar to ours.
The goal of this study is to explore the frequency and collocation of headword of the Academic Word List research noun in published academic research articles of both native speakers and Chinese EAP learners.The analysis shows that Chinese EAP learners use "research" more frequently than native speakers, and native speakers never use "researches" as a plural form of noun in academic writing while Chinese EAP learners use this form frequently. Compared with native speakers, Chinese learners tend to make the following errors: an overuse of research; using research as a countable noun; disorder on the use of "research" and "researches"; confusedness of "numerous research" expressions; mixed collocation prosodies.The potential causes are cross-linguistic influence, insufficient input and lack of awareness.
The findings of this paper can raise awareness of the proper use of research for writing instructors and students.
Increased awareness may in turn lead to a more conscious effort to think about language use in order to create clearer and more accurate texts for readers.In addition, increased awareness of the proper use of research may promote the development of reading skills for student writers by helping them to efficiently and accurately comprehension of native speakers articles.Last, our results can raise the awareness of learning and teaching academic vocabulary.We believe that the direct learning and teaching of the frequently-used AWL words can help students in their development of academic reading and writing abilities.

Appendices Appendix A. Sublists of the Academic Word List
Each word in italics is the most frequently occurring member of the word family in the Academic Corpus.For example, analysis is the most common form of the word family analyse.British and American spelling is included in the word families, so contextualise and contextualize are both included in the family context.
Sublist 1 contains the most common words in the AWL.Sublist 2 contains the next most common words, and so on.There are 60 families in each sublist, except for sublist 10 which has 30.The following questionnaire is designed for research on different perceptions and usages of research (noun) between English native speakers and Chinese learners.Please answer each question honestly and frankly according to your own opinion.There are no "correct" answers.All the data collected will be highly confidential and will be used for the research only.

Table 1 .
The corpora applied in this paper

Table 5 .
Descriptive report of "research(es)" in corporaWith the concordance tool, KW's (key word) frequency can be obtained.As can be seen in Table3, "research(es)" frequencies are 455 & 96 in CMFD and 365 & 0 in PQDT.And with the frequency and corpus size filled in the table of Chi-square Calculator, significance can be tested.According to Table

Table 6
Note. f(n,c) (see Note 3) is greater than or equal to 1; MI is greater than 3.

Table 6 &
Table7.Property modifier, such as anxiety, motivation, questionnaire, etc.. Clearly, there are some differences between Chinese EAP learners and native speakers on these categories of collocates.First, Chinese EAP learners use more often quantitative modifier and disciplinary modifier to collocate with "research(es)" than native speakers.This excess is embodied not only in the total counts in all texts but also in individual text (Table8 & 9).Second, Chinese EAP learners are more willing to use "researches" than "research" to denote the meaning of numerous research while native speakers use "research".Third, as for predicate verb, "doing research(es)" is typical in CMFD, while "conducting research" is more traditional in PQDT.Fourth, Chinese EAP learners use more often relevant than related to collocate with "research(es)" while native speakers only use related.Fifth, native speakers say "previous research" while Chinese EAP learners prefer to "previous researches".Last, there is a different prosody property of research between the two.Native speakers regard research as a word with non-positive prosody while Chinese EAP learners regard research(es) as a word with mixed (positive & neutral & negative) prosodies.
As shown inTable 6 & Table 7, the collocates of "research(es)" can be mainly classified into the  Chronological modifier, such as recent, previous, future, etc.;  Degree modifier, such as further;  Descriptive modifier, such as empirical, quantitative, qualitative, etc.;  Discipline modifier, such as monolingual, psychological, ethnographic, etc.;  Pronoun modifier, such as my, this, our, etc.;  Quantitative modifier, such as extensive, some, little, enough, etc.;  Predicate verb, such as conduct, do, etc.; 

Table 10 .
Distribution of QM & DM in corpora Chinese EAP learners are more willing to use "researches" than "research" to denote the meaning of much research while native speakers use "research".As can be seen in Table8, Chinese EAP learners tend to use collocations of "some researches", "a few researches", "extensive researches", "many researches", "enough researches", while native speakers use collocations of "little research", "extensive research".Besides, as for predicate verb, "doing research(es)" is typical in CMFD, while "conducting research" is more traditional in PQDT.In addition, descriptive modifiers, relevant and related, are both in the list of collocates in CMFD while only related is in PQDT, and relevant's Stats (4.35045 & 6.55587) are higher than those of related (3.39404 & 4.0145) in CMFD.That is, Chinese EAP learners use more often relevant than related to collocate with "research(es)" while native speakers only use related.As for the chronological modifier previous, native speakers say "previous research" while Chinese EAP learners prefer "previous researches" (Stat: 6.97091) to "previous research" (Stat: 4.02853).Last, from the point of prosody, according to Table7, most of collocates used by native speakers are neutral word, several negative word, but no positive word.It seems to indicate native speakers regard research as word with non-positive prosody.However, in Table6, Chinese EAP learners use not only neutral and negative words but also positive words, insightful, constructive, for instance, to collocate with "researches".

Table 6 ,
Chinese EAP learners use not only neutral and negative words but also positive words, insightful, constructive, for instance, to collocate with researches.Here are the text examples:(6) "Due to the complexity of genitive structures, there have been a great number of insightful researches into it conducted by foreign linguists from the perspective of semantics, syntax and corpus linguistics."(CL in CMFD) (7) "Linguists have conducted constructive researches in this field from different perspectives and have made great achievements."(CL in CMFD)