Characteristics of Pronoun “Who and Its Concordance” in Chinese College Students’ English Narrative Writing from the Perspective of Corpus-Based Method-A Case Study of Series of Compositions of “The Most Unforgettable Person I Ever Know”

Writing is one of productive skills and a way of conveying information considered to be the most complex and the most challenging skill for EFL English learners to acquire, hence many studies have been conducted on the revelation of the characteristic of writings of EFL learners and how to improve them. Among them, pronoun study has attracted extensive interest and become a hot spot in the second language acquisition and contrastive linguistics. Taking a series of compositions of “The most unforgettable person I ever know” as subject, this study is devoted to reveal the characteristics of “who and its concordance” in Chinese college students` English narrative writing from the perspective of corpus-based method. Result of contrastive analysis of writing from 630 college students of 3 colleges in the past 6 years shows: 1) As a whole, WIC can be regarded as common words for Chinese college students but the distribution of individual word is relatively disproportionate-“who” attracts far more attention while the other 4 words attract little or no attention. 2) In terms of sentence type, the distribution of the 5 types is imbalance with too little use of adverbial clause and too much of attributive clauses. In addition, the learners are used to utilizing simple and identical sentence structure and some of them are highly repetitive. 3) The use frequency of WIC of individual student shows that the majority numbers of learners have formed the habit of using them to depict interpersonal relationship and their distribution is unbalanced too. 4) As to the clusters, the learners have formed the habit of preference use of “who” as a relative pronoun above all. And some clusters are highly identical and simple. 5) To sum up, the majority of learners in this study can employ WIC consciously in their writing, but their usages are confined to simple words and structures. Therefore, the learner`s comprehensive competence of integrated employment of WIC should be improved.

and Internet technology has enabled linguists to analyses a vast amount of learners` authentic language data. And the Corpus-based methodology is able to export objective and convincing research findings after undergoing scientific statistic analysis procedure. Therefore, with the availability of the text of learner corpora, SLA researchers are beginning to focus on descriptive aspects of interlanguage processes and to discover in more detail the use/misuse of language features and their frequencies and distribution at different developmental stages (Li, 2017). Until now, considerable achievements have been attained in both corpus construction and corpus-based research of language. Moreover, methods adopted in the corpus linguistics have been applied to almost all the fields of linguistics.

Why Choose Pronoun
As a very important part of speech in English, pronoun is a kind of substitutive words that are used frequently in English language because it can avoid redundancy of antecedents and make discourse more compact and perspicuous. They are of great importance to the cohesion of the discourse, to the clear expression of meanings and to the establishment and maintaining of interpersonal relationships. It exerts a great influence in 3 aspects which are the expression of communication, the demonstration of interpersonal meaning and the cohesion of text. Pronouns may seem easy to learn in the surface but are hard to deal with due to their enormous usage as well as complicated linkages to various sentence patterns. Therefore, in recent years, a great number of researchers and scholars have done a lot of studies about English pronouns, demonstrating how pronouns are used from different aspects both in language teaching and learning (Li, 2012).

What is Pronoun "Who and Its Concordance"
In this study, "who and its concordance (WIC)" refer to "who" and its derivations, that is to say, WIC is composed of 5 words: who, whom, whose, whoever, and whomever. These words can either serve as interrogative pronouns in independent clause as sample 1 or as conjunctive pronouns in dependent clause as sample 2 or as relative pronouns in dependent clause as sample 3. Based on the sentence components that WIC play, the researcher classifies their usage into 5 types and 8 subtypes as the following examples shown in table 1. Among them the "Role in higher matrix clause" refers to the role that WIC and its antecedent plays in higher matrix clause in the attributive clause. Object God helps those who help themselves.

Predicative
Happy is the man who is contented with his lot.
There are those who eat out for a special occasion, or treat themselves.
Non-restrictive Subject Mrs. Smith, who has a lot of teaching experience at junior level, will be joining the school in September.

Object
He was accompanied by Winston Churchill, who is the Britain`s prime minister during the Second World War.
We were worried about our nearest neighbors, who were newcomers to the district.
Predicative This is Mrs. Smith, who has a lot of teaching experience at junior level.

Appositive
Winston Churchill, who is the Britain`s prime minister during the Second World War, died in 1965.
Nominal clause Subject It was uncertain who was responsible for the accident.
Who goes light travels fast.

Object
Can you tell me who is responsible for the accident?
You must give it back to who/ whoever it belongs to.

Predicative
The problem is not who will go but who will stay.

Research Outside China
From the perspective of acquisition, Butterworth and Hatch (1978) explore French EFL learners' acquisition of English genitive pronouns, while Felix and Hahn (1985) explore various forms of third personal pronouns in terms of gender for German learners. Ellis (1994:96) finds out that second language learners are challenged by the complicated system of English personal pronouns. Chan and Wong (2001) find that Malaysian has no distinction in gender on third person singular pronouns. Harris and Bates (2002), Wolf and Gibson (2004) test the pronominal reference or the processing preferences for English pronouns.
From the perspective of semantic analysis, YURIKO ASHIMA-TAKANE (1991) indicates that the child persistently made pronominal errors due to semantic confusion after the clear comprehension and production. Stevenson, Knott, and Oberlander (2000:225) concentrate on "the relationship between focusing and coherence relations in pronoun interpretation". Chan and Wong (2001)

Research in China
From the perspective of semantic analysis, Zhang Meisuo (2000) conducts research of the usage of cohesive devices in the expository writing for Chinese EFL learners and finds that they have problems in reference and always shift the pronouns within or between clauses. Wang Liqin (2006) describes the general features of the use of first person subject pronouns and explores the changes in the use of first person subject pronouns in argumentative writing by Chinese English major students. Yu ping and Cheng Congmei (2008) apply Relevance Theory to analyze tentatively of the pronoun reference as the cohesive device. FANG (2016,2017,2020) explores the Relativizer Omission by Chinese EFL Learners from various aspects.
From the perspective of semantic analysis, Niu Hongwei (2004) points out 3 binding features of English reflexives and the subject orientation for Chinese EFL learners. Zhang Guiju (2005) points out that ambiguity in reference is common errors in students' writing. Tang Shiyi (2005), Jing Rongqin (2007) survey and analyze the application of the Anticipatory It-Clause for Chinese. Jia Tingting ( 2008) discusses the gender mixed use errors of the third person animate singular English pronouns by Chinese EFL learners in spoken English. Chi Dan (2008) explores pragmatics of the demonstrative pronoun. Wang Chunlan (2009) points out that pronoun-shift result from mental distance and self-politeness. Zhang Ning (2006) testes advanced Chinese learners in their acquisition of English reflexives in mono-clausal sentences to find out their processes (Li, 2012). Li Fang (2012) implements an empirical study of the misuses of English pronouns among higher vocational school students. Li Li (2017) implements studies on different uses of pronoun it in cross-sectional learner corpora.
From the perspective of contrastive studies, Ma Guanghui (2001) conducts a contrastive rhetorical analysis between English essays written by Chinese students and American students and finds that Chinese students have problems in the use of pronouns. Yuan Jin (2004)  Those fruitful findings have shed light on pronoun research and enlightened researchers of related areas. However, no corpus-based study of WIC in Chinese learners' writing has been found so far.

Research Question
WIC play a vital role in indicating interpersonal relationship in English language. Their employment is very common for native speakers, but as EFL learners, are Chinese college students skillful in employing them? Therefore it is necessary to reveal their use feature in a provided situation. On the basis of the above reasons and consideration, the research questions are formulated as follows: What is the characteristic of the employment of WIC in Chinese college students` English writings in terms of frequency, sentence types and clusters?

Subjects
In order to conduct the research and probe into the above questions, all the subjects of this study are randomly selected from the classes the author teaches in 3 different colleges for the past 6 years, the details are shown in table 2.

Objects
As shown in table 2, the researcher has collected the writings and classified them into 6 groups, forming the object corpus after procedures stipulated in 3.4. The 580 students were given the topic "The most unforgettable person I ever know" on the Pigai.net at the beginning of the term as the following directions required: For this part you are allowed 30 minutes to write a short essay about "The most unforgettable person I ever know". You should state the reasons. Write at least 120 words but no more than 180 words.
All the collected compositions share the following common features: firstly, it is the student`s last version that is gathered into the self-built corpus. As above information shown, the composition of "The most unforgettable person I ever know" is not a limited time task. The students can accomplish it at any time within duration of three months or so and since the Pigai system can automatically mark and feedback those essays, hence the students can polish it at any time before the deadline. So there are many inversions of an individual composition from the same author, it is the last inversion before the deadline that is gathered as the research object. Secondly, these objects are accomplished in a very relaxed atmosphere because they are just after-class practice designed to improve their writing competence. There are usually 6-7 pieces of such compositions in an academic term in my teaching schedule. The students are required to finish them at any time with any reference available but are forbidden to plagiarize others` previous essays. Hence the students can accomplish them in a very relaxed atmosphere without interference of other components occurred in the test and match such as nervousness and anxiety. Thirdly, all these essays have not been revised and polished by the instructor, so they can serve as the authentic corpus that reflects the students' true writing style and competence.

Procedures
Firstly, the researcher analyzes all the writing of 6 groups (see table 2) on the topic "The most unforgettable person I ever know" in terms of content and correlation to the topic and deletes unqualified 50 essays. The left 580 pieces, which contain 5621 Word Types and 101372 Word Tokens, are chosen as the sample object resource. Secondly, the researcher applies the following analysis items of the AntConc3.3.5 to search the feature of WIC in turn: Concordance, Concordance Plot, Cluster and Word List and gets corresponding database.

Instruments
This research is facilitated by the employment of 2 pieces of software: AntConc3.3.5 and Pigai software. Pigai software is an online system which can be used to automatically scan the parameters of a student's English composition to make accurate and objective judgments and comments based on the comparison of the distance between the object compositions and the standard Corpus. Besides scores, it can provide specific feedback and suggestion in the syntax level, which can facilitate the students` revision. Its main function in this research is to comment and record all the traces of the leaner`s writing behavior. While AntConc3.3.5 is a kind of powerful software which is widely used by researchers who investigate language patterns in different languages, and by language teachers and students all over the world to look at how words used in texts. Its main function in the present study is to describe the corpora on the whole and to retrieve all the WIC selected in self-built corpus. It is composed of the following 7 analysis items available for phraseological research: Concordance, Concordance Plot, Cluster, Word List Concordance, File View, Collocates, and Keyword List. Among these, the first 4 items are main tools applied in this study. Apart from these tools, Excel is added to record the number of each category of WIC respectively.

Data Processing and Analysis
Firstly, the researcher utilizes the "Concordance" of AntConc3.3.5 to count the number of the employment of all the 5 words in WIC in the following terms shown in figure 1 in the self-built corpus and get 517 records of sentences.

Figure 1. Research terms by concordance
Secondly, the classification of these records is implemented according to the standard shown in table 1, the detailed information about the number and frequency of each word of WIC is presented in Table 3.
Thirdly, the researcher analyzes all the retrieved sentences in terms of sentence type. In order to facilitate the statistics and comparison, the researcher further classifies the attributive clause into 9 sub-types according to their nature and the role WIC and their antecedents play in higher matrix clause. But some sentences are wrong grammatically and involve many defects. Therefore, it is hard to decide the sentence type for them. As to such sentence, the researcher tries to guess the author`s initial intention and classifies it (see sample 5, 6). But to those hard to recognize, the researcher marks them as "unknown, wrong" (see sample 7, 8).

Sample 5
A person who weared a suit with glasses, walking confidently in front of me. (restrictive attributive clause modifying subject, wrong) Sample 5 is wrong in grammar. It can be either changed into "A person wore a suit with glasses, walking confidently in front of me." or "A person who wore a suit with glasses, walked confidently in front of me." According to the comprehension of the thought mode of Chinese learners, the researcher classifies this sentence as "restrictive attributive clause modifying subject, wrong".

Sample 6
However, for me, a person appear to my head who is beautiful and kind. (restrictive attributive clause modifying subject, wrong) Sample 6 also has defects, but based on logic, the researcher assumes its initial intention as "However, for me, a person who is beautiful and kind appears to my head" and then classifies this sentence as "restrictive attributive clause modifying subject, wrong".

Sample 7
In my memory, the most unforgettable person I ever know who is my angel. (unknown, wrong) For sample 7, it is better to delete "who". Therefore, this sentence is marked by "unknown, wrong".

San Mao, who is my most unforgettable person that is a famous woman writer and traveler in Taiwan and whose original name is Chen Ping. (unknown, wrong)
In sample 8 it is easy to identify the subtype of "who" clause but hard to decide "whose" clause. It can be either improved as "San Mao, who is my most unforgettable person, is a famous woman writer and traveler in Taiwan, and her original name is Chen Ping." or "San Mao, whose original name is Chen Ping, is a famous woman writer and traveler in Taiwan and my most unforgettable person." Hence the "whose" clause is marked by "unknown, wrong" The likewise procedures are implemented in turn by Concordance, Concordance Plot, Cluster and Word List. As table 3 shows:

By Concordance
1) In total number, there are 517 records of employment of WIC, occupying 89% of the effective subjects, which demonstrates the Chinese learners use them very common.
2) In terms of employment frequency of individual words, the employment of "who" lists top 1(496), occupying 95.9%, and the rest 4 words in turn are: whose (10), whom (8), whoever (3), "whomever" (0). This indicates that the "who" is the most popular and common word for the subject and "whomever" is the least one.
3) In terms of classification of sentence type, there are 448 contribution attributive clauses, 23 object clauses, 31 cleft sentences, 12 special questions independent clauses, 2 adverbial clauses and 1 subject clause, which indicates that the employment of WIC in attributive clause is the most familiar function for these learners. And their use in nominal clause (such as predicative clause and subject clause) and adverbial clause is the least familiar field for them.

4)
In terms of the role that WIC and their antecedents play in higher matrix clause, the first top 3 in turn are: predicative (206), object (158) and subject (49). This indicates that the learners prefer to use simple sentence structure such as "There is one person who helps me a lot." or "She is the person who helps me a lot." 5) In terms of correctness, the total appropriate usage is 88 times (occupying 34.8%), which demonstrates that a large number of students have not mastered the application of them at all.

By Concordance Plot
The same terms as shown in figure 1 is searched by Concordance Plot and the detailed information is presented by table 4. As table 4 shows: 1) The number of students who employ WIC in their writing is 353 out of 580, occupying 60.7%, which indicates that a large number of them have the habit of using WIC to depict interpersonal relationship.
2) 123 students have employed WIC more than once in a short essay, which shows that they have pragmatic consciousness of them.
3) The use frequency of WIC of individual student is unbalanced with 39.3% remain "zero" and 21.2% employ them more than once and the top one is 6 times (see sample 9) in this study.  The terms as shown in figure 2-3 is searched in turn by clusters/n-Grams, the detailed information shows that:
2) Combined the result in figure 2-3 with the text analysis, the most repetitive sentence pattern (65 times) is: I have met many people who are really worth recalling.
met a great many This may indicate that many students regard this sentence structure as good and then apply it as an introduction one in the first paragraph. In addition, it also demonstrates that the learners prefer to use WIC in simple sentence structure rather than complicated one, and "many people" ranks top 1 of the clusters of "who". Note. The number of frequency of "who" in this figure does not match that in table 2 because there are 4 sentences that carry double "who" in it and be considered as 1 record by concordance and 2 records by wordlist.

By Wordlist
Results of wordlist shows: That the self-built corpus contains 5621 Word Types and 101372 Word Tokens, which means the average lexical frequency in lexical density is 18.03, but WIC hit 517, this shows they are no doubt high-frequency words. In detail, the "who" ranks 35 in frequency of all the words, which further confirmed that it is very popular in this study.

Conclusion
In all, the employment of WIC in Chinese college student`s English narrative writings bears the following features: 1) As a whole, WIC can be regarded as common words for Chinese college students because of their high frequency in this study. But the distribution of individual word is relatively disproportionate-"who" attracts far more attention while the other 3 word as "whom, whose and whoever" attract little attention and "whomever" even attracts no record at all.
2) In terms of sentence type, the distribution of the 5 types is imbalance with too little use of adverbial clause and too much of attributive clauses. In addition, the learners are used to utilizing simple and identical sentence structure and some of them are highly repetitive. That is to say, they lack the skills of using WIC more complicatedly and advanced, such as in predicative clauses or adverbial clauses of concession. Furthermore, the distribution of sub-type of attributive is disproportionate too-predicative accounts for the overweight proportion and appositive attracts no attention at all.
3) The use frequency of WIC of individual student shows that a majority number of learners have formed the habit of using WIC to depict interpersonal relationship and their distribution is unbalanced too-with 39.3% remain "zero" and 21.2% employ them more than once in this study.
4) As to the clusters, the learners have formed the habit of preference use of "who" as relative pronouns elt.ccsenet.org English Language Teaching Vol. 14, No. 4;2021 above all. And some clusters are highly identical and simple. 5) To sum up, the majority of learners in this study can employ WIC consciously in their writing, but their usages are confined to simple words and structures with low accuracy. Therefore, the learner`s comprehensive competence of integrated employment of WIC should be improved.

Limitation of the Research
The reliability of the findings of this research is limited for the following reasons: Firstly, the source database is only limited to self-built corpus without international database in contrast, therefore, it is hard to identify the peculiar characteristic of WIC in Chinese college student` s English narrative writing without comparison. Secondly, the research method employed is relatively simple, mainly corpus-based quantitative method and text analysis, rather than integrated methods, consequently, some findings are mainly based on descriptive analysis and may be superficial and less convincing. Thirdly, the topic research object is only limited to "The most unforgettable person I ever know", which may be single and simple. Therefore, the use of WIC can`t be fully reflected in these writings. Fourthly, the classification of sentence pattern of a single sentence is not unique since some of them may involve many defects and hard to decide. Therefore, further investigation should base on larger variety of subjects and topics, more detailed analyses of the text. Moreover, qualitative and quantitative analysis rather than descriptive analysis based on the comparison of corpuses between Chinese learners and native learners can be taken into account as well.

Suggestions for Future Research
·To further research the features in more detailed aspects.
·To carry out comparison research.
·To attach more importance to experimental and empirical methods.