Self-Repairing of Chinese Science and Engineering Majors in Oral English

This study employs corpus analytical tools to carry out a systematic study on Chinese Science and Engineering Majors’ (SEMs’) use of self-repair in their oral English. The study aims to find out the overall feature of using self-repair by SEMs and to see if there exists statistically significant difference of using self-repair across different English proficiency level groups. Results show that SEMs use self-repair frequently and they use same information repair (SIR) most. There is no statistically significant difference in the overall frequency of using self-repair among SEMs’ levels. SEMs’ correction rate is rather low. Self-monitor theory and communication strategies theory are used to explain the results. At the end of the thesis, the author provides some suggestions for oral English teaching.


Research Background
"To err is human.To self-repair fortunately is also" (Postma, 2000, p. 98)."Repair is a term for ways in which errors, unintended forms, or misunderstandings are corrected by speakers or others during conversation.A repair which is made by the speaker (i.e. which is self-initiated) is known as a self-repair" (Richards, 2002).
The researchers have done a lot of study on self-repair behaviors in both first language and second language acquisition fields.For instance, Levelt's (1983Levelt's ( , 1989) spatial description study and Bredart's (1991) study of self-repair made by French-speaking subjects are concerned with the frequencies of different categories of self-repair in first language acquisition.Chinese linguists have also investigated self-repair behaviors of Chinese learners in their oral English, but they lack related studies on Chinese college students especially in particular majors.

Significance of the Research
The significance of the present study mainly lies in the following aspects: Firstly, the present study focuses on one particular group of English learners, i.e.Chinese Science and Engineering Majors (SEMs), for they occupy the largest proportion in college students in China, which reaches nearly 70% according to the recruiting plan of Ministry of Education (Retrieved August 8, 2011, from http://bbs.eduu.com/thread-307994-1-1.html).
Secondly, the abilities of listening and speaking are being paid more and more attention.However, college students' oral ability is probably not as good as their reading and writing abilities.The phenomenon is particularly prominent in Chinese SEMs' English learning (Zhen, 2009).Therefore, it is very urgent to carry out the studies on this group of learners' oral English.

Aim of the Research
The research is to investigate the use of self-repair patterns by Chinese SEMs in oral English, intending to find out their general features of using self-repair patterns from lower English proficiency level to higher one.It designs to provide answers to the following questions: 1) Which self-repair patterns are most likely used by Chinese SEMs in oral English?
2) Do Chinese SEMs at different English proficiency levels differ in their use of self-repair patterns?

Self-Repair
The research of self-repair derives from the work by Schegolff, Jefferson, and Sacks (1977).Based on the study of native speakers' conversation in natural occurring environment, they are the first linguists who put forward the term self-repair.Schegolff et al. (1977) made a comparison between "correction" and "repair".According to them, correction refers to the replacement of an error with what is correct.And what has been repaired is called repairable or sources.
On the basis of the definition given by Sehegolff et al. (1977), Rieger (2003) defines repair as "error correction, the search for a word, and the use of hesitation pauses, lexical, quasi-lexical, or non-lexical pause fillers, immediate lexical changes, false starts, and instantaneous repetitions" (Rieger, 2003, p. 48).Schegolff et al. (1977) study self-repair from the perspective of conversation analysis.Levelt classifies self-repair into three main types: covert repair, overt repair and rest repair.Van Hest's (1996) classification of self-repair in L2 was considered as the first and the most systematic one.Kormos (1998) claimed that Levelt's classification could be adopted in the study of L2 with some modification.On the basis of the participants' retrospective comments, Kormos studies the underlying reasons for self-repair.Chen Liping (2005) gives his classification of self-repair in the field of language acquisition.His classification is formed on the basis of that of Levelt (1983, 1989), van Hest (1996), Kormos (1998) and his own empirical studies.
In summary, the classifications of self-repair are various and possess their own strengths and limits.So in the present study, they would be applied with some modification according to the present data.

Empirical Study on Self-Repair
The study of the distribution of self-repair is an important topic in psycholinguistics.The research results can provide us with indirect evidence of the function and the sensitivity of the monitor towards different types of errors (Kormos, 1999).This section reviews some representative and important empirical studies on the distribution of self-repair in the field of language acquisition.Some similarities are found between different researches.But controversial arguments about the distribution of some types of self-repair also exist.This study will research Chinese SEMs' self-repair in oral English, and investigate the distribution of these repairs.

Self-Monitoring
In fact, self-repair automatically made by speakers is not only language behavior but also a cognitive activity.Hence some scholars try to find out the underlying mechanisms of self-repair.Levelt and Kormos adopt a psycholinguistic approach to explain the deep reasons for self-repair.That is the Self-monitoring Theory.Theoretically, the analysis of repair mechanisms can provide researchers with most direct information about the psychological and linguistic processes of spoken language.
Monitor performs two functions: matching function and creating instructions for adjustments.By the first function, it compares parsed aspects of inner and outer speech with the intentions and the messages that are sent to the formulator or it compares the inner and outer speech with the criteria or standards of production.By the second function, it will send an alarm signal to the speaker's working memory if there is some mismatch detected.Then the speaker will decide whether to make repair or not.
Self-repair hints at the existence of a monitor.In the process of the dynamic language use, speakers automatically monitor the form and the concept of their language for achieving successful communication.

Communication Strategies
Communication Strategies (CSs) are located within a general model of speech production, in which two phases are identified: a planning phase and an execution phase ( (Faerch & Kasper, 1983).CSs are seen as one part of planning process.They are called upon when learners experience some problems with their initial plan which prevents them from executing it.One solution is avoidance.It occurs when learners change their original communication goal by means of some kind of reduction strategy.The other solution is to maintain the original goal by developing an alternative plan through the use of an achievement strategy.

Research Participants
In order to collect the most representative spoken materials, such factors as learners' majors and the universities are fully taken into account.
In considering the balance of regional distribution, five universities are chosen as the sources of data.They are Harbin Institute of Technology (HIT), East China University of Science and Technology (ECUST), Nanjing University of Science and Technology (NUST), University of Shanghai for Science and Technology (USST), and Chongqing University of Posts and Telecommunications (CQUPT).In addition, the investigated English learners are all selected from the representative science and engineering specialties, such as Thermal Energy and Power Engineering, Computer Science and Technology, Metal Materials, Electronic Science and Technology, and so on.All the learners are classified into three groups according to their English proficiency distinguished by College English Test (CET).Learners who have passed CET6 are sorted into high-level group; those who have passed CET4 but haven't passed CET6 are classified into middle-level group; and those who haven't passed CET4 are grouped into low-level group.Students from each level are further classified into groups of two to four for discussion.

AntConc 3.2.1
The retrievals of the frequency and instances with context in SEMs' data are completed by applying the professional corpus retrieving software AntConc 3.2.1, which was developed by Laurence Anthony from Faculty of Science and Engineering of Waseda University in Japan in 2007.
In this research, the author mainly uses its analytical sub-tools of concordance and word list to do the retrieving and analyzing work in Chinese SEMs' data.The concordance tool generates concordance lines of a searched self-repair and it is used for identifying self-repair as well as calculating the occurring frequencies of each self-repair.

Excel and SPSS
After the data are retrieved, all the statistics need to be stored and analyzed by statistical softwares.Firstly, Excel is needed for data calculation and chat creation in this thesis, because it provides multifunctional operations, especially in calculating large amounts of data and creating the basic-to-intermediate charts based on information and data within the spreadsheets.
Besides, some professional statistical tools are also needed in this study.As the descriptive statistics will discover some possible differences among SEMs' data, the significant difference tests will be applied, as Biber et al. (1998, p. 5) points out that "significance tests show how likely it is that quantitative results could have occurred by chance, and thus they should always be reported in research articles describing a corpus-based study".There are several statistical tools for the significant difference tests.One of the most commonly used statistical tools is Statistical Package for Social Science (SPSS) created by SPSS, Inc. in Chicago, Illinois.It has many statistical functions such as descriptive statistics with the plots, frequency, and chart.Besides, it also provides many other kinds of statistical analysis such as One-way Analysis of Variance (i.e.ANOVA), Correlation, T-tests, Factor analysis and so on.In this thesis, the author will apply One-way ANOVA test and Correlation test into the significant tests of the comparative studies among SEMs' sub-databases across different English levels.

Transcription of Raw Data
All the video recordings of students' discussions were transcribed into txt format, with no artificial corrections, deletions, or additions to ensure authenticity (Liu Qin, 2010).All the transcripts were carefully checked by proficient English teachers so as to make sure each of them is as highly accurate as possible.As mentioned before, the total number of raw word tokens of SEMs' spoken database can be gained with the word list tool.And there are altogether 85,540 word tokens in 150 text samples, among which 50 samples are from high-level group, 50 from middle-level group and 50 from low-level group.That is to say, SEMs' oral English database consists of three sub-databases, i.e. high-level database, middle-level database and low-level database.
The information about learners' background such as gender, major, school level, English proficiency and topic of discussion, is added to each transcribed text.As shown in the following sample, the first six lines present the background information of SEMs.<schoollevel = key> means that learners in this English level group are from key universities.<school = ECUST> means that learners come from East China University of Science and Technology (ECUST).<studentlevel = 1> means that learners who have passed CET6 are grouped into the first group, i.e. high-level group.<major = Chemical Engineering and Technics> shows that learners' specific major is Chemical Engineering and Technics.<speakers sp1 = CC, male sp2 = LYJ, female sp3 = XYH, male> shows each learner's name acronym and gender.<topic = What Do You Think of the Increased College Enrollment?> shows that the topic of learners' discussion is "What Do You Think of the Increased College Enrollment".Each line ends up with the corresponding information-end-marker such as </schoollevel>, </school>, </studentlevel>, </major>, </speakers> and </topic>.The discussion part follows the background information.<conversation> shows the beginning of the discussion, and </conversation> indicates the end of it.The utterances launched by the first speaker begin with the code <sp1> and end up with the code </sp1>.So does the other speakers.

Annotation of Self-Repair
In the present study, self-repair is divided into the following four types: same information repair (SIR), different information repair (DIR), appropriateness repair (AR), and error repair (ER).Within each type, there are sub-types.For example, SIR includes syllable repetition repair (SRR), one-word repetition repair (ORR), within-two-word repetition repair (WRR), and more-than-two-word repetition repair (MWRR).DIR includes different fact repair (DFR) and message replacement repair (MRR), etc.
According to the classification and the annotation method mentioned above, the author manually finished the tagging of all the 150 text samples.The author consulted the experienced professors and teachers on any uncertainty in this process.

Retrieval of Self-Repair
The author used the Concordancer of AntConc 3.2.1 to retrieve each self-repair in Chinese SEMs' database so that the occurring frequencies of each self-repair can be obtained

Overall Description
After all the raw frequencies were retrieved through AntConc 3.2.1, the overall frequencies and self-repair patterns were respectively calculated by Excel, to gain the general features on the use of self-repair among three groups of SEMs' oral English.It should be noted when conducting each step of comparative studies, all the raw frequencies of self-repair directly obtained from the retrieval have to be converted into the standardized ones, because the databases used for comparative studies are of different sizes.Only in this way can the comparative studies be achieved.The norm of converting frequencies is taken as the number of occurring frequencies in per 100000 words in the present study.The formula of standardized frequency is Standardized Frequency (SF) = Raw Frequency (RF)/Total Tokens *100000 (formula taken from Gui Shichun, 2005).Table 1 displays the RF and SF of word tokens across three SEMs databases.

Comparative Studies
Comparative studies among SEMs' three sub-databases were made from the aspects of overall frequency, self-repair patterns and distributions.The distributions of self-repair in each SEMs' sub-database were found and analyzed so that the similarities and differences in using self-repair by three levels of SEMs can be observed.As for the significant differences of using self-repair among the three levels, One-way ANOVA tests were applied.

Frequency
All the RF of self-repair in the SEMs' database is listed in Table 2. Note.SIR = Same Information Repair, DIR = Different Information Repair, AR = Appropriateness Repair, ER = Error Repair.
Figure 1 shows the distribution of self-repair in this study.In overt self-repair, AR makes up the highest percentage followed by ER and DIR.Actually, DIR and AR concern meaning while ER concerns form of oral English.The distribution of self-repair of meaning and form can be seen in Figure 3.To some extent, this can avoid making mistakes and is indeed a communicative strategy.However, if a speaker uses SIR too much, the production will be of disfluency and incoherence.Note.SRR = Syllable Repetition Repair, ORR = One-word Repetition Repair WRR = Within-two-word Repetition Repair, MWRR = More-than-two-Word Repetition Repair.
We can see that participants made ORR most and SRR fewest.The following figures display the proportion of these four subtypes in SIR.We can see from Figure 4 that ORR takes up 52% and WRR takes up a quarter.This indicates that the participants usually repeated one or two words to gain time in their utterance rather than syllable or more than two words.

DIR
DIR is related to the speaker's problem in conveying information.This kind of self-repair occurs when the speakers decide to send different information than what he/she is currently formulating, or he/she intends to replace the current message with a totally different one.Note.DFR = Different Fact Repair, MRR = Message Replacement Repair.
From the above table, it can be seen that participants made more MRR than DFR, the RF being 84 and 45 respectively.The following figure shows the proportion of the two subtypes of DIR.AR is concerned with whether or not an idea is expressed properly, clearly, unambiguously and cohesively.It is employed when speakers decide to encode the originally intended information but in a modified way.According to Levelt's (1983) opinion, the central characteristic of an AR is that the original utterance does not contain an error.From Table 5, it can be seen that participants often made ALRR and AIR; sometimes they made AAR and ADR; they seldom made ACR.The following figure displays the proportion of subtypes of AR.Note.ER = Error Repair, PER = Phonological Error Repair, LER = Lexical Error Repair, MER = Morphological Error Repair, SER = Syntactic Error Repair.
From Table 6, it can be seen that the participants often made LER and MER, and sometimes they made PER and SER.The following figure can help us see the difference more directly.We can see from Figure 7 that both LER and MER take up more than 40%.Neither PER nor SER takes up more than 10%.
To sum up, Chinese SEMs make SIR (65%) most among all the four main types of self-repair.In overt self-repair, they make AR (52%) most followed by ER and DIR.In terms of meaning and form of self-repair, they make meaning of self-repair (63%) more than form of self-repair (37%).The distribution of sub-types of self-repair are also presented in terms of figures.

Within Main Types
The number of tokens in each data base (from High-level group to Low-level group) is 34222, 24450 and 26868 respectively.The author calculated the standard frequency of self-repair in different sub-databases.From Table 7, it can be seen that the standardized frequency of self-repair made by the three different groups is 3565, 3820 and 4327 respectively in 100000 words.It seems that the higher level the participants are, the less self-repair they make.The discrepancy of the standardized frequency among the three levels seems to be large.
In order to make sure whether the differences are statistically significant or not, one-way ANOVA test was further conducted.The results are shown in the following figure.The significance value of SF is p = 0.136>0.05,which means that there is no statistically significant difference in the overall frequency of making self-repair across the three English levels.This indicates that three levels of SEMs tend to make similar amount of self-repair in their discussions.

Covert Repair
As mentioned before, SIR belongs to covert repair.So the comparative analysis on SIR is actually on covert repair across different groups.The RF and SF of SIR from each group are listed in Table 8.Note.SRR = Syllable Repetition Repair, ORR = One-word Repetition Repair, WRR = Within-two-word Repetition Repair, MWRR More-than-two-Word Repetition Repair.
The SF of SIR made by three groups is 2225, 2627 and 2878 respectively.It seems that the participants in high-level group make fewer SIR than those in the other groups.Again, in order to make sure whether the differences are statistically significant or not, one-way ANOVA test must be conducted.The results of significance test are shown in Figure 9 Figure 9. One-way ANOVA test on SIR within SEMs' databases The significance value of SF is p = 0.127>0.05,which means that there is no statistically significant difference in the overall frequency of making SIR (or covert repair) across the three English level groups.

Overt Repair
Overt repair includes DIR, AR and ER.So the author calculated the RF and SF of DIR, AR and ER respectively, and then put them together to calculate and analyze the RF and SF of overt repair.DIR is first dealt with.From Table 9, we can see the SF of DIR made by three groups is 167, 106 and 156 respectively.The participants in middle-level make the fewest DIR among the three.There seems no regularity this time.One-way ANOVA test is needed.
Figure 10.One-way ANOVA test on DIR within SEMs' databases The significance value of SF is p = 0.293>0.05,which shows that there is no statistically significant difference in the overall frequency of making DIR across the three English level groups.
The data of AR and ER are processed in the same way.The SF will be listed in table and then One-way ANOVA test will be conducted to check if there is a significant difference among groups.Participants in high-level group make fewest AR and those in low-level group make most.But maybe the difference is not significant unless conducting a further test to confirm.The result of One-way ANOVA test is shown in Figure 11.The significance value of SF is p = 0.235>0.05,which suggests that there is no statistically significant difference in the overall frequency of making AR across the three English level groups.The SF of ER in three groups is 539, 387 and 511 respectively, which is similar to that of DIR.The significance value of SF is p = 0.109>0.05,which reveals that there is also no statistically significant difference in the overall frequency of making ER across the three English level groups.
Finally, the author combines the above data to calculate the RF and SF of overt repair and the results are in Table 12: In terms of overt repair, the SF in three groups is 1340, 1185 and 1449 respectively.The significance value of SF is p = 0.179>0.05,which suggests that there is also no statistically significant difference in the overall frequency of making overt repair across the three English level groups.

Repair of Meaning and Form
The overt repair can also be divided into two parts, one is repair of meaning and the other is repair of form.Repair of meaning includes DIR and AR, both of which deal with the content of the speech.ER, while concerns the form of the speech, is a repair of form.All of them have been analyzed independently in the previous section.So in terms of meaning, all the author has to do is combining DIR and AR.And in terms of form, ER has been discussed.From Table 13, we can see that the participants in low-level group make 934 times of repair of meaning, most among three groups while the number in high-level and middle-level group is almost the same.
Figure 14.One-way ANOVA test on repair of meaning within SEMs' databases The significance value of SF is p = 0.221>0.05,which indicates that there is also no statistically significant difference in the overall frequency of making repair of meaning across the three English level groups.

Frequent Use of SIR
The overall distribution of self-repair in the present study is very similar to Chen and Pu's result in 2007.SIR made in both studies takes up over 60 percent.It is a rather high proportion.And ORR in both researches takes up over 50 percent in SIR.SIR is a kind of repetition.The speaker repeats what he or she has just said to leave time to think of words that he or she will say next.To some degree, it can increase the accuracy of the speech.But in many cases, fluency is broken.
By reviewing the video, the author sees that many of the participants keep thinking about what to say and how to say while they are speaking.The scientific explanation is that they want to make their utterances more native-like and appropriate, which requires some time for consideration.Consequently, they repeat often and by which they gain time for reconsideration and reorganization.Such a view has been proposed by many researchers.Rieger (2003) states repetition gives speakers time to engage in linguistic and/or cognitive planning.The speaker can gain some time to search for a particular word or construction or to think about the content of his/her utterances by repeating some information.Brown (1991) also gives a possible explanation for repetition.He explains that the tasks pressure may put a considerable strain on the speakers due to their limited linguistic competence.
In the present study, SEMs can choose their topics and they have enough time to prepare before video taping.During the discussion, there is no time limit.The time length can last from 3 minutes to 10 minutes.So the task and time pressure can be almost ignored in this study.Even though it is the case, the repetition phenomenon is still commonly seen in every group.A rational explanation is that SEMs lack oral English practice.They may have mastered basic grammatical rules in mind, but when it comes to a discussion, they still need some time to recall what was planned to express.This causes the frequent repetition in discussions.

Low Self-Correction Rate
In the research, the advanced English learner could correct 41% of the errors.It reveals that most SEMs can not monitor and detect their errors during the discussion; one possible reason is that they do not master the grammatical rules well enough so they can not notice and correct their errors.Another possible reason may be that the participants feel less pressure because they are told that it is a free discussion rather than a test.So they may not pay much attention to the accuracy of their speech.

Self-Repair and Language Proficiency
The results of comparative studies show no significant difference in making self-repair patterns among three English level groups.It can be explained from the following aspects.Firstly, the English levels of the participants in this study are judged by their scores in College English (written) Test.The differences of participants' spoken ability may not be concordance with their scores in CET.Secondly, some participants in high-level group keep considering the accuracy of their speech and they may make much more SIR than the other groups; on the contrary, some participants in low-level group may not correct their errors during discussion, but the fluentness of their speech is better and their self-repair behaviors are lowered.Thirdly, though the difference is not significant, there is such a tendency that participants in high-level group will make less self-repair than the other groups through using pause fillers.It shows that participants in high-level group have better learned and applied communication strategies in discussions.

Major Findings
The major finding of the present study now can be concluded as follows.
1) In discussions, Chinese SEMs frequently make self-repair; there are 4.4 times of self-repair in every minute and 7.5 times of self-repair made by each student.And the participants make self-repair in every 25 words.
2) Chinese SEMs make SIR most among all types of self-repair.Within SIR, ORR takes up the highest percentage of all.In covert self-repair, AR takes up the highest percentage, followed by ER and DIR.
3) SF of self-repair shows that participants in high-level group make fewest times of self-repair while those in low-level group make most.But quantitative analysis by SPSS indicates that there is no significant difference across different groups in making self-repair and its main types.Within subtypes of self-repair, there are significant differences between high-level group and low-level group in making WRR and MWRR.Participants from low-level group make more than those from high-level group.There is significant difference between middle-level group and the other two groups in making ADR.Participants in middle-level group make fewer ADR than those in the other two groups.There is significant difference between low-level group and the other two groups in making SER.Participants in low-level group make more SER than those in the other two groups.
4) Errors and pause fillers that may affect participants in making self-repair are also taken into consideration.
The correction rate of Chinese SEMs is rather low, with no more than 10 percent correction rate as a whole.The participants in high-level use more pause fillers instead of repetition.
To sum up, in order to improve the communicative skills of Chinese SEMs, teachers should have a complete understanding of the students' self-repair behaviors.They must further develop Chinese SEMs' basic language skills, especially speaking skills.In this respect, the author suggests that teachers should create more chances for students to speak in English, and encourage them to speak.

Figure 2 .
Figure 2. Distribution of overt self-repair

Figure 3 .
Figure 3. Distribution of self-repair of meaning and form

Figure 4 .
Figure 4. Distribution of SIR

Figure
Figure 5. Distribution of DIR

Figure 6 .
Figure 6.Distribution of AR

Figure 7 .
Figure 7. Distribution of ER

Figure 8 .
Figure 8. One-way ANOVA test on self-repair within SEMs' databases

Figure 11 .
Figure 11.One-way ANOVA test on AR within SEMs' databases

Figure 12 .
Figure 12.One-way ANOVA test on ER within SEMs' databases

Figure 13 .
Figure 13.One-way ANOVA test on overt repair within SEMs' databases

Table 1 .
Standardized word tokens of different groupsIt can be seen from Table1.that the total raw word tokens are 85,540.Participants in high-level group uttered 34,222 words and the number of middle-level and low-level is 24,450 and 26,868 respectively.

Table 2 .
Overall raw frequency

Table 7 .
Frequency of main types of self-repair in different groups

Table 9 .
RF and SF of DIR across different groups

Table 10 .
RF and SF of AR across different groups

Table 11 .
RF and SF of ER across different groups

Table 12 .
RF and SF of overt repair across different groups

Table 13 .
RF and SF of repair of meaning across different groups