Proficiency as a Factor in English-Medium Instruction Online Tutoring

The current study explored the effects of English as a foreign language (EFL) learners’ proficiency level on English-medium instruction (EMI) in an online tutoring project. Sixteen Taiwanese college students (tutees) collaborated with preservice teachers (tutors) in the United States in an EMI online tutorial project. The online tutor-tutee interactions were examined to determine if the two tutee groups of two respective proficiency levels were equally receptive to EMI online tutoring based on nine indicators. The results show that learners of both levels could generate approximately equal amounts of language-related discussions and utterances. Overall, proficiency level did not impede tutees’ ability to notice their linguistic gaps in interlanguage. Moreover, tutees of both proficiency levels benefited from the textual display of online interactions with their tutors during a task-based learning scheme. Among the nine indicators, successful uptake and feedback type were two strong predictors of tutees’ subsequent target language (L2) learning in both groups. The findings offer pedagogical values to promotion of EMI policy in globalized higher education.


Introduction
English-medium instruction (EMI) has become an unstoppable trend in many regions of the world.It is considered as a critical step to "internationalize" education in the global era, especially in non-English speaking countries.Furthermore, successful EMI implementation helps promote diverse scholary exchange and distance learning (Chen, 2013).Be that as it may, when the instructional language alters from students' first language (L1) to a second/foreign language (L2, English), students' receptivity is in question.How do they adapt to the drastic alteration?Among EMI-related studies, the most notable issue is the feasibility of such instruction in classes with students of mixed proficiency levels (Chang, 2010;Coleman, 2006;Evans, 2009;Jacobson, 2001).Chen (2013) and Yeh (2012) advocate for the inclusion of prerequisite training in higher education before full-fledged EMI enactment, in order to establish linguistic transitions (English for academic purposes).Hence, the current study responds to the urgent need to prepare nonnative English learners of different proficiency levels for EMI, with a task-based online tutoring project.
Telecommunication tools now have evolved to "telecollaboration 2.0" in education, as Guth and Helm (2010, p. 16) describe; language teachers incorporate telecommunication (also known as computer-mediated communication, CMC) into their teaching to connect the learners with international English speakers.The CMC in this study mostly refers to text-based online chat-a real-time communication between two interlocutors through an electronic agent, e.g., Google Chat.Although the physical and social cues are reduced in CMC, the existing research affirms that CMC allows L2 learners more time to process incoming messages and craft L2 output (Beauvois, 1992;Chun, 1994;Levy & Stockwell, 2006;Warschauer, 1996Warschauer, , 1997Warschauer, , 2001)).CMC also facilitates awareness-raising of learners' linguistic issues (Chen & Eslami, 2013;Loewen & Erlam, 2006;Shekary & Tahririan, 2006).From this viewpoint, telecommunicative environments appear to amplify the effects of EMI and empower nonnative speakers in academic activities.Based on this account, the online tutoring project exemplifies an approach to foster students' ability to process textual input-output in EMI contexts.The following inquiries are investigated in the study: Will learners' L2 proficiency become a problem in an online EMI tutoring environment?And what discussion characteristics can predict students' learning quality?

Theoretical Framework: Noticing Hypothesis
Noticing is considered a prerequisite for L2 development (Izumi & Bigelow, 2000;Schmidt, 1990Schmidt, , 2001)); without this awareness, learners may overlook linguistic problems because of the presence of competing stimuli during conversations (Gass, 1997).When learners consciously identify the new item from input (i.e., explicit learning, which is learning with awareness, Schmidt, 2001), they then can use their old knowledge to comprehend the new through meaning negotiation with interlocutors.They then internalize the new information into short-and long-term memory before integrating it into their output (Ellis, 1997).Literature indicates that noticing helps learners produce more accurate L2 utterances during face-to-face interactions in natural and meaningful contexts (Ellis, Basturkmen, & Loewen, 2001a, 2001b;Ellis, Loewen, & Erlam, 2006;S. Loewen, 2004S. Loewen, , 2005;;Schmidt & Frota, 1986).
Noticing has been repeatedly tested in face-to-face contexts, and a few researchers have also investigated noticing in CMC contexts.For example, Shekary and Tahririan (2006) tested the effects of noticing in a text-based L2 chat.Sixteen EFL learners engaged in dyadic online discussions over one month.Seven hundred and eighteen mini-discussions about language use (language-related episodes, or LREs) were identified.Three linguistic aspects were examined: suppliance of vocabulary, correction of grammar, and correction of spelling.The researchers found that the retention rate from LRE input remained high in both immediate and delayed posttests.Chen and Eslami (2013) adapted Shekary and Tahririan's project, replacing EFL-learner pairs with nonnative and native English speaker (NNES-NES) dyads.The researchers confirmed the occurrence of noticing and obtained higher memory retention rates than those reported by Shekary and Tahririan.
Certain factors of online discussion content, particularly LREs, may lead to better input recall rates (Loewen, 2004).Nine indicators have been empirically investigated: type, linguistic focus, source, timing, emphasis, directness, response, uptake, and successful uptake.Shekary and Tahririan (2006) and Chen and Eslami (2013) both concluded that successful uptake was the strongest predictor of language acquisition progress in online learning context.Chen and Eslami further found that direct and heavily emphasized input/correction raised learners' awareness.However, these factors may vary with learners' language proficiency which represents the partial readiness of input receptivity.Next section will further discuss this important variable.
Few studies, to the author's knowledge, have explored if more and less proficient NNESs can both benefit from EMI in online mediums and if noticing can still occur.Face-to-face contact allows for more nonverbal cues, yet the input-output exchange is fast.In online text chats, textual display permits self-paced utterances and unlimited self-correction (i.e., the visibility of utterances gives learners more time to notice the language forms); however, nonverbal cues are reduced in online conversations.In research comparing face-to-face and online interactions (e.g., Ellis et al., 2006;Williams, 2001), learners of lower proficiency responded better to explicit comments during online interactions-suggesting that the communication strategies required for online interactions are different from those needed for in-person exchanges.Porter's (1983) empirical study included 12 native speakers and 12 L2 learners of two proficiency levels: intermediate and advanced.Intermediate learners demonstrated greater need for repair and negotiation than advanced peers during interactions.Varonis and Gass (1985) investigated interactions among NESs, highly proficient NNESs, and low-level NNESs.Discussions between NESs and high-level NNESs outnumbered those between NESs and low-level NNESs.Also, many negotiations took place to minimize confusion, indicating that NNESs strove to be understood and to understand NES and NNES interlocutors.Williams (2001) also found proficiency influential.She found that the frequency of learner-generated LREs, the posttest results, and learner proficiency were positively correlated.Learners with higher proficiency outperformed those with lower proficiency in both grammatical and lexical test scores; lower-proficiency learners better noticed and retained the information if the comments were explicit and corrective.Thus, learners of higher proficiency are more ready to receive the knowledge generated during LREs.However, more empirical evidence is needed to explore key factors that may affect L2 learners' receptivity to EMI: discussion characteristics, task design, message display, buffered and reduced in-person contact.The purpose of the current research is, therefore, twofold: first, to contribute more empirical evidence to the literature regarding online EMI learning projects and, second, to present an analysis of learner proficiency level and noticing occurrences during CMC.

The Study
This study investigated if EMI is feasible for learners of different proficiency levels in an online L2 tutoring context.Can noticing similarly and equally occur to the two NNES groups?If so, how?Under the Interactionist account, when more than one proficiency level is involved (like in peer tutoring), the more proficient party is assumed the role of caregiver by offering scaffolding and guidance to their less-proficient partners (Long, 1983a(Long, , 1983b)).Therefore, in this study, NNES participants of different proficiency levels were assigned to interact with NES peers and collaborate one-on-one through task-based telecommunication.L2 proficiency levels were examined as one possible variable affecting the amount of noticing and the effect on subsequent learning.

Participants
The study involved two NNES-NES (tutees in Taiwan and tutors from the U.S.) dyadic combinations: lower-intermediate learners interacting with NESs (eight pairs) and higher-intermediate learners interacting with NESs (also eight pairs).The NES tutors were undergraduate education majors who were taking teacher's training courses.They joined this project as a part of their course work in Multicultural Teaching.The main research focus was NNES participants who were junior or senior students in a Taiwanese college, Foreign Language majors with an age range between 20 and 22.They took a diagnostic test (shortened as GEPT, see Note 1) before the project launched.The intermediate-level test questions from the writing and reading sections were adopted.The average test score of the entire class was 45 (out of 100) with an SD of 13.4.To accentuate proficiency discrepancy as the main variable, the researcher chose to include those who scored lower than 1 SD below the mean and considered them to be lower-intermediate level (eight students).Those who scored 1 SD above the mean were also included and considered to be high-intermediate level (see Figure 1).The students whose scores fell within 1 SD from the mean were excluded from the study.Additionally, a t-test analysis was conducted to confirm a significant difference between the scores of the students in the two groups (the p-value was .00).
Figure 1.Proficiency test (GEPT) results and the inclusion of participants from two proficiency levels

Online Tasks
According to Long (1991), noticing helps students holistically perceive and correct their linguistic problems (filling the holes).His claim coincides with Swain's (2000) Output Hypothesis, Schmidt's (2001) assertion that noticing is SLA's prerequisite condition, and Chapelle's (1998, p. 98) seven principles for the design of online L2 learning tasks: 1) The linguistic characteristics of target language should be made salient.
2) Learners should receive help in comprehending semantic and syntactic aspects of linguistic input.
3) Learners should have the opportunities to produce target language output.4) Learners need to notice the errors in their own output.5) Learners need to correct their linguistic output.6) Learners need to engage in target language interaction whose structure can be modified for negotiation of meaning.
7) Learners should engage in a second language task designed to maximize opportunities for productive interaction.
Chapelle's principles guided the task design.During the orientation, the instructors confirmed participants' basic technology proficiency.The dyads engaged in online text-based chats of around 90 minutes per week to complete their two learning tasks.Before the cyber connection started, each participant's short biography and contact information were posted on a project website, which was also the primary instruction delivery medium for the tasks and the two classes.During the first week, the participants practiced synchronous text-based chatting with each other.
Weeks 1 through 3 were the orientation phase, followed by an ice-breaking activity.Weeks 4 through 6, the first task highlighting cultural differences of self-value was conducted.The second task, an action report on environmental protection, was completed between weeks 7 and 9.The immediate posttest was administered in week 10 and the delayed posttest in week 14.Participants were not informed about the posttests beforehand and thus did not review the chatscripts prior to the tests.

Coding of LREs and Posttests
Each dyad's LREs were identified from their chat archives, extracted, and saved to Word documents.LREs are considered a valid measurement to authentically and comprehensively assess students' linguistic knowledge (Swain, 2000(Swain, , 2001)).A total of 512 LREs were coded and categorized into nine characteristics: type, linguistic focus, source, complexity, directness, emphasis, response, uptake, and successful uptake (see Table 1 for details).

Combination of complexity and directness
Light: Indirect and simple Heavy: Direct, complex, or both.

Response
Type of feedback provided by the NES Provision: NES gives information about a language form.Elicitation: NES attempts to draw out from NNES a language form or information about a language form.Uptake NNES response to feedback Uptake: NNES produces response.
No uptake: NNES does not respond.
Successful uptake Quality of student response Successful uptake: NNES incorporates linguistic information into production or show solid evidence of understanding Unsuccessful uptake: NNES does not incorporate linguistic information into production.
When the online correspondence ended, two posttests (immediate and delayed assessments) were custom-made for individual Taiwanese learners based on the items each dyad had discussed in LREs.This posttest served as a quantitative index of learners' intake and output (Loewen, 2005;Shekary & Tahririan, 2006;Swain, 2001).Three possible LRE foci were correction, suppliance, and spelling (suggested in Shekary & Tahririan, 2006).
In the correction-focused test items, which were mostly grammar related (see LRE 2 for an example), the NNESs were asked to improve sentences they had incorrectly produced during interactions with their tutors.The suppliance questions were primarily vocabulary related (see LRE 1).These test items required learners to provide a definition or meaning for problematic word choice, idioms, or phrases.To test spelling, NNES students were required to identify the correct spelling of the words that appeared in LREs.The students' answers to the test items were coded into three categories: (1) correct (a response matched the targeted item in the tested LRE, e.g., LRE1 and test item 1), (2) partially correct (an acceptable or improved response but not the same as the original items discussed in LREs, e.g., LRE2 and test item 2), and ( 3) incorrect (a response showed that the NNES failed to reproduce the targeted item in LRE, e.g., LRE3 and test item 3).All the excerpts are unedited, i.e., as appeared in the data.Yvonne: How many hours are difference between us?
Stacy: Now I think it is 14, maybe?
Yvonne: Oh!I see.
Correct test response (i.e., the linguistic issue was accurately recalled): In the United States (or some countries), every fall people put our time back one hour, and in the spring we put our time forward an hour.It is to save electricity and optimize the daylight.
LRE 2 (grammar-related item): Yvonne (NNES): What am I proud of?I think I am proud of take care of me by myself.
Stacy (NES): it needs to be taking, because of the proposition before the verb.
Yvonne: Taking care of myself?

Correction test item 2:
Please find an error in the following sentence.
Maria is a very good student.She is always serious about submit her work in time.
Partially correct answer (i.e., the answer did not reflect the issue discussed in LRE but grammatically and semantically acceptable; the NNES changed the word choice and its form in addition to the preposition in): Maria is a very good student.She is always serious about finishing her work on time.
LRE 3 (grammar-related item): Yvonne (NNES): Oh~ sorry, I make a misunderstood!Stacy (NES): Oh, it is not a problem, Thanksgiving is coming very soon.
Yvonne: Do you go home for Thanksgiving?
Stacy: Sure, earlier you should have used "I misunderstood".

Correction test item 3:
There is an error in the following sentences.Please correct it.
A: I think I make a misunderstood.
B: How is so?
Incorrect test response (i.e., the respondent still could not correct the error discussed in the LRE.The focus is the first utterance, but the NNES changed the second sentence from "How is so" to "Why is so"): Why is so?
The immediate test was administered in week 10.The delayed assessment was in week 14.To avoid item repetition, the immediate test focused on the first half of the data (LREs), and the delayed test covered the second half.A total of 425 LREs were tested (84% of LREs produced).LREs in which dyads failed to find a linguistic solution were withdrawn.
Table 2 shows an example of an LRE and its characteristics in accordance with the coding scheme and the definitions suggested by Loewen (2005) and Shekary and Tahririan (2006).The word night store that the NNES misused had temporarily caused a misunderstanding.An immediate act of meaning negotiation was called for to help both interlocutors to resolve the confusion together and resume the main discussion (Gass, 1997).The NES modeled the correct language use, strip bars, and added more information to make himself understood.Therefore, this LRE is reactive.Also, the LRE focused on the word night store (strip bars), so the linguistic focus is on vocabulary.The meaning of the problematic linguistic item impeded communication; the reason for the initiation of this LRE was meaning as opposed to code (grammar).More than one response move was required to resolve the communication breakdown, making it a complex LRE.In addition, the tutor gave an explicit explanation, making it direct feedback.Because the tutor gave complex and direct explanations, which made the language focus more noticeable to both interlocutors, this LRE showed a heavy emphasis on the problem trigger and resolution.The discourse move in this LRE was directly from the tutor to the tutee to provide the explicit information, without a further attempt to elicit correct word choice from the NNES.Finally, the NNES was receptive of tutor's explicit feedback.Not only did she follow with an uptake, but the uptake was successful-she incorporated the tutor's input into her new output.Meaning negotiation was accomplished, and the task-related discussion between the interlocutors continued.The matching suppliance test item: What is a "strip bar/strip club"?

Data Analysis
The frequency counts of LREs and LRE characteristics produced by each dyad and group were summarized to verify the occurrence of incidental noticing in the online EMI tutoring setting.After the posttests, scores were calculated and statistics were produced for each dyad, the higher-and lower-proficiency groups, and the entire sample.A chi-square analysis (with the alpha level set at .05) was chosen to determine any significant difference between immediate and delayed test performance of the two groups (Ott & Longnecker, 2001).
Last, multi-factorial binary logistic regression analyses were conducted to reveal the best-fitting models for the relationship between the dependent variable (correct test responses) and the independent variables (the characteristics of LREs) (see Table 3).Logistic regression is used when the dependent variables are binary or dichotomous rather than numerical (continuous).In this case, regression modeling helped determine which characteristics of LREs would best predict test scores.Separate logistic regressions were performed for different proficiency groups (lower-and higher-intermediate levels) and for different question types (correction, suppliance, and spelling) on the posttests, using students' proficiency as a key variable.In stepwise regression, each independent variable is added to the equation one at a time according to the default entry criteria of SPSS 15.0 (from .15 to .20).Each step added the variable that caused the highest amount of change to the model.If a variable did not make a significant contribution to the model, it was excluded.An alpha level of .05 was chosen to conduct stepwise regression.Given that logistic regression allows only binary data (y = 0, y = 1), the dependent variable needed to be dichotomized (incorrect test response = 0, correct = 1) (Ott & Longnecker, 2001).Because partially correct responses were considered acceptable and show some degree of learning, they were merged into the category of correct response to generate a bigger sample size.Consequently, assigning binary values to the independent variables (see Table 3) resulted in odds ratios (ORs) and allowed for easier interpretations.The output of the logistic regression analysis was subjected to odds ratios and 95% confidence intervals for each independent variable.ORs generated by such statistical measures indicate the approximate likelihood of the outcome to be among those with y = 1 than among those with y = 0 (see Table 3).The larger the OR, the better predictor a certain independent variable is (Hosmer & Lemeshow, 2000).But a negative relationship between two variables produces an OR < 1.0, and the smaller the odds ratio, the stronger the negative relationship.

Reliability of Coding and Testing
The researcher of the present study coded 512 LREs.To estimate the inter-rater reliability of the coding, 50% of the data was coded by both the researcher of the study and the instructor of the Taiwanese participants.The kappa coefficients for LRE coding was k = .95.
After coding the LREs, the researcher crafted test items for each NNES by following the three constructs suggested by Shekary & Tahririan (2006): correction questions for grammar-related LREs, suppliance questions for vocabulary-related LREs, and spelling questions for spelling-related LREs.Because obtaining the reliability of the individualized testing is impossible in a conventional sense (e.g., testing and retesting for internal consistency), construct validity (suggested by Loewen, 2005) was chosen to ensure the suitability of the test items.In short, the aim was to verify that the test items actually measured a learner's ability to reproduce or recall the linguistic knowledge generated in the LREs.The instructors ensured that the test items tested the issues discussed in LREs.

Results and Discussion
The current study attempted to investigate the feasibility of online EMI L2 tutoring projects and the noticing effects on the NNES tutees of higher and lower language proficiency.By involving NES tutors and taking advantage of CMC's capacities, the researcher hoped little or no significant difference would be found between the performances of the two groups in the study.
Both lower-and higher-level students were able to produce approximately equal amounts of LREs during their EMI online L2 tutoring (Table 4).A two-sample one-tailed t-test (H  was that lower group > higher group) was conducted based on the average utterances by the two groups.The p-value was .227(> .05);hence, there was no significant difference between the two groups.An additional one-tailed t-test (H  was that lower group > higher group) was performed on the LREs.Once again, there was no significant difference between the two groups (p = .401).When the data set is divided in accordance with the three linguistic foci of grammar, vocabulary, and spelling, both groups also generated similar amounts of LREs in each category (see Figure 2).This is a surface indication that the combined effects of the CMC context and NES tutors may have promoted noticing for L2 learners of different proficiency.In addition, the similar numbers of LREs between the code-related and message-related episodes (257: 255) indicate that the communicative tasks used in this study successfully promoted incidental noticing of different linguistic issues (see Figure 3).The task-based language-learning framework has effectively raised learners' consciousness of their linguistic problems, regardless of their proficiency level.Based on the results of the posttests (Table 5), three chi-square analyses were used to examine if there were significant differences between the distributions of correct answers produced by the two groups.The results showed that there were no significant differences between the posttest results within either group (lower group: X 2 (2, n = 216) = .877,p > .05;higher group: X 2 (2, n = 209) = .656,p > .05).Therefore, the students of both proficiency levels successfully retained the linguistic knowledge discussed in LREs from short-term to long-term memory.A separate chi-square analysis also showed that there were no significant differences among the distribution of correct, partially correct, and incorrect answers in the two posttests between the two proficiency groups (X 2 (2, n = 425) = .318,p > .05).Proficiency level did not affect NNESs' memory retention or linguistic knowledge gained in LREs during the three weeks between the posttests.The results show that NESs' involvement and CMC context facilitated noticing, intake, and output.Ellis (2001) pinpointed the effect of contextual factors (interlocutors and media in this type of study), which could explain the discrepancy between the findings of the current research and other similar studies.Loewen's ( 2005) study (conducted with an NES teacher and NNES students in a classroom context) as well as Shekary and Tahririan's (2006) study (in an NNES-NNES CMC context) both reported that NNES participants showed an obvious decrease in memory ranging from 8.3% to 13.6% between the immediate and delayed posttests.The NES peers plus the text-based CMC in the present study could explain the more accurate recall of linguistic information and better performance of the students in this study than those in Loewen (2005) and Shekary and Tahririan (2006).The higher-proficiency group's memory only decreased 1.4%, while the lower-proficiency group decreased 0% from immediate to delayed posttests.
Previous studies in CMC and face-to-face contexts found that NES counterparts are more capable of giving immediate feedback (error correction), offering comprehensible input (foreigner talk discourse), and presenting language models (Kung, 2002;Long, 1983b;Porter, 1983;Schwienhorst, 2004;Williams, 2001).NES tutors' advantages, such as their ability to apply various communication strategies, allow them to more easily repair communication breakdowns than NNES tutors.Other literature supports NNES-NNES interactions and presents the possible interlanguage improvement without NESs' involvement (Porter 1983;Smith, 2003bSmith, , 2004), yet the NES tutors in the present study were preservice teachers and could explain their first language in detail (e.g., the nuance of synonyms, idiomatic expressions, or grammatical exceptions).Along the same line, the tutors in the present study facilitated and reinforced awareness, comprehension, intake, and integration and thus had a significant impact on tutees' learning and memory.Consequently, the dual stimuli of NES peers and dyadic CMC facilitated the better performance of the learners in this study compared to similar studies.
To examine the LRE characteristics, the whole dataset was first divided by the two proficiency levels, and then two separate logistic regressions were performed on the two subsets.As shown in Table 6, successful uptake was a strong predictor for both groups of learners (OR = 2.779 for lower group and OR = 3.232 for higher group).This outcome corresponds with the findings of Loewen (2005) and Shekary and Tahririan (2006), who endorsed the importance of the quality of uptake as opposed to the mere presence of it: "It is the correct production of linguistic information during LREs that helps learners produce the same correct information in the test item.Negotiating about language is not enough" (Sherkary & Tahririan, 2006, p. 570).Note.Predictors with ORs of than 1 were also reported in their reciprocal values (when y = 0), marked with asterisks. = .05was chosen to be the cut-off point to exclude the less significant predictors.
Directness, however, only entered the lower-proficiency NNESs' model with a high OR of 2.882, which means that explicit feedback was more effective in promoting noticing and subsequent learning for less proficient learners than implicit feedback (such as recast or repetition).This result agrees with Ellis et al. (2006) and Gass (1997), who found that explicit feedback was more effective than implicit feedback during communicative tasks.
When NESs attempt to elicit more information (improved output) from NNESs, NNESs are pushed (through noticing) to transform their declarative knowledge into procedural rules while contemplating a linguistic problem and possibly a solution (Schmidt, 1990).Porter (1983) concluded that less proficient learners demonstrated greater need for conversation repair tactics from NESs than advanced learners did.Meanwhile, more proficient learners did not appear to be affected by the type of feedback NES peers provided in LREs.This result echoes Williams' ( 2001) findings that more proficient learners were more receptive than less proficient classmates to linguistic feedback from either the NES teacher or NNES peers during LREs.
Knowing that proficiency level could qualify different characteristics as strong predictors in logistic regression analyses, the researcher included the learner's proficiency level as a new independent variable-in addition to the LRE characteristics-and recombined the subsets of data for a bigger sample size (425 LREs) to precede a further analysis (Table 7).The purpose was to examine the effect of tutees' proficiency level on noticing.Three more logistic analyses were conducted in accordance with test types: overall (uncategorized data), correction, and suppliance.Readers should be informed that the sample size of spelling-related LREs was too small (n = 22); thus no claim can be made.Note.Predictors with ORs less than 1 were also reported in their reciprocal values (when y = 0), marked with asterisks. = .05was chosen to be the cut-off point to exclude the less significant predictors.
In the three regression analyses, successful uptake entered all three models as a prominent predictor.In the logistic regression of overall test responses, successful uptake was the most powerful predictor (OR = 2.695).Moreover, three other significant predictors also entered the model: directness, response, and proficiency level.This outcome indicated that direct (explicit) feedback (OR = 1.992) and tutors' attempt to elicit responses from tutees (OR = 1.915) also positively affected tutees' correct response performance in the posttests.A few empirical studies have found that NESs' competency in using communication strategies (e.g., clarification or elicitation) is quite critical when interacting with NNESs (Long, 1981(Long, , 1983b;;Pica & Doughty, 1985).NESs can promote meaning negotiation and "pushed output" (Swain, 2000, p. 99) from NNESs by activating their existing knowledge and generating new output.Eventually, NNESs' interlanguage quality improves.
Furthermore, learners' proficiency level (OR = 1.753) in the model means the likelihood of correctly recalling the LRE-related linguistic information was almost 1.753 times more for learners of higher proficiency than for their less proficient counterparts (see Table 7).Thus, NNESs' proficiency influenced learners' performance on the posttests.Williams' (1999Williams' ( , 2001) ) studies in a conventional classroom setting also reported a similar finding.
In the study, learners of lower proficiency responded better to corrections given by their NES teacher compared to peer feedback.The frequency of learner-generated LREs, the posttest performance, and learner proficiency were positively correlated.However, there are two major differences between Williams' study and the present study.First, the lower proficiency level NNESs in the present study were able to generate almost as many LREs as the more proficient group.Second, there were no significant differences between short-and long-term memory retention between the two groups.The most probable explanation for the above could be that tutor-tutee instruction was personalized in the present study.Gass (1997) and Schinke-Llano (1986) emphasized the particular type of correction feedback less proficient learners needed: They responded better when immediate, explicit corrections were given.Similarly, the tutees in the current study obtained direct and explicit feedback from their tutors, who facilitated knowledge integration through LREs and eventually improved L2 output (Ellis, 1997).Similarly, the results of this study show that direct feedback and explicit response were significant predictors of learners' performance on posttests.Moreover, the textual presentation of the interactions via CMC allowed NNESs more time to analyze and to internalize the information provided by NESs (Sharwood Smith, 1993).Under this dual influence, the difference in performance between less-and more-proficient NNESs' was reduced in the current study.
Two additional logistic regressions were conducted to further explore what the effect of NNESs' proficiency was in correction and suppliance test items.In the regression analysis on the correction item type (grammar-related items) (Table 8), successful uptake entered the model with the second-highest OR of 2.123.However, proficiency level served as the strongest predictor (OR = 2.281) in this analysis.The group of higher proficiency answered 75.8% of the items correctly in the posttests, while the lower proficiency group answered 57.2% of the items correctly.The impact of noticing on grammar-related performance was not similar-higher proficiency learners outperformed their counterparts.The findings support the claim that learners' skill level can determine how ready they are to notice new forms during interactions (Bardovi-Harlig, 1995;Schmidt, 2001).This test item type appeared to be more challenging for less proficient tutees.Williams' ( 2001) study also yielded the same result: The lower proficiency learners' existing knowledge was insufficient to support the grammatical input from their tutors.Moreover, the complex grammatical explanations from the tutors were in the target language (EMI) as opposed to their native language, which doubled the challenges (Benjamin, 2001;Feuillard, 1997;Weatherford, 1997).Weatherford (1997) explains that NNESs of lower proficiency tend to struggle when offered grammatical explanations in L2, and therefore explicit feedback can make input more salient to them.Nevertheless, proficiency level had less influence on suppliance test items and did not enter the model as a powerful predictor (Table 9).More proficient learners answered 79% of the test items correctly, while the other group correctly answered 75.3%.The difference was not substantially significant: Tutees in both groups were receptive to tutors' message-related input through meaning negotiation in the CMC EMI context.The focus of suppliance items was vocabulary, which could explain why source (i.e., message-related LREs) entered the regression models for the first time.The results of the logistic regression analyses support that task-based language learning-focusing on both meaning and form-meaningfully contextualizes vocabulary for learners.The finding also echoes that of Tekmen (2006).She delineated the parallel increase between the content complexity in learning material and explicit instruction.The increase of directness in L2 feedback was key in facilitating learners' intake and output in multilevel classes.

Conclusion and Implications
Proficiency level has seldom been researched as a major variable in the literature regarding online EMI L2 learning projects.The current study aimed to explore the possible effect this variable had in an NES-NNES, tutor-tutee, higher-lower proficiency setting.The results show that learners of both proficiency levels were able to generate similar amounts of LREs, which indicates that L2 proficiency level did not impede learners' ability to notice the input during CMC.As for memory retention in the immediate and delayed tests, both tutee groups did not show major decreases between the two posttests.Statistical results further show that higher proficiency learners were still better on grammar-related test items (i.e., correction) in posttests.Both learner groups seemed to benefit similarly from the textual display of natural interactions with their NES tutors during task-based CMC.
The high number of learner-generated LREs and the correct test responses show that the double stimuli of NES tutors and CMC helped raise linguistic awareness and subsequent L2 learning of tutees of both proficiency levels.
The logistic regression analyses also suggested that successful uptake was a significant predictor of subsequent L2 learning in both tutee groups, which is consistent with existing face-to-face and CMC research of noticing.Successful uptake is evidenced by the similar number of LREs between the two groups, indicating that lower proficiency students were as receptive of tutors' EMI instruction as much as their higher proficiency counterparts.Meanwhile, the LRE characteristic of directness (explicit feedback) was particularly significant to learners at the lower proficiency level.It helped them recall the linguistic knowledge discussed in LREs and generate correct test responses in posttests.Tutors' corrective feedback evidently facilitated tutees' subsequent L2 learning.
Proficiency level, as a powerful predictor, noticeably affected learners' performance on grammar-related LREs (and correction test items).To make input more comprehensible, Loewen (2005) asserts that direct and explicit feedback should be made highly available to fit learners' needs for higher a possibility of noticing occurrences.
The fact that the more proficient learners in the current study received more direct feedback than what lower proficiency level classmates received was unexpected and intriguing.The higher quantity of direct and explicit feedback might have helped proficient tutees perform better in posttests than the less proficient learners, especially on grammar-related test items.
Additionally, the results also show students of both proficiency levels were equally receptive to lexical and semantic feedback from their tutors.CMC's pedagogical capacities allowed learners to visually analyze information, internalize the input into intake, and carefully craft the output.Despite the small sample size and the short duration of the present study, the findings suggest that explicit feedback and elicitation techniques can promote learners' linguistic awareness and subsequent L2 learning in an online EMI project.Finally, NESs' involvement was a performance booster in collaborative communicative tasks; NES tutors' "explicit rich instruction can serve to bridge the gap between learners' current proficiency level and the level demanded by the input . . .and hence speed up the language learning process" (Tekmen, 2006, p. 222).Future researchers and classroom practitioners are encouraged to consider the possible pedagogical effects of offering different types of feedback for different aspects of linguistic knowledge input to L2 learners of different proficiency levels in both classroom and CMC settings.
Future researchers should also include learners of a wider range of language proficiency.The participants in this study were all from the same class; hence proficiency discrepancy was limited between the two tutee groups.Moreover, the online collaboration in this research was not conducted in a fully controlled classroom setting due to the 14-hour time difference between tutors and tutees.Each dyad engaged in CMC after class, chose their own chat agents with their partners, and autonomously followed the instructions of a 90-minute session each week.Since this study included only 16 participants and thus produced a small sample size of LREs, a more controlled experimental setting could increase the quality and the quantity of data collection.For instance, some dyads did not build a strong sense of partnership and hence engaged in shorter online conversations.On the other hand, some developed friendships and interacted voluntarily with each other more often than required, which reinforced their intrinsic learning motivation.

Figure 2 .
Figure 2. The comparison among the quantities of grammar-, vocabulary-, and spelling-related LREs Generated by lower-level NNESs and higher-level NNESs

Table 2
. Example of coding scheme Carrie (NNES): The reason it's he went to night store.There have a lot of sexy girl and drink wine.I don't know how to explain that place.Carrie: Night store sounds a little strange name.Hahaha!Carrie: How about you? Dennis (NES): We have stores like that.They aren't very good places...we call them "strip bars."Dennis: It's very frowned upon to go there.Carrie: Oh! Strip bars are like dance clubs, right?

Table 3 .
Binary variables of logistic regression

Table 4 .
Frequency of LREs generated by lower-and higher-proficiency level NNESs

Table 5 .
Test results (lower-level NNESs' and higher-level NNESs' correct response percentages of correction, suppliance, and spelling test items)

Table 6 .
Logistic regression results of uncategorized test items divided by proficiency level (lower-proficiency vs.

Table 7 .
Logistic regression results (with proficiency level as an additional variable)

Table 8 .
Results of the logistic regression of correction test item type

Table 9 .
Results of the logistic regression of suppliance test item type