The Effect of Total Physical Response Method on Vocabulary Learning/Teaching: A Mixed Research Synthesis

This aim of this study is to determine the effect of the TPR method on students' vocabulary learning and the factors affecting the effectiveness of this method by combining the findings obtained from both qualitative and quantitative studies. For this purpose, a primary study with 13 quantitative and 7 qualitative findings was included in this study using mixed research synthesis. The data obtained from the studies with quantitative findings were combined with the meta-analysis method and the studies with qualitative findings were separately combined with the thematic synthesis method. Then, using the analytical themes obtained from the thematic synthesis, the variance among the studies included in the meta-analysis was attempted to be explained. As a result of the meta-analysis, it was determined that instruction based on the TPR Model had a “strong” effect size (ES=1.131, 95% CI: -0.705 to 3.729) on academic achievement. As a result of the thematic synthesis, four descriptive themes were formed: "Learning-teaching process in TPR method", "Learning outcomes in TPR method", "motivation" and "Implementation suggestions/requirements". It has been determined that teaching based on the TPR method has significant contributions to the learning process (increasing active participation, learning by having fun, cooperative learning, etc.) and learning outcomes (word learning, correct use, creativity, etc.), motivation in learning, and some requirements (according to teacher and feature) have been determined. According to the descriptive themes obtained from the thematic synthesis, 10 analytical themes were developed. It was observed that all analytical themes were made in the experimental studies, and two of the 10 analytical themes explained the variance among the studies included in the meta-analysis significantly (p<.05).

voice, words, commands and body language are used (Demircan, 2013), is a method focused on students' listening to and reacting to the orders from the teacher. Richards and Rogers (1986, p. 87) "The Total Physical Response Method (TPR) is built around the coordination of speech and action; defined as a language teaching method that attempts to teach language through physical (motor) activity. The Total Physical Response Method is associated with the psychological term "tracking theory", which includes rote re-learning technique through intense repetition of both verbal repetitions and physical activities (Richards & Rodgers, 1986, p. 87). The main purpose of the Total Physical Response method is to reduce the stress of students and to entertain them during the foreign language education process (Larsen-Freeman, 2000, p. 113).
The basic principles of TPR are as follows: (i) Coordination of speech and action facilitates language learning. (ii) Grammar is taught inductively. (iii) Meaning is more important than form. (iv) Speech is delayed until comprehension skills are formed. (v) Effective language learning takes place in a low-stress environment. (vi) The teacher's role is central and selects appropriate commands to introduce vocabulary and structure. (vii) The student is a listener and executor, responding to commands individually or collectively. (viii) Learning is maximized in a stress-free environment (Asher, 1977).
As in language acquisition, this method is primarily about developing listening and then understanding skills. It is based on speech and movement coordination and tries to teach language through physical activities (Richards & Rodgers, 2001, p. 73). Physical responses precede verbal responses and foreign language teaching begins with the use and teaching of imperative moods (Demirel, 1999, p. 63). In the process, the meanings of the words are not explained one by one, but are indicated by actions and reactions. The process starts with teaching the names of the materials in the classroom, that is, it is important to use realia and to have materials that the student can see and touch (Sarıgül, 2017). The teacher takes an active role, but the important thing is to ensure that the student is sufficiently exposed to the language (Asher, 1977).
The teacher must be a model in the classroom and has 3 basic features: (i) Mastering the spoken language, (ii) Developing understanding with body language, and (iii) Getting students ready to speak (Widodo, 2005). Students are not forced to speak in any way, but they are expected to be ready because it is believed that the second/foreign language will be learned in the same way as the mother tongue is learned. For this reason, first exposure to language, understanding, observing physical reactions and then speaking are expected (Sarıgül, 2017). In time, students who see the teacher as a role model will start to imitate their teachers. They begin to show their reactions to the commands given by the teacher (Richards & Rodgers, 2002). While implementing this method, students should not be corrected directly, they should be encouraged to speak, and they should be tried to make sense of the situation in the process. Trying to correct every word of individuals learning a new language and interrupting their speech will have negative effects on them. More corrections may occur as conversations improve (Göçen, 2020).
It is more effective to use the method with beginners and young students (Göçen, 2020). It is important that this method is used especially for students between the ages of 7-11, children learn better when they see and learn by doing, so movement is the basis for their learning (Gürsoy, 2014). Beginner students can physically react to commands given. It is expected that students who actively learn and use their physical intelligence will show more success. It is a method in which both the right and left parts of the brain work actively at the same time. There is not much material to use in the classroom. The important thing is the competence of the teacher and the ability to use body language. Teaching a foreign language to young learners or children differs from teaching adults, especially because it involves fun with movement and physical participation (Shin, 2006). In addition, Shin (2006) stated that the more fun learners have, the better they will remember the language they have learned. Ytreberg (1990) emphasized that "Children's understanding comes from hands, eyes and ears, and the physical world is always dominant in learning." Similarly, the "Total Physical Response" method also argues that language learning should refer to physical actions (Asher, 2002). At the same time, the method covers many teaching techniques, including drawing, music, games, role-playing, storytelling, competition, etc. Children are more likely to remember words associated with a fun game, an interesting picture, a song, or an absurd situation.
The Total Physical Response Method was developed by Psychologist James Asher in 1965(Asher, 1977. Asher likens learning a foreign or second language to first language acquisition and states the process as follows: first, listening skills develop before speaking skills do, and children begin to respond physically to their parents' commands. After a basis in listening skill is established, speaking naturally emerges (Richards & Rodgers, 2001). The student, who is sufficiently exposed to the language, is expected to construct the language, make sense of it, and use the cognitive processes of the mind actively. In general, Putri (2016, p. 19) stated the advantages of the method as follows: It is fun and easy. It does not require much preparation from the teacher's point of view. It is a good tool for learning vocabulary. Class size is not a problem. Useful for kinesthetic learners. However, Putri  Bulan, 2019;Türkeş, 2011;Demir & Çubukçu, 2014;Sariyati, 2013;Khakim & Anwar, 2020;Harrath, 2016;Nguyen & Le Thi & Nguyen, 2020;Irshad, 2018;Kassem, 2010;Ortiz & Guaraca, 2018;Zhen, 2011;Kariuki, 2008;Khusniyati & Haryadi, 2020;Qui, 2016). In the quasi-experimental design, one group is determined as the experimental group and the other group is determined as the control group, and both groups are compared with the measurements both before and after the experimental procedure (Fraenkel & Wallen,200) and the significant difference is checked.
As a result of the literature review, this method leads to an increase in students' motivation and interest in learning (Hsu & Lin, 2012;Sariyati, 2013), their creativity is improved concerning the lesson (Islami, 2019), and they feel comfortable during the lesson. Finally, it was observed that there was a decrease in their stress. They had fun while learning in the lesson and they actively participated in the lesson (Octaviany, 2007;Umah, 2017;Nuraeni, 2019;Islami, 2019). In addition, it was concluded that the words used by the students increased and their speaking skills improved (Ghani & Hanim, 2014;Safitri, Setiyadi, & Huzairin, 2017). On the other hand, in the literature research, no meta-analysis study was found that indicated the effect of the Total Physical Response Method on vocabulary learning.
In addition to the studies stating that this method affects vocabulary learning and teaching, the existence of studies showing that it has no effect has shown the need to reach a synthesis by using the meta-analytical review method. In addition, the necessity of making a thematic synthesis was considered to determine the needs of the students in line with this model. In this context, this study aims to determine the results of the studies examining the effect of learning environments created according to the Total Physical Response Method on vocabulary learning/teaching, with a mixed research synthesis. Thus, this study is considered important in terms of contributing conceptually and methodically to the studies to be carried out by resolving the contradiction in the literature. For this purpose, answers to the following questions were sought.
1) What is the effect of the Total Physical Response Method on vocabulary learning/teaching?
2) What are the students' views and experiences on learning vocabulary of the Total Physical Response Method?
3) What are the factors affecting the vocabulary learning of the Total Physical Response Method?

Method
In this study, mixed research synthesis method was used. Mixed research synthesis is a systematic literature review method that aims to combine the findings obtained from qualitative and quantitative studies on the same subject area (Sandelowski, Voils, & Barosso, 2006). This method consists of three stages (i) combining quantitative study results with meta-analysis method, (ii) combining qualitative research results with thematic synthesis, and (iii) comparing quantitative and qualitative synthesis results (Harden, 2010). The purpose of this method is to explain the variance between studies combined with meta-analysis with analytical themes obtained from thematic synthesis (Kanadlı, 2020, p. 94). The mixed research synthesis process is given in Figure 1.

Literature Review Process
The studies included in this study were obtained by reviewing Yök Thesis Catalogue (2021), Google Academy/Scholar (2021), ERIC (2021), ULAKBİM (2021), EBSCO (2021) and Proquest Digital Dissertation (2021), Publish or Perish (2021) databases. The search was carried out by two different researchers by entering similar search words. While searching, the key concepts of "TPR", "Total Physical Response", "Total Physical Response", "Effect of TPR language teaching method on vocabulary learning" and "Effect of TPR language teaching method on vocabulary teaching" were entered into the search engine. Necessary data were tried to be obtained by communicating with the owners of the studies whose access was limited or inaccessible. As a result of the literature search, approximately 500 records were reached where the TPR strategy was used on vocabulary learning/teaching.

Criteria for Inclusion of Studies
Since this study is a mixed research synthesis, separate inclusion criteria were determined for meta-analysis and thematic synthesis studies. For the meta-analysis part, in order to determine the effect size of the Total Physical Response Method on vocabulary learning, the quantitative studies conducted between 2008 and 2020 on teaching environments organized according to the Total Physical Response Method (including the studies using the storytelling strategy) were examined within the scope of the research (Table 1). For this purpose, studies should have the following features: (i) Studies using experimental (quasi or real experimental design) design (with experimental and control groups) conducted between 2008-2020 in Turkey or abroad (Thesis/Article). (ii) He/she should examine the effect of the lesson organized according to the Total Physical Response Method and the current general method arranged according to the traditional method on vocabulary learning/teaching. (iii) Each study should have sample size (N), mean (X) and standard deviation (SD). (iv) Must use parametric tests (t-test or F statistic). (v) The pretest data averages of the groups are equivalent. (vi) Vocabulary teaching is not limited in language, but as a result of the eliminations, its effect on English vocabulary teaching has been decided. In the studies included in the thematic synthesis, opinions were taken from the people who participated in the studies conducted between 2008 and 2020 to examine the effect of the TPR method on vocabulary teaching/learning. The direct expressions of the participants or the themes and codes that emerged in the study should be reported.
The studies included in the thematic synthesis were (i) conducted in the English course between 2010 and 2020, (ii) opinions were received from the people who participated in the TPR Method practices, (iii) the direct statements of the participants or the themes and codes emerged in the study were reported and (iv) should be qualitative or mixed method research. In the thematic synthesis, studies that received opinions without participating in the TPR method implementations were not included. Accordingly, 7 studies with qualitative findings (qualitative or mixed method) were included in the thematic synthesis. The characteristics of the studies included in the thematic synthesis are given in Table 2. According to Table 2, it was determined that 6 of the 7 studies included in the thematic synthesis were mixed method (quantitative + qualitative) and 2 of them were qualitative method. In these studies, a total of 162 participants were asked for their opinions on the TPR Method practices.

Coding Characteristics of Studies
When the studies conducted in Turkey and abroad related to the Total Physical Response Method were examined, it was determined that there were 9 articles and 4 theses. The inclusion of both articles and thesis studies as moderators is important in terms of determining which type of study the effect size to be obtained originates from (Table 1). This method is mostly preferred to be implemented to young and beginner students, so in this study, the education level is determined as youth (preschool, primary school, and secondary school and high school) and adult (university).
There are domestic studies (n=3) and international studies (n=10) examining the effect of the Total Physical Response Method on vocabulary learning. This method is not an actively used method in learning environments in Turkey and its effect is newly researched (Kandemir, 2013). It is expected that studies on the Total Physical Response Method suggested by the Ministry of National Education among the methods that can be used in the classrooms in the 2nd grade education program may increase recently. It will also contribute to the studies to be conducted to determine whether there is a significant difference in this method, where there are lots of works abroad, the studies in the country, however, are limited. When Table 3 is examined, 100% (f=13) of the studies included in the meta-analysis were conducted in a quasi-experimental design and parametric tests were used in all.

Evaluation of the Quality of the Studies
The quality of the experimental studies within the scope of the meta-analysis was determined according to the evaluation model developed by Pulye, Gagnon, Griffits & Johnson-Lafleur (2009) for experimental studies in the field of health sciences. In this model, it is recommended to examine the quality of quantitative experimental studies according to three criteria. These criteria are; (i) the implementation process is expressed and the sampling is random, (ii) the group information is hidden (assignment of the groups randomly), and (iii) the validity/reliability of the obtained data is ensured and there is no data loss. In the study, it is recommended to give 1 point if this criterion is met, and 0 point if it is not met. However, 0.5 points were given because the criteria could be partially met in social sciences. The quality score was calculated with the formula (total score/3) x100. Pulye et al. (2009) did not determine any percentage value regarding the high quality/poor quality of the studies, but in the context of this study, studies with a quality score above 50% were considered high quality, and studies below it were considered poor quality.
A checklist of 12 criteria proposed by Harden, Brunton, Burchett, Oakley and Backhans (2006) was used to determine the quality of the studies included in the thematic synthesis. Although not a general rule, Harden et al. (2006) suggested that studies meeting less than 7 out of 12 criteria should be evaluated as "low", studies meeting 7-9 as "medium" and studies meeting 10-12 criteria as "high" quality. He recommended including medium and high quality studies in the thematic synthesis.

Data Extraction
In order to code of quantitative research, two coding forms were prepared. In the first coding form, the names of the authors, dependent variables (achievement, attitude, proficiency), sample characteristics (education level), research design (quasi-experimental designs), measurement tools (achievement test, performance test) characteristics of the intervention (course type, implementation duration) and data analysis test (parametric, non-parametric) were coded. A second coding form was prepared to extract quantitative data from studies that met the inclusion criteria. In this coding form, the names of the authors, post-test mean, standard deviation and sample size for studies that did not find a significant difference between the experimental and control groups in the pre-test (p>.05), dependent samples t-test for weak experimental designs, and sample size were collected. In case information such as mean, standard deviation, and sample size missing, t-test result and sample size, F-test result and sample size were collected for independent samples.

Synthesis of Quantitative Studies
The meta-analysis method was used to determine the effectiveness of the experimental procedure. Since the sample sizes of the studies included in the meta-analysis were less than 20 (Card, 2012, p. 93), Hedge's g was used as the effect size index. Since the studies were collected from the literature, the calculated effect sizes were combined according to the random effects model (Borenstein et al., 2009, p. 86) and the overall effect size was calculated. This calculated value was interpreted as "weak" if it is between 0-0.20, "small" if it is between 0.21-0.50, "medium" if it is between 0.51-1.0, and "strong" if it is greater than 1.0 (Cohen, Mainon, & Morrison, 2007, p. 521).
Heterogeneity test was performed to determine the existence and magnitude of variance between effect sizes. The amount of heterogeneity was calculated using the DerSimonian-Liard estimator (DerSimonian & Liard, 1986). The magnitude of heterogeneity was interpreted according to the I 2 index. I 2 value is considered as low as 25%, as

Synthesis of Qualitative Findings
Thematic synthesis method was used in the analysis of qualitative findings. This analysis method has three stages (Thomas and Harden, 2008). In the first stage, direct quotations or basic concepts extracted from qualitative research are coded by reading them line by line. In the second stage, the codes obtained from the first stage are grouped by comparing them according to their similarities and differences. Thus, descriptive themes are created. In the third stage, descriptive themes are compared and hypotheses are produced about what characteristics a good TPR method-based teaching practice should have. Thus, analytical themes are produced.

Cross-Study Synthesis
At this stage, moderators were formed according to the analytical themes obtained from the thematic synthesis. Studies with quantitative findings were classified according to these moderators and categorical moderator analysis was performed to determine whether there was a significant difference in terms of effect sizes, in other words, whether the moderator obtained from the analytical theme was a significant moderator. Thus, the factors affecting the effect of teaching practices based on TPR Method on vocabulary learning/teaching were tried to be determined.

Synthesis of Quantitative Research
The forest plot of the effect sizes of the 13 studies included in the meta-analysis study is given in figure 2.

Figure 2. Forest plot of meta-analysis results
According to Figure 2, the study with the largest effect size (ES=2.745) is the thesis study by Zhen (2011), while the study with the smallest effect size (ES=-0.081) is the article study by Bulan (2019). Studentized resudual and Cook's distance were examined to determine whether these studies were outliers. Accordingly, the studentized residuals revealed that none of the studies had a value larger than ±2.8905 and hence there was no indication of outliers in the context of this model. The estimated common effect size based on the random-effects model was 1.131 (95% CI: 0.737 to 1.525). As seen in the figure 2 the common effect size differed significantly from zero (z=5.62, p<.01). This common effect size is "strong" according to the Cohen et al. (2007) classification. The result of the heterogeneity test was significant (Q(12)=54.46, p < .01). I 2 index is 78%, which means that there is a high amount of heterogeneity among the studies.
Since the heterogeneity test was significant (p<.05), moderator analysis was performed to determine the source of this heterogeneity. For this purpose, the studies included in the meta-analysis were classified according to education level (young and adult) and duration. The moderator analysis based on these categorical variables is given in Table 4.  As seen in Table 4, the selected moderator is significant (p<.05). The result of this analysis being significant indicates that there is a significant difference between the determined categories. Accordingly, the variance among the studies is due to the level type of the studies.
Meta-regression was performed to determine whether the duration of the experimental procedure (weeks) significantly predicted the effect sizes. Five studies were not included in the analysis because they did not report the duration of administration. The course hours specified in the other 8 studies were converted into a common time (week). A lesson hour is considered as 40 minutes. As a result of meta-regression, there is a significant (p<.05) relationship between the duration of applying the experimental procedure and the effect sizes, and the implementation period explains 35% of the variance.

Publication Bias
According to Card (2012, p. 262), one of the best methods of determining publication bias is to include unpublished studies in the meta-analysis and to test whether the effect sizes show a significant difference according to the publication status (thesis versus article). If there is no significant difference between the effect sizes according to the publication status (p>.05), it can be said that there is no publication bias. Accordingly, as seen in Table 1, 3 theses (unpublished) and 10 articles (published) were included in this meta-analysis study. Categorical moderator analysis was performed to determine whether the studies showed a significant difference according to the effect sizes. As a result of the analysis, the overall effect size of the article studies was 1.279 (95% CI: 0.860 to 1.698), while the overall effect size of the thesis studies was 0.570 (95% CI: -0.227 to 1.368). No significant difference was found between these two overall effect sizes (p>.05). According to this result, it can be said that there is no publication bias.
A funnel plot of the estimates is shown in Figure 3.

Figure 3. Funnel Plot
Both the rank correlation and the regression test did not indicate potential funnel plot asymmetry (p=0.061 and p=0.427, respectively). Rosenthal's Fail-Safe N test was performed to determine whether the estimated overall effect size was strong. In order to render the overall effect size estimated according to the results of this test meaningless, 447 studies with zero effect size should be included in the meta-analysis. If the required number of studies is more than 10 (threshold value=5k+10; k, number of studies) than five times the number of studies included in the meta-analysis, it can be concluded that the calculated overall effect size is strong and is not the product of publication bias (Rosenthal, 1979). Accordingly, considering the studies included in the meta-analysis, the threshold value is 75 (5x13+10). Since the number of studies required (447) is much larger than the threshold value, it can be said that the overall effect size calculated is not strong and the product of publication bias.

Synthesis of Qualitative Findings
Eight studies with a quality score of "medium" and "high" were included in the thematic synthesis. The qualitative findings of these studies, in other words, the direct quotations in the findings or the code definition tables of the researcher were extracted and entered into QDA Miner Lite, a qualitative data analysis software. In the first stage of the thematic synthesis, the qualitative data entered into the program were read line by line and coded. In the second stage, the codes whose definitions were close to each other were combined under the same codes by expanding the code definition. The new codes formed later were compared according to their similarities and differences, codes with the same features; (i) the learning-teaching process in the TPR method, (ii) the learning outcomes in the TPR method, (iii) Motivation, and (iv) practice recommendations/requirements. Some requirements that affect the learning-teaching process and the learning outcomes of the students in the TPR method have been determined. According to this; teachers are (i) energetic, (ii) patient and (iii) enthusiastic; As a feature, (iv) TPR songs, (v) TPR games, (vi) pairwork, (vii) body language -gestures and mimics, (viii) minimal and easy questions, (ix) adaptation, (x) pronunciation, (xi) pace, (xii) institutional support came to the fore. In addition, the codes related to motivation affecting the method after the screening were determined as positive and negative. Positive codes; (i) feeling happy, (ii) making comfortable, (iii) exciting, (iv) interesting, (v) appreciate, (vi) enthusiastic, (vii) enhancing motivation, and (viii) making confident. Negative codes are (i) scaring and (ii) boring.
In the third stage of the thematic synthesis, three researchers discussed the characteristics of an intervention based on a good TPR method, considering descriptive themes, and the following analytical themes (hypotheses/suggestions) emerged as a result of the discussion: 1) The number of classrooms where the implementation will be made is not large (24 people or less), 2) Asking questions appropriate to the student's readiness level, 3) The practitioner takes precautions against possible disciplinary problems that may occur, 4) The practitioner supports the method with songs, stories and games, 5) The practitioner use of pictures and realia, 6) The practitioner organizes pair work activities for active learning, 7) Predominant use of the target language by the practitioner, 8) The practitioner should pay attention to the learning pace of the students, 9) The practitioner should also teach the written form of the words, 10) The practitioner should make the students feel comfortable to be active in the classroom environment (reduce pressure and stress).

Cross-Study Synthesis
In order to determine whether the analytical themes obtained from the thematic synthesis were made in the experimental studies, the experimental studies were examined according to the analytical themes by two researchers. The results of the examination are given in the  Total 10 studies 2-The practitioner should make the students feel comfortable to be active in the classroom environment.
Demir and Çubukçu (2014)  As seen in Table 5, the implementation status of the analytical themes of the studies included in the meta-analysis is described. Although there is no suggestion about the ideal class size from the thematic synthesis, it has been determined that the ideal class size is 24 and below for OECD countries. In three of the studies included in the meta-analysis, it was determined that the experimental group consisted of 24 and lower class size (1), (2), (3), (4), (7), (10), (11), (12), (13) (Theme 1). (Theme 5) To ask questions appropriate for the student's readiness level (2), (6), (7), (10) , (13) .
(Theme 3) The necessity of the practitioner to take precautions against possible disciplinary problems was also determined in some studies included in the meta-analysis (2), (6) . (Theme 10) The use of songs, stories and plays, which are possible strategies due to the characteristics of the TPR Model, has generally been observed in studies (1), (2), (3), (6), (10), (12), (13) . (Theme 4) The use of pictures and realia was also observed in the studies included in the meta-analysis in order to provide instruction and permanence for the practitioner in vocabulary teaching (1), (2), (6), (7), (10), (12) . In order to ensure active learning, the organization of pair work activities, which is one of the codes found in thematic synthesis studies, was also observed in some studies included in the meta-analysis (1), (6), (10), (12) (Theme 6). (Theme 9) The practitioner's use of the target language in the classroom is necessary for the students to receive enough input, and the student provided with enough input can both complete the vocabulary learning and the student with enough vocabulary can easily express herself in the advanced stages. This expression has been observed in some studies (6), (12) . (Theme 8) and this theme was found in only one of the studies, which can be counted as an important factor in learning, as the practitioner pays attention to the students' learning pace (10) . In addition, (Theme 7) the practitioner is expected to teach the written forms of words in order to support vocabulary teaching and ensure correct use, and this thematic code was observed in studies (6), (7), (8), (10), (12) . (Theme 2) The practitioner should make the students feel comfortable and stress-free (reduce pressure and stress) in order to get them involved in the classroom environment. This analytical theme has been reached in studies (3), (4), (6), (7), (9), (10), .
A categorical moderator analysis was conducted to determine whether the suggestions that emerged in the analytical themes could explain the variance among the studies included in the meta-analysis, and therefore whether these suggestions were functional. The results of the categorical moderator analysis of each recommendation implemented in the studies are given in Table 6. As seen in Table 6, only 2 of the 10 analytic themes (6th and 10th analytic themes) obtained from the thematic synthesis were able to explain the variance between studies significantly (p<.05). Accordingly, while the average effect size of the TPR method, which the practitioners implemented by considering the students' learning pace, was Theme 10 (Implemented), 2.745 (1.762, 3.729), the average effect size of the studies that implemented this method without considering the students' learning speed was calculated as Theme 10 (Non-implemented), 1.021 (0.652, 1.390). According to this finding, when the TPR method is implemented considering the learning pace of the students, it can support students' vocabulary learning. On the other hand, the average effect size was calculated as Theme 6 (Implemented), 1.777 (1.044, 2.509) in the studies in which the TPR method was implemented, in which the practitioners also taught the written forms of the words, while the effect size was calculated as Theme 6 (Non-implemented), 0.770 (0.415, 1.125) in the studies that did not teach the written forms of the words. In this context, it can be said that teaching the written form of the words to the students in the TPR method supported by vocabulary teaching can positively affect the students' vocabulary learning.
The first analytic theme can be expressed as a continuous variable rather than categorically. In other words, the first analytical theme can be expressed as "As the class size increases, the effect size decreases". Because as the class size increases, the applicability of the TPR method decreases, which may lead to a decrease in the effect size calculated for academic success. Meta-regression analysis was performed to determine whether these analytical themes were appropriate. As a result of the meta-regression analysis, the model tried to explain the effect sizes of academic achievement with the presence of the implementation class was not found to be significant (p>.05). Therefore, it can be said that there is no significant relationship between the size of the class in which the implementation is made and the effect size of academic achievement.

Discussion
In the literature, some studies show that the TPR method has a positive effect on vocabulary learning (e.g. Demir and Çubukçu, 2014;Sariyati, 2013;Khakim and Anwar, 2020;Harrath, 2016) and that it does not have a significant effect (Bulan, 2019;Arumdiah and Dewi, 2010;Ortiz and Guaraca, 2018). Therefore, this study aims to determine the effect of the TPR method on students' vocabulary learning and the factors affecting the effectiveness of this method by combining the findings obtained from both qualitative and quantitative studies. For this purpose, a primary study with 13 quantitative and 7 qualitative findings was included in this study using mixed research synthesis. The data obtained from the studies with quantitative findings were combined with the meta-analysis method and the studies with qualitative findings were separately combined with the thematic synthesis method. Then, using the analytical themes obtained from the thematic synthesis, the variance among the studies included in the meta-analysis was tried to be explained.
Since 13 studies included in the meta-analysis were collected from the literature (Borenstein et al., 2009), they were combined according to the random effects model. The overall effect size was calculated as 1.131 (95% CI: 0.737 to 1.525) as a result of the combination. The calculated overall effect size differed significantly at zero effect value (z=5.62, p<.01). This effect size can be interpreted as a "strong" (Cohen et al., 2007) effect size. Accordingly, it can be said that the TPR method can have a "strong" effect on increasing the success of students' vocabulary learning. For this reason, it can be said that this study resolved the existing conflict in the literature. In the heterogeneity test performed to determine whether the studies included in the meta-analysis share the same common effect, it was determined that the studies were highly heterogeneous (Q (12) =54.46, p < .01, I 2 =78%). To explain the variance between studies, the studies included in the meta-analysis were coded according to various characteristics (education level, duration); these codes were able to partially explain the source of the variance. In order to determine the reasons for this heterogeneity between studies, moderator analysis was performed according to level (young and adult), and meta-regression was performed to determine the relationship between the duration of applying the experimental procedure (week). As a result of the categorical moderator analysis, it was determined that level was a significant (p<.05) moderator contributing to the variance. Similarly, it was determined that the duration of applying the experimental procedure had a significant relationship with the effect sizes related to success (p<.05).
The data extracted from the study with 7 qualitative findings included in the thematic synthesis were coded by reading line by line. These codes are compared according to their similarities and differences, codes with the same features; gathered under four themes: (i) Learning-teaching process in TPR method, (ii) learning outcomes in TPR method (iii) motivation and (iv) implementation suggestions/requirements. Accordingly, it has been determined that when the TPR method is used in the lesson, active participation of the students as well as meaningful and cooperative learning takes place, and students learn by having fun. As a result of this learning-teaching process, it was revealed that the motivation of the students increased, their positive attitudes towards the lesson improved and they felt more comfortable as a learning outcome. However, some limitations were encountered in the application of the TPR method. For example, the shyness of some of the students, crowded classes, lack of equipment and discipline problems may affect the effectiveness of this strategy. In order to overcome these limitations, suggestions were given such as taking into account the individual differences of the students, reviewing the class sizes, assigning students to prevent disciplinary problems and ensuring their active participation in the activities. Therefore, it can be said that the TPR method is a teaching method that contributes to increasing students' interest and motivation towards the lesson and improving their attitudes once these limitations are overcome.
Among the descriptive themes that emerged from the thematic synthesis, 10 analytical themes were identified that could affect the effectiveness of the TPR method. It was determined that the 10 analytical themes developed had to be applied in experimental studies. From this point of view, analytical themes were obtained by combining qualitative research with thematic synthesis. Analytical themes are hypotheses that express what characteristics a good experimental intervention should have (Thomas and Harden, 2008). It was determined that only two of the 10 analytical themes developed contributed to the variance between studies. These analytical themes are the suggestions that "The practitioner should pay attention to the learning pace of the students" and "The practitioner should also teach the written form of the words". When the application cases of these themes were examined, it was determined that one study (Zhen, 2011) applied for one suggestion and five studies (Nguyen et al., 2020;Harrath, 2016;Kassem, 2010;Zhen, 2011;Khusniyati et al., 2020) applied for another suggestion. Only two of the experimental themes explained the variance among the studies included in the meta-analysis significantly (p<.05), while eight of them were not significant (p>.05). Significant analytical themes are "The practitioner should pay attention to the learning pace of the students" and "The practitioner should also teach the written form of the words". Accordingly, not paying attention to the learning pace of the students in the TPR method and not teaching the written form of the words may have negatively affected the student achievement.
Since there is no previous meta-analysis study on vocabulary learning of the TPR Method, meta-analysis studies on vocabulary learning of different methods used in the classroom were scanned. Some studies support that the methods and techniques used in the classroom may support students' vocabulary learning. According to the meta-analysis study conducted by Hao, Wang, and Ardasheva (2021), the overall effect size of technology-assisted second language vocabulary learning was found to be .845, and the effect of the use of technology in the classroom on English vocabulary learning was supported by a meta-analysis study. The study also showed that technology can improve students' long-term vocabulary. In addition, according to the meta-analysis study conducted by Haıdarı Baysal, and Kanadlı (2020), the effect of digital technology-based teaching on foreign language vocabulary learning was examined. As a result of the research, it was determined that digital technology-based teaching had a positive and wide-ranging effect size on foreign language vocabulary learning (effect size = 1.173).

Conclusion and Implications
It is seen that the TPR is a teaching method that contributes to students' vocabulary learning. However, for this method to be effective, attention should be paid to teaching not only the pronunciation, but also the written forms of the words and the learning pace of the students. As a limitation emphasized in qualitative research, TPR method is difficult to implement in crowded classrooms. Likewise, it was determined that class size and level were associated with success in both categorical moderator analysis and meta-regression. Similarly, the duration of applying the experimental procedure (week) was also found to be a significant predictor of the effect size on success. When the primary experimental studies were examined, it was observed that the TPR method had application times ranging from 1 to 8 weeks.