An Innovative Way of Finding Best or Least Matching Pairs and Groups

The procedure introduced in this article is an innovative way for finding the best and least matching pairs. The method can also be extended to find the most converging or diverging groups if the objectives of studies necessitate so. The procedure employs three pieces of information to find out which pairs or groups of subjects make the most or least converging pairs or groups. These three pieces of information are the total differences between pairs of subjects’ responses to the questions in a questionnaire, their place on the continuum defined for the variable under study, and the correlations between pairs of students’ scores. A rank is assigned to each pair with regard to each source of information. These values are then added and the subjects are ranked from the most converging to the most diverging pairs with small and big numbers representing converging and diverging pairs, respectively. After pairing subjects, it is easy to find the most converging or diverging groups by dividing the arranged pairs vertically or horizontally. The procedure is felt to be applicable to many quantitative non-experimental and qualitative studies.


Introduction
Research purposes are many and varied and are accordingly investigated by using different research instruments.Each research goal necessitates using the instrument or data collection procedure which is the most appropriate for that purpose and will most probably bring about the desired results.Once the researcher decided on the nature of the data to be collected, the second step is deciding on the way of data collection.Mackey and Gass (2005) state that, questionnaires are one of the most common methods of collecting data on attitudes and opinions from a large group of participants.However, as it is evident from the literature below, these are not the only research areas in which questionnaires are used.Questionnaires are instruments which are widely used in different areas of quantitative and qualitative studies from controlling for the language background and program evaluation to motivation, personality factors and anxiety studies, to name only a few.Dörnyei (2007), while emphasizing the role of questionnaires in qualitative research, considers them as the main data collection method in surveys.Muijs (2004), on the other hand, dividing quantitative research into two types of experimental and non-experimental, focuses on questionnaires as descriptive experimental data collection devices.
Questionnaires, as Dörnyei (2003) defines them, are extremely versatile instruments and capable of gathering a large amount of information quickly.Seliger and Shohamy (1989) categorize questionnaires among highly explicit and structured data collection instruments, although, we know that for qualitative research purposes open-ended questionnaires are more appropriate.Structured types of data collection procedures, according to Seliger and Shohamy (1989), unlike informal procedures like field notes, determine in advance the specific focus of the data that will be sought.
The standard method of analyzing quantitative questionnaire data, as Dörnyei (2003) puts it, is by means of submitting them to various statistical procedures.This involves entering the information obtained from questionnaires in the SPSS program and running different statistical procedures afterward.
On the other hand, pairing and grouping students are two ways that researchers may want to use to investigate their ISSN 1925-4768 E-ISSN 1925-4776 130 hypotheses, especially in applied linguistics and other fields of humanities.It may be needed to use the most or the least converging pairs or groups for different research purposes.It is also possible for researchers to be interested in selecting pairs or groups which have neither highly converging nor highly diverging views but pairs or groups whose views moderately correspond to each other.
For pairing or grouping students, the common method is to enter the questions in the questionnaire as variables and the students as cases in SPSS program and then perform the statistical procedure which is the most appropriate for the scores obtained, largely correlation, factor analysis and chi-square.However, using correlation tells us about the overall co-variance of the scores and nothing more.Factor analysis shows loadings and the effect that each independent variable has on the dependent variable.Chi-square too shows the degree of difference between observed and expected scores.In the same way, Looking at the total scores might tell us whether the collective view of the subjects under study has been in favor or against the idea investigated by the researcher.The problem with all these procedures is that none of them by itself is enough to make sure that the paired subjects or groups are truly similar or dissimilar.Even all of them taken together do not make much sense as they are disparate pieces of information whose relationship to each other is not clear.
On some occasions researchers might want to have the most converging or diverging pairs.On other occasions, however, they may wish to have groups of diverging or converging subjects depending on their research questions.
Still in other research events, researchers may be inclined to study pairs or groups whose views are neither highly convergent nor highly divergent but somewhere in between.The following procedure is an attempt to address these needs.
Suppose that a researcher is interested in finding out whether collaborative work of pairs of highly convergent students (say, students who have a very positive attitude toward English language) will result in better speaking skills (for example, by triggering more negotiation of meaning) compared to the students whose views are highly divergent.Or, think of a situation in which a researcher wants to pair his/her subjects based on some particular criterion in order to investigate their gain in writing.Regardless of the research purpose, when students are to be matched, the following procedure can be used to find the best matches.
In other research contexts, researchers may want to have two groups-not necessarily pairs-in which there is a relative one to one correspondence in students' motivation so that the average motivation scores for the groups are very similar.Again, this is the procedure which can help them divide the subjects into two best matching groups.

Background to the Procedure
Examples of questionnaire use in applied linguistics are not difficult to find.The recent comeback of qualitative and descriptive studies has made the use of questionnaires even more popular.For example, Lee (2009) administered a questionnaire to 206 secondary teachers to investigate the effect of their beliefs on their practice.Chamot (2005, as cited in Cross, 2009) used a questionnaire to identify learner's learning strategies.Also, Pulido (2009) employed a self-reported strategy use questionnaire during the TW verification task which was designed to examine motivation and cognitive involvement.In the same vein, Kato (2009) used a questionnaire to investigate the effectiveness of intervention quantitatively and qualitatively.
Language background questionnaires are another type of questionnaires which are frequently used by researchers where this factor needs to be controlled.For example, Derwing and Munro (1997) used a questionnaire to measure their subjects' degree of familiarity with languages other than English.White, et al (1997) gave a questionnaire to their subjects to determine their previous experience with English.De Groot (2006) made participants fill in a language questionnaire to assess their FL skills.Barcroft (2007) also used a language background questionnaire in investigating the benefits of providing opportunities for target word retrieval during second language vocabulary learning.And finally, prior to commencing his study, Cross (2009) administered a background questionnaire in order to identify and analyze factors that may influence the extent of comprehension of news videotexts.
Motivation, attitude, and background knowledge are not the only variables measured by questionnaires.The application of questionnaires is almost possible in all kinds of research.Three examples of researchers who have used questionnaires for purposes other than what was said above are Byram and Risager (1999, as cited in Chapelle, 2009), Huang (2007), and Reinders (2009).
Stressing the importance of questionnaires would be a vain attempt as every researcher knows well and is already aware of the extent to which this instrument is used in research studies.However, as far as we know, using questionnaires as a device to find the best matching pairs or groups through integrating three pieces of information gathered from different sources is not a practice tried till now.In fact, what is given here as the background for this work is not a background in its usual sense, since this work has no precedent in the literature.However, examples of situations in which this procedure could have been employed would highlight its importance.
A good example of the situation in which this method could have been employed is a study by Baker & Macintyre (2000).These two researchers examined the nonlinguistic outcomes of an immersion versus a non-immersion program.The dependent variables included attitude toward learning French, orientation for learning, willingness to communicate, etc.The researchers employed factorial analysis (MANOVA) to compare the results of questionnaire analysis.However, it was possible to give another turn to the study, for example, by matching and dividing subjects in each group to obtain more detailed information about subjects' behavior in more converging and less converging ends of the groups.This would even have made it possible to compare subjects falling below the dividing line in the immersion group with subject falling above the dividing line in the non-immersion group to avoid sweeping conclusions.The combined results could have been obtained by separately investigating each dependent variable, rather than by submitting them to a program in which small values are crushed up in favor of overall loadings of important variables.
Another study which would have lent itself to using this method is that of Lamb ( 2004).Lamb's study shows that the two constructs of integrative and instrumental motivations are almost indistinguishable in Indonesian context more so because of the powerful forces of globalization.Except for the eight items at the beginning of Lamb's questionnaire, the remainder of the items sought to uncover students' attitude and motivation to learn English.Although lamb used complementary instruments like interviews in unfolding the general pattern of his subjects' activity and attitude, the results are largely founded on frequency count of response types.The differences in response to single questions or the differences between particular subjects which would possibly open up new insights are all canceled out with this collective approach.
The following studies are some of the other researches that motivated present researchers to find a way for reliable matching or grouping of subjects in descriptive studies.Culhane (2004), in his attempt to introduce a model to enhance understanding of SLA with regard to acculturation and motivation uses questionnaires, typically involving Likert scale items, which ask subjects to agree or disagree with statements expressing ethnic identification between native and acquired cultures.Arumi (2006), investigating the self-regulation processes in consecutive interpreting learning in the context of formal teaching in the university sphere, used questionnaires to define a series of general characteristics all students share.Loewen et al. (2009) administered a questionnaire to investigate the beliefs of L2 learners regarding the role of grammar instruction and error correction.Blake (2009) used an exit survey with Likert scale response items to each participant in his study upon completion of the posttest at the end of the study.The purpose of the questionnaire was to gain insight into how each of the learning environments was perceived by the participants who engaged in the environment.And finally, Cassanato (2008) exploring the influence of language on non-linguistic cognition, and in particular time, used a questionnaire to corroborate that English speakers tend to use distance metaphors to describe the duration of events whereas Greek speakers tend to use amount metaphors to describe the durations of the same events.

Procedure
In the following imaginary example, to make the analysis less complicated, only six students have answered the questions in a questionnaire using a Likert scale with five levels.The differences in students' views, as in most Likert scales, are ordinally measured.This analysis uses a six-stage procedure to find the best matching pairs.These stages include: 1. finding differences in views about every single question between students regardless of the direction of difference 2. finding total between-student differences 3. creating total scores table 4. creating arranged total scores table 5. finding the correlation between pairs of students scores, and finally 6. assigning ranks to pairs' differences, correlations of scores, and the differences between subjects' total scores and adding all the three ranks up.The following simple imaginary example shows the procedure as it is applied to the answers given by the six students mentioned above to five questions in a questionnaire containing five levels.The levels range from 1 (the most negative) to 5 (the most positive) in Appendix 1.
Another way of calculating the total coordination of ideas is to use the total between-student differences, as represented in Table 1.
This table shows the total differences in views for all questions between pairs of students.It is as if we have superimposed all the above tables and added up all the scores that fall in cells above each other.In fact, in this table we have the sum of the differences in attitudes or the distance in pairs of subjects' views for all the questions in the questionnaire.So, the cell right below B, for example, shows how much students A and B's ideas converge/diverge concerning all the questions in the questionnaire.From the table we understand that students in two pairs, that is, AD and CD have very similar attitudes compared to students in AE pair whose attitudes are very different from each other.However, we cannot understand from the coordinated views of the pairs of students whether their views towards English language are strongly positive, strongly negative, or somewhere in between.Coordinated views can be attributed to all possibilities, that is, pairs of students having strongly positive, strongly negative, or a moderate attitude toward English language.But, we can calculate the mean of total difference to obtain a measure for the degree of homogeneity of views in our group as a whole.This measure will make it possible for us to make a comparison between our group and groups whose attitudes are measured by the same questionnaire.For the above table the mean difference will be: 94 : 15 = 6.27 where: 94 is the grand total of differences and 15 is the number of cells The maximum value for each cell in the above table is 20 (4 х 5) if pairs of students disagreed completely in all questions.Therefore, we can calculate the percentage of positiveness by dividing the mean value by the total possible difference in each cell and subtract it from 1-a case in which mean is equal to the maximum-to obtain the overall coordination of views.
6.27 : 20 = 0.31 1 -0.31 = 0.69 As it can be seen, the value for this coordination is very close to the value of mean coordination calculated above.The slight increase in the value is the result of rounding the numbers up.
To find out whether the agreements in views have been strongly positive, strongly negative, or somewhere in between we copy the scores given by each subject to every question above each other, add them up and then divide the result by the number of questions to see if it exceeds the median score.In this case the median score is 3.This procedure will enable us to predict with great precision the behavior of a student based on his/her best match.Table 2 shows the total scores of subjects.
For student A, for example, we have: 22 : 5 = 4.4 4.4 -3 = 1.4 towards the positive end from the median (Figure 1) This is a highly positive attitude.So, if a subject's distance from this subject is a small one, we can claim that he/she too has a strongly positive attitude toward English language.In Table 1 (total between-student differences), for instance, we see that student A's difference is the least from student D compared to the difference of A from other students.So, we can speculate that student D too has a highly positive attitude toward English language.Let us check this.(Figure 2) 24 : 5 = 4.8 4.8 -3 = 1.8 toward the positive end from the median Now let's calculate student F's attitude measure which has the largest difference from that of A's.(Figure 3) 18 : 5 = 3.6 3.6 -3 = 0.6 towards the positive end from the median One more way to group students according to their attitudes is to arrange them based on their overall scores (the last row in Table 2) given to the questions.To this end, we have to arrange the total scores from the lowest to the highest.If we apply this to Table 2, the arrangement will be like the following.
From Table 3, we can understand that E has the most negative attitude while D has the strongest positive attitude overall toward English language.
It is the right time now to make the best use of all the tables we have made so far.Just look at Table 1(total between-student differences) and Table 3 (arranged total scores).The comparison of these two tables makes it possible for us to extract some valuable information about the overall coordination of views between pairs of students concerning all questions, that is, the coordination of their views.For example, student E, having the most negative attitude toward English language according to Table 3, has the least difference from student B, as it is obvious from Table 1.This means that, regardless of the individual scores they have given to the questions, the direction of their views concerning all questions is the most similar compared to the correspondence of E's view to those of other students.
As was mentioned above, differences tell us nothing about the direction of attitudes.That is, from differences we cannot say whether the students in a pair, who have similar attitudes, are close to the negative extreme of the continuum or to the positive end.Looking at students' overall score, however, will tell us something about the direction of their attitude.For example, two students who are both toward the positive end of the continuum will make a better match.But, it is quite possible that these students disagree regarding some particular question(s).It seems that, we need another piece of information, that is, the value for correlation between the sets of scores of pairs of students who fall in each pair to determine the degree of co-variance between their sets of scores.Therefore, to find the best matches between students we need three pieces of information: 1. the total differences between pairs, 2. their place on the positive/negative continuum, and 3. the correlations between pairs of students scores.
This information can convincingly help us decide about the pairs at the first step and groups at the later steps if we are going to group our students for a particular study.That is, two students make the best match if they have the smallest total difference, closest possible distance on the negative-positive continuum, and the highest correlation between their scores to all questions.For groups too, the closer the attributes of students or pairs of students the more homogeneous the group will be and vice versa.
To understand how this method works for the above students we can make a table like the one below left (Table 4) in which the differences are arranged from the least to the most and the correlations from the most to the least.Another table which is a copy of Table 3 (arranged total scores) is given.The only thing left to do is to add up the three values for each pair regarding their positions in the tables.For example, AD stands in the first place in the differences and in the second place in the correlations columns in Table 4.In Table 3 there is no distance between these two subjects, so the value extracted from this table is zero for this pair.Put together, the value for AD pair is 1 + 2 + 0 = 3.But for CD the difference rank is 2, the correlation rank is 7, and the distance is 2. So, CD gathers a value of 11. [Insert Table 4 & 5 Here] Now, we count the ranks for all pairs in both columns and add them up with their distance driven Table 3.The smaller the outcome the better the match will be.For the six subjects above the outcomes will be as the following: A quick look at the differences and correlations columns will tell us that A and D, which have gathered the smallest value, have the least total difference and at the same time the second highest correlation value.Also, from Table 3 we understand that they are next-door neighbors.All these would convince us that A and D are the best matching subjects who have the strongest positive attitude toward English language.
ISSN 1925-4768 E-ISSN 1925-4776 134 -4776 134 You would ask why not to use the overall score of students to judge about their attitudes.The point is that, by considering only the overall scores, as it was said about, we are in effect excluding information about differences in their scores to individual questions which can be averaged out in the total scores.The following example shows that while the overall scores of the students are the same, their scores on two items are completely different.[Insert Table 6 Here] Interestingly enough, from the values we calculated for matches and through comparing them with Table 3 (arranged total scores), we understand that while C and D make the second best match they are not next-door neighbors in this table.This is because other variables (correlation and difference of scores) too play their own role in determining the exact degree of convergence/divergence between pairs.This method will enable us to identify the most converging or diverging pairs of students in most descriptive researches in which a questionnaire is used.Information should be collected about the position of the scores on the positive/negative continuum, correlation between the sets of scores, and the total differences between scores.For example, in our imaginary study above, students AD, CD, and BF make the most converging and highly positive pairs, while students DE, DF, and AF make the most divergent pairs toward the negative end of the continuum.It is clear that, if we assigned one student to a pair, all the other possibilities for that student would vanish and we should look for matches with other students, of course, in the same direction.In the example above, making the first match excludes 8 other possibilities.Making another match will exclude 4 more possibilities.These two matches, together, will account for 14 possibilities.So, as we make pairs, the number of possibilities for making other pairs quickly shrinks so that the number of pairs will in effect be equal to the number of students divided by two.
Based on this information we can divide the six students in the imaginary study above into three pairs of negative, moderate, and positive attitudinal groups.The smaller the overall number calculated for the pairs, the stronger the convergence in attitude and relationship between the individuals in the pair will be.The opposite is true for the bigger numbers.However, since in this study we had only six students the three arrangements for pairs from the top, in the column below, will constitute our options for the most converging pairs and the three pairs from the bottom will be available for the most diverging pairs.As we move towards the middle of the column, the matches' attitudes become more moderate.The moderate attitude of this central group is another quality which would be of interest for researchers.(Figure 4) The question of how to use this method to make diverging or converging groups is an easy one to answer.If we had, say, twenty matched subjects and arranged them from the smallest value to the biggest one, we could simply draw a line vertically between the subjects to have two very similar groups.If, on the other hand, we wanted to have two completely diverging groups the line should be drawn horizontally between the fifth and sixth groups.The imaginary data below show how these two types of arrangements can be made.All the letters representing subjects in the groups are used arbitrarily here.(Figure 5)

Conclusion
Researchers opting for quantitative non-experimental or qualitative research methods are heavily reliant on questionnaires as means of data collection.On the other hand detailed information about the composition of subjects in all types of research will undoubtedly enhance the quality of researches.One big advantage of the above procedure is that it takes different pieces of information about individual subjects into account and makes it possible for the researchers to have an in depth understanding of the group or groups they are studying which is a major tenet of qualitative research.The logic provided here for the matching and grouping of subjects seems to be a strong one which makes it a worthwhile step to be taken, especially in studies which are dealing with matched pairs or groups or diverging and converging ones.

Table 5 .
Arranged total scores