A Comparison between the Difficulty Level ( Readability ) of English Medical Texts and Their Persian Translations

Using foreign written materials in Iran's healthcare industry is very common, but it seems that there is a significant difference between the difficulty level of original texts and their corresponding translations. This study compares the readability level of English medical texts and their corresponding Persian translations. In this study, 50 translated booklets and their corresponding texts in English were assessed – all these booklets are translated versions of BMA publications and kept in Iran's National Library. Comparisons of these texts were made using Gunning Fog Index and SMOG Readability Index Grade. Then, significant difference between the data obtained from English medical texts and their Persian translations were made. A significant difference was observed between the number of multi-syllables words and readability scores in English medical texts and their corresponding Persian texts, but no significant difference was observed between the number of words and sentences in these two groups. Therefore, it is necessary to omit needless words, use fewer complex (multi-syllabuses) words, and use shorter sentences.


Introduction
Written texts and translated texts are powerful tools for developing knowledge, science and technology.The readability of written text and translated texts plays a very important role in transferring data.Seeking scientific methods for increasing the readability level of written materials is of vital importance for any industry.Because Healthcare Industry is paramount to any nation's economy and a vibrant healthcare industry ensures a healthy nation, and ensures a flourishing lifestyle for its citizens, it is vital to adopt simple modes of communications between healthcare professionals and their patients.One such way to improve communications in healthcare is by using readability formulas on documents as a means to communicate effectively with staff, patients, and healthcare suppliers.Translations of such texts also should be readable and share same features as the original ones.
It should be noted that Readability formulas cannot evaluate all these features that promote readability.Readability formulas measure certain features of text which can be subjected to mathematical calculations.These formulas are usually based on one semantic factor (the difficulty of words according to their length in characters or syllables) and one syntactic factor (the difficulty of sentences according to their length in characters or words).So, not all features that promote readability can be measured mathematically and these mathematical equations cannot measure comprehension directly.Therefore, readability formulas are considered to be predictions of reading ease but not the only method for determining readability and they do not help us evaluate how well the reader will understand the ideas in the text.As mentioned earlier readability formulas are being increasingly used to measure the understandability of written information in industrial sectors.These formulas are also used very frequently in clinical and health settings.This paper tries to compare the English medical texts with their Persian translations in order to assess whether the readability level of these translated texts is in the same level as the original ones by means of Gunning Fog Index and SMOG Readability Formula.

Review of Literature
Reading and understanding of some of the texts is difficult.A text which is difficult to read may be unique in content, yet it fails to serve its purpose of making the reader understand and use it.So, all of the authors, writers, and scholars need to assess the readability of their written materials (Dawson, 2008).The problem many writers face is how to assess the "readability" of their text.Over time, different methods have been developed to objectively predict the reading difficulty of written materials.Readability formulas are one of these scientific methods that offer the solution.By applying these scientific and mathematical principles, the readability formulas aim to present an objective analysis about the readability of a particular text.A readability formula is simply a mathematical equation derived by regression analysis.
There are many studies worldwide that assess the readability of written texts using readability formulas.Freda et al. (1999) decided to evaluate whether ACOG's (American College of Obstetricians and Gynecologists) patient education pamphlets comply with the recommended readability level for health education materials intended for the general public.They used four different formulas -the Fry graph, the Flesch formula, Gunning's Fog formula, and McLaughlin's SMOG formula -to evaluate these texts.Roger E. Alexander (2000) assessed the readability of dental educational materials using a computer-based program that assigns a reading level of understanding on the basis of the Flesch-Kincaid Formula.John K. Courtis and Salleh Hassan (2002) decided to assess the readability level of bilingual annual reports.In this study, the reading ease of different language versions of narrative disclosures within corporate annual reports was evaluated using Flesch and Yang formulas for Hong Kong and Flesch and Yunus formulas for Malaysia.Forbis et al. (2002) decided to evaluate the readability of the written asthma management plans (WAMPs).They used Flesch grade level, Dale-Chall, Powers-Sumner-Kearl, FOG, SMOG, and FORCAST formulas to analyze 10 WAMPs (they included 7 from the national guidelines, 1 from the World Health Organization, and 2 local ones).King et al, (2003) evaluated the readability level of mental health internet brochures for children and adolescents by means of SMOG formula.Rachel E. Myers and Felisha Shepard-White (2004) decided to assess the readability level Psychotropic Medication Handouts using the SMOG formula and the Readability Assessment Instrument (RAIN).Professor John C. Hall (2005) began to assess the readability level of original articles in medical journals using the Flesch Formulas.Hendrickson et al, (2006) examined the content and general readability of pediatric oral health education materials for parents of young children by means of three formulas -Flesch-Kincaid grade level, Flesch Reading Ease, and SMOG grade level.Christopher et al. (2007) began to evaluate the readability of consent forms using four readability formulas -the Flesch Reading Ease Score (FRES), the Flesch-Kincaid Grade Level Index (FKGL), the Fog Index, and the Fry Graph.Reinhold et al. (2008) assessed the readability level of the printed materials which are about dementia and related diseases by the order of International Psycho-geriatric Association.In this study, the readability level of 118 various brochures on dementia and related disorders were assessed by means of the SMOG readability index grade.
Like the work of Coutris and Hassan (2002) who compared the readability level of bilingual annual reports, this study will also compare the difficulty level of English medical texts and their corresponding Persian translations.In this study also the passages will be scored using two different formulas which are Fog and SMOG grading formulas.

Objectives of the Study
Using foreign written materials in Iran's healthcare industry is very common, so it is clear that translation must play an important role.It seems that medical translated texts have not many users (Zaker, 2006).In most cases, the number of words and sentences which are used in the translated texts is many times more than the number of them in the original texts (Zaker, 2006).Also, many of medical terminologies are not translated into Persian and their original forms with a little phonetically reformations are used in translated texts.Such problems have influenced both ordinary people and those who professionally engage in healthcare and medical industry.The problem that ordinary people face may be that they unable to have a correct understanding from the medical translated texts because of untranslated words and expressions and long sentences; so, most of these texts are not useful for them in practice.For the professional group, these problems greatly increase their tendency to read English texts because they prefer to read medical texts in the original form to have a better understanding of the contents.Those professionals who do not want to read the original text also still need to be completely familiar with English language to read and understand untranslated terminologies which are used in translated Persian texts.Therefore, solving such problems may be of a great importance for Irans' society from both social and economic points of view.Producing more readable translated medical texts can be useful to solve such problems.Using readability formulas can be effective because these formulas can help translators to count the number of words, multi-syllables words, and sentences they use and compare it with the number of them in the original text and keep away from using long words and sentences because long words and sentences can make the text ambiguous.In this study, two readability formulas -Gunning Fog Index and SMOG Readability Formula-are used in order to compare the difficulty level of the English medical texts and their corresponding Persian translations.The article, accordingly, seeks to answer the following research questions: x Is there any significant difference between the number of words in English medical texts and the number of words in the corresponding Persian texts?
x Is there any significant difference between the number of multi-syllables in English medical texts and the number of multi syllables in the corresponding Persian texts?
x Is there any significant difference between the number of sentences in English medical texts and the number of sentences in the corresponding Persian texts?
x Is there any significant difference between the difficulty level (readability) of English medical texts and their Persian translations in terms of number of words, multi-syllables and sentences?

Method
This study is a descriptive comparative research and follows all procedures of such studies and uses a body of techniques for investigating readability of English medical texts and their corresponding translations.In this study, the method of inquiry is based on gathering measurable data which are subject to specific principles of reasoning.Data are collected through observation and assessed via a set of formulas.
The corpus consists of 50 translated medical booklets and their corresponding English texts.Original texts are published by British Medical Association (BMA) and deal with public health.Their corresponding Persian translations are translated and published by different translators and publications and are kept in National Library of Iran.There are 296 translated medical booklets in the National Library which belong to BMA.The corpus is systemic randomly selected from among these booklets (approximately 1 booklet is randomly selected from among each 6 booklets).Then the paragraphs that should be assessed were systematic randomly selected.According to Ken Black ( 2004) Systematic sampling is a statistical method involving the selection of elements from an ordered sampling frame.The most common form of systematic sampling is an equal-probability method which is as follow: where k is the number of elements (paragraphs should be assessed in each book), n is the sample size, and N is the population size.Using this procedure each element in the population has a known and equal probability of selection.In this study, the equal-probability method is used to show how many paragraphs should be selected from among each booklet: K= 296»50 = 5.92 ~ 6 Therefore, 6 paragraphs are selected from each translated booklet (e.g., in a book of 180 pages, 6 paragraphs will be selected -each 30 pages 1 paragraph) and then required data (words, complex words or words with more than 3 syllabuses, and sentences number) are obtained from these translated texts and their corresponding English text.Further, these data are compared via Gunning Fog Index and SMOG Readability Index Grade to get readability scores.Then, the data obtained from each book (number of words, multi-syllables words, sentences, and Fog and SMOG scores) is measured and normal distribution of them is checked.Finally, the number of words, multi-syllables words, sentences, and difficulty level (readability) in English medical texts and their corresponding translations is compared via T-Test and Mann Whitney U Test to find out whether their differences are significant or not.

Results
This section summarizes the data collected and analyzes these statistical data.Collected data are reported in sufficient detail to justify the conclusions.Descriptive statistics are presented in tables to provide a better understanding of research findings through values and statistical presentations.As mentioned earlier, corpora and required paragraphs are selected through systematic random sampling.Then, required data -words, complex words, and sentence numbers-are obtained.Further, the means of these data are measured for each book separately.These means that belong to the 300 paragraphs chosen from the paragraphs of the books are presented in table 1.
After the total mean score of each book, taking the mean of all paragraphs' means, was calculated, the normality of distribution of the data belonging to each factor was separately checked to legitimize running T-tests.Descriptive statistics of two groups is presented in table 2. The following equation is used to determine the normal distribution of each set of data: N for the number of words is: For this set of data, N is out of the allowed range of "-1.96 and 1.96", i.e.," N>1.96".So, one can draw the conclusion that this set of data is not normally distributed.Therefore, Mann Whitney U test should be used to determine whether there is a significant difference between the number of words in English medical texts and their corresponding translation into Persian.
N for the number of multi-syllable words is as follows: In this set of data, N is also out of the allowed range of "-1.96 and 1.96", i.e., "N>1.96".So, this set of data is not also normally distributed; thus, the significant difference between the numbers of multi-syllable words in two groups is measured by Mann Whitney U test.
N for the number of sentences is as follows: In this set of data, N is placed between the normal range of "-1.96 and 1.96", i.e., "-1.96<N<1.96".Therefore, it can be concluded that this set of data is normally distributed and significant difference between the number of sentences in English medical texts and their corresponding Persian translations can be measured via T-test.N for the Fog scores is as follows: Like the earlier set of data, this set is also normally distributed because N is placed between the normal range of "-1.96 and 1.96", i.e., "-1.96<N<1.96".Therefore, a T-test can be used to determine the significant difference between the groups' Fog scores.N for the SMOG scores is as follows: Again, N is placed in the normal range of "-1.96 and 1.96", i.e., "-1.96<N<1.96".Therefore, this set of data is normally distributed and T-test can be used to determine whether there is a significant difference between the SMOG scores of English medical texts and their corresponding Persian translations.Now that it is determined whether the data are normally distributed or not, the significant difference between data obtained from two groups should be measured.As mentioned earlier, Mann Whitney U test and independent T-test are used to determine the significant difference in data sets with abnormally and normally distributions, respectively.
Table 3 represents two groups' statistics which reflect the difference between the mean number of words and multi-syllables words in the two groups.This table could be used to compare group statistics like the Mean.Table 4 represents the significant difference for the data obtained from comparing the number of words and multi-syllables words in English medical texts and their corresponding Persian translations.As these set of data This table shows that the difference between the number of words in English medical texts and their corresponding Persian translations is not significant, i.e., larger than 0.05 (P = 0.436).But in the case of the number of multi-syllables words, it is smaller than 0.05 and difference is significant (P = 0.00).
Although the difference between the number of words in two groups was not significant, based on the means of two groups -52.76 and 48.24 for the Persian and English texts respectively-number of words in Persian texts was higher than of the number of them in English texts.In addition, based on the means of two groups -70.15 and 30.85 for the Persian and English texts, respectively -the number of multi-syllable words in the Persian texts was significantly larger than in the English texts.
Table 5 represents the statistics of two groups which reflect the difference between the mean number of sentences, Fog Scores, and SMOG scores.This table could be used to compare group statistics like the Mean.
This table reflects the significant difference between two groups in terms of number of sentences and Fog and SMOG scores.As these sets of data are normally distributed, independent samples T-test is used to determine whether there is a significant difference between these sets of data.
Table 6 shows that the difference between the number of sentences in English medical texts and their corresponding Persian translations is not significant, i.e., larger than 0.05 (P = 0.405).In spite the fact, the number of the sentences in Persian texts was still higher than the number of them in English texts -because the mean for the number of sentences in English medical texts and their Persian translation is 6.2016 and 6.2674, respectively.However, no significant difference was observed between the number of words and sentences in English medical texts and their corresponding Persian translations (P = 0. 436, 0. 832, respectively), the Fog and SMOG scores of two groups were quite significant (p=0.00).
As mentioned earlier, readability deals with the "ease" degree of a text and readability formulas sort out measurement tools determining how many years of education are required to read and understand a text.
Readable texts are powerful tools to transfer the knowledge; so, they are very important in any industry.Therefore, producing readable texts is of vital importance for Health industry as any other industries.Readability formulas are powerful tools to evaluate and to increase the readability level of healthcare texts.It is also of great importance that Medical health translated texts to have an equal readability level as their corresponding original texts to ensure that target readers read the texts in the same ease degree as the native readers.Readability formulas can also be used as an effective tool to compare the translated texts and their corresponding original ones.In this research, Gunning Fog Index and SMOG Readability Formula were used to compare the readability level of English medical texts and their corresponding Persian translations since it seems that the number of words, multi-syllabuses words, and sentences in Persian texts exceed those in the English texts.Mann Whitney U test and t-test -were used to determine whether there is a significant difference between data obtained in English and Persian texts.

Discussion and Conclusion
The main objective of this study is to find out whether there is a significant difference between data obtained of two groups involved in the sample of the study.The analysis showed that the number of words and multi-syllable words in English medical texts and their corresponding Persian translations were not normally distributed.
As it is stated above, the number of sentences and Fog and SMOG scores in English medical texts and their corresponding Persian translations were distributed normally.So, a t-test was used to determine whether there is a significant difference between them in two groups of the study.In the case of the number of sentences in English medical texts and their corresponding Persian translations, the difference between them turned out not to be significant (P= 0835).Thus, the readability level of the Persian texts is significantly higher than that for the English texts, meaning that although there is no significant difference between the number of words and the number of sentences in the English medical texts and their corresponding Persian translations (only in the number of multi-syllable words is a significant difference), there is still a significant difference between their readability level.It also may be referred to the role that other factors, which are not considered in this study, might play in the readability.However, multi-syllable words play a greater role in readability compared with the other two factors.
A readable texts is one that has a simple, direct, economic, and familiar language; it is empty from needless words; sentence structures are evident and unambiguous; sentences are not too long; and organization and structure of sentences are orderly and logical (Stephens, 2000).So, to make the Persian translated medical texts more readable and justify their readability level with their corresponding original English texts, it is necessary to use a direct and understandable language, omit needless words, use fewer complex (multi-syllable) words, and use shorter sentences.Using this information, a model could be suggested to improve readability of the written texts (see figure 1).

N
distributed, Mann Whitney test is used to determine the significant difference between the data of two groups.

Table 1 .
Means of Data Obtained from English Medical Texts and Their Corresponding Persian Translations

Table Map :
MNWPT: Mean of the Number of Words in Persian Texts.MNWET: Mean of the Number of Words in English Texts.MNMSWPT: Mean of the Number of Multi-Syllable Words in Persian Texts.MNMSWET: Mean of the Number of Multi-Syllable Words in English Texts.

Table 3 .
Two Groups Statistics in terms of the Number of Words and Multi-Syllables Words

Table 4 .
Mann Whitney U Test Results

Table 5 .
Two Groups Statistics in terms of Number of Sentences and Fog and SMOG Scores Figure 1.A Model for Producing More Readable Texts