A Corpus-based Lexical Study of Sermons in Nigeria

Religious sermons play a very significant role in a pluralist setting like Nigeria, as organs of social, political and moral education. The fulfilment of these functions is partly contingent on the effective use of language. If the sermon-giver and his audience draw from the same pool of lexis, communication in the genre will not only be enhanced but its teaching and practice will also align, thereby bridging the gap between the two. It is to this end that this study attempts to identify the words most associated with Christian sermons in English, in Nigeria. To carry out this study, a corpus of present-day sermons in Nigeria was constructed and compared to a reference corpus of sermons from other parts of the world, in order to find out those lexical items which are characteristic of sermons, whether in terms of types, frequency or usage. The study reveals the role of textual context to be the definition of thematic focus because, although the sermon words in the two contexts manifest high similarity, their degrees of significance in the contexts differ substantially. Additionally, the findings reveal diversity, both in lexical choice and the discourse structure of sermons. However, in both contexts, the sermons demonstrate similarity in semantic grouping. So, for the teacher of English for religious purposes (ERP) and the user of English in church contexts, this work offers insights into the lexical world of sermons in Nigeria.


Introduction
Studies of English for Special Purposes (ESP) define it as the language used for a utilitarian purpose, whether occupational, vocational, academic or professional.In this context ESP elevates English to the status of an instrument of specialized communication.A number of approaches to English language teaching (ELT) have emerged, including English for Science and Technology (EST), English for Occupational Purposes (EOP) and English for Academic Purposes (EAP).According to Mackay & Mountford (1978) and Swales (1990), all the variants of ESP have a singular aim: communication efficiency above and beyond pedagogic effectiveness.Concerning the justification for ESP, Long and Richards, in the preface to Swales (1990, p. vii), write that: the discourse communities such as academic groupings (also professional and occupational) of various kinds are recognized by the specific genres that they employ [...].The work that the members of the discourse communities engage in involves the processing of tasks which reflect specific linguistic, discoursal and rhetoric skills.
In relation to the present study, the church constitutes the discourse community while the use of language by ministers -including sermons -represents a genre, in other words a discourse type.Restating the importance of ESP studies, Swales (1990, p. 3) himself observes that ESP-type analyses have become narrower and deeper.This narrowing, he says, is compensated for by an interest in providing a deeper or multi-layered textual account.As a result, there is growing interest in assessing rhetorical purposes, in unpackaging information structures and in accounting for syntactic and lexical choices.Moreover, the resulting findings are no longer viewed simply in terms of stylistic appropriacy but, increasingly, in terms of the contributions they may or may not make to communicative effectiveness.
Nigeria to give insight to its tradition and language while paving the way for a better understanding of the entire discourse.
The first Christian contact in Nigeria occurred in the fifteenth century when the Portuguese -Augustinan and Capuchin Monks -introduced Roman Catholicism.However, it was until the Roman Catholic missionaries came in the 1800s that sustainable Catholic Church growth occurred, mainly in the south-eastern parts of Nigeria.Then, in 1842 the first Protestant missionaries -Henry Townsend and Samuel Ajayi Crowther -came to Nigeria, now to the south west; these were Wesleyan Methodists.Soon after, other protestant groups followed.Today, there is a proliferation of churches in Nigeria; some maintain the tradition of their founding missionaries and some are totally locally founded.Whatever the mission and their tradition, each has planted large and vibrant churches whose congregations are numbered in millions.http://www.ecwaevangel.org/living_in_nigeria/HistoryOfChristianity The growth of Christianity in Africa [Nigeria] has been very spectacular and it is adjudged the continent with the fastest Christian growth in the world (Ezeogu 2000).But, as Ezeogu observes, Christian tradition did not come into a traditionally [even linguistically] vacuous society in Nigeria.Rather, it met a multiplicity of cultures, traditions and languages which expectedly impacted it and are impacted by it.Ezeogu examines the intriguing relationship of African cultures and the Christian tradition and finds that there are two contending Bible-culture relationships in African Christianity: the dialectic model and the dialogic model.In the dialectic model, the Bible and culture are seen as irreconcilable while in the dialogic model the Bible and culture are considered compatible.However, in truth, culture and Christianity may not be so polarised.While Ezeogu was interested in the intriguing relationship between Christianity and culture, the interest of this paper is to see how the English language is able to carry the weights of both traditions in sermons.The major focus is to examine how meaning is created through sermon words by investigating the underlying contextual and cultural factors which determine them.

Religion and Language
In Nigeria a rich literature exists on the use and role of language in religion, and in sermons in particular.Some of those works are discussed here to show the direction of past research and to affirm the need for the present work.Hackett (1988) observes that Nigeria is a pluralist country in which religion is a major fact of life which cannot be ignored.Williams (1997) also affirms this and argues that there has been enough religious unrest in Nigeria to suggest that religion has become a major factor in the country's contemporary body politic.Similarly, Oguntola-Laguda (2008) notes that, in a religiously homogeneous society, the aftermath of the interaction between religion and politics is political stability, but that is not the case with pluralistic societies like Nigeria where heterogeneity is a cause of religious and political crises.But, as he argues, we must recognize the symbiosis between religion and politics in Nigeria and how polity stands to benefit from it.This is where the role of language comes in.Krolick (2010) notes its role and power in crises management.Ker (2007) echoes these points by showing how religious songs and messages are often coloured by this world's concerns, thereby addressing societies' socio-political and economic issues, such as corruption, peace, stability, integration and development.Meanwhile Taiwo (2007) examines the social role of preacher relative to listeners, how the speaker makes linguistic choices in order to achieve persuasion and is in total control as the knower or expert, and assumes his audience are non-knowers, so his messages are characterized by information and directives.
In none of these attempts is the character of religious language observed from as extensive a population as the corpus of sermons this work covers by investigating the keywords of sermons and examining their characteristics.

Theoretical Orientation
The study is based on the lexical theory of J. R. Firth (1934Firth ( -1951)), who observed that the collocations of established 'key' or 'pivotal' words, when supported by reference to contexts of situation, may constitute material for syntactic analysis.Firth consistently argued that words in company may be said to have physiognomy, because words change their manners when changing locale.He believed words and their structural relations must be studied in context, because a word's meaning is conditioned by others in its vicinity, providing a new way to describe language.These ideas have been propagated and expanded by notable linguists including Halliday (1966Halliday ( , 1991)), Sinclair (1987Sinclair ( , 1991) ) and Leech (1981Leech ( , 1991)); all support a theory of language in which meaning is dispersed at different levels and contexts of language.Sinclair harnessed these ideas into a model for lexical description, as adopted in this study.Sinclair (2004) proposes an alternative model of the lexicon which projects lexical items as a higher rank of lexical structure, above words.The model -extended units of meaning -presents five categories of description for any lexical item, two compulsory, three optional.The obligatory components are the core and semantic prosody.The optional categories serve to fine-tune the meaning and cohesion of a text: collocation, colligation and semantic preference.In describing the lexis of sermons, this study focuses on the major category: the corethe node word, the invariable occurrence of a lexical item.The aim is to identify lexical items peculiar to sermons in Nigeria, be they in terms of type, frequency or usage, in order to enhance communication in the genre and determine how much influence the context has on language.

The Corpus Method
Two specialized corpora were designed for this work.A corpus of Nigerian sermons in English, consisting of 4,816 running words, was constructed for this research.This was compared to joint reference corpora of sermons from the UK and America comprising approximately 15,000 words.These dual reference corpora facilitate validation of the key lexis of sermons in Nigeria.The investigation entailed examination of keywords obtained through frequency analysis of Nigerian versus British and American sermons.Thus we can see what lexical choices are determined by the Nigerian context and any symmetries or asymmetries.The Nigerian sermons were sourced from the published works of pastors; the British and American sermons were located via the Internet.For precision in analysis and the reliability of findings, and to facilitate the handling of copious data, the study uses WordSmith 5 analytical software, with the keyword tool used to determine those words demonstrating lexical salience in each context.

The Keyword Method
The notion of keywords, originated by Firth (1957), was substantially developed by William (1976William ( , 1983) ) and Scott (1997Scott ( , 2000)).Firth described the importance of pivotal or focal words in language, while William spoke of significant binding words in texts, using the term keyword to describe this.Both of them used the term to refer to basic words in culture and society.But Scott (1997Scott ( , 2000Scott ( , 2006) ) modified the term keyword to stand for words which are particularly common or uncommon in a text or group of texts in relation to certain norms.Hence the notion becomes extremely powerful in giving insights into the content and style of texts (Johansson 2007).Sinclair (2003) (in relation to Scott & Tribble 2006: Textual Patterns) describes keywords as a powerful tool for assessing and understanding texts, while Scott (1997, p. 4) states that "a keyword may be defined as a word which occurs with unusual frequency in a given text.This does not mean high frequency but unusual frequency, by comparison with a reference corpus of some kind".The advantage of comparing a corpus's word frequency list with that of another corpus is that those words common to both are filtered out in the process, leaving only words that make corpus A distinctive from corpus B, and vice versa (Archer 2009, pp. 3-4).As Archer (1) notes, "the frequency with which particular words are used in a text can tell us something meaningful about that text and its author […] because their choices of words are seldom random".Baker (2009, p. 136) affirms this by highlighting keywords as a useful tool for identifying significant lexical differences between texts, though he cautions that attention be paid to differences in word usage and/ or similarities between texts.
In this study, the identification of sermon keywords is a crucial starting point, prior to taking other essential steps to establish the lexical behaviour of sermon words in context.Those words which are statistically significant in terms of their frequency of occurrence are obtained via WordSmith 5, by sorting the word frequency list according to the resulting log likelihood values (LL).This puts the largest LL values at the top of the list, representing those words having the most significant relative frequency differences between the two corpora.Thus we observe the words that are most indicative (or characteristic) of one corpus, as compared to another, heading the list (Rayson et al. 2004).These are keywords, and are identified using Log Likelihood Statistic set at a value of 15.13 (pco.0001) with 1 d.f.The log-likelihood test represents the frequency deviation from the normative/reference corpus -the higher the figure, the greater the deviation.The keyword list produced portrays both positive and negative keywords, frequent and infrequent words.Positive keywords represent those words associated with the language of sermons in Nigeria, while negative keywords represent sermon keywords in British and American contexts.

Analysis and Results
In this study, two types of analyses were required to identify the words more likely to occur in sermons in Nigeria than chance would suggest.The first level of analyses entailed comparison of the word frequency lists of the main corpus of Nigerian sermons (henceforth corpus 1) with the overall reference corpus tagged non-Nigerian sermons (henceforth corpus 2).Corpus 1 was then matched with corpus 2, the normative corpus, to discover what words are key in context vis-à-vis non-Nigerian contexts.This second level of keyword analysis reveals any symmetries and asymmetries of language behaviour in context.To do this, cut-off points were set using word frequency ratings depending on the size of each corpus, and the log likelihood scores.In the case of corpus 1, only words with a frequency of 5 and above were considered; but for corpus 2, treble the size of corpus 1, a cut-off point was set at a frequency of 10 and above.Then, the threshold for LL was uniformly set at 15.13.These enquiries produced the results shown in Tables 1-3, below.  1 above displays some words from corpus 1 relative to words from corpus 2. Both lists are sorted on a frequency basis with the most frequent words occurring at the top of the list.Column 1 contains some of the words found in the Nigeria sermons.Columns 2 and 3 contain other information about the sermon words: their frequencies and relative frequencies, i.e. proportional percentages when raw frequencies are compared to the size of the corpus.Similarly, columns 4, 5 and 6 present selected words from corpus 2, their raw frequency ratings and proportional percentage scores.It can be seen from columns 1 and 4 that no clear differences can be posited because both lists presents word types in common usage: articles, pronouns, prepositions, auxiliary verbs, main verbs and nouns, with function words having the highest frequencies, as expected.
It is generally accepted that texts cannot be differentiated on the basis of raw frequencies alone.It seems therefore that sermon words cannot be differentiated on the basis of word types alone since both texts manifest the same thematic preoccupations at the macro level.However, we can investigate to what extent the handling of themes differs in terms of emphasis or focus.Our aim is to show that although sermons select words from a common pool -the Bible -to address specific issues of relevance, their degrees of significance varies according to the focus of the sermon-giver, which in turn is determined by the needs of the congregation.Here, a keyword analysis will reveal which words are given prominence in sermon contexts, thereby revealing their thematic preoccupations.
As already indicated, keywords perform an 'aboutness' function in texts, they tell us what they are about.Both Archer (2009) and Scott (1998), and many others in the 'keyword research school', have consistently argued that we learn something about texts from the frequency with which authors use words because their choice of words is seldom random.We shall see in In this table, for each corpus a list of words which were found to be key in the texts is shown.Here also, the keywords are organized according to their frequencies and thus salience.The keyness ratings in the columns adjacent to the keywords indicate their levels of significance in context by representing the extent of deviation from the normative texts -the greater the deviation, the higher the LL scores (keyness value).It should be noted at the outset that almost all the words found in one corpus were also found in the other, with the exception of a very few cases which we attribute to differences in corpus size.Nonetheless, the words found to be key in one corpus differ significantly from those found to be key in the other, and this is the interest of this study, to find those words which are key in each context and to posit the reasons why.
The first observation to be made from Table 2 is the striking divergence in the keyword lists for the two corpora.This points to differences in lexical choices and, as such, differences in thematic foci.The second observation is that the words in each corpus reveal that some semantic fields are identifiable because the words seem to have some linking threads, such that they could constitute loose hyponyms.We find that, in both corpora, there are, for instance, words referring to humanity, divinity, themes, proper names, action-target words/tools, and words of reference (pronouns).However, this categorization is not water-tight as an item can fit into more than one category.The semantic structure of the sermon words is tabulated and shown graphically in Table 3 and Figure 1 below.Obviously, this table presents the discourse structure of sermons as a special type of communicative event whose aim is to establish a symbiotic relationship between God -the maker of mankind -and man.The sermon-giver's extended aim is to portray God as the solution to man's problems.Therefore, the sermons are structured to achieve this: the preacher addresses man, discusses his needs/problems by presenting them as sermon themes, then he situates God as the solution, and points man towards the actions to be taken to get God's attention.Every sermon seems to have this circular structure: Humanity Theme Divinity Action.Then, as is common in most communications, proper names and reference words are used as discourse devices.Below is a graphical representation of these semantic relationships.Although all the words in the six fields identified above together present the lexis of sermons, it is useful to investigate why some of them deserve greater mention in one context more than another.Why, for instance, do we have 'living' as an item of humanity in corpus 1 and 'heart' as an item of humanity in corpus 2? Or why do we have 'praying' as a target action in corpus 1 and submission in corpus 2? Are these merely stylistic choices or does their use transcend style?The answers are found when we closely examine the themes' fields which indicate the foci of the sermons.We find that, aside from a determination of what the texts are about, the thematic keywords also co-select other keywords to expatiate their topics.For example, in corpus 1, we see that the focus of the sermons is the quality of people's lives; for this reason, such words as prosperity, poverty, healing, victory, wealth and challenges make up the list of themes.The aim of the sermonist in such cases is to point the people towards how to overcome or handle those issues of life.Therefore, in the humanity list, as expected, we see such keywords as you, people, life and living; and in the target action group, there are words which suggest how to obtain the desired change, e.g. through the Word, praying, confession, seed and the like.The same pattern is visible in the keywords found in corpus 2. The point is that the choices of these words are not random but are rather organized around the sermons' themes.So the textual salience of a word is definitely a function of its frequency.
And the role of context in word frequency seems to depend on thematic relevance, such that sermonists in Nigeria emphasize issues of concern to society, while those elsewhere base their sermons on areas of need in society.Little wonder then that there is great diversity in the thematic preoccupations in the sermons analysed.In corpus 1, for example, topping the themes list are issues of living conditions, while in corpus 2 issues of family are given priority to mirror the needs of society.This confirms that sermons are organs of education, whether social or moral.It is important to point out that some lexical choices revealed stylistic appropriacy over and beyond thematic considerations; the use of devil (corpus 1) as opposed to Satan (corpus 2) and the use of God (corpus 1) instead of Lord (corpus 2) are examples.
Before concluding this treatise, it is worth highlighting the implications of variations in reference-making in both corpora.In both corpora, the pronouns 'you', 'your', 'we' and 'our', were not only found to be overused, but greatly so.In each corpus, these reference words emerge as the first two most significant words: 'you' and 'your' in corpus 1 and 'we' and 'our' in corpus 2. With frequency and keyness values of fq.1840/k.1677, fq.980/k.1648, fq.8972/k.767 and fq.4889/k.358, respectively, each  is the significance of their unusually high frequencies, given that they are not noun words and, as such, cannot bear thematic relevance?Clearly, their importance stems from the fact that they embody the discourse structure of sermons.Whereas, in the Nigerian sermons, there is a gap between the sermon-giver who occupies the position of knower and sees his audience as non-knowers and, as such, sermons in the context are characterized by density of information and instruction, in the non-Nigerian sermons, the sermon-giver does not put any distance between himself and his audience, rather he identifies with them by use of the inclusive 'we' and 'our'; hence sermons in the context are less dense.

Concluding Remarks
Sermons are a special discourse event, having their own lexes and discourse structure.As such, for effectiveness of communication, communicants must necessarily adopt certain lexis and follow its pattern of discourse for a specific purpose to be achieved.As the study demonstrates, the type and usage of words in sermons are indexical of their contexts, since each context selects its lexis in accordance with the needs of the discourse community.And this, in turn, is determined by the unusually high frequencies of words in context.As Alderson (2007) affirms, knowledge of words and their meanings is a crucial component of language proficiency, both for first language acquisition and for second and foreign language learning; and word frequency is a crucial variable in text comprehension.Thus, it is clear that high-frequency words in sermons, as in other texts, need to be identified to help comprehend sermon texts in their various contexts.

Figure 1 .
Figure 1.Semantic groupings of sermon keywords occupies the most salient position in each corpus.But what

Table 1 .
Word types in Nigerian sermons relative to non-Nigerian sermons Table 2 below whether and to what extent this assertion is supported by sermon words.

Table 3 .
Semantic relations of words in the sermons