Population Genetics of Drosophila ananassae : Evidence for Population Sub-Structuring at the Level of Inversion Polymorphism in Indian Natural Populations

Drosophila ananassae is a cosmopolitan and domestic species distributed in the tropical, subtropical and mildly temperate regions. Population structure analysis in forty-five Indian natural populations of D. ananassae was performed employing three cosmopolitan inversions as markers. Pairwise FST analysis and genetic distance (D) values showed strong genetic differentiation. Though, lowermost values correspond to geographically closest populations, we did not find any significant ‘isolation by distance’ effect. Values of gene flow based on FST estimates are very low (Nm < 5). All these findings, viz. strong genetic differentiation and minimal gene flow indicate strong sub-structuring in Indian natural populations of D. ananassae at the level of inversion polymorphism. This finding is particularly intriguing in case of D. ananassae as it is frequently transported via human traffic. Given limited gene flow, populations are expected to diverge genetically due to drift. Low level of gene flow coupled with high degree of genetic differentiation might have occurred historically and is maintained currently. Demographic properties, historical and contemporary events and other factors are more important in shaping the patterns of population sub-structuring, genetic differentiation and gene flow than mere terrestrial habitat characteristics (un) favorable for migration.


Introduction
Inferring the origin, population structure, and demographic history of a species is a major objective of population genetics.Natural population displays geographic population sub-structure, which is due to differences in allele and genotype frequencies from one geographic region to the other.Population subdivision is centrally important for evolution and affects estimation of all evolutionary parameters from natural and domestic populations (Hartl & Clark, 2007).Chromosomal inversion polymorphism is one of the best studied systems in population genetical studies in Drosophila.Chromosomal analyses could be used in evolutionary studies as genetic markers (Powell, 1997) in which chromosome inversions are considered as alleles and are utilized to examine gene flow and other population genetic parameters.In natural populations of Drosophila, chromosomal polymorphism due to inversions is common and is an adaptive trait (Da Cunha, 1960;Dobzhansky, 1970;Sperlich & Pfriem, 1986).Drosophila ananassae is a member of the ananassae species complex of the ananassae subgroup of the melanogaster species group (Bock & Wheeler, 1972).It shows high degree of chromosomal polymorphism (Singh, 1996).It occupies a unique status in the whole genus of Drosophila due to certain peculiarities in its genetical behavior (Singh, 2000).D. ananassae harbors a large number of inversions in its natural populations.Of these only three inversions namely, Alpha (AL) in 2L, Delta (DE) in 3L and Eta (ET) in 3R are cosmopolitan in distribution (Singh, 1998).Population genetics of chromosomal polymorphism in Indian natural populations of D. ananassae has been extensively studied and the results have clearly shown that there is geographic differentiation of inversion polymorphism (for references see reviews by Singh, 1998;Singh & Singh, 2008).D. ananassae displays high population sub-structure across the whole distribution range (Stephan, 1989;Stephan & Langely, 1989;Stephan & Mitchell, 1992;Stephan et al., 1998;Das, 2005;Schug et al., 2007).It exhibits more population structure than both D. melanogaster and D. simulans (Vogl et al., 2003;Das, 2005).It is a cosmopolitan and domestic species, largely circumtropical in distribution.In tropical and subtropical regions of the world, D. ananassae is one of the most common Drosophila species, especially in and around human habitations.Although populations are separated by major geographical barriers such as mountains and oceans, recurrent transportation by human activity may lead to genetic exchange.It exists in many semi-isolated populations around the equator, particularly in mainland South-east Asia and on the Islands of the Pacific Ocean (Tobari, 1993).D. ananassae is thought to have originated in South-east Asia where most of its relatives occur (Das et al., 2004;Das, 2005).
Forty-five natural populations from different eco-geographic regions of the country (covering the regions from Jammu in north to Kanniyakumari in south and Dwarka in west to Deemapur in east) were analysed for chromosomal inversions and data has been given elsewhere (Singh & Singh, 2007).In the present communication, same quantitative data on the frequencies of three cosmopolitan inversions in Indian natural populations of D. ananassae (Singh & Singh, 2007) have been used to arrive at genetic variability estimates, F-statistics and gene flow.The present communication represents the most comprehensive analysis of Indian natural populations of D. ananassae using traditional genetic approach and employing chromosomal markers (inversions), genetic variability parameters (H o, H e, F) and F-statistics to infer the population structure and gene flow among D. ananassae populations from different parts of India.

Materials and methods
2.1 Drosophila strains D. ananassae flies were collected from forty-five different eco-geographical localities of India ranging from Jammu in north to Kanniyakumari in south and Dwarka in west to Deemapur in east (see Singh & Singh, 2007 for details of collection along with their geographical locations).

Population genetic analysis
Quantitative data on inversion frequencies in forty-five natural populations of Indian D.ananassae (Singh & Singh, 2007) was utilized to arrive at genetic variability estimates, F-statistics and gene flow.This is the first time that inversions as chromosomal markers have been employed for population structure analysis.We have used three cosmopolitan inversions of D. ananassae, since they have worldwide distribution due to their adaptive role; are integral part of the genetic endowment of the fly and could be treated as alleles at single locus.Since, population sub-structuring or subdivision is acquiring different allele or genotype frequencies from one geographic region to other leading to population differentiation, chromosomal inversions could be utilised for such studies.
Genetic variability was recorded as mean observed (H o ) and expected (H E ) heterozygosity.Population inbreeding coefficient (F) was calculated to deduce the level of inbreeding due to population sub-structuring and also the departure of H o from HWE. Population structure analysis was done using traditional F-statistics following Wright (1951).
Gene flow between populations was estimated as the number of migrants exchanged between populations per generation (Nm).Nm values were derived from one approach using F ST values, following the island model of Wright (1951) with a small level of migration, whereby: Nm = (1-F ST ) / 4F ST Genetic distance (D) approach was also utilized to determine the pattern of geographic variation among Indian natural populations of D. ananassae.It was calculated from Nei's (1972) genetic identity (I) using the formula (D=1-I).To test 'isolation by distance' effect genetic distance and geographic distance were correlated.

Results
Estimates of genetic variability are given in Table 1.Mean observed heterozygosity (H o ) ranges from 0.15 (AD) to 0.61 (PC), similarly, mean expected heterozygosity (H E ) ranges from 0.15 (ML) to 0.45 (VD).The value of population inbreeding coefficient (F) ranges from -0.09 (AB) to 0.47 (PU).The overall structure among populations based on F-statistics was higher.As given in Table 1, values of F IS range from -0.53 (ML) to 0.47 (PU).The values of F ST across each population range from 0.04 (PC) to 0.64 (ML).The values of F IT , which is the most inclusive inbreeding coefficient, range from -0.41 (PC) to 0.68 (AD).
Pairwise F ST values among populations range from 0.054 (GY vs UJ) to 0.617 (AD vs ML) showing that Indian populations of D. ananassae are not homogeneous and exhibit high level of genetic differentiation (Table 2).Table 3, shows F ST based estimates of gene flow ranging from 0.155 (PN-ML, AD-ML) to 4.379 (GY-AD).The gene flow measured as Nm (using F ST values) probably does not represent the real flow and is not the best estimation of this geographic parameter, but this measure permitted us to observe the overall trend.
As given in the Table 4, pairwise genetic distance (D) values among populations range from 0.000 (UJ vs IN and KL vs SD) to 0.436 (LK vs GK).The lowermost D values correspond to geographically closest populations.Genetic distance and geographic distance were insignificantly correlated (r = 0.200; p > 0.05).

Discussion
Due to its extensive population structure, D. ananassae could be an appropriate model to analyze the effect of population subdivision on genetic variation.

Genetic variability
Values of population inbreeding coefficient show the range of -0.093 to 0.47.In most of the cases H o is almost similar to H E (indicating populations are in HWE), only in few cases H o < H E (indicating inbreeding), in these cases too, H o is almost similar to H E and in rest of the populations H o > H E (indicating outbreeding) as is the case in most of the natural populations.Reduction in heterozygosity resulting from population sub-structuring is intimately related to the reduction in heterozygosity caused by inbreeding.This could be understood by interpreting each subpopulation as sort of "extended family" or set of interconnected pedigrees.Organisms in the same population often share one or more recent or remote common ancestors, and so mating between organisms in the same subpopulation will often be mating between relatives (Hartl & Clark, 2007).

F-statistics
Values of F IS (-0.53 to 0.47) in most of the natural populations are close to zero indicating random mating in subpopulations.Values of F IT, the most inclusive measure of inbreeding (-0.41 to 0.68) is found close to zero in most of the cases.Values of F ST show the range of 0.04 to 0.64.So, range wise population subdivision, possibly due to drift accounts for approximately 4% to 64% of the total genetic variation.Presumably, values of F ST are influenced by the size of subpopulations, which is the major determinant of the magnitude of random changes in allele frequency (Hartl & Clark, 2007).

Genetic differentiation and north-south trends
Pairwise F ST values and genetic distance estimates reveal strong differentiation among the populations particularly between northern and southern regions within the study area, hence showing north-south trends.However, no clinal pattern with respect to inversion frequencies has been found in Indian natural populations of D. ananassae.Latitudinal clines have been detected in Indian natural populations of D. melanogaster (Das & Singh, 1991;Singh & Das, 1992).These north-south trends have also been reported with respect to inversion frequencies in natural populations of D. ananassae (Singh & Singh, 2007).These trends have persisted because of diversifying selection that acts to impede homogenization as a result of gene flow, as is commonly the case among naturally occurring clines (Endler, 1977;Arnold, 1997).Further, strong genetic differentiation observed among D. ananassae populations from different eco-geographic regions could be due to geo-climatic heterogeneity and this difference is canalized via rigid polymorphic system in D. ananassae.It is because of this, it resists any change brought through any agency or homogenizing force.Further, localized differentiation or similarity among populations could be due to selection and local dispersal or both.Since, D. ananassae flies are domestic and co-habit the human dwellings, which are capable of sustaining the residents, they have inherently poor dispersal capacity.It may be likely that when these are subjected to forced dispersal, these flies may show aversion to mating and hence maintaining the differentiation.Forister (2004) found the similar happening in Hesperia comma, a holarctic skipper with very high dispersal potential.

Gene flow
Our low pairwise gene flow values (only slightly above the range shown by rat snakes, Lougheed et al., 1999) and conversely high F ST values relative to other studies involving non Drosophila models (Carmichael et al., 2001;Nice & Shapiro, 2001;Schwartz et al., 2002;Bargelloni et al., 2003;Rueness et al., 2003;Mcrae et al., 2005) are surprising because Drosophila by itself has poor dispersal capacity but since it is a human commensal and is co-transported via agency of human travel along with fruits and vegetables so geographic barriers or habitat discontinuity of any kind hardly hinders its movement.Despite this it maintains very high level of genetic differentiation and exists as structured semi-isolated populations.Gene flow between these flies may be restricted despite their being nearly adjacent in some of the localities.So sympatric divergence could also be the explanation for these data as has been hypothesized for host races in butterfly genus Mitoura (Nice & Shapiro, 2001).
Habitat barriers play a role in structuring populations (Walker, 2000;Mcrae et al., 2005), so that where habitat barriers are present, we should have found greater population structure, but in case of our model these are insignificant as flies are transported via fruits and vegetables to different parts of the country, and possibilities are that, small number of founders might have started their colony afresh and during that precarious bottleneck period random genetic drift might have played its role in causing differentiation.Further, significant genetic differentiation in nearly all pairwise comparisons (based on the values of genetic distance and pairwise F ST values) across the country suggests that considerable population structuring can occur regardless of distance (in some cases) or any other similarity inducing factors.Evidence for substantial structuring also means that populations within some states may not be connected by frequent dispersal and this would define populations with allele frequencies divergent enough to indicate functional independence (Moritz, 1994;Mcrae et al., 2005).Isolation by distance (Wright, 1943) effect was not conformed statistically as genetic distance and geographic distances are insignificantly correlated in D. ananassae populations (Figure 1).It is evident from D values that populations separated by greater geographic distance have higher genetic dissimilarity than those situated close to each other in most of the cases.D. ananassae does not show temporal divergence (Singh & Singh, 2007).This again conforms lesser intermixing between the populations of D. ananassae in terms of migration or gene flow across geological time scale, otherwise there could have been all genetically similar populations of D. ananassae throughout its range.It is this strong genetic differentiation that keeps D. ananassae populations highly sub-structured and in semi-isolated fashion.
Similar studies (Vogl et al., 2003;Das et al., 2004;Schug et al., 2007) done earlier at molecular level in D. ananassae have arrived on the similar conclusion.These studies have shown that F ST values of the order of 0.1 (much lower than our F ST estimates) could be applied to Indian populations.This difference could be attributed to the markers used.Compared to allozymes and molecular markers, the picture of geographic differentiation appears to be different for chromosome inversions, which are more variable and more differentiated even over short distances.This could be partly due to the fact that allozymes and molecular markers are in general more "neutral" than chromosome inversions.
We could therefore conclude, at least in our reasonably broad spatio-temporal study covering the representative eco-geographical regions of the country, that, populations of D. ananassae show strong sub-structuring due to genetic differentiation of their natural populations.Given limited gene flow, populations are expected to diverge genetically due to drift.Low level of gene flow coupled with high degree of genetic differentiation might have occurred historically and is maintained currently.Demographic properties, historical and contemporary events and other factors are more important in shaping the patterns of population sub-structuring, genetic differentiation and gene flow than mere terrestrial habitat characteristics (un) favorable for migration.

Figure 1 .
Figure 1.Correlation between geographic distance (Km) and genetic distance in natural population of Drosophila ananassae

Table 1 .
Estimates of genetic diversity and F-statistics in Indian natural populations of D. ananassae , observed average heterozygosity; H e , expected average heterozygosity; F, population inbreeding coefficient; F IS , inbreeding coefficient due to non random mating; F ST , inbreeding coefficient due to population subdivision; F IT , inbreeding coefficient due to effect of non random mating with subpopulation and the effect of population subdivision. o