Application of Microsatellite Markers to Fingerprint and Determine the Representational Diversity within a Recently Established Elite Maize Inbred Line Breeding Program

This study focused on the implementation of microsatellite markers to determine the diversity among elite inbred lines representing the Vietnamese maize breeding program. Genetic relationships were assessed using twenty loci to gain an understanding of the width of the program compared to other Asia programs, as well as develop a DNA fingerprint database. The PIC value of the 20 loci ranged from 0.10 to 0.83 with those containing a tetra-repeat the most informative (mean PIC = 0.60). The amount of genetic diversity among the lines assessed was similar to the levels identified in other Asian breeding programs. Future monitoring will be important to maintain genetic breadth of the Vietnamese program and minimize risks from narrowing in future targeted selective breeding strategies. This will be facilitated through implementation of the developed robust SSR-based DNA fingerprint database, which will also be used for true-to-type identification, hybrid confirmation and potential trait selection.


Introduction
Maize is an important crop in Vietnam with 1.09 million ha sown in 2009 producing 4.38 million tons, used mainly for animal feed (FAOSTAT Database 2009).By 2020, the Vietnamese population is projected to increase to 100 million, and the annual demand for maize for animal feeds is projected to increase to over 8 million tons (Le et al., 2010).Therefore, there is urgency to increase productivity in order to reduce the need for massive future imports, currently at 0.5-0.7 million tons per annum.This is an ambitious target in light of the finite resource of arable land suitable for growing maize, together with the diverse ecological, social and economic zones in Vietnam.To meet this demand, current strategies within the Vietnam maize breeding program are to develop hybrids with; 1) high and stable yields for intensive and combined conventional and modern maize farming practices; 2) tolerance to herbicides, and abiotic and biotic stresses and 3) shorter life cycles, for flexible crop rotation (Le et al., 2010).
In order to maximize heterosis effects from optimal hybrid combinations as well as maintain breadth in the breeding programs, knowledge of diversity and relationships among current genetic resources is required (Warburton et al., 2002).A narrow and/or non-representative genetic base will ultimately reduce the ability to develop genetically elite lines due to the lack of novel and/or alternate alleles potentially leading to vulnerability to biotic and abiotic stresses (Ali et al, 2006).
Short sequence repeat (SSR) microsatellite markers are widely used to assess genetic relationships within maize breeding programs (Prasanna et al., 2010) due to their co-dominant inheritance, chromosome specific location and amenability to automation (Kalia et al., 2011).In particular, they have been used to assess programs in China (Li et al., 2002), India (Pushpavalli et al., 2002;Ranatunga et al., 2009), Indonesia (Pabendon et al., 2007), The Philippines (Sales et al., 2004), Thailand (Phumichai et al., 2008), Japan (Enoki et al., 2002) and Iran (Choukan et al., 2006).SSR markers have also been applied to assess genetic diversity within the Vietnamese program (Bui et al., 2007a).However, these studies were limited by conventional PCR and PAGE technologies, resulting in limited accuracy and the germplasm assessed was not truly representative of the sources of the genetic material within the entire program.
Fluorescence-based SSR detection and allele sizing on an automated DNA fragment analyzer was developed to provide fast and accurate genotyping (Ziegle et al., 1992).This is based on the separation of fluorescently labeled SSR amplicons by capillary or gel electrophoresis, and requires one of the PCR primers labeled with a fluorescent dye (Oetting et al., 1995).Most recently, the multiplex-ready PCR technique was developed to increase throughput and further reduce assay costs (Hayden et al., 2008).Therefore, the aims of this study were to genotype a representative set of the elite inbred Vietnamese maize lines using the high throughput fluorescent SSR method to: 1) Determine the current genetic relationships and the width of the Vietnamese breeding program and to compare this to the width of other Asian maize inbred programs, and 2) Develop a robust DNA fingerprint database tool for future diversity assessment, true-to-type identification and hybrid confirmation.

Maize Inbred Lines and gDNA Extraction
Twenty elite maize inbred lines were supplied by the National Maize Research Institute (NMRI), Vietnam, developed from commercial local and imported hybrids by a traditional selfing method.Based on their source and high yield and desirable agronomic characteristics, these lines were selected as representative of the diversity within the Vietnam maize breeding program (Table 1).The CTAB method (Saghai-Maroof et al., 1984) was used to extract gDNA from an equal amount of fresh young leaf material from three 8-week-old plants per genotype.The gDNA was then bulked for each genotype sample.

Primer Synthesis and Amplification Optimization
Twenty SSR primer sequences were selected from the Maize GDB website (http://www.maizegdb.org;Table 2) based on the previously reported levels of informativeness (Phan et al., 2004;Bui et al., 2007a, b).They were also chosen based on genome location, distributed across the ten linkage groups of the maize genome.The primers were synthesized with generic non-complementary nucleotide sequences at the 5' end (Invitrogen, USA).Amplicons were obtained using the M13 genotyping method (Oetting et al., 1995).PCR reactions were optimized for annealing temperature at each locus by first observing all of the expected sized products on ethidium bromide stained agarose gel.

Fluorescent Genotyping
Genotyping was performed with a MyCycler TM PCR machine (Bio-Rad, Australia) using the multiplex-ready PCR technique (Hayden et al., 2008).Each 12 μl reaction contained 10-15 ng of gDNA template, 1x mutilpex-ready PCR buffer (5x Immolase PCR buffer, 7.5 mM MgCl 2 , 1 mM dNTPs and 2.5x BSA (Bioline, Australia)), 0.3 U Immolase DNA polymerase (Bioline, Australia), 75 nM of tagF primer (Applied Biosystem), 75 nM of tagR primer (Invitrogen, Australia) and an optimized amount of each of the locus specific primer ranging from 10 to 15 nM (Table 2).PCR conditions for all loci were 95°C for 10 min, followed by 2 phases: The first PCR phase consisted of five cycles of 92°C for 60 s, 54-65°C for 90 s and 72°C for 60 s, and 20 cycles of 92°C for 30 s, 54-65°C for 90 s, 72°C for 60 s; The second phase was 40 cycles of 95°C for 15 s, 54°C for 60 s, 72°C for 60 s; and a single final extension cycle of 72°C for 10 min.Automated capillary electrophoresis fragment analysis was performed on an ABI3730 DNA Analyser at the Australian Genome Research Facility, Parkville, Australia.

Data Analyses
To determine the informativeness of the SSR loci, alleles were called and analysed with GeneMapper v.4.0 software (Applied Biosystems).The observed and effective numbers of alleles were determined with POPGEN 1.32 (Francis et al., 1999).Polymorphic Information Content (PIC) calculations were determined for each primer combination by applying the formula PIC = 1 -Σp i 2 , where p i is the frequency of the i th allele (Botstein et al., 1980).
For the diversity analysis, based on the presence/absence of alleles at each locus, pairwise similarity measures (Sij) were determined using Jaccard's coefficient; Sij = a/(n-d) (Jaccard, 1908) within NTSYSpc version 2.1 (Rohlf, 2000), where a is the number of bands present in both genotypes i and j, n is the total of bands in sample (total sample size), and d is the number of bands absent in both genotypes i and j.Subsequently, the genetic relationships among the lines were visualized through an UPGMA dendrogram.Principal components analysis (PCA) was also performed using the NTSYS pc 2.1 software to determine the relationships within a multi-dimensional space.

SSR Genotyping and Genetic Relationships within the Vietnamese in-bred Breeding Program
The number of alleles observed per locus ranged from two to seven (mean of four) with a total of 88 alleles assessed, ranging from 100 to 350 bp.The effective allele number ranged from 1.11 to 4.08, with an average of 2.07 per locus (Table 2), which corresponded to 52% of the total number of alleles.This was somewhat lower than the allele number of 6.0 per locus observed by Rantunga et al (2009) or 6.8 observed by Warburton et al (2002).However, apart from potentially more diverse germplasm having been assessed, the allele detection and reporting methods employed may greatly influence the outcome.A more thorough checking of marker stability and reproducibility with potentially more conservative allele assignment has arrived with the era of automated genotyping.
The PIC value of each locus ranged from 0.10 (phi448880) to 0.83 (phi072).Four markers (phi423796, phi448880, umc1279 and phi233376) had relatively low PIC values (0.14, 0.10, 0.19 and 0.14, respectively) and thus they may not be as informative as the other 16 loci for diversity assessment and fingerprinting purposes within this genotype set.The mean PIC value across the 20 SSR was 0.44, lower than previously detected within Vietnamese maize line nurseries (Phan et al., 2004 andBui et al., 2007a, b).A similar study among Chinese elite inbred lines revealed a PIC average of 0.60 among 50 SSR loci assessed, also slightly higher than revealed among the 20 loci in the current study (Li et al., 2002) and closer to a mean of 0.62 revealed in the much larger study of Smith el al (1997) and mean 0.69 PIC of loci used to assess relationships among collections of Japanese (Enoki et al., 2002) and Thai (Phumichai et al., 2008) inbred lines.An even higher mean PIC range of 0.53 to 0.99, with a mean of 0.83, was previously detected among 22 SSR loci (132 alleles) screened across 45 Indian inbred lines (Ranatunga et al., 2009).The differences in numbers and levels of informative alleles among studies may also be due to differences in sample sources and number, specific selection a priori imposed as well as different SSR loci assessed.
In the current study, 45% of the loci analysed comprised tetra-repeats, indicating their informativeness for estimating genetic diversity among maize inbred lines.Indeed, the mean number of observed alleles per tetra-repeat locus was 4.89, compared to 3.00 and 3.25 for tri-and penta-/complex-repeats, respectively.Congruent to this, the mean PIC value for tetra-repeat type loci was 0.60 compared to 0.25 and 0.40 for tri-, and penta-/complex-repeats, consistent with a previous report (Phan et al., 2004).Conversely, Bantte and Prasanna (2003) reported that di-repeats represented 38.9% of the analyze loci and had a relatively higher mean PIC value (0.62) than tri-(0.44),tetra-(0.55),penta-(0.40)or complex repeats (0.53).This was consistent with the di-repeat hypervariability previously reported in maize (Smith et al., 1997;Li et al., 2002).Thus, a future optimal strategy may be to target di-and tetra-repeat SSR loci for molecular discrimination among very closely related lines within the Vietnamese program.

Current and Future Maintanence of Genetic Diversity
The similarity coefficient among the Vietnamese elite lines ranged from 0.32 to 0.91, indicating substantial genetic breadth among some of lines (0.68 maximum dissimilarity) and similar to the range from 0.34 to 0.72 (0.66 maximum dissimilarity) that was observed between loci screened on Indian inbred lines (Ranatunge et al., 2009).This was also similar to the 0.57 maximum dissimilarity observed with 259 alleles from 50 SSR loci among 59 inbred Chinese lines (Li et al., 2002).
A lose cluster was observed among the Vietnamese elite inbreds, comprising lines sourced from Charoen Pokphand Seeds as well as a Syngenta (VNL13) and Monsanto (VNL11) line.This may highlight the sources of elite germplasm used within this commercial breeding program (Figure 1).However, the majority of lines did not cluster closely, indicating genetic distinction and heterogenic representation within the program.Indeed, the three lines representative of those bred within the NMRI program (VNL14, VNL15 and VNL16) were scattered throughout the dendrogram and PCA (data not shown), indicating substantial diversity among them at the loci assessed (Figure 1).Although 62 % of the variation among all 20 lines was explained by the three largest principal coordinates (49.07, 7.08 and 6.00 %, respectively), a far more comprehensive sampling is required to determine the detailed structure of the program.Structure on a much larger scale has previously been shown within maize inbred breeding programs using SSR markers, for indicating heterotic groupings and/or pedigree (Choukan et al., 2006;Lui et al., 2003;Sales et al., 2004).

Development of an SSR Database for NMRI
The alleles identified in this study will be useful to maintain true-to-type and hybrid identification within the Vietnamese NMRI maize in-bred breeding program, saving time and money in future crossing and reselection.The database is also a tool for breeders elsewhere who will use the Vietnamese hybrid lines as parental germplasm in future material exchange programs (Table 3).

Table 1 .
The twenty elite inbred lines from the Vietnamese breeding program that were assessed in the study

Table 2 .
SSR markers that produced stable and multiallele profiles among the twenty elite inbred lines

Table 3 .
A genotype database constructed for the twenty elite lines within the Vietnamese breeding program