Phylogeny and Molecular Evolution of Tricyrtis ( Liliaceae s . l . ) Inferred from Plastid DNA matK Spacer Nucleotide Sequences

Phylogenetic relationships of the genus Tricyrtis (Liliaceae s.l.) have been investigated using data from two non-coding plastid DNA nucleotide sequences using a classical parsimony-based approach in order to provide ground work for further Bayesian inference research. Parsimony-based studies do not involve temporal and spatial information unlike MCMC (Markov chain Monte Carlo) methods. However, well-established phylogeny and age calibration information play vital roles in estimation of molecular evolutionary rate and divergence times in lineages. The evolutionary rate estimates from the matK spacer in Tricyrtis lineages were similar to those in our previous report on rps16 intron data. In the near future different evolutionary model systems will be tested in order to clarify evolutionary rate estimations in various flowering plants including medicinal herbs.


Introduction
Molecular evolutionary studies using DNA nucleotide and amino-acid sequences require well-established phylogeny and fossil or molecular-based calibration dates (Yoder & Yang, 2000;Drummond, Ho, Rawlence, & Rambaut, 2007a;Sanders & Lee, 2007;Drummond, Suchard, Xie, & Rambaut, 2012).For phylogenetic studies, parsimony- (Swofford, 1993) or maximum-likelihood-based (Drummond et al., 2012) approaches can be used.Non-coding DNA such as introns (untranslated intervening sequences) and intergenic spacers (intergenes) are reported as evolving faster than gene-coding regions in terms of nucleotide substitution rates (Kimura, 1983;Gielly & Taberlet, 1994).The introns are spliced out when mature mRNA is formed and they are not involved in protein formation (Kimura, 1983;Freifelder, 1987).According to the hypothesis of Kimura (1983), non-coding regions evolve faster because they have fewer functional constraints.He also suggested that 'silent' nucleotide substitutions, which are not associated with amino acid changes, occur more frequently than other nucleotide substitutions.Therefore, non-coding DNA sequences are expected to provide more variable and informative characters for phylogenetic studies of evolutionary rates.
DNA evolutionary models such as strict molecular or relaxed clock models (Drummond, Ho, Phillips, & Rambaut, 2006;Battistuzzi, Filipski, Hedges, & Kumar, 2010) can use various base substitution systems (e.g.HKY-Hasegawa, Kishino, & Yano; GTR-General Time Reversible) with different statistical heterogeneity models (Drummond et al., 2006).The differences inferred in lineage molecular evolutionary rates could be caused by intrinsic properties in terms of physiological (metabolic rates), generation time, efficiency of DNA repair system, and population size, etc. (Thorne, Kishino, & Painter, 1998;Drummond & Suchard, 2010;Dornburg, Brandley, McGowen, & Near, 2012).It is also known that viral, plant, and mammalian lineages have different rates of evolution.This has led to revision of the constant-rate molecular clock model (Thorne et al., 1998).However, the evolutionary rate can vary in different lineages in organisms (Drummond et al., 2006;Pybus, 2006) in contradiction to the strict molecular clock hypothesis (Zuckerkandl & Pauling, 1965).A relaxed clock model (Lepage, Bryant, Philippe, & Lartillot, 2007;Battistuzzi et al., 2010) was used for the inference of molecular dating estimation in our previous rps16 intron nucleotide sequence studies of Tricyrtis (Hong & Jury, 2011).This is complemented by the work reported here on comparative molecular dating in Tricyrtis employing MCMC-based Bayesian inference approaches on sequences from the matK spacer region.
The genus Tricyrtis Wall. is composed of 18 -20 known taxa and shows high endemism in East Asia (Takahashi, 1974(Takahashi, , 1980(Takahashi, , 1987;;Peng & Tiang, 2007).In particular, ca. 13 taxa occur in Japan (Masamune, 1930;Takahashi, 1974Takahashi, , 1980) ) and ca.6 taxa are known in Taiwan (Liu & Ying, 1978).Most of the taxa of this group are rare in the wild (Kitagawa & Koyama, 1958), but they have long been cultivated as ornamental garden plants (Mathew, 1985).Morphological characters such as inflorescence types, spur shapes, apex of perianths, leaf shapes, and base types are useful characters for classification of Tricyrtis (Masamune, 1930;Takahashi, 1980).The genus name, Tricyrtis, is derived from the characteristic shape of the saccate spur (Takahashi, 1980), which is a perigonal nectary organ (Dahlgren & Clifford, 1982).The common name of the genus is "toad lily".Although there is no fossil record of a Tricyrtis ancestor, the high endemism in Asia and their localized distribution patterns attracts attention in terms of evolutionary routes and diversification relationships among the taxa (Hong & Jury, 2011).In the current study we compare phylogenetic relationships among Tricyrtis groups based on analysis of matK spacer and rps16 intron nucleotide sequences and discuss the differences in evolutionary rates and molecular dating found.Many researchers have used published phylogeny information or data based upon classical parsimony analyses for establishing appropriate phylogenetic relationships.The only available phylogenetic information on the genus Tricyrtis (Liliaceae s.l.) comes from our recent Bayesian inference studies (Hong & Jury, 2011).In the current work, classical parsimony analyses on Tricyrtis have been carried out to establish phylogenetic relationships for further molecular evolutionary studies using newly sequenced matK spacer and relatively recent publication of rps16 intron nucleotide sequence data by the same authors (Hong & Jury, 2011).

Taxon Sampling
* For type specimen information on Tricyrtis, see Hong and Jury (2011).The matK spacer sequencing data was obtained with rps16 intron data during the same period of works at the University of Reading (UK) and Royal Botanic Gardens, Kew.1930) Takahashi (1980) Subgenus I. Brachycyrtis (Koidz.)Masam.
Sect.IV.Maculatae Masam.The close phylogenetic relationship of these genera to Tricyrtis has been established based on the combined analysis of trnL-F and rbcL sequences (Fay et al., 2000).The matK spacer sequences ranged from 751 bp and

DNA Extractions, PCR Reactions, and Cycle Sequencing Reactions
For the eight taxa from Kew, total genomic DNA extractions followed the modified CTAB method of Doyle and Doyle (1987).For all other samples of Tricyrtis, extracts were provided by the Molecular Biology Research Service at the University of Reading, who used the method of (Doyle & Doyle, 1987).The PCR and cycle sequencing reactions were performed using a DNA Thermal Cycler 480 (Perkin Elmer / Applied Biosystems).To amplify matK spacers, two primers were used, 3914F (5´-TGG GTT GCT AAC TCA ATG G-3´) and 122SR (TGA AAG AGA AGC GGG TA).For the amplifications of rps16 introns (Hong & Jury, 2011), the primers were rpsF (GTG GTA GAA AGC AAC GTG CGA CTT) and rpsR2 (TCG GGA TCG AAC ATC AAT TGC AAC).
For automatic sequencing reactions, a 373 DNA sequencer (Applied Biosystems) was used.

Phylogenetic Analyses Using Parsimony
The SeqMan program of the LaserGene TM (DNASTAR Madison, Wisconsin) package was used to edit and assemble the sequences.All the phylogenetic analyses were done using PAUP* version 4.0 series (Swofford, 1993).The options for the Heuristic search were random sequence addition (1000 replicates), branch swapping algorithm (TBR; Tree-Bisection-Reconnection), ACCTRAN (Accelerated character transformation character state optimization method), and MULPARS.Synapomorphous indels were treated as resulting from a single evolutionary event and coded as binary characters (presence / absence) regardless of indel-length (Johnson et al., 1996;Oxelman, Lidén, & Berglund, 1997;Bakker, Hellbrűgge, Culham, & Gibby, 1998).The indel information used for the coding is given in Table 3.  , 1995) analyses were performed in order to test clade support.For the Bootstrap and Jackknife analyses, 1000 replicates (fast stepwise addition), no branch swapping, ACCTRAN, MULPARS, and collapsed minimal branch length zero were used.For the Bremer support analyses, strict consensus trees were used as input constraint trees.For phylogenetic analyses using the matK spacer region, a heuristic search was performed using a TBR branch-swapping algorithm, 1000 random stepwise sequence additions, no steepest descent option, collapse branches with minimal length zero, and MULPARS option.

Bayesian Analysis for Molecular Dating
The Bayesian inference studies for the molecular dating in this study were carried out using, BEAST v1.6.1 (Bayesian Evolutionary Analysis Sampling Trees, Drummond et al., 2010).Details of the program used are given in Hong and Jury (2011).Total 100,000,000 steps were generated in the BEAST analysis and 10% of these were discarded in the burn-in process.A total of 8444 unique clades were found, and the Maximum Credibility Tree was established as in Figure 2. Calochortus minimus (40 Myr) and Tricyrtis affinis (20 Myr) were used for age calibrations in the Bayesian analyses (Vinnersten & Bremer, 2001, Figure 1).2).

Combined Parsimony-based Phylogenetic Analyses
Previous taxonomic treatments are given in Table 2, while Figure 1 shows the phylogenetic relationships obtained in this study, based on the combined data analyses of both matK spacer and rps16 intron sequences.A dichotomous relationship is present between T. hirta var.hirta and T. hirta var.masamunei and a clade of sect.Flavae (Figure 1, except for T. nana), and a polytomous relationship was found in most of the taxa in sect.
Hirtae.Tricyrtis hirta var.hirta differs from T. hirta var.masamunei by having pilose stems and leaves, and purple spotted tepals.Tricyrtis hirta var.masamunei is distinguishable from other T. hirta variants by the whole plant characteristically being totally glabrous (Figure 1).Sect.Brachycyrtis has a distinct floral structure, campanulate, without conspicuous bi-lobed saccate spurs.Tricyrtis macranthopsis differs from T. ishiiana var.surugensis by having an axillary inflorescence and relatively broader oblong leaves.However the matK spacer and rps16 intron results could not resolve the relationship between these two taxa (Figure 1).
The internal branch of the polytomous clade of sect.Flavae shows strong clade support with bootstrap / jackknife values of 96% / 91% (Figure 1) without indel coding (Tables 3-5).The clade of sect.Flavae shows a decay value of 4 indicating relatively strong character support (Figure 1).Among the outgroups, the monophyletic relationship of Calochortus, a clade of Uvulariaceae (Scoliopus, Prosartes, Streptopus) and Convallariaceae (Polygonatum), was confirmed based on a consensus tree with indel coding.In a combined analysis including only 17 taxa, the internal branch of a clade of sect.Flavae (except for T. nana) shows very strong clade support with bootstrap / jackknife values of 100% / 97% (Figure 1).A monophyletic relationship of T. suzukii with the dichotomous clade T. formosana var.formosana and T. formosana var.amethystina was also found (Figure 1).The sister relationship of T. formosana var.formosana and T. formosana var.amethystina has strong clade support, 90% and 83%.The clade of sect.Flavae shows a decay value 5 indicating relatively strong character support (Figure 1).The separation of ingroups and outgroups also shows very strong character support, with a decay value of 20 (Figure 1).The mean rates of divergence of matK spacer sequences in Tricyrtis lineages were similar whether global clock (Strict clock model) or Relaxed Lognormal Clock (technically non-clock) models were tested.There was also no difference in the 95% HPD (Highest Probability Density) values.In the Strict clock model test in BEAST analyses, the mean rates of divergence of spacer in Tricyrtis lineages was 7.56 x 10 -4 (HPD: 5.77 x 10 -4 -9.57x 10 -4 ) substitutions per site per Myr.In the Relaxed Lognormal Clock model test, the mean rate was 7.41 x 10 -4 (HPD: 5.35 x 10 -4 -9.62 x 10 -4 ) substitutions per site per Myr.In Tricyrtis lineages the evolution rates of matK spacer nucleotide sequences were found to be 1.77-1.98times faster compared to those of rps16 introns (Hong & Jury, 2011).
Divergence times were also estimated using matK spacer nucleotide sequence data (Figure 2).Seven nodes (N1 -N7) were found having posterior probability values >0.5 (Drummond et al., 2006), and the divergence times are given at the nodes (Figure 2).In the case of node N4, the divergence time between the outgroup Scoliopus bigelovii and the rest of Tricyrtis groups was estimated as 24.96Myr, more recent than we reported previously (Hong & Jury, 2011).Furthermore, in a previous report based upon rbcL data ( (Vinnersten & Bremer, 2001), the age of Scoliopus was estimated as about 30 Myr.

Discussion
The parsimony-based classical phylogenetic analyses in Tricyrtis provide insights into the evolutionary affinities of the groups.The relationships support the previous taxonomic treatments at section levels (Table 2, Figure 1).The outgroups also show monophyletic relationships (Figure 1).The phylogenetic data of Tricyrtis lineages based upon combined nucleotide analyses of the matK spacer and rps16 intron, should be useful for further Bayesian inference studies to estimate divergence times and evolutionary rates between-and within-lineages of Tricyrtis.
The relationship of Tricyrtis perfoliata with other species of sect.Flavae has not been resolved at the intra-specific level.The formation of a polytomy among the lineages of sect.Flavae may be an indication of relatively rapid evolution within the group.Tricyrtis suzukii has been treated under sect.Tricyrtis by Takahashi (1980), but the current molecular data with other available morphological, anatomical, and phytochemical information (Hong, Greenham, Jury, & Williams, 1999) indicate that the taxon should be treated under sect.Hirtae.Most of the taxa of sect.Hirtae, except for the T. hirta complex, are endemic to Taiwan.T. macrantha, T. macranthopsis, and T. ishiiana var.surugensis are closely related in sect.Brachycyrtis, which is the most advanced group within the genus in terms of phylogeny.A close relationship of T. hirta var.hirta and T. hirta var.masamunei has been shown based on molecular data at infra-specific levels in the current study.Among the species of sect.Tricyrtis, a close relationship of T. macropoda and T. maculata was demonstrated based on anatomical and morphological data (Hong, 1999, Ph.D thesis, http://ethos.bl.uk).
In the current study, the strict vs. relaxed clock model gives similar evolutionary rate information when either model was used, and the relatively high Coefficient of Variation value (1.42) may be an indication that the matK spacer nucleotide sequences are evolving in a non-clockwise fashion, which also agrees with the rps16 intron data (Hong & Jury, 2011).Previous reports indicated that accelerated molecular evolutionary rates in lineages of various organisms may be caused by a shorter generation time or smaller body size (Drummond, Rambaut, Shapiro, & Pybus, 2005).T. nana fits these criteria (Hong, 1999, Ph.D thesis, http://ethos.bl.uk).However, in Bayesian inference studies, T. nana has shown either a medium (rps16, Hong & Jury, 2011) or a slow (matK, Figure 2) evolutionary rate compared to the remainder of the genus.However, the generation time differences between T. nana and other Tricyrtis taxa are much less than for example between humans and Drosophila or Dictyostelium (Ehrenman et al., 2003;Gao et al., 2007;Hong, Qi, Brabant, Bosco, & Martinez, 2008) and may not be a significant factor in this case.
In further studies, different evolutionary models will be tested using Bayesian analyses in order to clarify the rate differences in Tricyrtis lineages.Another objective will be to seek information on factors which modulate evolutionary rates in lineages of different organisms.Furthermore, it will be interesting to investigate nuclear ribosomal DNA (nrDNA) nucleotide sequence divergence in comparison to chloroplast or plastid DNA data in terms of evolutionary rate and molecular dating studies.

Figure 1 .
Figure 1.One of the most parsimonious trees (randomly chosen) out of 97400 trees based on a combined analysis of matK spacer and rps16 intron sequences (tree length = 187, CI = 0.9198, RI = 0.8750).Thirteen indels were coded as binary characters (presence / absence).The numbers above the branches refer to character changes (base substitutions) and decay values in brackets.The numbers below the branches indicate bootstrap / jacknife values (in bold).

Figure 2 .
Figure 2. A Maximum Clade Support (MCC) tree obtained from Bayesian analysis of Tricyrtis matK spacer nucleotide sequence data is shown (BEAST v1.6.1,Drummond et al., 2007).A relaxed clock model (uncorrelated log normal distribution) with HKY substitution model was tested.The scales at the bottom were drawn as Mya (million years).The bars on the nodes denote 95% HPD (Highest Posterior Density) ranges indicating confidence intervals.The node ages are marked and numbered from N1 through N7 (brown color) with posterior probability greater than 0.5 (numbered in green color on the branches, Drummond et al., 2006).The branch colours from red (fast rate) to blue (slow rate) are also shown next to the probability values on the branches.The star (black) symbols refer to calibration sites.The blue bars on the right side indicate different sections of the genus (refere to Table2).

Table 4 .
Tricyrtis phylogenetic analyses information with non-coded indels (insertions/deletions) section for further search option information autapomorphic substitutions were excluded for the analyses MPT denotes maximum parsimonious trees Table 5. Tricyrtis phylogenetic analyses information with coded indels (insertions/deletions) of the Evolutionary Rate and Divergence Time of Tricyrtis Based Upon matK Spacer DNA Nucleotide Sequence data

Table 1 .
Details of taxa studied, DNA sequences, and EMBL/GenBank accession numbers

Table 2 .
Previous taxonomic treatments of the genus Tricyrtis Wall Masamune (

Table 3 .
Indel information of combined matK spacer and rps16 intron sequences of Tricyrtis