CRISPR/Cas9-Mediated Targeted Mutagenesis of Wild Soybean (Glycine soja) Hairy Roots Altered the Transcription Profile of the Mutant

Clustered Regularly Interspaced Short Palindromic Repeat/CRISPR-associated protein 9 (CRISPR/Cas9) system has been regularly applied for genome editing and gene function identification in wild soybean (Glycine max) cultivars. However, till date no studies have demonstrated successful mutagenesis in wild soybean (Glycine soja) which is the ancestor of Glycine max and rich in stress tolerance genes. In the current study, we report the successful creation of mutations in the loci encoding plasma membrane Na/H antiporter (SOS1) and nonselective cation channels (NSCC) in wild soybean hairy roots using the CRISPR/Cas9 system. Two genes, GsSOS1 and GsNSCC, were mutagenized with frequencies of 28.5% and 39.9%, respectively. Biallelic mutations in GsSOS1 were detected in transgenic hairy roots. GsSOS1 mutants exhibited altered Na/K ratios in the roots under both control and salt-treated conditions. However, no significant effects of GsNSCC mutation on Na/K ratios were observed. RNA-Seq analysis revealed that both GsSOS1 and GsNSCC mutation altered the transcription profiles in mutant roots. Many differentially expressed gene sets that are associated with various cellular functions were identified. Our results demonstrated that CRISPR/Cas9 systems as powerful tools for wild soybean genome editing and would significantly advance the gene mining and functional identification in wild soybean.


Introduction
It is well known that soybean (Glycine max) is one of the most important crops of the world (Lam et al., 2010;Smil, 2000). Wild soybean (Glycine soja) is the known ancestor of Glycine max, possessing much greater adaptability to a variety of environmental stresses and is believed to be rich in stress tolerance genes. It has, therefore, been suggested as a potential source of germplasm to improve the agronomic traits of cultivated soybean (Ge et al., 2011;Wen et al., 2009). However, so far, only a few resistance genes have been successfully mined from wild soybean. In recent years, significant progress has been made in the utilization of CRISPR/Cas9 (Clustered Regularly Interspaced Short Palindromic Repeat/CRISPR-associated protein 9) system for genome editing and functional studies in various crops.
Multiple reports have described the generation of mutations using CRISPR/Cas9 in soybean (Glycine max), including somatic mutation and whole-plant soybean mutagenesis. In earlier years, the hairy soybean root system was used to detect the target gene editing efficiency with different sgRNA promoters Du et al., 2016) or different target loci (Jacobs et al., 2015) and of endogenous and exogenous genes (Cai., 2015), which significantly improved the optimization of CRISPR/Cas9 system in soybean functional genomic research. Earlier, gene function investigations with CRISPR/Cas9 system combined with the hairy root transformation (due to its high efficiency and time-saving) was successfully implemented. For example, overexpression of GmMYB118 in soybean hairy roots, significantly improved the plants' drought and salt tolerance when compared with CRISPR-only transformed plants (Du et al., 2018). In recent years, before commencing the whole-plant soybean mutagenesis, the CRISPR/Cas9 constructs functionality are assessed by transient expression in the hairy roots (Do et al., 2018;Curtin et al., 2018), thus strengthening the realization of hairy roots as an excellent transgenic model system for soybean transformation/mutagenesis. Based upon above-mentioned information and other reports in Glycine max, we investigated for the existence of studies wherein successful transformation and mutagenesis in the wild soybean (Glycine soja) using root hair was reported. To our surprise, till date, no studies have reported successful utilization of the CRISPR/Cas9 for mutagenesis in wild soybean (Glycine soja).
Hence, the first question we want to address in this study is to see if we could perform mutagenesis in Glycine soja hairy roots in the same way as Glycine max. For this purpose, we selected two target genes GsSOS1 and GsNSCC. Salt overly sensitive1 (SOS1), encodes a plasma membrane Na + /H + antiporter, which exports Na + to the apoplast (Shi et al., 2000(Shi et al., , 2002. Nonselective cation channels (NSCC) in the plasma membrane of higher plants form a large and diverse group of plant cation channels which are the major pathway for Na + influx into root cells (Amtamnn & Sanders, 1999;Tyerman & Skerrett, 1999;White, 1999). Because the unidirectional influx of Na + is rapid and greatly exceeds the rate of accumulation, efficient efflux of Na + to apoplast must function to minimize net uptake and achieve ion homeostasis in the plant cell (Tester & Davenport, 2003). Hence, SOS1 is a crucial component of plants in the defence against sodium ions that have entered the cytoplasm. Mutants of Arabidopsis lacking SOS1 are highly salt-sensitive and does not possess an effective Na + extrusion mechanism (Shi et al., 2002;Qiu et al., 2002). In addition to Na + efflux, some reports also highlighted the critical roles of SOS1 in supporting vacuolar morphology, ion homeostasis, and membrane trafficking, thus mediating salt tolerance of root cells during the early stages of salt stress (Oh et al., 2010a). Further, SOS1 mutants also exhibit altered plants pathogen responses and circadian rhythm (Oh et al., 2010b). Therefore, it is highly likely that the complex structure of the large SOS1 protein is involved in more than one function. Like SOS1, NSCC also has multiple functions. Apart from Na + influx, NSCC was shown to be involved in the uptake of K + , NH 4 + , Ca 2+ , Mg 2+ , micronutrients and trace elements, in ROS-, amino acid, purine-and cyclic nucleotide induced signaling, growth and development (Demidchik and Maathuis 2007). Both SOS1 and NSCC are related to Na + transportation and have multiple functions. The second question we want to address is the alterations in the transcriptome as a result of the mutagenesis of GsSOS1 and GsNSCC. In this study, we tried to perform CRISPR/Cas9 system in wild soybean to generate the mutagenesis of GsSOS1 and GsNSCC genes and detect the effects of gene mutation on phenotype and transcriptome.

Plant Material
The wild soybean seeds of G. soja (ZYD1239) were germinated in a growth chamber maintained at 25 °C with 16/8hrs of light/dark cycle. One-week old seedlings were selected for A. rhizogenes-mediated transformation Ming et al., 2018). After the initiation of hairy root formation from the infection site, the hairy roots were covered with vermiculite to maintain high humidity. After 30 days, each hairy root was cut into two parts. One part was used for DNA extraction (for mutation detection), and the other part was used for total RNA extraction for RNA-seq analysis. The seedling, which does not exhibit any mutation, their hairy and the primary roots were excised and used as controls for those with the mutation. Both seedlings (with and without mutations) were subsequently treated for six hours with either water or 250 mM NaCl. Post-treatment, the roots, stems and leaves were collected to measure Na + , K + concentrations. Further, plants with all hairy roots were also directly subjected to control and salt stress. All the roots (with gene mutation), stem, leaves were collected from each plant to determine the levels of Na + , K + ions.

Vector Construction
A codon-optimized Cas9 gene with an NLS was obtained from Professor Qu (Qu, State Key Laboratory for Protein and Plant Gene Research, Peking-Tsinghua Center for Life Sciences, College of Life Sciences, Peking University) and used for generation of pCambia3301-Cas9 and pUC57-GmU6-sgRNA vectors as described earlier in Sun et al. (2015).

Detection of Mutations in Target Genes
Genomic DNA was extracted using a DNAsecure Plant Kit (Tiangen, China) according to the manufacturer's instructions. The target genes were PCR amplified using with gene-specific primers (Table A2) using genomic DNA of each hairy root as a template. The PCR products were digested for 30 min with BglⅡ and ApalI, respectively. The undigested bands were purified and sequenced to detect gene mutation(s).

Determination of Ion Concentration
The harvested seedlings were separated into roots, stems and leaves and were initially oven-dried for 30 min at 105 o C and then at 65 o C until a constant weight is recorded. The fully dried tissues were weighted and grounded to a fine powder. The powdered material was digested with nitric acid and the total volume was made to 15 ml using ddH 2 O (double distilled water). The solution without tissues samples was used as blank. K + and Na + concentrations were measured by ICP-OES spectrometer (ICAP6300, Thermo Fisher Scientific, US). The experiments were repeated three times. The ion content was determined using the following formula: Ion content (mg/kg) = (The sample concentration -Blank control concentration) × Volume/Dry weight.

RNA-Seq Profiling Experiment
The hairy roots from the three sample groups (CK, NSCC and SOS1) were collected for RNA extraction using the RNAprep Pure Plant Kit (TIANGEN, China) according to the manufacturer's instructions. The qualified RNA samples were used for RNA-seq analysis. The strand-specific cDNA library was constructed as described before (Jiang et al., 2017). Suitable enriched fragments were sequenced using a HiSeq 2500 instrument (Illumina, USA).
Raw reads obtained from sequencing were filtered to exclude reads containing adapters, reads with more than 10% unknown nucleotides and low-quality reads containing more than 50% of bases with a quality score of ≤ 5 to obtain clean reads. The cleaned reads were mapped to the G. max references sequence using TopHat2 software with a tolerance of two mismatches (Kim et al., 2013). The soybean (Glycine max (Linn.) Merr.) genomic sequence available from the database (ftp://ftp.ensemblgenomes.org/pub/release-35/plants/fasta/glycine_max/ dna/Glycinemax.V1.0.dna. toplevel.fa.gz) was used for mapping of the reads. The basic sequencing results and assembly information are summarized in Table A3. The DEseq package was used to estimate differential gene expression after standardization of reading count (Anders and Huber 2012). The differentially expressed genes were considered to be significant at False Discovery Rate (FDR) < 0.01 and absolute fold change ≥ 2. GO annotation was carried out using Blast2GO software (Young et al., 2010). KEGG pathway annotation was performed using Path_finder software against the KEGG database (Kanehisa et al., 2008). After GO annotation of every unigene, WEGO was used to assign GO functions to all unigenes and to determine the distribution of gene functions of the species.

RT-PCR Assays
Sequencing results were validated by qRT-PCR analysis of a randomly selected set of genes. The total RNA was DNaseI-treated and used for cDNA synthesis. First-strand synthesis was carried out using Superscript III reverse transcriptase module (Invitrogen, USA) as the manufacturer's protocol. The comparative ΔΔCT method was used for relative quantitation of expression of genes (Schmittgen and Livak 2008). The Gsr18SRNA gene was amplified as an internal control. Primers used in this study were listed in Table A2.

Targeted Mutagenesis in Soybean Hairy Roots
The CRISPR/Cas9-mediated genome-editing tool was utilized to edit the endogenous gene GsSOS1 and GsNSCC in wild hairy soybean roots. PCR/RE assays were conducted to detect mutations in GsSOS1 and GsNSCC target regions. The PCR-RE assay showed that gene mutations were induced (Figure1) and mutagenesis efficiencies for GsSOS1 and GsNSCC were 28.5% and 39.9%, respectively (Table 1). We detected the biallelic and monoallelic mutations in GsSOS1 and only the monoallelic mutations in GsNSCC ( Figure 1). The undigested bands from the PCR-RE assay were cloned and sequenced to confirm the mutation. Most mutations in the two genes predominantly were multiple-nucleotide deletion. However, some rare nucleotide substitutions and insertions were also detected ( Figure A1). Sequencing of gene clones from independent mutant roots revealed a variety of mutations for each root, suggesting that the introduced CRISPR-Cas9 system continued to modify the genes during hairy root development ( Figure A1). Figure 2. Na + /K + ratio in different tissues of GsSOS1 and GsNSCC mutants and wild type plants Note. After detection of mutation in the roots (and subsequently removing the roots without mutation), leaves and stems tissues were collected to determine Na + and K + levels. S and N: Salt and Normal; SOS/NSCC and WT: SOS/NSCC mutation and Wild type; R, S and L: Root, Stem and Leaf. * indicate a significant difference at P < 0.05 compared with the corresponding controls.

Effects of Gene Mutation on the Expression of target and Non-target Genes
Nine mRNA libraries of GsmSOS1 (SOS), GsNSCC (NSCC) mutants and wild type (CK) soybean roots, each with three replicates, generated a total of 140.39 Gb raw reads via the Illumina/Solexa sequencing platform and obtained 138.42 Gb of clean reads after analysis. Mapping of reads indicated, ~81.03 to 83.84% clean reads, ~79.05 to 81.95% unique reads and one perfectly matching locus in the soybean genome (Table A3).
GsSOS1 and GsNSCC mutation affected the expression of many other genes in wild soybean roots. There were 571 up-regulated and 1246 genes down-regulated in GsNSCC mutant roots when compared with the wild type. In GsSOS1 mutant roots, 908 and 1031 genes were up-and down-regulated were observed ( Figure 3A, Supplementary File 1). The expression of the 887-common set of genes was significantly affected either of the mutations ( Figure 3B). These results indicate a) both GsSOS1 and GsNSCC are associated with common pathways or b) be regulating genes with a similar pathway. To verify the expression profiles of differentially expressed genes, about forty genes were randomly selected, and their expressions were validated using qRT-PCR. Among these, thirty-five genes expression profiles matched with that observed in RNAseq data (Figure 4).  Gene ID is shown on the x-axis. The comparative ΔΔCT method was used for the qRT-PCR experiments, and GsRNAr-18S was selected as the reference.

The Function Prediction of the DEGs Affected by the Gene Mutation
Blast2GO software was used to assign GO and KEGG functional classifications the DEGs. As a result, The DEGs were successfully classified into the three main GO categories of biological process, cellular component, and molecular function. The DEGs affected by GsSOS1 and GsNSCC mutation were further categorized into 40, and 38 GO functional groups respectively ( Figure A3). Further, the DEGs affected by GsSOS1 and GsNSCC mutation were mapped to 94 and 95 KEGG pathways respectively (Supplementary File 2). The pathways with the highest unigene representations were those associated with plant hormone signal transduction, phenylpropanoid biosynthesis, starch-sucrose metabolism, and plant-pathogen interactions ( Figure A4).
The results of GO enrichment analysis showed that GsSOS1 and GsNSCC mutation had significant effects on the expression of many genes ( Figure 5). For example, the biological process, the number of genes related to the protein phosphorylation in GsNSCC mutant, and transcription regulation and nodulation in GsSOS1 mutant respectively, were most affected ( Figure 5). Interestingly, few genes that exhibited up-regulation in GsSOS1 mutant roots were also associated with the nodulation process ( Figure A5). Significantly, both GsSOS1 and GsNSCC mutations altered the expression of a set of the gene that is associated with transporter activity,  whole-plant transformation in soybean (Taylor et al., 2006), many researchers carried out the soybean gene editing research in hairy roots Du et al., 2016;Jacobs et al., 2015;Cai et al., 2015;Du et al., 2018;Ming et al., 2018). Hairy soybean roots are an excellent model system for transformation and mutagenesis, for carrying out a functional study of the genes. For example, overexpression of GmMYB118 in hairy soybean roots, led to improved drought and salt tolerance of the plants. Furthermore, CRISPR-transformed plants exhibited reduced stresses tolerance (Du et al., 2018). GmNAC15 overexpression in hairy roots of soybean enhances their salt tolerance (Ming et al., 2018). However, the hairy root system cannot regenerate whole transgenic plants, and therefore successful heritable mutations cannot be achieved. Furthermore, the mutation frequencies in hairy roots are hard to be 100% and exhibit high variability, due to design of sgRNAs, different promoters driving sgRNA cassette and the target gene Du et al., 2016;Cai et al., 2015). The mutant and the normal roots grow together which might cause some deviation in phenotypic and physiological data of CRISPR-transformed plants. The mutation efficiencies for GsSOS1 and GsNSCC in this study were 28.5 and 39.9%, respectively. When all the hairy roots of each CRISPR-transformed plant were together collected, the Na + /K + ration between mutants and the wild type exhibited no significant difference under normal and salt stress conditions. Exclusion of hairy roots with no mutation, the significant difference of Na + /K + ratios in roots was detected between GsSOS1 mutant and wild type. No significant effects of GsNSCC mutation on Na + /K + ratios in different tissues under normal and salt treatment conditions were detected, perhaps because GsNSCC has two copies in the genome, while only one copy of GsSOS1 ( Supplementary File 3 and 4). Hence, some researchers study the sgRNA efficiencies with a hairy root system and identify the gene function with whole-plant transformation method (Do et al., 2019;Curtin et al., 2018). Therefore, our transcriptome analysis was based on the mutant hairy roots of target genes to study the influence of the target gene mutation on other gene expressions.

CRISPR
SOS1, a plasma membrane Na + /H + -antiporter, is known as a crucial component in the defence of plants against sodium ions that have entered the cytoplasm. The mutations in Arabidopsis SOS1 results in (i) enhanced sensitivity to higher levels of NaCl (Wu et al., 1996;Zhu et al., 1998), (ii) accumulation of significantly higher levels of Na + in mutants than that of wild type under salt stress (Shi et al., 2000), and (iii) K + acquisition impairment and alterations in Na + to K + ratios (Wu et al., 1996). In present study, GsSOS1 was edited in the roots with CRISPR-cas9 system into different mutant types. Under 250 mM NaCl treatment, Na + /K + ratios in GsSOS1 mutant roots were significantly higher than that of the wild type. Many studies have previously reported that SOS1 appears to have many functions, that includes supporting vacuolar morphology, ion homeostasis and membrane trafficking. Many of these functions are critically associated with tolerance of root tissue during the early stages of salt stress. In this study, with RNA-seq analysis, GsSOS1 mutation affected the expression of many other genes in wild soybean roots. Oh et al. (2010a) also outlined the function of genes and pathways that are affected when SOS1 is either mutated by T-DNA insertion or by the reduction of SOS1 transcripts through RNAi-interference (Oh, et al., 2010a). Oh et al. (2010a) also reported that non-availability of SOS1, altered the expression of genes related to pH homeostasis, membrane trafficking and ion transportation during salt stress. The similar results were obtained in this study. Eight aquaporin genes, including PIPs, TIPs, SIPs and NIPs, were differentially expressed in GsSOS1 edited roots compared with wild type. The expression differences of some the K + , Na + and Ca + transporters between GsSOS1 mutant and wild type were also observed, Some of the K + channel protein-coding genes SKOR and AKT1, sodium/calcium exchanger, sodium transporter HKT1, as well as 11 other intracellular traffic-related genes (Supplementary File 3). In Arabidopsis SOS1 mutant, expression of CNGC19 gene, encoding a calmodulin-binding cyclic nucleotide-gated channel, is strongly upregulated (Oh et al., 2010b). In this study, the alteration in the expression levels of four CNGC genes in GsSOS1 mutant roots was also detected. In addition, KEGG analysis of DEGs in GsSOS1 mutant roots showed that GsSOS1 mutation significantly influenced the expression of many other genes that involved in different pathway, including hormone signal transduction, phenylpropanoid biosynthesis, starch-sucrose metabolism, circadian rhythm and plant-pathogen interactions ( Figure A4). Earlier, a few of the reports also suggest biological functions that are affected in SOS1 mutants of Arabidopsis, includes plants pathogen responses and circadian rhythm (Oh et al., 2010b). Hence, it is highly likely that the SOS1 is involved in more than one functions such as those observed in the current study.
Nonselective cation channels (NSCC) in the plasma membrane of higher plants form a large and diverse group of plant cation channels by which the bulk of toxic Na + influx into plant roots (Amtmann & Sanders, 2002;Tyerman & Skerrett, 1999;White, 1999). In this study, GsNSCC mutation altered the expression of many genes that are integral components of the membrane (Figure 6). Unlike GsSOS mutants, GsNSCC mutation did not cause any change in the Na + /K + ratios under normal condition and salt treatment (Figure 3). This raises the possibility of involvement of other types of NSCCs for Na + influx in the root cells (Tyerman, 2002). In the Arabidopsis, 20 Glu receptor family genes may form nonselective ion channels (Lacombe et al., 2001). Apart from toxic Na + influx, NSCCs were also involved in nutritional uptake of K + , NH 4 + , Ca 2+ , Mg 2+ , micronutrients including trace elements, in ROS-, amino acid, purine-and cyclic nucleotide induced signaling, growth and development (Demidchik et al., 2002). In this study, GsNSCC mutation altered the expression of 1817 genes that are involved in different metabolism pathways (Figures 6 and A4). One notable effect was detected on the genes associated with protein phosphorylation of biological process or protein serine/threonine kinase of molecular function. In other studies, it was reported that the activity of some of these nonselective cation channels is modulated by phosphorylation (Kaupp & Seifert, 2002). Hence, it can be postulated that GsNSCC may be involved in protein phosphorylation or GsNSCC mutant may need the functional complementation of other NSCC protein and the activity of NSCC modulated by phosphorylation. As SOS1, NSCC may also be involved in more than one function. Both NSCC and SOS are also associated with biological processes (albeit in different ways) such as Na + transportation, that may be the reason of sharing 887 differentially expressed genes between two mutants of GsNSCC and GsSOS. We also found many genes whose expression were significantly altered uniquely in a mutant specific manner. For example, from the results of hierarchical clustering analysis of differentially expressed genes, some genes were up-regulated only in GsSOS1 mutant but not in GsNSCC mutant. Majority of these genes are involved in nodulation ( Figure A5).
In conclusion, the mutants with CRISPR/Cas9 gene-editing system in the hairy roots of wild soybean were successfully generated. Two genes, GsSOS1 and GsNSCC, were mutagenized with frequencies of 28.5% and 39.9%, respectively. GsSOS1 and GsNSCC mutations significantly altered the transcriptome of mutant roots. Further, many differentially expressed genes are associated with various cellular functions according to the multiple functions of SOS1 and NSCC proteins. The wild soybean genome editing would advance the gene mining and functional identification in wild soybean for improving the agronomic traits of cultivated soybean.  Note. Wild-type sequences of the target genes were shown with the protospacer adjacent motif sequence highlighted in red. The numbers of the changed nucleotides were shown to the right of each sequence. +, -and S indicate insertion, deletionand substitution. Inserted and substituted nucleotides were shown in green.   Figure A5. Hierarchical clustering analysis of differentially expressed genes from GsSOS1 or GsNSCC mutation roots (SOS, NSCC) and wild type control (CK), each with 3 replicates (-1, -2 and -3). Rows represent individual genes. Genes that increased and decreased in abundance are indicated in red and green respectively Vol. 12,No. 9; e journal.
mmons Attrib indicated in red and green, respectively 2020 ution