Abstract:
【Objective】 The purpose of the study was to analyze the SSR,SNP sequence features and phylogeny of the transcriptome of the wild sugarcane species Narenga porphyrocoma(Hance)Bor.,so as to provide a reference for in-depth study of molecular marker development,germplasm resources utilization,population genetic structure and historical dynamics of differentiation of sugarcane plants.【Method】Based on transcriptome data of
N.porphyrocoma,MISA and SOAPSnp softwarewere used to excavate SSR and SNP loci,and to analyze the sequence features of the obtained Unigenes. The transcriptome data of
Brachypodium distachyon, Oryza sativa, Setaria italic, Sorghum bicolor,Zea mays, and
Arabidopsis thaliana were downloaded from the JGI database,and the maximum likelihood(ML)method was used to construct a phylogenetic evolutionary tree and estimate the species divergence time.【Result】A total of 171000000 Raw reads were obtained by sequencing the transcriptome of
N. porphyrocoma,156800000 Clean reads were obtained after data filtering,and 130393 Unigenes were obtained after further assembly. Among them,14233 Unigenes contained 16372 SSR loci,and the frequency of occurrence was 12.56%. There were 1839 Unigenes containing more than 1 SSR locus and 656 of compound SSR loci. SSR repeat motiftypes were abundant,ranging from single nucleotide to six nucleotide repeats, with a total of 612 SSR repeat motif types. The type with the largest number was trinucleotide repeat type(49.16%), followed by dinucleotide repeats(25.54%)and single nucleotide repeats(18.30%). Among all nucleotide repeat types, there were 14 types in which the proportion of repeat motifs to the total number of SSR loci was <0.50%. and the three types of repeat motifs with higher frequency were CCG/CGG,A/T and AG/CT respectively. The sequence lengths of SSR loci were 12-191 bp. There were atotal of 15123 SSR loci with sequence length ≤ 25 bp,accounting for 96.23% of the total number of SSR loci. Amongthem,the number of loci with SSR length of 15 bp was the largest, accounting for 32.16% of the total number of SSR loci. The number of repeats of SSR nucleotide repeat motifs was 4-24,and 5,6 and 7 repeats were dominant. There were 222106 SNP loci in the transcriptome sequence of
N. porphyrocoma,with an average of 1.70 SNP loci in each Unigenes. The proportion of nucleotide conversion types(65.92%)was higher than that of transversion types(34.08%). Among the six types of single nucleotide variant types,the frequency of A/G was the highest (33.07%),followed by C/T(32.84%). The results of phylogenetic analysis and species divergence time estimation showed that
N. porphyrocoma had the closest genetic relationship with S. bicolor,and the differentiation time was 14.6 million years(Ma).【Conclusion】The SSR and SNP loci in the transcriptome of
N. porphyrocoma are very abundant with high genetic polymorphism,indicating that it is a feasible method to develop SSR and SNP molecular markers in sugarcane by using transcriptome sequencing. The method of using transcriptome data to construct phylogenetic evolutionary tree can be used for phylogenetic analysis studies of other species lacking genomic data.