基于转录组测序的棘胸蛙SSR和SNP分子标记开发

SSR and SNP molecular marker development based on Quasipaa spinosa transcriptome sequencing

  • 摘要: 【目的】基于转录组开发SSR和SNP分子标记用于评价棘胸蛙(Quasipaa spinosa)遗传多样性,为其种质资源的创新利用提供理论支撑。【方法】采用TRIzol试剂盒提取棘胸蛙肝脏、肌肉和肾脏组织总RNA,构建cDNA文库后利用Illumina HiSeq 2500测序平台进行高通量测序,通过MISA对棘胸蛙转录组测序数据进行SSR检索,并以SAMtools和VarScan v.2.2.7进行SNP查找。【结果】棘胸蛙转录组测序共获得93887条非冗余基因(Unigenes),序列总长度达91352712 bp,且所有转录组Q30均超过95.00%。在93887条Unigenes中发现33019个SSRs,其中21966条Unigenes含有SSR,6688条Unigenes含有超过1个SSR;以单核苷酸重复型SSR数最多,达25788个,且出现频率最高(27.47%)。SSR的平均长度以四核苷酸重复型最长,达35.47 bp;棘胸蛙SSR以(A/T)n为绝对优势重复基元,占总SSR的65.51%,然后依次为(C/G)n、(AT/AT)n、(AC/GT)n、(AG/CT)n、(AAT/ATT)n、(AGG/CCT)n,分别占总SSR的12.59%、5.66%、5.55%、3.31%、1.55%和1.30%。在33019个SSRs中,核苷酸重复次数主要集中在5~25次,占总SSR的99.91%,且大部分SSR位于非编码区,仅有1633个SSRs位于编码区;长度≥12 bp的SSR共计17244个,占总SSR的58.53%。挑选120对SSR引物进行引物有效性验证,发现有57对引物扩增出单一条带,且条带大小与预期结果一致。对棘胸蛙转录组序列进行SNP检索,共发现87634个SNPs,其中,56300个SNPs属于转换位点、31334个SNPs属于颠换位点,转化/颠换比达1.80,碱基的转换频率明显高于颠换频率。【结论】利用高通量转录组测序开发棘胸蛙SSR和SNP分子标记是一种切实可行的方法,能开发出通用性较高、数量较多、覆盖性较广的分子标记。棘胸蛙具有中度偏高的遗传多样性,可作为种质材料进一步开发利用。

     

    Abstract: Quasipaa spinosa,so as to provide appropriate molecular markers for the innovative ways of germplasm resource application.【Method】Total RNA was extracted from liver,muscle and kidney tissues of Q. spinosa to build a TRlzol kit. cDNA library and then the library was high-throughput sequenced by the Illumina HiSeq 2500 sequencing platform. Microsatellite searching software MISA was used to screen and analyze microsatellite(SSR)in the Q. spinosa transcriptome while the software SAMtools and VarScan v. 2.2.7 were used for searching SNP loci.【Result】 93887 non-redundant unigenes with a total sequence length of 91352712 bp were obtained from the transcriptome sequencing of Q. spinosa,and all transcriptome Q30 were over 95.00%. Among the 93887 unigenes,33019 potential SSR markers were identified and 21966 unigenes contained SSR loci. In additional,a total of 6688 unigenes had more than one SSR locus. The dinucleotide was the highest at 25788 and the frequency of occurrence frequency was the highest at 27.47%. The average length of SSR was 35.47 bp.(A/T)n was the absolutely dominant repeat motif of Q. spinosa SSR,accounting for 65.51% of the total SSRs,followed by(C/G)n,(AT/AT)n,(AC/GT)n,(AG/CT)n,(AAT/ATT)n,and accounting for 12.59%,5.66%,5.55%、3.31%,1.55% and 1.30% of the total SSRs,respectively. Among the 33019 potential SSR markers,the times of repetition was mainly between 5-25 times,accounting for 99.91% of the all SSRs. In additional,only 1633 located in the coding area. 17244 SSR loci whose length ≥ 12 bp accounted for 58.53% of the total SSR loci. 120 pairs of SSR primers were selected to verify the validity of the primers and 57 pairs of primers amplified a single band,and the band size was as expected. 87634 SNPs were identified(56300 transitions and 31334 transversions) from mapping sequencing reads to assembled unigenes,the transition/transversion ratio was approximately1.80 and the frequency of base transition is higher than that of transversion.【Conclusion】High-throughput transcriptome sequencing is a feasible method to develop SSR and SNP molecular markers,which can develop molecular markers with universality, large number and wide coverage. Q. spinosa has a moderately high genetic diversity,so it can be used as germplasm materials for further development and utilization.

     

/

返回文章
返回