基于高通量转录组测序的斑重唇鱼SSR分布及序列特征分析

SSR distribution and sequence characteristics analysis of Diptychus maculates based on high-throughput transcriptome sequencing

  • 摘要: 【目的】基于高通量转录组测序分析斑重唇鱼SSR分布及序列特征,为斑重唇鱼SSR分子标记开发、遗传多样性、遗传连锁图谱、种质资源鉴定和分子辅助育种提供理论数据。【方法】使用Illumina NovaSeq高通量测序平台对斑重唇鱼皮肤组织进行转录组测序,利用Trinity对获得的高质量测序数据进行序列组装,再用MISA对筛选得到的1 kb以上的Unigenes进行SSR分布及序列特征分析。【结果】斑重唇鱼转录组共获得244.42 Gb Clean data,组装后获得55182条Unigenes,其中8903条Unigenes含有SSR位点,含有SSR的Unigenes比例为16.13%,共含有12899个SSR位点,从中除去616个复合型SSR位点,共获得12283个完美型SSR位点,存在单核苷酸~六核苷酸6种重复基元类型,其中单核苷酸重复基元数目最多,为6654个,占比为54.17%,二核苷酸~六核苷酸随着核苷酸重复数的增多依次减少,分别是32.30%、12.04%、1.38%、0.07%和0.03%。斑重唇转录组中共有348种不同重复基元,其中最丰富、种类最多的是三核苷酸重复基元,共有145种;单核苷酸是数量最多重复基元类型,其重复基元数量最多的是(A) 10型。SSR的长度范围较大,为10~75 bp,总长度为157088 bp,SSR相对丰度为0.19%,其中长度为10 bp的SSR数量最多,有3232个,占比为26.31%;其次是12、11和14 bp,占比分别为21.73%、12.02%和10.06%。【结论】斑重唇鱼转录组中SSR位点数量多,出现频率较高,多态性较好,可用来开发斑重唇鱼类分子标记,且不同核苷酸重复基元类型数量差异较大,且分布特征差异明显。

     

    Abstract: 【Objective】This paper analyzed the SSR distribution and sequence characteristics of Diptychus maculates based on high-throughput transcriptome sequencing,to provide theoretical data for the development of SSR molecular markers,genetic diversity,genetic linkage map,germplasm resource identification and molecular assisted breeding of D. maculates.【Method】Illumina NovaSeq high-throughput sequencing platform was used to perform transcriptome sequencing on the skin tissue of D. maculates. Trinity was used to sequence and assemble the obtained high-quality sequencing data. MISA was used to analyze the SSR distribution and sequence characteristics of the screened Unigenes more than 1 kb.【Result】A total of 244.42 Gb of Clean data were obtained from the transcriptome of D. maculates. 55182 Unigenes were obtained after assembly,among which 8903 Unigenes contained SSR loci,and the proportion of Unigenes containing SSR was 16.13%,containing 12899 SSR loci. After 616 complex SSR loci were removed,a total of 12283 perfect SSR loci were obtained. There were 6 repeat motif types of mononucleotide-hexanucleotide,among which the number of mononucleotide repeat motif was the largest(6654,accounting for 54.17%). The number of dinucleotide-hexanucleotide decreased with the increase of nucleotide repeat number. They accounted for 32.30%, 12.04%, 1.38%, 0.07% and 0.03% respectively. There were a total of 348 different repeat motifs in the transcriptome of the D. maculates,of which 145 were trinucleotide repeat motifs(the most abundant and most varied). The mononucleotide was the type with the largest number of repeat motifs,and the type with the largest number of repeat motifs was the(A) 10 type of mononucleotide. The length range of SSR was large,ranging from 10 to 75 bp,with a total length of 157088 bp. The relative abundance of SSR was 0.19%. The number of SSR with a length of 10 bp was the largest of a total of 3232,accounting for 26.31%. Followed by SSR with a length of 12,11 and 14 bp,accounting for 21.73%,12.02% and 10.06%,respectively.【Conclusion】In the transcriptome of D. maculates,the number of SSR loci is large,its occurrence frequency is high,and the polymorphism is good,which can be used to develop molecular markers of D. maculates. And the number of different nucleotide repeat motif types is greatly varied,and the distribution characteristics are obviously different.

     

/

返回文章
返回