禄锦鹏, 秦春秀, 李潇, 陈代朋, 王佳楠, 李志刚, 刘文波. 2023: 可可毛色二孢AM2As全基因组分泌蛋白预测及功能分析. 南方农业学报, 54(11): 3136-3155. DOI: 10.3969/j.issn.2095-1191.2023.11.002
引用本文: 禄锦鹏, 秦春秀, 李潇, 陈代朋, 王佳楠, 李志刚, 刘文波. 2023: 可可毛色二孢AM2As全基因组分泌蛋白预测及功能分析. 南方农业学报, 54(11): 3136-3155. DOI: 10.3969/j.issn.2095-1191.2023.11.002
LU Jin-peng, QIN Chun-xiu, LI Xiao, CHEN Dai-peng, WANG Jia-nan, LI Zhi-gang, LIU Wen-bo. 2023: Prediction and functional analysis of AM2As whole genome secretory proteins of Lasiodiplodia theobromae. Journal of Southern Agriculture, 54(11): 3136-3155. DOI: 10.3969/j.issn.2095-1191.2023.11.002
Citation: LU Jin-peng, QIN Chun-xiu, LI Xiao, CHEN Dai-peng, WANG Jia-nan, LI Zhi-gang, LIU Wen-bo. 2023: Prediction and functional analysis of AM2As whole genome secretory proteins of Lasiodiplodia theobromae. Journal of Southern Agriculture, 54(11): 3136-3155. DOI: 10.3969/j.issn.2095-1191.2023.11.002

可可毛色二孢AM2As全基因组分泌蛋白预测及功能分析

Prediction and functional analysis of AM2As whole genome secretory proteins of Lasiodiplodia theobromae

  • 摘要: 【目的】对可可毛色二孢AM2As全基因组编码分泌蛋白进行预测及功能分析,为挖掘该病原菌关键致病基因及可可抗病基因提供理论参考。【方法】根据NCBI数据库已公布的可可毛色二孢全基因组序列,利用SignalP 5.0、ProtComp 9.0、TMHMM 2.0、GPI-SOM、LiPop 1.0等生物信息学软件对其分泌蛋白序列进行预测筛选及功能分析,并依据分泌蛋白序列的相似性,应用PHI1 COG和eggNOG-mapper 5.0数据库进行功能注释分析,再利用MEGA X、GSDS 2.0、TBtools 1.09等软件对分泌蛋白组进行系统发育进化及其基因结构和启动子顺式作用元件分析。【结果】可可毛色二孢全基因组共编码13054个蛋白,有638个蛋白具有潜在典型分泌蛋白特征,占总数4.89%。638个分泌蛋白的氨基酸数量为55~1777个,其中以100~400个氨基酸居多,占分泌蛋白序列总数的53.76%,说明大多数分泌蛋白属于小型蛋白。20种氨基酸数量和占比在638个分泌蛋白中存在明显差异,Ala占比最高,其余氨基酸种类占比均未超过10.00%。638个分泌蛋白的信号肽氨基酸数量为15~38个,其中有409个分泌蛋白的信号肽氨基酸数量为17~20个;以非极性氨基酸Ala占比最高,而有带电侧链的Asp和Glu占比最低。信号肽-3和-1位上氨基酸相对保守,切割位点属于A-X-A类型,可被Sp I型信号肽酶识别并切割。将638个分泌蛋白分为七大类群,不同类群成员在蛋白结构、功能和保守基序上呈明显差异。400个分泌蛋白获得功能注释,主要涉及碳水化合物运输和代谢过程、蛋白翻译后修饰及氨基酸代谢和运输过程。分泌蛋白组中共预测出221个CAZymes和244个效应蛋白,其中48个效应蛋白可通过PHI数据库得到功能注释。【结论】可可毛色二孢分泌蛋白功能主要集中在碳水化合物的运输和代谢、翻译后修饰及氨基酸代谢和运输等过程,还有其他未知功能和致病功能,初步推断可可毛色二孢AM2As在侵染致病过程中,受脱落酸反应、生长素反应、低温厌氧反应和光反应等顺式作用元件调控,通过分泌大量碳水化合物降解、代谢、转运等相关蛋白,建立寄生关系,再利用代谢产物帮助自身加速侵染,达到致病的作用。

     

    Abstract: 【Objective】The purpose of the study was to predict and analyze the whole-genome encoded secretory proteins of Lasiodiplodia theobromae AM2As whole genome,and to provide theoretical references for the excavation of key disease-causing genes of this pathogen as well as disease-resistance genes of cocoa.【Method】According to the whole genome sequence of L. theobromae published in NCBI database,SignalP 5.0,ProtComp 9.0,TMHMM 2.0,GPI-SOM, LiPop 1.0 and other bioinformatics software were used for prediction screening and functional analysis of L. theobromae secretory protein sequences. In addition,based on the similarity of protein sequences,PHI,COG and eggNOG-mapper 5.0 database were applied for functional annotation analysis,and then MEGA X,GSDS 2.0,and TBtools 1.09 were used to analyze the phylogenetic evolution of the secretory proteomes and their gene structures and promoter cis-acting elements.【Result】L. theobromae whole genome encoded a total of 13054 proteins,including 638 proteins with potentially typical secretory protein characteristics,accounting for 4.89% of the total. The number of amino acids in the 638 secretory proteins ranged from 55 to 1777,with the majority having 100 to 400 amino acids,accounting for 53.76% of the total number of secretory protein sequences. This indicated that most of the secretory proteins were small proteins. There was a significant difference in the number and percentage of the 20 types of amino acids in the 638 secretory proteins, with Ala having the highest percentage and the percentage of the other types of amino acids not exceeding 10.00%. The number of signal peptide amino acids in the 638 secretory proteins ranged from 15 to 38,with 409 secretory proteins having a signal peptide amino acid number of 17 to 20. The nonpolar amino acid Ala had the highest percentage,while Asp and Glu,with electric side chains,had the lowest percentage. Amino acids in signal peptide-3 and-1 positions were relatively conservative,and the cleavage site belonged to A-X-A type,which could be recognized and cut by Sp I type signal peptide enzyme. The 638 secretory proteins were divided into seven groups,and the members of different groups showed significant differences in protein structures,functions and their conserved motifs. The functional annotations of 400 secretory proteins were obtained,mainly related to carbohydrate transport and metabolic process,protein post-translational modification,amino acid metabolism and transport process. A total of 221 CAZymes and 244 effector proteins were predicted in the secretory proteome,of which 48 effector proteins could be functionally annotated by PHI database.【Conclusion】The functions of the secretory proteins of L. theobromae are mainly concentrated in the processes of carbohydrate transport,metabolism,post-translational modification and amino acid metabolism and transport,as well as other unknown functions and potential pathogenic function. It is preliminarily inferred that during the infection and pathogenesis process,L. theobromae AM2As are regulated by cis-acting elements such as abscisic acid reaction,auxin,lowtemperature anaerobic and light reaction. By secreting a large number of proteins involved in the degradation,metabolism and transport of carbohydrates,it establishes a parasitic relationship,and then uses the metabolic products to help itself accelerate the infection,achieving the effect of pathogenesis.

     

/

返回文章
返回