Abstract:
Objective In this study,WRKY transcription factor family structure in mulberry genome was identified to provide reference for revealing biological fimction of WRKY transcription factor family.MethodThe number,type,structure,system evolution,conserved domain and codon usage of mulberry WRKY transcription factor family were analyzed by bioinformatics methods.RcsultA total of 55 mulberry WRKY transcription factor genes were identified based on mulberry whole genome protein database,which accounted for 1.88% of total mulberry genes (29261).Mulberry WRKY transcription factor family was divided into six types based on intron number and fifteen types based on intron phase,of which twenty-seven genes contained two introns and twenty-five genes belonged to 2-2 intron phase type.The phylogenetic analysis on conserved domian showed that mulberry WRKY transcription factor family proteins were divided into three categories(Ⅰ,Ⅱ and Ⅲ).Category Ⅰ could be separated into Ⅰ N and Ⅰ C,and category Ⅱ was classified into Ⅱ a,Ⅱ b,Ⅱ c,Ⅱ d and Ⅱ e.The analysis on conserved domain of mulberry WRKY transcription factor family proteins showed that five types of Motif were highly conserved.In all mulberry WRKY transcription factor proteins,C terminal Motif 1 was contained,at the same time,category Ⅰ proteins contained N terminal Motif 3.The WRKY transcription factor family gene promoter region was rich in PBF(C2H2 zinc finger factors) and AHL(Arabidopsis thaliana hook factors) elements.Usage bias of codon indicated that effective codon number of mulberry WRKY transcription factor family genes (ENC) was 48.00-60.00,GC content of the third position in codon(GC3s) was 0.330-0.722.The average hydrophilism values were all negative.Relative synonymous codon usage (RSCU) of twenty-nine codons>1.000,and the number of codons ended with A (six) or T (eleven) was larger than those ended with G (four) or C (eight).Conclusion Fifty-five members are identified in mulberry WRKY transcription factor family.Genes with the same introns phase in the same category probably derive from a common ancestral gene,and relate to gene duplication and genome rearrangement events.The protein sequence is highly conserved and they function under environment stress.The codon usage bias of most genes is weak,and mainly affected by selection pressure of base mutation.