| 57 | 0 | 46 |
| 下载次数 | 被引频次 | 阅读次数 |
基因表达过程主要包括转录、剪接和翻译,多种调控元件参与其中,是个高度调控的过程。建模识别分析这些调控元件,对理解基因表达具有重要意义。本研究提出了一个基于移动序列模式的短序列建模模型,并对转录启动子和剪接调控元件进行了建模分析。启动子是基因转录的核心调控元件,剪接调控元件参与调控剪接位点的识别。分类实验结果表明,该模型可有效识别转录启动子序列和剪接调控元件序列。并进一步利用该模型,建模分析已为生物实验验证的、会导致剪接影响的基因组变异,实验结果表明,该模型可有效预测基因组变异的剪接影响,进一步验证了该模型的有效性。
Abstract:Gene expression mainly includes transcription, splicing and translation, where many different types of regulatory elements involved. Gene expression is a highly regulated process. Modeling and characterizing these regulatory elements can be helpful for understanding how gene expression is regulated. In this report, a short sequence model based on shift sequence pattern is proposed, and further applied to model and analyze transcription promoters and splicing regulatory elements. Promoter is the core regulatory element for gene transcription, and the splicing regulatory elements play important role for splicing site recognition. The results of classification experiments show that the model is able to effectively identify the promoter sequences and splicing regulatory element sequences. And further validate the model using genomic variants with known splicing effect which has been verified with biological experiments. The experimental results show that the model can effectively predict the splicing effect of genome mutation, which shows the validity of the model.
Auton A.,Brooks L.D.,Durbin R.M.,Garrison E.P.,Kang H.M.,Korbel J.O.,Marchini J.L.,Mccarthy S.,Mcvean G.A.,and Abecasis G.R.,2015,A global reference for human genetic variation,Nature,526(7571):68-74
Bechtel J.M.,Rajesh P.,Ilikchyan I.,Ying D.,Mishra P.K.,Qi W.,Wu X.,Afonin K.A.,Grose W.E.,Ye W.,Khuder S.,and Fedorov A.,2008,The Alternative Splicing Mutation Database:a hub for investigations of alternative splicing using mutational evidence,BMC Research Notes,1(1):3
Consortium F.,Pmi R.,and CLST,2014,A promoter-level mammalian expression atlas,Nature,507(7493):462-470
Fairbrother W.G.,Holste D.,Burge C.B.,and Sharp P.A.,2004Single nucleotide polymorphism-based validation of exonic splicing enhancers,PLoS Biology,2(9):E268
Fairbrother W.G.,Yeh R.F.,Sharp P.A.,and Burge C.B.,2002,Predictive identification of exonic plicing enhancers in human genes,Science,297(5583):1007-1013
Jia C.L.,Zhang Y.,Zhu L.,Zhang Y.,2015,Application progress of transcriptome sequencing technology in biological sequencing,Fenzi Zhiwu Yuzhong(Molecular Plant Breeding),13(10):2388-2394(贾昌路,张瑶,朱玲,张锐,2015,转录组测序技术在生物测序中的应用研究进展,分子植物育种,13(10):2388-2394)
Ke S.,Shang S.,Kalachikov S.M.,Morozova I.,Yu L.,Russo J.J.,Ju J.,and Chasin L.A.,2011,Quantitative evaluation of all hexamers as exonic splicing elements,Genome Research,21(8):1360-1374
Krijger P.H.L.,and Laat W.D.,2016,Regulation of disease-associated gene expression in the 3D genome,Nature Reviews Molecular Cell Biology,17(12):771
Liu,H.X.,Zhang M.,and Krainer A.R.,1998,Identification of fu-nctional exonic splicing enhancer motifs recognized by indiv-idual SR proteins,Genes and Development,12(13):1998-2012
Ma M.,Wang Y.,Ru Y.,and Wang Z.F.,2012,A classification method for RNA splicing regulatory elements,Zhongguo Shengwu Yixue Gongcheng Xuebao(Chinese Journal of Biomedical Engineering),31(1):45-52(马猛,汪洋,汝颖,王泽锋,2012,一个RNA剪接调控元件分类方法的研究,中国生物医学工程学报,31(1):45-52)
Pei J.,Han J.,Mortazaviasl B.,Wang J.,Pinto H.,Chen Q.,Dayal U.,and Hsu M.C.,2004,Mining sequential patterns by pattern-growth:the PrefixSpan approach,IEEE Transactions on Knowledge and Data Engineering,16(11):1424-1440
Pereira L.,Soares P.,Triska P.,Rito T.,Waerden A.V.D.,Li B.,Radivojac P.,and Samuels D.C.,2014,Global human frequencies of predicted nuclear pathogenic variants and the role played by protein hydrophobicity in pathogenicity potential,Scientific Reports,4(4):7155
Rockman M.V.and Kruglyak L.,2006,Genetics of global gene expression,Nature Reviews Genetics,7(11):862-872
Roulet E.,Busso S.,Camargo A.A.,Simpson A.J.,Mermod N.,and Bucher P.,2002,High-throughput SELEX SAGE method for quantitative modeling of transcription-factor binding sites,Nature Biotechnology,20(8):831-835
Sonnenburg S.,Schweikert G.,Philips P.,Behr J.,and R覿tsch G.,2007,Accurate splice site prediction using support vector machines,BMC Bioinformatics,8(Suppl 10):S7
Stadler,M.B.,Shomron N.,Yeo,G.W.,Schneider A.,Xiao X.,and Burge C.B.,2006,Inference of splicing regulatory activities by sequence neighborhood analysis,Plos Genetics,2(11):e191
Sun Y.F.,Fan X.D.,and Li Y.D.,2003,Identifying splicing sites in eukaryotic RNA:support vector machine approach,Computers in Biology and Medicine,33(1):17-29
Tu M.,Tong W.,Perkins R.,and Valentine C.R.,2000,Predicted changes in pre-mRNA secondary structure vary in their association with exon skipping for mutations in exons 2,4,and 8 of the Hprt gene and exon 51 of the fibrillin gene,Mutation Research,432(1-2):15-32
Valentine C.R.,1998,The association of nonsense codons with exon skipping,Mutation Research,411(2):87-117
Wang Z.,Rolish M.E.,Yeo G.,Tung V.,Mawson M.,and Burge,C.B.,2004,Systematic identification and analysis of exonic splicing silencers,Cell,119(6):831-845
Wrzodek C.,Schr?der A.,Dr?ger A.,Wanke D.,Berendzen K.W.,Kronfeld M.,Harter K.,and Zell A.,2009,ModuleMaster:Anew tool to decipher transcriptional regulatory networks,Biosystems,99(1):79-81
Yang B.,Liu F.,Ren C.,Ouyang Z.,Xie Z.,Bo X.,and Shu W.,2017,BiR en:predicting enhancers with a deep-learning-based model using the DNA sequence alone,Bioinformatics,33(13):1930-1936
基本信息:
DOI:10.13417/j.gab.037.004253
中图分类号:Q75
引用信息:
[1]施梦军,赵海峰,陈虹,等.基于移动序列模式分析人基因组调控短序列[J].基因组学与应用生物学,2018,37(10):4253-4259.DOI:10.13417/j.gab.037.004253.
基金信息:
国家自然基金(No.61300057; No.81000321);; 安徽省教育厅重点项目(KJ2016A040; KJ2013A007)共同资助
2017-12-26
2017-12-26
2017-12-26