HFCAS OpenIR  > 中科院合肥智能机械研究所
Finding motif pairs in the interactions between heterogeneous proteins via bootstrapping and boosting
Kim,Jisu1; Huang,De-Shuang2; Han,Kyungsook1
2009-01-30
发表期刊BMC Bioinformatics
ISSN1471-2105
摘要

AbstractBackgroundSupervised learning and many stochastic methods for predicting protein-protein interactions require both negative and positive interactions in the training data set. Unlike positive interactions, negative interactions cannot be readily obtained from interaction data, so these must be generated. In protein-protein interactions and other molecular interactions as well, taking all non-positive interactions as negative interactions produces too many negative interactions for the positive interactions. Random selection from non-positive interactions is unsuitable, since the selected data may not reflect the original distribution of data.ResultsWe developed a bootstrapping algorithm for generating a negative data set of arbitrary size from protein-protein interaction data. We also developed an efficient boosting algorithm for finding interacting motif pairs in human and virus proteins. The boosting algorithm showed the best performance (84.4% sensitivity and 75.9% specificity) with balanced positive and negative data sets. The boosting algorithm was also used to find potential motif pairs in complexes of human and virus proteins, for which structural data was not used to train the algorithm. Interacting motif pairs common to multiple folds of structural data for the complexes were proven to be statistically significant. The data set for interactions between human and virus proteins was extracted from BOND and is available at http://virus.hpid.org/interactions.aspx. The complexes of human and virus proteins were extracted from PDB and their identifiers are available at http://virus.hpid.org/PDB_IDs.html.ConclusionWhen the positive and negative training data sets are unbalanced, the result via the prediction model tends to be biased. Bootstrapping is effective for generating a negative data set, for which the size and distribution are easily controlled. Our boosting algorithm could efficiently predict interacting motif pairs from protein interaction and sequence data, which was trained with the balanced data sets generated via the bootstrapping method.

DOI10.1186/1471-2105-10-S1-S57
语种英语
WOS记录号BMC:10.1186/1471-2105-10-S1-S57
出版者BioMed Central
引用统计
文献类型期刊论文
条目标识符http://ir.hfcas.ac.cn:8080/handle/334002/34664
专题中科院合肥智能机械研究所
通讯作者Han,Kyungsook
作者单位1.Inha University; School of Computer Science and Engineering
2.Chinese Academy of Sciences; Hefei Institute of Intelligent Machines
推荐引用方式
GB/T 7714
Kim,Jisu,Huang,De-Shuang,Han,Kyungsook. Finding motif pairs in the interactions between heterogeneous proteins via bootstrapping and boosting[J]. BMC Bioinformatics,2009,10(Suppl 1):1-8.
APA Kim,Jisu,Huang,De-Shuang,&Han,Kyungsook.(2009).Finding motif pairs in the interactions between heterogeneous proteins via bootstrapping and boosting.BMC Bioinformatics,10(Suppl 1),1-8.
MLA Kim,Jisu,et al."Finding motif pairs in the interactions between heterogeneous proteins via bootstrapping and boosting".BMC Bioinformatics 10.Suppl 1(2009):1-8.
条目包含的文件 下载所有文件
文件名称/大小 文献类型 版本类型 开放类型 使用许可
Finding motif pairs (654KB)期刊论文作者接受稿开放获取CC BY-NC-SA浏览 下载
个性服务
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
谷歌学术
谷歌学术中相似的文章
[Kim,Jisu]的文章
[Huang,De-Shuang]的文章
[Han,Kyungsook]的文章
百度学术
百度学术中相似的文章
[Kim,Jisu]的文章
[Huang,De-Shuang]的文章
[Han,Kyungsook]的文章
必应学术
必应学术中相似的文章
[Kim,Jisu]的文章
[Huang,De-Shuang]的文章
[Han,Kyungsook]的文章
相关权益政策
暂无数据
收藏/分享
文件名: Finding motif pairs in the interactions between heterogeneous proteins via bootstrapping and boosting.pdf
格式: Adobe PDF
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。