[特邀报告]Hydrogen bonds meet self-attention: all you need for protein structure embedding

Hydrogen bonds meet self-attention: all you need for protein structure embedding
编号:111 访问权限:仅限参会人 更新:2022-07-04 17:45:20 浏览:963次 特邀报告

报告开始:2022年07月24日 15:20 (Asia/Shanghai)

报告时间:20min

所在会议:[S4] 分会场4 » [S4-2] 结构生物信息与药物分子设计

暂无文件

摘要
Abstract—General-purpose protein structure embedding can be used for many important protein biology tasks, such as protein design, drug design and binding affinity prediction. Recent researches have shown that attention-based encoder layers are more suitable to learn high-level features. Based on this key observation, we treat low-level representation learning and highlevel representation learning separately, and propose a two-level general-purpose protein structure embedding neural network, called ContactLib-ATT. On the local embedding level, a simple yet meaningful hydrogen-bond representation is learned. On the global embedding level, attention-based encoder layers are employed for global representation learning. In our experiments, ContactLib-ATT achieves a SCOP superfamily classification accuracy of 82.4% (i.e., 6.7% higher than state-of-the-art method) on the SCOP40 2.07 dataset. Moreover, ContactLib-ATT is demonstrated to successfully simulate a structure-based search engine for remote homologous proteins, and our top-10 candidate list contains at least one remote homolog with a probability of 91.9%.
关键字
蛋白质结构,深度学习
报告人
崔学峰
教授 山东大学

崔学峰博士,现任山东大学计算机科学与技术学院教授。在加拿大滑铁卢大学先后获得计算机本科、硕士、博士学位。2014年博士毕业后,加入沙特阿拉伯阿卜杜拉国王科技大学(KAUST)完成两年多的博士后工作。2016年回国,在清华大学交叉信息研究院担任三年的Tenure-Track助理教授。2019年获得山东大学杰出中青年基金, 加入山东大学。同年,还获得ACM SIGBIO新星奖。主要科研领域为生物信息学。一直致力于设计机器学习与并行算法,用来解决与人类生活息息相关的生物问题。对于蛋白质生物信息学核心问题——同源搜索问题,申请人提出了多个创新算法。第一作者论文3次发表在会议Intelligent Systems for Molecular Biology(ISMB)。此外,创新科研成果被国际媒体Bio-Techniques报道1次,被国际媒体Science X报道2次。
 

发表评论
验证码 看不清楚,更换一张
全部评论