AI+Bio 24/12/18 文献速递|基于GPT-4和蛋白质语言模型实现自动化蛋白质分析任务的多代理AI系统ProtChat
作者:微信文章关注AI+Biology线报,每日推送AI+Biology当日更新的最新文章和资讯,获取领域最新的技术进展
Deepmind推出Alphafold3官方教程,从原理到实操带你精通结构预测重磅|Deepmind推出Alphafold3官方教程,从原理到实操带你精通结构预测数据决定成败:机器学习在小分子药物研发的未来|Nature Computational Science最新综述
阿斯利康重磅|抗体设计中的生成模型全面评测
一击即中!BindCraft 实现蛋白binder的one-shot设计(附完整protocol
1.GPCRchimeraDB: A Database of Chimeric G-Protein Coupled Receptors (GPCRs) to Assist Their Design
期刊: biorxiv
链接: https://www.biorxiv.org/content/10.1101/2024.12.16.628733
总结: 本文介绍了一个针对嵌合G蛋白偶联受体(GPCR)的数据库——GPCRchimeraDB。该数据库收录了170个嵌合GPCR的序列,并提供了与自然GPCR的比较工具,旨在优化和辅助新型嵌合受体的设计,为药物开发和受体功能研究提供有价值的资源。
摘要: Chimeric GPCRs have emerged as valuable tools for elucidating GPCR function by facilitating the identification of signaling pathways and discovering novel ligands. However, the design process remains largely trial-and-error, lacking a standardized approach.
2.Inconsistency of LLMs in Molecular Representations
期刊: chemrxiv
链接: https://doi.org/10.26434/chemrxiv-2024-lnvbz
总结: 本文探讨了大型语言模型(LLMs)在化学分子表示中的一致性问题,特别是使用SMILES和IUPAC命名等不同分子表示时。结果表明,当前商业LLMs在一致性方面的表现极差,提出的KL散度损失函数改善了表面一致性,但并未提高准确性,揭示了这些模型理解化学的根本问题。
摘要: Large language models (LLMs) have shown promising potential across diverse chemistry tasks. However, their ability to capture the intrinsic chemistry of molecules remains unclear. We evaluate the consistency of state-of-the-art LLMs using different molecular representations.
3.CoarsenConf: Equivariant Coarsening with Aggregated Attention for Molecular Conformer Generation
期刊: Journal of Chemical Information and Modeling
链接: https://doi.org/10.1021/acs.jcim.4c01001
总结: 本文提出了CoarsenConf,一种通过聚合注意力机制实现分子构象生成的模型。通过等变粗化分子图并利用层次化变分自编码器,CoarsenConf显著提高了分子构象的准确性和生成效率,尤其在多个下游应用中表现优异。
摘要: Molecular conformer generation (MCG) is an important task in cheminformatics and drug discovery. CoarsenConf uses an SE(3)-equivariant hierarchical variational autoencoder to efficiently generate accurate conformers.
4.ProtChat: An AI Multi-Agent for Automated Protein Analysis Leveraging GPT-4 and Protein Language Model
期刊: Journal of Chemical Information and Modeling
链接: https://doi.org/10.1021/acs.jcim.4c01345
总结: 本文提出了ProtChat,一个基于GPT-4和蛋白质语言模型的AI多代理系统,能够自动化蛋白质分析任务,如蛋白质属性预测和蛋白-药物相互作用分析,免去人工干预。该系统显著提高了效率,并使没有计算背景的研究人员也能方便使用。
摘要: Large language models (LLMs) have transformed natural language processing. Similarly, in computational biology, protein sequences are interpreted as natural language. We propose ProtChat, an AI multi-agent system for protein analysis that automates complex protein tasks.
5.Artificial Intelligence for Central Dogma-Centric Multi-Omics: Challenges and Breakthroughs
期刊: arxiv/q-bio.gn
链接: https://arxiv.org/abs/2412.12668
总结: 本文回顾了AI在整合基因组学、转录组学和代谢组学等多组学数据中的应用,提出了基于中央法则的多组学模型如何推动精准医学的发展。文章总结了多组学集成的策略和AI技术的进展,为计算生物学家提供了实践指导。
摘要: The integration of genomics, metabolomics, and transcriptomics has become a critical approach to advancing disease genetics research. This paper reviews AI-driven multi-omics models for disease prediction and precision medicine.
6.Prediction of Peptide Structural Conformations with AlphaFold2
期刊: biorxiv
链接: https://www.biorxiv.org/content/10.1101/2024.12.03.626727
总结: 本文探讨了AlphaFold2(AF2)在肽段多构象预测中的应用,评估了557条肽段的结构预测结果,并与核磁共振(NMR)数据进行了对比。研究显示,AF2在肽段构象预测中的表现较为优秀,但仍存在一定的准确性差异。
摘要: Protein structure prediction via AI/ML approaches has sparked substantial interest. We report AF2-based structural conformation prediction of peptides, comparing results with NMR-determined ensembles.
7.An Unsupervised Framework for Comparing SARS-CoV-2 Protein Sequences Using LLMs
期刊: biorxiv
链接: https://www.biorxiv.org/content/10.1101/2024.12.16.628708
总结: 本文提出了一种无监督框架,利用大语言模型(LLMs)对SARS-CoV-2蛋白序列进行比较和聚类分析。通过对比学习和Siamese神经网络,本文框架能够有效地识别不同变异株之间的关系,并对病毒序列的变异提供深刻的见解。
摘要: The SARS-CoV-2 pandemic resulted in extensive sequencing data. This paper proposes an unsupervised framework using large language models to characterize SARS-CoV-2 sequences, focusing on spike proteins.
8.miRScore: A Rapid and Precise MicroRNA Validation Tool
期刊: biorxiv
链接: https://www.biorxiv.org/content/10.1101/2024.12.12.628184
总结: 本文提出了miRScore,一种高效的微小RNA(miRNA)验证工具。通过结合结构和表达数据,miRScore能够快速、准确地验证miRNA注释,显著提高了miRNA数据库中新提交注释的准确性。
摘要: MicroRNAs (miRNAs) regulate gene expression and are crucial for disease research. miRScore is an independent tool for rapid validation of miRNA annotations using sRNA-seq data.
9.iModMix: Integrative Module Analysis for Multi-Omics Data
期刊: biorxiv
链接: https://www.biorxiv.org/content/10.1101/2024.11.12.623208
总结: 本文介绍了iModMix,一种用于多组学数据整合分析的新方法。iModMix通过图形套索构建网络模块,实现代谢组学、蛋白组学和转录组学数据的横向整合,并能够处理未标定代谢物,填补了现有方法的空白。
摘要: Multi-omics integration offers insights into disease biology but remains challenging. iModMix is a novel method for integrating metabolomics with proteomics or transcriptomics to explore molecular associations.
DiffSBDD 是一种基于对称性扩散模型的新方法,通过 3D 条件生成问题,拓展了结构药物设计的适用性,为药物生成提供了更广泛的解决方案。
页:
[1]