Artificial intelligence in drug discovery from advanced molecular representation to pipeline applications
Keywords:
ADME/Tox prediction, Artificial intelligence, de novo design, Drug discovery, ModelsAbstract
The pharmaceutical research and development (R&D) process is persistently challenged by high financial costs, protracted timelines, and remarkably low success rates. Artificial intelligence (AI) technology, by simulating complex biological systems, has accelerated the innovation of the entire drug discovery pipeline. This review positions AI as a pivotal technology for reengineering the R&D process by utilizing sophisticated molecular representations to predict pharmacodynamic (PD) and toxicological effects significantly earlier. The scope systematically covers the AI foundations in chemoinformatics, detailing how the performance of AI models is intrinsically linked to the quality of molecular representation. We elaborate on representations ranging from robust string-based methods to advanced topological models, including the five key categories of Graph Neural Networks (GNNs), three-dimensional (3D)-aware Geometric Deep Learning (GDL) and emerging Quantum Machine Learning (QML) as well as Hybrid Quantum-Classical Neural Networks (HQNNs). We analyzed the practical application of these models across the drug discovery pipeline, including de novo molecular design with biological foundation models and flow matching generative architectures, data scarcity solutions via Few-Shot Learning and meta-learning, and explainable AI (XAI) for transparent validation. We propose an integrated Q-BioFusion framework that synergizes quantum computing, autonomous experimentation, and generative models to address systemic R&D constraints. We hope future research will improve the geometric fidelity to achieve more accurate and faster 3D molecular prediction and generation, enhance data efficiency, and solve the inherent data sparsity problem in biological assays, and advance integrated XAI workflows. These efforts will ensure transparent, reliable and trustworthy guidance during the computer simulation process of drug design.