About
The book is intended for a graduate course on machine learning/artificial intelligence for drug discovery and development. It can also serve the graduate course “Machine Learning/Artificial Intelligence and its Application.” The target audiences are two groups:
- Graduate (e.g., MS and PhD students) or advanced undergraduate students majoring in computer science, engineering, medicine, chemistry, biology, medical informatics, or biostatistics.
- Data scientists from the pharmaceutical and biotech industry.
Our book covers a comprehensive overview of most of the essential topics in AI for drug discovery and development and starts from basic concepts in machine learning and pharmaceutical science. Thus it is more friendly to readers who do not have sufficient background knowledge.
We plan to write the book from three different aspects: data, method, and application.
-
Data We will cover all the different data types that are related to drug discovery and development, including drug target, genomics data, small molecules (SMILES, molecular graphs, fingerprint), biologics (amino acid sequence), clinical trials (drug, disease code, text-based feature), patient data (electronic health records, medical claims, wearables) and scientific literature.
-
Machine Learning Methods (special focus on deep learning methods) We will present different machine learning methods with a focus on deep learning. In particular, we will cover basic machine learning methods (supervised, unsupervised learning, model optimization, and training), representation learning (various deep learning models - MLP, RNN, CNN, Attention model, transformer, graph neural network, autoencoder), deep generative models (variation autoencoder, generative adversarial networks), and Combinatorial methods (Reinforcement learning, Genetic algorithm, Bayesian optimization)
-
Drug Discovery Applications We will introduce the different drug discovery and development tasks and how deep learning models can help. Specifically, we will cover DNA/RNA-protein binding, Target identification, Small molecule design ( De novo small-molecule design, lead optimization, property prediction: ADMET, QSAR, adverse drug effect (drug side effect) prediction, virtual screening, retrosynthesis, drug combination prediction (synergy), drug-drug interaction, drug-target interaction prediction), large molecule design (protein sequence learning, biologics property prediction, epitope/antetope prediction prediction, protein amino acid sequence prediction, protein 3D structure prediction, antibody design).
Table of Content (tentative)
- Data
- Target protein
- Small-molecule drug
- Biologics
- Clinical trial data
- Literature data
- Machine learning Basics
- Supervised Learning
- Unsupervised Learning
- Numerical optimization
- Data split
- Hyperparameter
- Ensemble methods
- Deep learning methods
- Multiple Layer Perceptron (MLP)
- Convolutional neural network (CNN)
- Recurrent neural network (RNN)
- Graph neural network (GNN)
- Embedding
- Attention mechanism
- Transformer
- Memory network
- Advanced Machine learning methods
- Variational Auto-Encoder (VAE)
- Generative Adversarial Network (GAN)
- Normalizing Flow model
- Reinforcement Learning (RL)
- Genetic algorithm (GA)
- Bayesian optimization (BO)
- Self-supervised learning and pretraining
- Small-molecule drug discovery
- virtual screening and high-throughput screening
- drug property prediction: ADMET
- de novo drug design
- lead optimization
- Large-molecule drug discovery
- protein property prediction
- protein design
- Drug Development
- Clinical trial basics
- clinical trial outcome prediction
- Drug repurposing
- Drug combination
- Patient recruitment and Patient-trial matching
- Survival analysis
- Clinical Trial site Selection
Follow Us
Please follow us on Twitter: Tianfan, Danica, Jimeng for the latest news.
Tentative Release Dates
- 2023
Authors
Suggestion
Any feedbacks, suggestions and comments for improve our paper are warmly welcome! Please feel free to reach out to us via the google form.