Skip to yearly menu bar Skip to main content


Poster
in
Workshop: New Frontiers of AI for Drug Discovery and Development

AbLEF: Antibody Language Ensemble Fusion for thermodynamically empowered property predictions

Zachary Rollins · Talal Widatalla · Andrew Waight · Alan Cheng · Essam Metwally

Keywords: [ GNN ] [ protein property prediction ] [ structural ensembles ] [ multimodal deep learning ] [ CNN ] [ protein language models ] [ Molecular dynamics ]


Abstract: Pre-trained protein language and/or structural models are often fine-tuned on drug development properties (i.e., developability properties) to accelerate drug discovery initiatives. However, these models generally rely on a single structural conformation and/or a single sequence as a molecular representation. We present a physics-based model whereby structural ensemble representations are fused by a transformer-based architecture and concatenated to a language representation to predict antibody protein properties. AbLEF enables the direct infusion of thermodynamic information into latent space and this enhancesproperty prediction by explicitly infusing dynamic molecular behavior that occurs during experimental measurement. We find that $\textbf{(1)}$ ensembles of structures generated from molecular simulation can further improve antibody property prediction for small datasets,$\textbf{(2)}$ fine-tuned large protein language models can match smaller antibody-specific language models at predicting antibody properties, $\textbf{(3)}$ trained multimodal sequence and structural representations outperform sequence representations alone, $\textbf{(4)}$ pre-trained sequence with structure models are competitive with shallow machine learning (ML) methods in the small data regime, and $\textbf{(5)}$ predicting measured antibody properties remains difficult for limited high fidelity datasets.

Chat is not available.