Workshop: AI for Science: Mind the Gaps

Novel fuzzy approach to Antimicrobial Peptide Activity Prediction: A tale of limited and imbalanced data that models won’t hear

Aviral Chharia


Antimicrobial peptides have gained immense attention in recent years due to their potential for developing novel antibacterial medicines, next-generation anti-cancer treatment regimes, etc. Owing to the significant cost and time required for wet lab-based AMP screening, researchers have framed the task as an ML problem. However, traditional models rely on the unrealistic premise of large medical data availability to achieve significant performance levels; otherwise, they overfit, decreasing model precision. The collection of such labeled medical data is a challenging and expensive task in itself. The current study is the first to examine models in a real-world setting, training them on restricted and highly imbalanced data. A Fuzzy Intelligence based model is proposed for short (<30 aa) AMP activity prediction, and its ability to learn on limited and severely skewed high-dimensional space mapping is demonstrated over a set of experiments. The proposed model significantly outperforms state-of-the-art ML models trained on the same data. The findings demonstrate the model's efficacy as a potential method for in silico AMP activity prediction.