Automatic Auxiliary Task Selection and Adaptive Weighting Boost Molecular Property Prediction
Abstract
Recent studies in Machine Learning (ML) for biological research focus on investigating molecular properties to accelerate drug discovery. However, limited labeled molecular data often hampers the performance of ML models. A common strategy to mitigate data scarcity is leveraging auxiliary learning tasks to provide additional supervision, but selecting effective auxiliary tasks requires substantial domain expertise and manual effort, and their inclusion does not always guarantee performance gains. To overcome these challenges, we introduce Automatic Auxiliary Task Selection (AutAuT), a fully automated framework that seamlessly retrieves auxiliary tasks using large language models and adaptively integrates them through a novel gradient alignment weighting mechanism. By automatically emphasizing auxiliary tasks aligned with the primary objective, AutAuT significantly enhances predictive accuracy while reducing negative impacts from irrelevant tasks. Extensive evaluations demonstrate that AutAuT outperforms 10 auxiliary task-based approaches and 18 advanced molecular property prediction models.