Poster
in
Workshop: Adaptive Foundation Models: Evolving AI for Personalized and Efficient Learning

Assisted Few-Shot Learning for Vision-Language Models in Agricultural Stress Phenotype Identification

Muhammad Arbab Arshad · Talukder "Zaki" Jubery · Asheesh Singh · ARTI SINGH · Chinmay Hegde · Baskar Ganapathysubramanian · Aditya Balu · Adarsh Krishnamurthy · Soumik Sarkar

Project Page [ OpenReview]

Abstract

In the agricultural sector, labeled data for crop diseases and stresses are often scarce due to high annotation costs. We propose an Assisted Few-Shot Learning approach to enhance vision-language models (VLMs) for image classification tasks with limited annotated data by optimizing the selection of input examples. Our method employs one image encoder at a time—Vision Transformer (ViT), ResNet-50, or CLIP—to retrieve contextually similar examples using cosine similarity of embeddings, thereby providing relevant few-shot prompts to VLMs. We evaluate our approach on the agricultural benchmark for VLMs, focusing on stress phenotyping, where proposed method improves performance in 6 out of 7 tasks. Experimental results demonstrate that, using the ViT encoder, the average F1 score across seven agricultural classification tasks increased from 68.68\% to 80.45\%, highlighting the effectiveness of our method in improving model performance with limited data.

Chat is not available.