Poster
in
Workshop: Workshop on robustness of zero/few-shot learning in foundation models (R0-FoMo)
Image Clustering Conditioned on Text Criteria
Sehyun Kwon · Jaeseung Park · Minkyu Kim · Jaewoong Cho · Ernest Ryu · Kangwook Lee
Abstract:
Classical clustering methods do not provide users with direct control of the clustering results, and the clustering results may not be consistent with the relevant criterion that a user has in mind. In this work, we present a new methodology for performing image clustering based on user-specified criteria in the form of text by leveraging modern Vision-Language Models and Large Language Models. We call our method Image Clustering Conditioned on Text Criteria (ICTC), and it represents a different paradigm of image clustering. ICTC requires a minimal and practical degree of human intervention and grants the user significant control over the clustering results in return. Our experiments show that ICTC can effectively cluster images with various criteria, such as human action, physical location, or the person's mood, while significantly outperforming baselines.
Chat is not available.