Timezone: »

 
Poster
A Step Towards Worldwide Biodiversity Assessment: The BIOSCAN-1M Insect Dataset
Zahra Gharaee · ZeMing Gong · Nicholas Pellegrino · Iuliia Zarubiieva · Joakim Bruslund Haurum · Scott Lowe · Jaclyn McKeown · Chris Ho · Joschka McLeod · Yi-Yun Wei · Jireh Agda · Sujeevan Ratnasingham · Dirk Steinke · Angel Chang · Graham Taylor · Paul Fieguth

Thu Dec 14 03:00 PM -- 05:00 PM (PST) @ Great Hall & Hall B1+B2 #212
Event URL: https://github.com/zahrag/BIOSCAN-1M »

In an effort to catalog insect biodiversity, we propose a new large dataset of hand-labelled insect images, the BIOSCAN-1M Insect Dataset. Each record is taxonomically classified by an expert, and also has associated genetic information including raw nucleotide barcode sequences and assigned barcode index numbers, which are genetic-based proxies for species classification. This paper presents a curated million-image dataset, primarily to train computer-vision models capable of providing image-based taxonomic assessment, however, the dataset also presents compelling characteristics, the study of which would be of interest to the broader machine learning community. Driven by the biological nature inherent to the dataset, a characteristic long-tailed class-imbalance distribution is exhibited. Furthermore, taxonomic labelling is a hierarchical classification scheme, presenting a highly fine-grained classification problem at lower levels. Beyond spurring interest in biodiversity research within the machine learning community, progress on creating an image-based taxonomic classifier will also further the ultimate goal of all BIOSCAN research: to lay the foundation for a comprehensive survey of global biodiversity. This paper introduces the dataset and explores the classification task through the implementation and analysis of a baseline classifier. The code repository of the BIOSCAN-1M-Insect dataset is available at https://github.com/zahrag/BIOSCAN-1M

Author Information

Zahra Gharaee (University of Waterloo)
ZeMing Gong (Simon Fraser University)
Nicholas Pellegrino (University of Waterloo)
Nicholas Pellegrino

Nicholas Pellegrino is a doctoral student in Systems Design Engineering at the University of Waterloo. He is supervised by Prof. Paul Fieguth and associated with the Vision and Image Processing (VIP) Lab and the Statistical Image Processing (SIP) Lab. His main research focus is on machine vision, specifically object recognition. In support of the BIOSCAN program, associated with the International Barcode of Life project, Nicholas has undertaken the task of taxonomic order-level insect image classification. Broadly, this research will enable a far more extensive and detailed understanding of global biodiversity and the interactions between species and ecosystems. During his master's degree (also in Systems Design Engineering at the University of Waterloo), Nicholas was associated with PhotoMedicine Labs, and was co-supervised by Dr. Parsin Haji Reza and Prof. Paul Fieguth. His main research contributions were in the areas of signal processing, multi-spectral unmixing, and chromophore-selective PARS® imaging with applications in PARS histology and ophthalmology. From his master's, Nicholas was awarded the Alumni Gold Medal, which recognizes the top graduating master’s student across the whole university for their academic achievement. Nicholas graduated from Mechatronics Engineering at the University of Waterloo in 2019, completed his master's degree with PhotoMedicine labs in 2022, and now is completing a PhD at the Vision and Image Processing lab.

Iuliia Zarubiieva (University of Guelph)
Joakim Bruslund Haurum (Aalborg University & Pioneer Centre for AI)
Scott Lowe (Vector Institute)
Jaclyn McKeown (University of Guelph)
Chris Ho (University of Guelph)
Joschka McLeod (University of Guelph)
Yi-Yun Wei (University of Guelph)
Jireh Agda (University of Guelph)
Sujeevan Ratnasingham (University of Guelph)
Dirk Steinke (University of Guelph)
Angel Chang (Simon Fraser University)
Graham Taylor (University of Guelph / Vector Institute)
Paul Fieguth (University of Waterloo)

More from the Same Authors