Deep learning for identifying bee species from images of wings and pinned specimens

100%

RESEARCH ARTICLE Deep learning for identifying bee species from images of wings and pinned specimens

Abstract

One of the most challenging aspects of bee ecology and conservation is species-level identification, which is costly, time consuming, and requires taxonomic expertise. Recent advances in the application of deep learning and computer vision have shown promise for identifying large bumble bee species. However, most bees, such as sweat bees in the genus Lasioglossum, are much smaller and can be difficult, even for trained taxonomists, to identify. For this reason, the great majority of bees are poorly represented in the crowdsourced image datasets often used to train computer vision models. But even larger bees, such as bumble bees from the B. vagans complex, can be difficult to separate morphologically. Using images of specimens from our research collections, we assessed how deep learning classification models perform on these more challenging taxa, qualitatively comparing models trained on images of whole pinned specimens or on images of bee forewings. The pinned specimen and wing image datasets represent twenty and eighteen species from six and four genera, respectively, and were used to train the EfficientNetV2L convolutional neural network. Mean test precision was ninety-four point nine percent and ninety-eight point one percent for pinned and wing images respectively. Results show that computer vision holds great promise for classifying smaller, more difficult to identify bees that are poorly represented in crowdsourced datasets. Images from research and museum collections will be valuable for expanding classification models to include additional species, which will be essential for large scale conservation monitoring efforts.

Introduction

One of the most challenging aspects of bee ecology and conservation is classifying or identifying individuals by species. With more than twenty thousand species globally, researchers may find tens to several hundred bee species to identify in any given study. The identification process requires a high level of taxonomic expertise because many species share similar and sometimes highly variable features. Such expert services are in decline for both researchers and for the general public, who are increasingly providing crowdsourced data used to assess large scale trends in biodiversity. Computer vision is a promising approach that will help address the problem of bee species identification, especially for the larger taxa, such as Bombus, for which sufficient and accurate crowdsourced training datasets exist. However, the effectiveness of computer vision for identifying smaller, more taxonomically challenging species is not well tested.

Computer vision can be effective for species-level bee identification, but this requires large, annotated image datasets for model training. Public occurrence and image repositories, such as the Global Biodiversity Information Facility, provide access to images from citizen science programs and museums. For example, the Global Biodiversity Information Facility is an excellent resource for many species in the genus Bombus and other species that can be readily identified by experts from photos. However, most bee species are much smaller than Bombus with distinguishing features that are often obscured or not sufficiently resolved in photos for identification, even by experts. One such group of bees belongs to the subgenus Lasioglossum (Dialictus), which are particularly challenging to identify. In North America, where they comprise over two hundred fifty species, Dialictus species are frequently encountered in scientific studies, yet they are often not identified to species because there are so few experts available. Even DNA barcoding often fails for this "nightmare taxon".

But even larger bumble bees can be difficult to identify without close inspection. For example, species in the Bombus vagans mimicry complex (B. vagans, B. sandersoni, and B. perplexus) have similar patterns of hair color and are frequently confused. Although they can be separated by using DNA barcodes or by comparing ratios of malar or flagella segment length, such morphometric features can be difficult to see in photos. Species-level identification usually requires careful examination under a microscope by experts. Consequently, reliably identified images of most bee species are rare or absent in crowdsourced datasets and are not available in sufficient numbers for training computer vision models. New sources of images for model training are needed.

Acquiring new annotated training images from the field is challenging because photographed individuals must also be captured and identified. However, existing museum or research collections, with pinned specimens that have been identified by experts are ideal for acquiring new images for computer vision model training. Pinned specimens can be quickly photographed under a microscope from standard angles under standard lighting conditions. This reduces the contextual variation of images taken in the field, which can confuse vision models and necessitate more images for effective model training.

In addition to images of whole specimens, images of bee wings alone may be effective for identifying bees. Many bee species have distinctive patterns of venation on their wings that can be used in combination with other morphological features to help differentiate species. For example, Hall and Kozmus each mapped important nodes and cell centroids on bee forewings, and used k-nearest neighbor classification and discriminant analysis to effectively separate species. More recent deep learning techniques using convolutional neural networks may be more flexible as they do not rely on predetermined feature input, such as the location of specific nodes. Like whole specimens, bee wing images may be acquired under standard conditions that minimize distracting background noise. Wing images are further standardized because they do not capture uninformative variation in the pose or condition of the rest of the body. With whole specimens, for example, bee hair can be variously matted, wings folded, and appendages positioned differently for each specimen. Photographing only wings in a flat plane reduces this type of variation, which could help improve the accuracy of identifications over images of whole bees. On the other hand, focusing on only a single part of the bee may provide fewer features for the model to differentiate among species.

We gathered new images of whole-bee specimens and bee wings for classification model training from our research collections. Our goals were to one, assess the potential for species level identification of small and other challenging species through computer vision, and two, to qualitatively compare the effectiveness of models trained on images of bee wings versus whole bee specimens. This was a qualitative comparison because the two image datasets were comprised largely of different species and sample sizes.

Materials and methods

Pinned bee imaging

Bee wing imaging

Two. Subgenus Lasioglossum

Image cropping

Classification model training

Results

Discussion

Conclusions

Supporting information

Overview

The study assesses the effectiveness of deep learning models in accurately classifying bee species from image datasets of pinned specimens and wing images. The research highlights the promise of computer vision in increasing the accuracy of species identification, particularly for smaller, challenging taxa that are often underrepresented in existing datasets.

Key Points

1Deep learning offers a promising solution for identifying difficult-to-classify bee species
2Images from research collections can enhance available datasets for model training
3Models trained on different types of images can yield varying levels of identification accuracy
4Precise species identification is crucial for biodiversity monitoring and conservation efforts
5The study emphasizes the importance of machine learning in overcoming taxonomic expertise shortages.

Details

Authors: Brian J. Spiesman, Claudio Gratton, Elena Gratton, Heather Hines
Category: Biology and Natural Sciences

PDF
Evolutionary Tree for All Bumblebee Species World-Wide Estimated by Combining Information from Fast-Evolving Genes, Slow-Evolving Genes, and Genomic Data (Apidae, Bombus)
This article presents a comprehensive evolutionary tree for all extant bumblebee species worldwide, integrating data from fast- and slow-evolving genes along with genomic information to enhance understanding of their evolutionary relationships.
PDF
Pollination ecology in the tropical Andes: moving towards a cross-scale approach
This review explores plant-pollinator interactions in the tropical Andes, highlighting the importance of these networks for ecosystem functioning and suggesting a cross-scale approach to improve research and conservation efforts in this critical biodiversity hotspot.
PDF
Helminths Overview - Notes Studley Microbiology & Parasitology
This document provides a comprehensive overview of helminths, including their life cycles, classification, and significant types affecting humans. It covers their diagnosis, clinical presentations, and treatments for various infections caused by these parasitic worms.
PDF
Life Processes
This chapter explores the defining characteristics of life, the processes necessary for living organisms to maintain their structures, and the critical functions of nutrition, respiration, and excretion. It discusses both autotrophic and heterotrophic nutrition while highlighting the importance of molecular movements for sustaining life.
PDF
Nuclear Magnetic Resonance (NMR) Fundamentals and Applications
This document covers the principles of nuclear magnetic resonance (NMR), including the behavior of atomic nuclei in magnetic fields, gyromagnetic ratios, and the applications of NMR in spectroscopy.