
By Elizabeth J. Carlen*, Cesar O. Estien*, Tal Caspi, Deja Perkins, Benjamin R. Goldstein, Samantha E.S. Kreling, Yasmine Hentati, Tyus D. Williams, Lauren A. Stanton, Simone Des Roches, Rebecca F. Johnson, Alison N. Young, Caren B. Cooper, and Christopher J. Schell.
As biodiversity continues to rapidly decline, the need for geolocated biodiversity data is increasing. Contributory science – including citizen and community science – is participant-generated data that provides a unique opportunity for understanding biodiversity across space and time. Thus, data reported to contributory biodiversity platforms – such as eBird and iNaturalist – can be incredibly valuable for conservation efforts. However, data reported to contributory biodiversity platforms, such as eBird and iNaturalist, can be driven by social and ecological variables, leading to biased data. Though empirical work has highlighted the biases in contributory data, little work has articulated how biases arise in contributory data and the societal consequences of these biases. Understanding the biases in these data is crucial for equitable conservation.
We present a conceptual framework to illustrate how social and ecological variables create bias in contributory science data. In this framework, we present four filters – participation, detectability, sampling, and preference – that ultimately shape the type and location of contributory biodiversity data. The participation filter determines the spatial distribution of reports within a region, reflecting who is reporting the data. The detectability filter narrows the pool to more easily observed species, excluding many nocturnal, cryptic, timid, or microscopic organisms. The sampling filter imposes finer-scale spatial and temporal biases and reflects the fact that people are more likely to log observations in certain circumstances – for example, when recreating in green spaces or alongside a trail/road versus commuting through gray spaces. Similarly, people are more likely to sample during certain times, such as the morning or afternoon, versus the night, or on the weekends. Lastly, the preference filter modifies the pool in favor of charismatic, flowering, rare, and colorful species and against nuisance or “boring” species. However, this can be regionally bound, culturally specific, and vary by the individual observing the species.
We leverage these filters to examine data from the largest contributory science platforms – eBird and iNaturalist – in St. Louis, Missouri, USA, and discuss the potential consequences of biased data. Examples of societal consequences include misunderstandings of species distributions and non-equitable conservation efforts. We end our perspective piece by providing several recommendations for researchers and institutions to move towards a more inclusive field. With these recommendations, we provide opportunities to ameliorate biases in contributory data and an opportunity to practice equitable biodiversity conservation.