A Crowdsourcing Methodology to Measure Algorithmic Bias in Black-box Systems: A Case Study with COVID-related Searches
This repository contains the data collected via crowdsourcing to study system behavior of a commercial search engine with COVID-related queries. The work has been presented at the Bias workshop at ECIR 2022: https://www.youtube.com/watch?v=htDPUL3EqBA
If you make use of this resource, please cite the paper below.
Le, Binh, Spina, Damiano, Scholer, Falk, Chia, Hui (2022). A Crowdsourcing Methodology to Measure Algorithmic Bias in Black-Box Systems: A Case Study with COVID-Related Searches. In: Boratto, L., Faralli, S., Marras, M., Stilo, G. (eds) Advances in Bias and Fairness in Information Retrieval. BIAS 2022. Communications in Computer and Information Science, vol 1610. Springer, Cham. https://doi.org/10.1007/978-3-031-09316-6_5
@inproceedings{le2022crowdsourcing,
abstract = {Commercial software systems are typically opaque with regard to their inner workings. This makes it
challenging to understand the nuances of complex systems, and to study their operation, in particular in the context
of fairness and bias. We explore a methodology for studying aspects of the behavior of black box systems,
focusing on a commercial search engine as a case study. A crowdsourcing platform is used to collect search
engine result pages for a pre-defined set of queries related to the COVID-19 pandemic, to investigate whether
the returned search results vary between individuals, and whether the returned results vary for the same individual
when their information need is instantiated in a positive or a negative way. We observed that crowd workers
tend to obtain different search results when using positive and negative query wording of the information needs,
as well as different results for the same queries depending on the country in which they reside.
These results indicate that using crowdsourcing platforms to study system behavior, in a way that
preserves participant privacy, is a viable approach to obtain insights into black-box systems,
supporting research investigations into particular aspects of system behavior.},
author = {Le, Binh and Spina, Damiano and Scholer, Falk and Chia, Hui},
booktitle = {Proceedings of the Third Workshop on Bias and Social Aspects in Search and Recommendation},
series = {Bias @ ECIR 2022},
title = {A Crowdsourcing Methodology to Measure Algorithmic Bias in Black-box Systems:
A Case Study with COVID-related Searches},
doi = {10.1007/978-3-031-09316-6_5},
year = {2022}
}