As a result of this work, we were invited to write a correction to the original published article. The related work is provided in the folder "Correction" and the following Jupyter notebook: Correction Work.ipynb. The correction can be found at:
Our paper "A reality check on research reproducibility in Open Science students’ projects" describes our experience of reproducing a study.
title={A reality check on research reproducibility in Open Science students’ projects},
author={Frick, Claudia and Bl{\"u}mm, Mirjam and Randall, Natasha and K{\"u}{\c{c}}{\"u}k, Berrak and Bailey, Drew},
journal={API Magazin},
The purpose of this work is to attempt to reproduce a study by re-using a published dataset, for the final project of the TH Köln ( module "Open Science" as part of the Digital Sciences Master's Degree.
Original dataset:
Original paper:
The main Jupyter Notebooks used for analyses are located here: These can be viewed on Github, or the project can be downloaded and worked on locally. Guidelines to run the notebooks are included within the code.
- Cleaning and Exploring the Data
- Masculinity Preferences by Priming Group
- Masculinity Preferences over Time
- Modelling the Data
Our data management plan (DMP) is located here:
Our modified dataset is released under Creative Commons Attribution 4.0 International (CC BY 4.0).
Our research software is released under the MIT licence.
The project directory contains:
Object | Description |
Folder: “Original Files” | The original study files as provided by the original researchers. |
Folder: “Reproduction Project” | Our re-structuring and modifying of the original files, as well as additional datasets and software (code). |
Folder: “Correction” | Our work on writing a correction for the original article, based on our reproduction project work. |
PDF: “Data Management Plan”. | Our current data management plan. Contains information about the datasets, software, metadata and contact information. |
The “Reproduction Project” folder contains:
Object | Description |
4 folders corresponding to the 2 “main studies” and 2 “supplementary studies” described in the paper | Each folder contains its respective data, e.g. “main_study_face_data.csv” and any re-used and modified datasets, e.g. “main_study_face_data_modified.csv”. |
The folder “Reproduction Project Code” | Contains the code and Jupyter notebooks used to analyse the data, numbered in order. The notebooks containing an incorrect analysis (labelled “WRONG DATA”) are included for transparency. |
The xlsx file: “Downloaded Data (All Studies)”. | The form of the original dataset, included for reference. |
The pdf file “Published Article”. | The original journal article associated with the dataset, included for reference. |
The docx file “Description of Supplementary Studies”. | An explanation of the supplementary studies as provided by the original researchers, included for reference. |
The purpose of the original data collection, structure of the associated study, and meaning of the attributes in the dataset is described in the journal article: (
In our modified dataset, “main_study_face_data_modified.csv” the contents of each of the attributes are:
Attribute Name | Attribute Meaning | Attribute Contents |
participant_id | The unique ID identifier for each of the participants in the study. | String of 331 possible values, e.g. “13” or “211b”. |
prime_condition | A number representing the priming condition group each participant was assigned to. | Integer in range: 1 to 5 |
prime_condition_names | The name (mapped from the prime condition number) of the relevant priming condition group. | String of possible values: “neutral”, “male/male”, “male/group”, “male/female”, “pathogen” |
trial_number | Each row in the dataset corresponds to each trial for each particular participant, where they choose whether they prefer an image of a masculinised or feminised face. | Integer in range: 1 to 40 |
image | The image shown to the participant, containing a masculinised and feminised face. | String of form Slide{}.bmp, where {} is the number of the slide, e.g. Slide13.bmp |
pre_post_prime | An indicator of whether this particular trial occurs before (0) or after (1) the participant has been shown their respective priming images. All trials ≥ 20 are post prime (1), the remainder are pre-prime (0). | Integers: 0 or 1 |
chose_masc | An indicator of whether on this particular trial the participant chose the masculinised face (1) or feminised face (0). | Integers: 0 or 1 |