Our dataset contains 689 minutes of recorded event streams, and 1.6 million annotated faces with bounding box and five point facial landmarks.
-
To access the dataset and pre-trained models, please follow the link and provide your informaton. After filling out the form, you will receive credentials and instructions on how to access the server.
-
Pre-trained models come together with the dataset, alternatively pre-trained models are available on this google drive link.
Figure 2: File structure of the FES dataset, with green representing an event stream and blue representing annotations: a)The preprocessed data are divided into three folders, with each folder containing only bounding box annotations,both bounding box and facial landmark annotations, and event streams in the h5 format. The raw dataset contains lab and wild folders with raw videos and annotations. b)Each controlled experiment (Lab) file has an individual subject ID and an experiment ID. Each file in the uncontrolled (Wild) dataset contains a scene ID that provides information about the location of a recording and the number (ID) of an experiment.
The final dataset contains both the originally collected raw files and the preprocessed data. To produce preprocessed data out of raw files, the reader can refer to preprocessing folder of this repo. The raw files contain video in the “raw” format that can be rendered, and annotations in the “xml” format. Meanwhile, the converted files contain a dataset ready for machine learning training in the “npy” format, annotations for bounding box and facial landmarks, and “h5” files representing the Python binary format to work with the event stream data as an array.
The integration of event streams with annotated labels was based on the time dimension. Since events were recorded at microsecond precision, the timeline of the labels was also converted to microseconds, although it originally had millisecond precision and was derived based on a frame number and frame rate of 30 Hz.
Further, this repo contains pre-processing scripts and instructions for inference and model training for face and facial landmark detection from events streams.
To launch training or inference, first of all it is required to install Metavision SDK environment.
-
If you use Linux OS, install Metavision SDK environment from this link.
-
If you use Windows OS, install Metavision SDK environment from this link.
git clone https://github.com/IS2AI/faces-in-event-streams
To run inference for face bounding box model, use the following command:
python3 detection_and_tracking_pipeline.py --object_detector_dir /path/to/model --record_file <RAW file to process> --display
alternatively you can proceed with instructions from this link.
Before launching training, please place label_map_dictionary.json, which comes as part of this repo, in the same folder where your train/val/test folders are located.
- To train the model to detect face bounding box in event streams, run:
cd <path to train_detection.py>
python3 train_detection.py <path to output directory> <path to dataset>
alternatively, you can follow instructions from here.
- To select the feature extractor, you need to define --feature_extractor option using:
python3 train_detection.py <path to output directory> <path to dataset> --feature_extractor <select feature extractor: Vanilla, ResNet_18, ResNet_34, ResNet_50>
To train or detect five point facial landmark detection, the change to the following files should be done:
- feature_extractors.py
- box_processing.py
- display_frame.py
- modules.py
Changes which need to be done can be found in our repo in files:
- box_processing_difference.py
- display_frame_difference.py
- modules_difference.py
- feature_extractors_difference.py
Further, train and inference instructions are the same as for face detection part, declared above.
Since
- feature_extractors.py
- box_processing.py
- display_frame.py
- modules.py
are files which originally comes as part of Metavision SDK which comes under Prophesee Metavision Licensing, we will share only changes that has to be done on the installed Metavision SDK package (once you install Metavision SDK, you will have direct access to the original files).
@Article{s24051409,
AUTHOR = {Bissarinova, Ulzhan and Rakhimzhanova, Tomiris and Kenzhebalin, Daulet and Varol, Huseyin Atakan},
TITLE = {Faces in Event Streams (FES): An Annotated Face Dataset for Event Cameras},
JOURNAL = {Sensors},
VOLUME = {24},
YEAR = {2024},
NUMBER = {5},
ARTICLE-NUMBER = {1409},
URL = {https://www.mdpi.com/1424-8220/24/5/1409},
ISSN = {1424-8220},
DOI = {10.3390/s24051409}
}