MOUSEDataPipeline provides tools for the (automatic) processing of new MOUSE datafiles, offering a structured approach to manage and analyze scientific data generated by the MOUSE instrument.
-
Measurement Date: A rough timestamp indicating when measurements on a specific set of samples began. Each set of samples belonging together is grouped under a unique measurement date in the format YYYYMMDD.
-
Batch: Represents a set of measurements for a single sample. A batch includes all measurements across various configurations for that particular sample.
-
Repetition: Refers to an individual measurement within a specific configuration. This includes the measurement alongside the preceding direct beam and direct-beam-through-sample measurements, which are essential for determining the primary beam flux, beam position, and transmission factor.
The data is organized under a predefined directory structure to ensure consistency and facilitate automated processing:
├─── Proposals
│ └─── 2025
└─── Measurements
├─── SAXS002
│ ├─── logbooks
│ └─── data
│ └─── Masks
│ └─── 2025
│ └─── 20250101 # (measurement date)
│ └─── 20250101_[batch]_[repetition] # directory with files
│ └───eiger_[number]_master.h5
│ └───eiger_[number]_data00001.h5
│ └───im_craw.nxs
│ └─── beam_profile
│ └─── eiger_[number]_master.h5
│ └─── eiger_[number]_data00001.h5
│ └─── im_craw.nxs
│ └───beam_profile_through_sample
│ └─── eiger_[number]_master.h5
│ └─── eiger_[number]_data00001.h5
│ └─── im_craw.nxs
│ └─── 20250101_[batch]_[repetition]
│ └─── ...
│ └─── autoproc # (processed datafiles)
Some flexibility is possible, there is a MOUSE_settings.yaml file that contains the paths to given sections in the tree. These can be adapted to point at the bits in your structure
To process directories using specific configurations and steps, execute the following commands in your terminal:
python src/directory_processor.py --config MOUSE_settings.yaml --single_dir ~/Documents/BAM/Measurements/newMouseTest/Measurements/SAXS002/data/2025/20250101/20250101_21_22 --steps processstep_translator_step_1 processstep_translator_step_2 processstep_beamanalysis
Alternatively, specify measurement details directly:
python src/directory_processor.py --config MOUSE_settings.yaml --ymd 20250101 --batch 21 --repetition 22 --steps processstep_translator_step_1 processstep_translator_step_2 processstep_beamanalysis
If you want to do all currently ready steps for all repetitions in a batch, run the following:
python src/directory_processor.py --config MOUSE_settings.yaml \
--ymd 20250101 --batch 21 --parallel --steps \
processstep_translator_step_1 \
processstep_translator_step_2 \
processstep_beamanalysis \
processstep_cleanup_files \
processstep_add_mask_file \
processstep_metadata_update \
processstep_thickness_from_absorption \
processstep_add_background_files \
processstep_stacker
- Processes all data for a specified measurement date (YYYYMMDD), batch, and repetition, or by a given directory path.
- Executes the defined processing steps, which should ideally be wrappers around CLI-executable scripts, though this isn't strictly enforced.
WIP, not functional yet! This component aims to continuously monitor a measurement date directory for newly completed repetitions, automatically processing them as they become available.
TBC...