Skip to content

Tools for (automatic) processing of the new MOUSE datafiles

License

Notifications You must be signed in to change notification settings

BAMresearch/MOUSEDataPipeline

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

37 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MOUSEDataPipeline

MOUSEDataPipeline provides tools for the (automatic) processing of new MOUSE datafiles, offering a structured approach to manage and analyze scientific data generated by the MOUSE instrument.

prerequisites and assumptions

Nomenclature

  • Measurement Date: A rough timestamp indicating when measurements on a specific set of samples began. Each set of samples belonging together is grouped under a unique measurement date in the format YYYYMMDD.

  • Batch: Represents a set of measurements for a single sample. A batch includes all measurements across various configurations for that particular sample.

  • Repetition: Refers to an individual measurement within a specific configuration. This includes the measurement alongside the preceding direct beam and direct-beam-through-sample measurements, which are essential for determining the primary beam flux, beam position, and transmission factor.

expected directory structure

The data is organized under a predefined directory structure to ensure consistency and facilitate automated processing:

├─── Proposals
│   └─── 2025
└─── Measurements
    ├─── SAXS002
    │   ├─── logbooks
    │   └─── data
    │       └─── Masks
    │       └─── 2025
    │           └─── 20250101  # (measurement date)
    │               └─── 20250101_[batch]_[repetition] # directory with files
    │                   └───eiger_[number]_master.h5
    │                   └───eiger_[number]_data00001.h5
    │                   └───im_craw.nxs
    │                   └─── beam_profile
    │                       └─── eiger_[number]_master.h5
    │                       └─── eiger_[number]_data00001.h5
    │                       └─── im_craw.nxs
    │                   └───beam_profile_through_sample
    │                       └─── eiger_[number]_master.h5
    │                       └─── eiger_[number]_data00001.h5
    │                       └─── im_craw.nxs
    │               └─── 20250101_[batch]_[repetition]
    │               └─── ...
    │               └─── autoproc  # (processed datafiles)    

Some flexibility is possible, there is a MOUSE_settings.yaml file that contains the paths to given sections in the tree. These can be adapted to point at the bits in your structure

usage example:

To process directories using specific configurations and steps, execute the following commands in your terminal:

python src/directory_processor.py --config MOUSE_settings.yaml --single_dir ~/Documents/BAM/Measurements/newMouseTest/Measurements/SAXS002/data/2025/20250101/20250101_21_22  --steps processstep_translator_step_1 processstep_translator_step_2 processstep_beamanalysis

Alternatively, specify measurement details directly:

python src/directory_processor.py --config MOUSE_settings.yaml --ymd 20250101 --batch 21 --repetition 22 --steps processstep_translator_step_1 processstep_translator_step_2 processstep_beamanalysis

If you want to do all currently ready steps for all repetitions in a batch, run the following:

python src/directory_processor.py --config MOUSE_settings.yaml \
--ymd 20250101 --batch 21 --parallel --steps \
processstep_translator_step_1 \
processstep_translator_step_2 \
processstep_beamanalysis \
processstep_cleanup_files \
processstep_add_mask_file \
processstep_metadata_update \
processstep_thickness_from_absorption \
processstep_add_background_files \
processstep_stacker

top-level methods:

1. directory_processor

  • Processes all data for a specified measurement date (YYYYMMDD), batch, and repetition, or by a given directory path.
  • Executes the defined processing steps, which should ideally be wrappers around CLI-executable scripts, though this isn't strictly enforced.

2. watcher

WIP, not functional yet! This component aims to continuously monitor a measurement date directory for newly completed repetitions, automatically processing them as they become available.

functionality methods:

TBC...

About

Tools for (automatic) processing of the new MOUSE datafiles

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published