This system serves as a Crowdsourcing Task managing system, preparing data and assigning available tasks to the appropriate users in crowdsourcing scenarios.
In its current state, the system is optimized for a crowd-assisted Optical Music Recognition (OMR) pipeline. More specifically, the functionalities of the system can be summarized as follows:
- The system receives PDF files of music scores
- A measure detector identifies measures per page and creates a "skeleton" MEI file for each score
- Each PDF file is segmented and associated with parts of the generated MEI file
- Each segment and MEI parts are paired and served in crowdsourcing tasks
- The system identifies available crowdsourcing task types and makes tasks available via its API
- As results are posted back to the API, crowd judgements are processed and aggregated
- Crowd-processed MEI parts are aggregated to re-create original PDF submitted music score
- Versions of the crowd-processed MEI file are pushed to Github (if enabled) periodically until the end of the transcription campaign
- CE-API (optional) (Github page)
- Docker (Website)
- If you plan to also run the front-end scriptoria, clone it into a
scriptoria
folder in the root of this repository - Run
start_local.sh
, it will set up and start the docker containers automatically
In case you get an error about file sharing, please consult this thread.
There's a lot to still be done. To make this a little bit easier, issues have been made tagged with Future Work labels, in case anyone wants to carry on the torch. A couple of these might involve large refactors.