Skip to content

Popular repositories Loading

  1. extractor extractor Public

    C++ 23 4

  2. corset corset Public

    Corset is a web-based data selection portal that helps you getting relevant data from massive amounts of parallel data.

    SCSS 17 3

  3. keops keops Public

    Tool for manual evaluation of parallel sentences.

    PHP 14 4

  4. DataCollection DataCollection Public

    Forked from modernmt/DataCollection

    Data collection, alignment and TAUS repository

    Python 8 3

  5. cirrus-scripts cirrus-scripts Public

    Scripts for running bitextor/paracrawl/europat jobs on cirrus.ac.uk

    Shell 7 1

  6. synthesis synthesis Public

    Data synthesis by contextualizing glossary translations

    Python 6 3

Repositories

Showing 10 of 20 repositories
  • keops Public

    Tool for manual evaluation of parallel sentences.

    PHP 14 GPL-3.0 4 0 1 Updated Mar 17, 2025
  • giashard Public

    Sharding program for Paracrawl

    Go 2 1 0 0 Updated Oct 16, 2024
  • cirrus-scripts Public

    Scripts for running bitextor/paracrawl/europat jobs on cirrus.ac.uk

    Shell 7 1 8 1 Updated Sep 26, 2024
  • giawarc Public

    Processing utilities for Internet Archive

    C++ 1 0 4 1 Updated Apr 19, 2024
  • corset Public

    Corset is a web-based data selection portal that helps you getting relevant data from massive amounts of parallel data.

    SCSS 17 GPL-3.0 3 1 0 Updated Nov 6, 2023
  • europat-scripts Public

    Scripts for obtaining patent data

    Java 5 2 1 1 Updated Apr 14, 2023
  • tmxutil Public

    Tools to generate & filter Europat tmx files.

    Python 4 MIT 1 1 0 Updated Jan 17, 2023
  • synthesis Public

    Data synthesis by contextualizing glossary translations

    Python 6 3 0 0 Updated Jul 1, 2021
  • opus-train Public

    Automate download and training with OPUS corpora

    Shell 2 MIT 2 0 0 Updated Jan 28, 2021
  • human-evaluations Public

    Results of the human evaluation

    Rich Text Format 5 3 0 0 Updated Dec 9, 2020

Top languages

Loading…

Most used topics

Loading…