-
Notifications
You must be signed in to change notification settings - Fork 58
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Update automatic-post-editing.md (#570)
* Update automatic-post-editing.md * Update automatic-post-editing.md * Update automatic-post-editing.md
- Loading branch information
1 parent
58f8df1
commit c9b5d98
Showing
1 changed file
with
45 additions
and
3 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,7 +1,49 @@ | ||
--- | ||
nav_order: 5 | ||
parent: Application areas | ||
parent: Building and research | ||
title: Automatic post-editing | ||
description: A method to correct errors in the machine translation output | ||
layout: coming_soon | ||
description: Automatic post-editing of machine translation | ||
--- | ||
|
||
**Automatic post-editing (APE)** is the task of automatically correcting the output of a machine translation system. Like in manual human post-editing, the objective is to improve the quality of a machine translation output. For example, automatic post-editing can be used to correct errors or apply a certain style. | ||
Automatic post-editing systems should meet multiple requirements: | ||
- detect and fix machine translation errors | ||
- avoid making overcorrections, especially those that add new errors | ||
|
||
|
||
### Evolution | ||
The evolution of automatic post-editing systems is similar to that of machine translation. The first systems took [rule-based approaches](/rule-based-machine-translation.md), later systems took [statistical approaches](/statistical-machine-translation.md) and then [neural approaches](/neural-machine-translation.md). | ||
|
||
|
||
### Use cases | ||
Automatic post-editing can be applied in several use cases: | ||
- fixing errors in machine translation outputs | ||
- adapting the output of a machine translation system to a custom domain | ||
- providing alternative translation suggestions in translation tools, for example [OpenTIPE](https://aclanthology.org/2023.acl-demo.19.pdf) | ||
|
||
|
||
### Training | ||
Automatic post-editing systems are usually trained with parallel triplets containing | ||
- a source text | ||
- the machine-translated version of this source text | ||
- the post-edited version of the machine-translated text | ||
|
||
| Source | Machine translation | Human post-edited translation| | ||
| How are you? | ¿Cómo estás? | ¿Cómo está?| | ||
| Computer| Computadora| Ordenadora| | ||
| … | … | …| | ||
|
||
|
||
### Datasets | ||
The datasets from the [quality estimation](/quality-estimation) shared task at [WMT](/wmt) can be used for training and evaluating automatic post-editing systems. | ||
|
||
When human post-edited translations are not available, synthetic post-editing data can be created from ordinary translation data. For example, [eSCAPE (Synthetic Corpus for Automatic Post-Editing)]( https://aclanthology.org/L18-1004.pdf) creates synthetic data by inserting a machine translation. | ||
|
||
|
||
### Evaluation | ||
Automatic post-editing systems can be evaluated like machine translation systems: | ||
- Automatic, reference-based evaluation metrics, like [TER](/building-and-research/metrics/ter.md) or [BLEU](/building-and-research/metrics/bleu.md) | ||
- Human evaluation, like direct assessment | ||
|
||
Evaluation reveals how many sentences were improved. Precision can be calculated by dividing the number of improved sentences by the total number of modified sentences. Another common metric is the average number of edits per segment. | ||
|