Skip to content

Commit

Permalink
Rearrange directories in the repo (#550)
Browse files Browse the repository at this point in the history
* Move features to nav_bar

* Move apis to nav_bar

* Move events to nav_bar

* Move calls-for-papers to nav_bar

* Move integrations to nav_bar

* Move aggregators to nav_bar

* Move languages to nav_bar

* Move quality-estimation to nav_bar

* Move building-and-research to nav_bar

* Move concepts to nav_bar

* Move resources to nav_bar

* Move more to nav_bar

* Move about.md to nav_bar

* Move community.md to nav_bar

* Move newsletter.md to nav_bar

* Move contributing to nav_bar

* Move community to nav_bar

* Move machinetranslate.jpg to images

* Move machinetranslate.png to images

* Move favicon.ico to images

* Move CNAME to files

* Change path for favicon.ico

* Move industry to nav_bar/more

* Move people to nav_bar/more

* Move research-laboratories to nav_bar/more

* Move associations to nav_bar/more

* Move nav_bar/community to  nav_bar/more

* Move roadmap.md to nav_bar/contributing

* Remove duplicate quality-estimation.md

* Move quality to nav_bar/building-and-research

* Move customization to nav_bar/features

* Move approaches to nav_bar/building-and-research

* Move CONTRIBUTING.md to nav_bar/contributing

* Move workflows to applications

* Move applications to nav_bar/building-and-research

* Change filepaths after rearrangement

* Change paths

* Add QE APIs support in Language pages (#549)

* Add QE APIs to languages and fix languages nav bar

* Exclude languages from nav bar and improve wording

---------

Co-authored-by: Tovmas <tharrison748@gmail.com>

* Move features to nav_bar

* Move apis to nav_bar

* Move events to nav_bar

* Move calls-for-papers to nav_bar

* Move integrations to nav_bar

* Move aggregators to nav_bar

* Move languages to nav_bar

* Move quality-estimation to nav_bar

* Move building-and-research to nav_bar

* Move concepts to nav_bar

* Move resources to nav_bar

* Move more to nav_bar

* Move about.md to nav_bar

* Move community.md to nav_bar

* Move newsletter.md to nav_bar

* Move contributing to nav_bar

* Move community to nav_bar

* Move machinetranslate.jpg to images

* Move machinetranslate.png to images

* Move favicon.ico to images

* Move CNAME to files

* Change path for favicon.ico

* Move industry to nav_bar/more

* Move people to nav_bar/more

* Move research-laboratories to nav_bar/more

* Move associations to nav_bar/more

* Move nav_bar/community to  nav_bar/more

* Move roadmap.md to nav_bar/contributing

* Remove duplicate quality-estimation.md

* Move quality to nav_bar/building-and-research

* Move customization to nav_bar/features

* Move approaches to nav_bar/building-and-research

* Move CONTRIBUTING.md to nav_bar/contributing

* Move workflows to applications

* Move applications to nav_bar/building-and-research

* Change filepaths after rearrangement

* Change paths

---------

Co-authored-by: Tovmas <tharrison748@gmail.com>
  • Loading branch information
tovmasharrison and Tovmas authored Oct 22, 2023
1 parent 1184139 commit fcdee82
Show file tree
Hide file tree
Showing 633 changed files with 178 additions and 356 deletions.
2 changes: 1 addition & 1 deletion _includes/head.html
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@
{% endif %}
{% endunless %}

<link rel="shortcut icon" href="{{ 'favicon.ico' | relative_url }}" type="image/x-icon">
<link rel="shortcut icon" href="{{ 'images/favicon.ico' | relative_url }}" type="image/x-icon">

<link rel="stylesheet" href="{{ '/assets/css/just-the-docs-default.css' | relative_url }}">

Expand Down
32 changes: 0 additions & 32 deletions applications/live-chat.md

This file was deleted.

File renamed without changes.
15 changes: 7 additions & 8 deletions generate.py
Original file line number Diff line number Diff line change
Expand Up @@ -144,7 +144,7 @@ def normalize(code):
name = LANGUAGE_FAMILIES[code]

slug = slugify(name)
filepath = f'languages/{ slug }.md'
filepath = f'nav_bar/languages/{ slug }.md'

# TODO: check that it won't be overwritten by a language

Expand Down Expand Up @@ -257,7 +257,7 @@ def normalize(code):
}

slug = slugify(name)
filepath = f'languages/{ slug }.md'
filepath = f'nav_bar/languages/{ slug }.md'

content = read_content(filepath)

Expand Down Expand Up @@ -392,7 +392,7 @@ def normalize(code):

content = read_content(filepath)

filepath = f'apis/{ api_id }.md'
filepath = f'nav_bar/apis/{ api_id }.md'
with open(filepath, 'w', encoding='utf8') as f:
f.write(f'''\
---
Expand Down Expand Up @@ -497,7 +497,7 @@ def normalize(code):

content = read_content(filepath)

filepath = f'integrations/{tms_id}.md'
filepath = f'nav_bar/integrations/{tms_id}.md'
with open(filepath, 'w', encoding='utf8') as f:
f.write(f'''\
---
Expand Down Expand Up @@ -554,7 +554,7 @@ def normalize(code):

content = read_content(filepath)

filepath = f'aggregators/{a_id}.md'
filepath = f'nav_bar/aggregators/{a_id}.md'
with open(filepath, 'w', encoding='utf8') as f:
f.write(f'''\
---
Expand Down Expand Up @@ -679,9 +679,8 @@ def normalize(code):
}
}

slug = slugify(name)

filepath = f'quality-estimation/{ slug }.md'
slug = slugify(name.replace('-', '').replace(' ', ' '))
filepath = f'nav_bar/quality-estimation/{ slug }.md'

content = read_content(filepath)

Expand Down
File renamed without changes.
File renamed without changes
File renamed without changes
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
Original file line number Diff line number Diff line change
Expand Up @@ -12,5 +12,5 @@ English is a common choice for the bridge because many languages have training d

## Applications

- Translation without [parallel data](/customisation/parallel-data.md)
- Translation without [parallel data](/nav_bar/features/customisation/parallel-data.md)
- Paraphrasing in a single language
Original file line number Diff line number Diff line change
Expand Up @@ -5,9 +5,9 @@ title: Locale
description: Specification of language variants
---

A **locale** is an identifier of a [language](/languages/languages.md) and region, plus an optional writing script.
The locale is used in [machine translation APIs](/apis/apis.md) to specify the language of the source and target text.
Locales are used to indicate the language of documents in [web crawling](/customisation/crawling.md) to build [training data](/customisation/crawling.md).
A **locale** is an identifier of a [language](/nav_bar/languages/languages.md) and region, plus an optional writing script.
The locale is used in [machine translation APIs](/nav_bar/apis/apis.md) to specify the language of the source and target text.
Locales are used to indicate the language of documents in [web crawling](/nav_bar/features/customisation/crawling.md) to build [training data](/nav_bar/features/customisation/crawling.md).

Example: `frCA` means French (fr) as spoken in Canada (CA)

Expand Down Expand Up @@ -53,4 +53,4 @@ These language variations are supported by many API vendors:

## See also

- [Languages](/languages/languages.md)
- [Languages](/nav_bar/languages/languages.md)
Original file line number Diff line number Diff line change
Expand Up @@ -13,11 +13,11 @@ Zero-shot machine translation is an active area of research.

## Approaches

- Multilingual [neural machine translation](/approaches/neural-machine-translation.md) (MNMT): This approach learns a single model for all language pairs. The target language is an input to the model. This approach can produce translations between languages that both have parallel data, but not necessarily parallel data with each other. As of 2022, [Google Translate uses MNMT](https://ai.googleblog.com/2022/05/24-new-languages-google-translate.html).
- Multilingual [neural machine translation](/nav_bar/building-and-research/approaches/neural-machine-translation.md) (MNMT): This approach learns a single model for all language pairs. The target language is an input to the model. This approach can produce translations between languages that both have parallel data, but not necessarily parallel data with each other. As of 2022, [Google Translate uses MNMT](https://ai.googleblog.com/2022/05/24-new-languages-google-translate.html).
- Stitching together encoders and decoders
- When training models for many language pairs, it's possible to ensure that the latent representations are similar across languages, then connect the appropriate encoder and decoder for the desired language pair.
- Unsupervised methods learn encoders and decoders as denoising autoencoders, making an effort to ensure that the latent representations are similar across languages, then connect the appropriate encoder and decoder for the desired language pair.

## See also

- [Bridging](bridging.md): Bridging is also commonly used to translate between languages without parallel data.
- [Bridging](/nav_bar/building-and-research/applications/advanced-concepts/bridging.md): Bridging is also commonly used to translate between languages without parallel data.
File renamed without changes.
File renamed without changes.
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ Machine translation for **commerce and marketplaces** is the translation of prod
## Goals

* sales conversion
* [SEO](seo.md) and site search
* [SEO](/nav_bar/building-and-research/applications/seo.md) and site search
* customer support

The top merchants and platforms cannot human-translate all products into all languages, because of the scale.
Expand All @@ -22,4 +22,4 @@ The translation of product titles and descriptions is handled by the platforms,
* Product titles
* Product descriptions
* Product reviews
* Customer support [chat](live-chat.md) and email
* Customer support [chat](/nav_bar/building-and-research/applications/live-chat.md) and email
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ When applied to localised content, machine translation is used to detect potenti
- truncated words or phrases due to lack of screen space
- unsuitable sort order in target language

Detecting these issues at an early phase allows taking corrective actions, in parallel with the actual professional translation or [post-editing](/post-editing.md), or planning for necessary post-processing of localised content, for example:
Detecting these issues at an early phase allows taking corrective actions, in parallel with the actual professional translation or [post-editing](/nav_bar/building-and-research/applications/workflows/post-editing.md), or planning for necessary post-processing of localised content, for example:

- Enable proper locale, character sets and fonts.
- Resize assets for various text lengths, or plan for resizing these assets at the time of translation.
Expand All @@ -39,11 +39,11 @@ Detecting these issues at an early phase allows taking corrective actions, in pa

The source image in English relies on the exact length of English text:

![English image](content-drafting-images/image_with_text_eng.svg)
![English image](/nav_bar/building-and-research/applications/content-drafting-images/image_with_text_eng.svg)

Localised images will have different text lengths depending on the target language. For example, German text is usually longer and does not fit, so some work is needed after the translation to adjust the localised images:

![German image](content-drafting-images/image_with_text_ger.svg)
![German image](/nav_bar/building-and-research/applications/content-drafting-images/image_with_text_ger.svg)
<!-- example of mock-up UI localisation with wrong sorting of a translated list -->


Expand All @@ -65,4 +65,4 @@ This technique omits the basic difficulty of post-editing: while the post-editor

An author may use DeepL in the browser to draft an English paragraph from a Polish source.

![Drafting text in DeepL](content-drafting-images/drafting-text-deepl.png)
![Drafting text in DeepL](/nav_bar/building-and-research/applications/content-drafting-images/drafting-text-deepl.png)
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ In-game:
* UI strings
* On-screen text and images
* Subtitles
* [Live chat](live-chat.md)
* [Live chat](/nav_bar/building-and-research/applications/live-chat.md)

Out-of-game:

Expand Down
32 changes: 32 additions & 0 deletions nav_bar/building-and-research/applications/live-chat.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
---
nav_order: 8
parent: Application areas
title: Live chat
description: Machine translation for live chat
---

# Live chat

Machine translation for **live chat** is used to translate messages between two users who do not speak a common language.

Live chat messages are a type of [user-generated content](/nav_bar/building-and-research/applications/user-generated-content.md).
Live chat is challenging for machine translation because it is noisy and context-dependent.

Common scenarios are in-game chat in [gaming](/nav_bar/building-and-research/applications/gaming.md) and customer support for [commerce and marketplaces](/nav_bar/building-and-research/applications/commerce-and-marketplaces.md).

Many commercial chat applications that have incorporated machine translation for live chat:

- Skype
- Telegram
- WeChat
- Line

## Companies

* [Unbabel](/nav_bar/more/industry/companies.md/#unbabel)
* [Language I/O](/nav_bar/more/industry/companies.md/#language-io)
* [KantanMT](/nav_bar/more/industry/companies.md/#kantanmt)

## See also

* [Speech translation](/nav_bar/building-and-research/other-input-types/speech.md)
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,6 @@ This is a challenge for machine translation since many names and terms can have

# Content types

- [Social networks](social-networks.md)
- [Social networks](/nav_bar/building-and-research/applications/social-networks.md)
- Messenger apps
- [Commerce and marketplaces](commerce-and-marketplaces.md)
- [Commerce and marketplaces](/nav_bar/building-and-research/applications/commerce-and-marketplaces.md)
Original file line number Diff line number Diff line change
Expand Up @@ -5,13 +5,13 @@ title: SEO
description: Machine translation for SEO
---

Machine translation for **search engine optimization** \(**SEO**\) is the translation of [commerce and marketplace](commerce-and-marketplaces.md) website content into the languages in which users are searching.
Machine translation for **search engine optimization** \(**SEO**\) is the translation of [commerce and marketplace](/nav_bar/building-and-research/applications/commerce-and-marketplaces.md) website content into the languages in which users are searching.

Translations for search engine optimization is challenging for machine translation because the end goal is not just to convey the meaning, but to use the words that the users actually search for in the target language.

Short input, like keywords and tags, and non-sentence input, like lists of keywords, are also a challenge for machine translation.

Content can be purely machine-translated, [hybrid-translated](/workflows/hybrid-translation.md) or human [post-edited](/workflows/post-editing.md).
Content can be purely machine-translated, [hybrid-translated](/nav_bar/building-and-research/applications/workflows/hybrid-translation.md) or human [post-edited](/nav_bar/building-and-research/applications/workflows/post-editing.md).
Search engines can penalize machine-generated content, including purely machine-translated content.

### Content types
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ Social networks with a machine translation feature include:
* YouTube
* TikTok

While the static content of social networks is human-translated, machine translation is used for [user-generated content](user-generated-content.md).
While the static content of social networks is human-translated, machine translation is used for [user-generated content](/nav_bar/building-and-research/applications/user-generated-content.md).

## Content types

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -10,12 +10,12 @@ Machine translation is a key technology in professional **translation and locali

### Integrations

Translation software, like [translation management systems](/integrations) (TMS) and computer-aided translation tools (CAT), integrate [machine translation APIs](/apis), directly or via plugins.
Translation software, like [translation management systems](/nav_bar/integrations) (TMS) and computer-aided translation tools (CAT), integrate [machine translation APIs](/nav_bar/apis), directly or via plugins.

### Workflow

The translation software fills in the machine translation for the human translator to [post-edit](/workflows/post-editing.md).
The machine translation can be inserted in whole files at once or one [segment](/concepts/segment.md) at a time.
The translation software fills in the machine translation for the human translator to [post-edit](/nav_bar/building-and-research/applications/workflows/post-editing.md).
The machine translation can be inserted in whole files at once or one [segment](/nav_bar/concepts/segment.md) at a time.
Some systems that translate segment-by-segment can learn from post-edits and adapt the machine translation output accordingly.

### Productivity
Expand All @@ -41,7 +41,7 @@ A software localisation process consists of the following steps:
2. Identifying the features that need to be replaced or adapted.
3. Translating user interface and user assistance content.
4. Replacing or adapting the features that can’t be used in the target culture.
5. Creating versions of the software in the target [locale](/applications/advanced-concepts/locale.md) that are target culture specific.
5. Creating versions of the software in the target [locale](/nav_bar/building-and-research/applications/advanced-concepts/locale.md) that are target culture specific.
6. Testing the localised versions:
- Verifying the validity of the translation in the context of the software.
- Checking if the new versions work well for the target audience in each language:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ User-generated content is challenging for machine translation because it is open

## Content types

* [Posts and comments](social-networks.md)
* [Live chat](live-chat.md)
* [Product titles, descriptions and reviews](commerce-and-marketplaces.md)
* [In-game content](gaming.md)
* [Posts and comments](/nav_bar/building-and-research/applications/social-networks.md)
* [Live chat](/nav_bar/building-and-research/applications/live-chat.md)
* [Product titles, descriptions and reviews](/nav_bar/building-and-research/applications/commerce-and-marketplaces.md)
* [In-game content](/nav_bar/building-and-research/applications/gaming.md)
File renamed without changes
Original file line number Diff line number Diff line change
Expand Up @@ -8,16 +8,16 @@ description: Workflow with both human translations and pure machine translations

In a **hybrid translation** workflow, some raw machine translations are never seen or edited by a human translator.

Hybrid translation can be faster and cheaper than full human [post-editing](post-editing.md).
Hybrid translation can be faster and cheaper than full human [post-editing](/nav_bar/building-and-research/applications/workflows/post-editing.md).

Hybrid translation requires good machine translation.
A signification portion of the machine translated segments should be usable as-is.

> ### Workflow diagram
> The hybrid translation workflow was first presented by Microsoft, VMWare and [Unbabel](/industry/companies.md#unbabel).
> The hybrid translation workflow was first presented by Microsoft, VMWare and [Unbabel](/nav_bar/more/industry/companies.md#unbabel).
>
> ##### Slide from a [ModelFront](/industry/companies.md#modelfront) presentation
> <img title='Hybrid translation workflow' src='/workflows/hybrid-translation-workflow.png' width='700' style='padding: 1em;' />
> ##### Slide from a [ModelFront](/nav_bar/more/industry/companies.md#modelfront) presentation
> <img title='Hybrid translation workflow' src='/nav_bar/building-and-research/applications/workflows/hybrid-translation-workflow.png' width='700' style='padding: 1em;' />
A risk **threshold** is set.
Each new machine translation is automatically classified as high-quality or low-quality.
Expand All @@ -31,17 +31,17 @@ They are marked as translated or approved, and potentially even locked.

### Technology

The key technology for a hybrid translation workflow is translation [**quality prediction**](/quality/quality-estimation.md), which is known as *machine translation quality estimation* in the [research](/building-and-research/building-and-research.md) world.
The key technology for a hybrid translation workflow is translation [**quality prediction**](/nav_bar/building-and-research/quality/quality-estimation.md), which is known as *machine translation quality estimation* in the [research](/nav_bar/building-and-research/building-and-research.md) world.

### Adoption

At first, companies like Microsoft, [Unbabel](/industry/companies.md/#unbabel), VMWare and Wayfair implemented hybrid translation by researching and developing their own machine translation quality estimation.
At first, companies like Microsoft, [Unbabel](/nav_bar/more/industry/companies.md/#unbabel), VMWare and Wayfair implemented hybrid translation by researching and developing their own machine translation quality estimation.

With the launch of the [ModelFront](/industry/companies.md/#modelfront) translation quality prediction API, more companies started to adopt the hybrid translation workflow within commercially available translation management systems.
With the launch of the [ModelFront](/nav_bar/more/industry/companies.md/#modelfront) translation quality prediction API, more companies started to adopt the hybrid translation workflow within commercially available translation management systems.


---

### See also

- [**Quality estimation**](/quality/quality-estimation.md)
- [**Quality estimation**](/nav_bar/building-and-research/quality/quality-estimation.md)
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ description: Workflow for human translation with sentence completion
Unlike traditional predictive text, interactive translation prediction uses the machine translation model to provide better completions of the sentence.

> ##### User interface from [Casmacat](https://www.casmacat.eu/)
> <img title='Casmacat interactive machine translation user interface' src='/workflows/casmacat_interactive_machine_translation.gif' width='700' style='padding: 1em;' />
> <img title='Casmacat interactive machine translation user interface' src='/nav_bar/building-and-research/applications/workflows/casmacat_interactive_machine_translation.gif' width='700' style='padding: 1em;' />

## Challenges
Expand All @@ -19,4 +19,4 @@ Unlike traditional predictive text, interactive translation prediction uses the

## See also

- [A user study of neural interactive translation prediction](https://link.springer.com/article/10.1007/s10590-019-09235-8) (2019) found that interactive translation prediction was more efficient than [post-editing](post-editing.md) with neural machine translation and also highlights user experience challenges.
- [A user study of neural interactive translation prediction](https://link.springer.com/article/10.1007/s10590-019-09235-8) (2019) found that interactive translation prediction was more efficient than [post-editing](/nav_bar/building-and-research/applications/workflows/post-editing.md) with neural machine translation and also highlights user experience challenges.
File renamed without changes.
File renamed without changes.
Loading

0 comments on commit fcdee82

Please sign in to comment.