Converting Multilingual Document Representations to English #1990
-
Hi @MaartenGr , Hi All, I need to obtain topic representations for a set of multilingual documents. I have already utilized multilingual embeddings for this purpose. However, I have received representations in different languages due to these multilingual documents. Is there any approach that can convert all the final representations to the English language? I have selected the following representation method: |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
I would advise using a model that support multi-lingual documents a bit better. I'm not sure but I remember Flan-T5 not to be trained on multi-lingual data. Instead, if you use an LLM like Llama 3, you can ask it in the prompt to generate only English texts. |
Beta Was this translation helpful? Give feedback.
I would advise using a model that support multi-lingual documents a bit better. I'm not sure but I remember Flan-T5 not to be trained on multi-lingual data. Instead, if you use an LLM like Llama 3, you can ask it in the prompt to generate only English texts.