Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Belarusian language support #675

Open
timfunky opened this issue Dec 2, 2024 · 1 comment
Open

Add Belarusian language support #675

timfunky opened this issue Dec 2, 2024 · 1 comment
Labels
languages Dictionary or language related issues

Comments

@timfunky
Copy link

timfunky commented Dec 2, 2024

Hello @sspanak or should I say zdravey. First of all, thank you a bunch for your immense contribution to the community of dumb phones and open source in general! Blagodarq!

I would like to contribute by adding Belarusian language support. While creating a layout is not a big deal (letters і, ў, ґ should be supported, right?), I wonder what is the way to create a .csv dictionary. I thought of parsing books in Belarusian, maybe dictionaries (the problem is lacking conjugation there, however), and then putting parsed words in the csv file, avoiding repetitions and one letter words. Then generating frequency value for words(I don't really understand that part, how can I calculate it?).

I am probably reinventing the wheel, so I would like to ask if you or someone what tools/databases/other means you use to make dictionaries for languages.

Thank you for reading and have a nice day.

@sspanak
Copy link
Owner

sspanak commented Dec 2, 2024

Вітаю, тімфанкі! Character support is not a problem at all. Also, don't worry about the word frequencies, I'll extract them from the general Android dictionary and inject them in the word list. I'll also take care of the rest of the technical details. I have all the necessary tools in the repo.

If you would like to help, please try finding a good Belarusian word list. Extracting from books and subtitles often produces words with spelling mistakes or words with numbers, punctuation or special characters. And, in the case of Slavic languages, a lot of verb conjugations and noun cases will be missing.

Instead, I recommend looking for a dictionary from a respected academy or university. In many countries, they have "the big dictionary of X". Such dictionaries are great, they are spell checked and verified by linguists. This results in very high suggestions when you type. The problem is I do not speak Belarusian and it is very difficult for me to search in Belarusian websites. On the other hand, your help as a native speaker will be invaluable. This is all I am asking.

Also, I would like to clarify the letter order for ABC mode. Should 3-key be: дежзё or деёжз? Also, where do "ш", "ы", "ь" and "э" go? In Bulgarian, we have 8: "шщъь", 9: "юя", but I can see the positions of all letter from "ч" onward vary between all Slavic languages. There is even inconsistency between some keyboards and devices when using the same language. I really hope you know what people are used to and what will feel just right.

And after we are done, I hope you will enjoy using TT9 even more!

Btw, you've found the most incorrect way of spelling "Blagodarq". 😆 It should be either "Blagodarya" or "Благодаря".

@sspanak sspanak added the languages Dictionary or language related issues label Dec 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
languages Dictionary or language related issues
Projects
None yet
Development

No branches or pull requests

2 participants