Descritption

This repository contains a set of scripts to build a ready-to-use Juman++ model for Jumandic.

Prerequrements

Unix environment (on Windows use WSL or MSYS2/MinGW64)
Juman++ build environment
Python 3.6+
Ruby
Perl
Configured ssh authorization for github (we will clone several repositories via ssh)
32 GB of RAM

How to Use

Run the configuration script: python3 configure.py. It will prompt for the location of Mainichi Shinbun texts.

After that run make nornn for training a model without RNN component. make rnn produces the model with RNN component. The models will be inside the bld/model folder.

Adding your words to the model

It is possible to add your words to the model. To do it:

Perform the configuration as described above: python3 configure.py
Fetch the repositories make repo.
Go into bld/repos/jumandic folder, it is a local clone of JumanDIC repository.
Create a new file with the .dic extension in the userdic folder of the bld/repos/jumandic folder.
Put your words into that file, in JUMAN dictionary format (refer to other files for example).
Execute make clean-dic if you have already built a Juman++ model.
Build your model as shown above.

If the built model does not contain your words, ensure that the binary dictionary was rebuilt after adding new words.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
make		make
scripts		scripts
.gitignore		.gitignore
Makefile		Makefile
README.md		README.md
configure.py		configure.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Descritption

Prerequrements

Recommended

How to Use

Adding your words to the model

About

Releases 1

Languages

ku-nlp/jumanpp-jumandic

Folders and files

Latest commit

History

Repository files navigation

Descritption

Prerequrements

Recommended

How to Use

Adding your words to the model

About

Topics

Resources

Stars

Watchers

Forks

Releases 1

Languages