This repository contains the configuration for the MT interfaces at
- http://gtweb.uit.no/jorgal/
- http://gtweb.uit.no/tolkimine/
- http://gtweb.uit.no/mt-testing/
- http://gtweb.uit.no/mt/
The translation daemon/server is running under the user “apy” on the server gtweb (gtweb.uit.no), and the html pages also reside under the home directory of that user (hosted by the regular nginx server). The whole configuration is in this git repo, hosted at https://github.com/giellatekno/gtweb-apy-conf
The directory /home/apy
is a git repo, with
https://github.com/giellatekno/gtweb-apy-conf as the upstream.
Any change to the configuration is first checked in to git and pushed
to github, then the configuration is updated at gtweb by running the
update
script, which also might update submodules and build them.
There are some submodules:
- the translation daemon/server apertium-apy, branch giellatekno
- the static html pages apertium-html-tools, branch giellatekno
- the grammar checker Quill editor web interface divvun-webdemo
- the grammar checker CK-editor web interface ckeditor-divvungc
The apertium-apy daemon runs as a systemd service, and there’s an nginx config that makes gtweb.uit.no:2737 available under gtweb.uit.no/apy. Nginx should serve SSL signed by Lets Encrypt / certbot. All requests to apertium-apy from the static html pages go to https://gtweb.uit.no/apy, although we still allow accessing the pages through plain http since https makes the website translation break on loading external resources (see #2).
The following sections define how to do common or uncommon tasks like installing everything from scratch, installing new language pairs, or updating the code for the subrepos.
The page mt-testing runs all language pairs, but jorgal and tolkimine
only run certain pairs. See the files jorgal.conf / tolkimine.conf in
the directory html-tools-confs
– you’ll need to edit the line
ALLOWED_PAIRS
if you want your new pair to appear on one of those
pages. Then check in your change and push to
https://github.com/giellatekno/gtweb-apy-conf (it’s also possible to just
check in that change on /home/apy
without pushing, if you really
have to), e.g.
edit html-tools-confs/jorgal.conf git commit -am "added sme-eus" git push # unless you're editing in /home/apy because you can't push
To install the new language pair on gtweb, log in as a user with sudo-rights and do:
sudo yum install apertium-from-to sudo /home/apy/update
https://build.opensuse.org/project/show/home:TinoDidriksen:nightly
A push to apertium-apy or apertium-html-tools isn’t automatically
pulled into the gtweb just by running /home/apy/update
; you need to
also check into this repo that you want to use that.
This repo specifies which commit of each submodule we want to use. If
you want to use the newest commit of the giellatekno
branch of
apertium-html-tools and apertium-apy, you can enter this repo and do
git submodule foreach git pull origin giellatekno git commit -m "updated apy/html-tools" git push
(if you don’t have push-rights to
https://github.com/giellatekno/gtweb-apy-conf, you could just run the
first two lines under gtweb:/home/apy
).
Then, as a sudo-user on gtweb, do:
sudo /home/apy/update
In case e.g. gtweb dies and you need to reinstall everything:
Tell SELinux that nginx can proxy to ports 2737 (apy) and 3000 (spellcheck):
sudo semanage port -a -t http_port_t -p tcp 2737 sudo semanage port -a -t http_port_t -p tcp 3000
Install some dependencies from CentOS/EPEL:
sudo yum install certbot python2-certbot-nginx # for https sudo yum install git jq # for this repo, build-html-pages sudo yum install python36-tornado # for running apertium-apy sudo yum install zip unzip # for divvun-apy-extract-all-zcheck
curl -Ss https://apertium.projectjj.com/rpm/install-nightly.sh | sudo bash sudo yum update sudo yum install giella-sme # for running in divvun-gramcheck sudo yum install 'apertium-sme-* # and other language pairs for MT
And these for libdivvun:
sudo yum install centos-release-scl # for installing devtoolset's sudo yum install devtoolset-8-gcc-c++ # for recent C++ sudo yum install libtool automake cmake pugixml-devel libarchive-devel sudo yum install hfst-ospell-devel
Add a new user apy
(group apy
) with home dir /home/apy
.
sudo adduser apy
Log in as this user apy
– we’ll make the home directory a git repo
tracking this repo:
cd /home/apy git init git remote add origin https://github.com/giellatekno/gtweb-apy-conf.git git branch --set-upstream-to=origin/master master git pull git checkout master git submodule init git submodule update --recursive
Still as user apy
, compile libdivvun
manually since CentOS doesn’t
have pugixml-devel
in non-extra repos:
git clone https://github.com/divvun/libdivvun/ cd libdivvun scl enable devtoolset-8 bash ./autogen.sh ./configure --enable-checker --enable-cgspell --enable-xml --disable-python-bindings --prefix=/home/apy/PREFIX/divvun-gramcheck make -j5 make test # it's ok if the run-lib test fails make -j5 install exit
Then log in as a user with sudo-rights, and install configuration files:
sudo /home/apy/install-and-enable-services
That script will also ensure that the yum updater, apertium-apy service and apertium-apy-restarter service are running now and on restarts of gtweb.
Then update and build apertium-apy and the apertium-html-tools pages:
sudo /home/apy/update
(Currently gtweb is using an old svn checkout in
~apy/apertium-apy/corpustools
. This section needs updating to use
Tino’s package, but apparently that’s only available for newer
operating systems.)
apertium-apy uses CorpusTools if available. We need to ensure it’s
possible to run /usr/bin/pdftohtml
and to do from corpustools
import pdfconverter
from the apy directory. To install CorpusTools:
sudo yum install python3-corpustools
Check that you have all dependencies by logging in as user apy
and
cd apertium-apy python3 -c 'from corpustools import pdfconverter; print(pdfconverter.convert2intermediate.__doc__)'
This should print “Convert a pdf document to Giella xml format.” etc.
apertium-apy
will detect if corpustools and pdftohtml are available.
All the relevant configuration files for the gtweb machine are under
the etc
folder of this repo, so we know what configs are relevant in
case we need to reinstall everything.
Language pairs are those that are installed with yum install
(ExecStart
in etc/systemd/system/apy.service
gives the path to the
modes files), but individual html configurations can specify a subset
of pairs to run (see Installing new language pairs).
We expect a standard nginx httpd running; see configs in
etc/nginx/default.d/
.
The file etc/systemd/system/apy.service
says how to run the
apertium-apy MT daemon, which is started on restart of the gtweb
machine.