Wrapper to produce the XML editions of OBP books.
docker run --rm \
-v /path/to/local.epub:/ebook_automation/epub_file.epub \
-v /path/to/local.xml:/ebook_automation/epub_file.xml \
-v /path/to/output:/ebook_automation/output \
openbookpublishers/obp-gen-xml
Alternatively you may clone the repo, build the image using docker build . -t some/tag
and run the command above replacing openbookpublishers/obp-gen-xml
with some/tag
.
This wrapper requires saxonb-xslt
and python3-bs4
to be installed on your system. On Debian (or Debian-based distributions) this package can be installed via
apt-get install libsaxonb-java python3-bs4
To perform the setup, run:
bash setup
The setup contains the necessary instruction to initialise the submodule.
To start the conversion, place the epub file and the DOI deposit in the obp-gen-xml folder. Finally, run:
bash run prefix
where prefix is the name of the book and the DOI deposit files; i.e.: bash run Siklos-Advanced_Problems2
.
bash clean [-y]
would remove temporary files (untracked files and folders) from the obp-gen-xml folder. The script asks for the user's confirmation before removing the files, but if you are running this as part of a script you might want to use the-y
flag to bypass the confirmation.
This suite of scripts works as expected, but the introduction of tailor_book_transform.py
is to be regarded as a temporary patch.
The stylesheet Transform-to-XML-book.xsl
merges together XML files of the book sections. This is performed by tentative includes of possible filenames hardcoded in the spreadsheet. All this works smoothly with the XML parser embedded in Oxygen, but the (apparently less tolerant) XML parser that saxonb-xslt
uses fails at the first occurrence of a missing file.
tailor_book_transform.py
creates a temporary and simplified version of Transform-to-XML-book.xsl
which lists only the successful includes.
There are a number of possible solutions, including (a) forcing saxonb-xslt
to use a different XML parser and (b) re-write the Transform-to-XML-book.xsl
to make its list of includes more precise.
The suite of XSL files stored in XML-last
fail if the Crossref schema version declared in the DOI deposit does not correspond with the one hardcoded in the stylesheets.
Since the version of our DOI deposit changed over the time, we need a resilient system able to process the all the deposits. The small collection of scripts stored in ./src.
serve for this purpose:
./src/extract_schema_version.py
reads the schema version declared in the DOI deposit;./src/tailor_book_transformation.py
and./src/tailor_section_transformation.py
produces compatible variations of the stylesheets. Please, note that./src/tailor_book_transformation.py
has extra instructions described in this DEV section of the readme file.
Use pre-commit.sh
as a pre commit git hook to build a test image that will run flake8
to enforce PEP8 style.
ln -sf ../../pre-commit.sh .git/hooks/pre-commit