Version 0.1.10: Python 3.5+ and other features
This release includes a number of small fixes from 0.1.9 and two more significant changes.
Unidic 26 Field Format Support
Unidic has a surprising variety of formats, and the 26-field variety wasn't previously supported. This format includes kana accent information and is notably used in binary distribution of Unidic 2.1.2.
Support for Python 3.5, 3.6
Support for these versions was initially removed due to their short remaining lifespan and lack of a default
option in the namedtuple
constructor. @tamuhey made the necessary changes to get them working so they're supported for now; thanks!
Other Changes
- dummy mecabrc specification for bundled Unidic support (still a work in progress)
- test fixes and documentation
- deal with comma separate values inside fields
Upcoming Changes
I'm working on creating a bundled version of Unidic. Modern versions of Unidic are too large to distribute via PyPI, so I'm figuring out the best way to distribute the data.