This changelog was inspired by the keep-a-changelog project and follows semantic versioning.
- (#cf35c3) fixes minimum python version to be
python>=3.9
- (#61730d, #224995, #331fc0) adds support for macOS MPS devices and updates outdated
numpy
/sklearn
code - thanks to @d-jiao - (#c48016, #2fe517, #c965b1, #5578ca, #5b0d85) adds security guidelines and request templates
- (#331fc0) updates actions pipeline, supported python versions and internal dependencies to the latest available like
torch
,gensim
, among others. Support forpython<=3.8
was dropped as a result. Numerous security vulnerabilities were solved
- (#3f27ee) adds
transform
method - (#f98f3f) adds example jupyter notebook
- (#683bec) adds contributing and conduct guidelines
- deactivates debug mode by default
- documents get_most_similar_words method
- optimizes original word2vec TXT file input for model training
- updates README.md
- adds support for original word2vec pretrained embeddings files on both formats (BIN/TXT)
- optimizes handling of gensim's word2vec mapping file for better memory usage
- support for python 3.6
- ETM training with partially tested support for original ETM features.
- ETM corpus preprocessing scripts - including word2vec embeddings training - adapted from the original code.
- adds methods to retrieve document-topic and topic-word probability distributions from the trained model.
- adds docstrings for tested API methods.
- adds unit and integration tests for ETM and preprocessing scripts.