Skip to content

Release 0.6.4

Compare
Choose a tag to compare
@evfro evfro released this 02 May 03:46
· 95 commits to master since this release

This release introduces a massive update to the framework with new internal design and additional functionality. With this release the long broken support for Python 2 is abandoned and all future releases will be aimed at Python 3 only starting from 3.6 version.

New models and additional functionality

  • New Kernelized Probabilistic MF model.
  • Built-in support for scaled version of PureSVD (see Reproducing EIGENREC results tutorial for details).
  • Simple hybrid model that uses feature-similarity scores aggregation.
  • Baseline models for item cold start regime: popularity-based, random, similarity-aggregation model, PureSVD.
  • New classes to support item post-filtering.
  • Unified handling of side feature-based relations.
  • Support for several learning-rate schedules in SGD: adagrad, adam, rmsprop + my own 3 heuristic schedules adanorm, gnprop and gnpropz.

Hyper-parameter tuning

  • Generic find_optimal_config function to perform random grid search over user-defined hyper-parameter space.
  • New find_optimal_svd_rank routine to quickly and efficiently tune SVD.
  • New find_optimal_tucker_ranks routine to quickly and efficiently tune tensor-based models.
  • User can now define, which configurations to skip from random grid search.

Evaluation

  • New versatile run_cv_experiment routine to automate cross-validation experiments. Supports both the default and the user-defined evaluation protocols.
  • More ways to evaluate against the specific set of metrics supported by Polara.

Performance improvements

  • Efficient handling of indices in LightFM model (allows to reduce memory load by orders of magnitude comparing to native LightFM implementation).
  • Rating prediction with tensor-based model is now more efficient.
  • Computation of Tucker core in tensor-based models is now optional.

Other improvements

  • Revived Turi Create (ex Graphlab Create) support with its factorization models including Factorization Machines.
  • Refactored evaluation code.
  • Refactored and improved code for SGD-based matrix factorization. Now supports both naive and probabilistic implementations.
  • Improved handling of sparse operations.
  • Better handling of side features.
  • Improved timing functionality.
  • Internal naming is now more consistent.
  • Support for Amazon and Epinions datasets
  • Allow unpacking the probe part of the Netflix dataset.
  • Some other minor improvements and fixes.