Merge branch 'release/v.0.6.4'

evfro · May 2, 2019 · ae90f14 · ae90f14
2 parents b10ec32 + 989e1b8
commit ae90f14
Show file tree

Hide file tree

Showing 38 changed files with 1,968 additions and 875 deletions.
diff --git a/.gitignore b/.gitignore
@@ -1,3 +1,4 @@
 *.pyc
 polara.egg-info/
 examples/.ipynb_checkpoints/
+.ipynb_checkpoints/
diff --git a/README.md b/README.md
@@ -2,16 +2,15 @@
 Polara is the first recommendation framework that allows a deeper analysis of recommender systems performance, based on the idea of feedback polarity (by analogy with sentiment polarity in NLP).
 
 In addition to standard question of "how good a recommender system is at recommending relevant items", it allows assessing the ability of a recommender system to **avoid irrelevant recommendations** (thus, less likely to disappoint a user). You can read more about this idea in a research paper [Fifty Shades of Ratings: How to Benefit from a Negative Feedback in Top-N Recommendations Tasks](http://arxiv.org/abs/1607.04228). The research results can be easily reproduced with this framework, visit a "fixed state" version of the code at https://github.com/Evfro/fifty-shades (there're also many usage examples).
-
-The framework also features efficient tensor-based implementation of an algorithm, proposed in the paper, that takes full advantage of the polarity-based formulation. Currently, there is an [online demo](http://coremodel.azurewebsites.net) (for test purposes only), that demonstrates the effect of taking into account feedback polarity.
+The framework also features efficient tensor-based implementation of an algorithm, proposed in the paper, that takes full advantage of the polarity-based formulation.
 
 
 ## Prerequisites
 Current version of Polara supports both Python 2 and Python 3 environments. Future versions are likely to drop support of Python 2 to make a better use of Python 3 features.
 
-The framework heavily depends on `Pandas, Numpy, Scipy` and `Numba` packages. Better performance can be achieved with `mkl` (optional). It's also recommended to use `jupyter notebook` for experimentation. Visualization of results can be done with help of `matplotlib` and optionally `seaborn`. The easiest way to get all those at once is to use the latest [Anaconda distribution](https://www.continuum.io/downloads).
+The framework heavily depends on `Pandas, Numpy, Scipy` and `Numba` packages. Better performance can be achieved with `mkl` (optional). It's also recommended to use `jupyter notebook` for experimentation. Visualization of results can be done with help of `matplotlib`. The easiest way to get all those at once is to use the latest [Anaconda distribution](https://www.continuum.io/downloads).
 
-If you use a separate `conda` environment for testing, the following command can be used to ensure that all required dependencies are in place (see [this](http://conda.pydata.org/docs/commands/conda-install.html) for more info):
+If you use a separate `conda` environment for testing, the following command can be issued to ensure that all required dependencies are in place (see [this](http://conda.pydata.org/docs/commands/conda-install.html) for more info):
 
 `conda install --file conda_req.txt`
 
@@ -98,30 +97,26 @@ random = RandomModel(data_model)
 models = [i2i, svd, popular, random]
 
 metrics = ['ranking', 'relevance'] # metrics for evaluation: NDGC, Precision, Recall, etc.
-folds = [1, 2, 3, 4, 5] # use all 5 folds for cross-validation
+folds = [1, 2, 3, 4, 5] # use all 5 folds for cross-validation (default)
 topk_values = [1, 5, 10, 20, 50] # values of k to experiment with
 
-# run experiment
-topk_result = {}
-for fold in folds:
-    data_model.test_fold = fold
-    topk_result[fold] = ee.topk_test(models, topk_list=topk_values, metrics=metrics)
-
-# rearrange results into a more friendly representation
-# this is just a dictionary of Pandas Dataframes
-result = ee.consolidate_folds(topk_result, folds, metrics)
-result.keys() # outputs ['ranking', 'relevance']
+# run 5-fold CV experiment
+result = ee.run_cv_experiment(models, folds, metrics,
+                              fold_experiment=ee.topk_test,
+                              topk_list=topk_values)
 
 # calculate average values across all folds for e.g. relevance metrics
-result['relevance'].mean(axis=0).unstack() # use .std instead of .mean for standard deviation
+scores = result.mean(axis=0, level=['top-n', 'model']) # use .std instead of .mean for standard deviation
+scores.xs('recall', level='metric', axis=1).unstack('model')
 ```
 which results in something like:
 
-| metric/model |item-to-item | SVD | mostpopular | random |
+| **model** | **MP** | **PureSVD** | **RND** | **item-to-item** |
 | ---: |:---:|:---:|:---:|:---:|
-| *precision* | 0.348212 | 0.600066 | 0.411126 | 0.016159 |
-| *recall*    | 0.147969 | 0.304338 | 0.182472 | 0.005486 |
-| *miss_rate* | 0.852031 | 0.695662 | 0.817528 | 0.994514 |
+| **top-n** |
+| **1** |  0.017828 |  0.079428 |  0.000055 |  0.024673 |
+| **5** |  0.086604 |  0.219408 |  0.001104 |  0.126013 |
+| **10** |  0.138546 |  0.300658 |  0.001987 |  0.202134 |
 | ... | ... | ... | ... | ... |
 
 ## Custom pipelines
@@ -137,7 +132,7 @@ Now you are ready to build your models (as in examples above) and export them to
 
 ### Warm-start and known-user scenarios
 By default polara makes testset and trainset disjoint by users, which allows to evaluate models against *user warm-start*.
-However in some situations (for example, when polara is used within a larger pipeline) you might want to implement strictly a *known user* scenario to assess the quality of your recommender system on the unseen (held-out) items for the known users. The change between these two scenarios as controlled by setting `data_model.warm_start` attribute to `True` or `False`. See [Warm-start and standard scenarios](examples/Warm-start and standard scenarios.ipynb) Jupyter notebook as an example.
+However in some situations (for example, when polara is used within a larger pipeline) you might want to implement strictly a *known user* scenario to assess the quality of your recommender system on the unseen (held-out) items for the known users. The change between these two scenarios as controlled by setting `data_model.warm_start` attribute to `True` or `False`. See [Warm-start and standard scenarios](examples/Warm_start_and_standard_scenarios.ipynb) Jupyter notebook as an example.
 
 ### Externally provided test data
 If you don't want polara to perform data splitting (for example, when your test data is already provided), you can use the `set_test_data` method of a `RecommenderData` instance. It has a number of input arguments that cover all major cases of externally provided data. For example, assuming that you have new users' preferences encoded in the `unseen_data` dataframe and the corresponding held-out preferences in the `holdout` dataframe, the following command allows to include them into the data model:  
@@ -150,4 +145,7 @@ svd.build()
 svd.evaluate()
 ```
 In this case the recommendations are generated based on the testset and evaluated against the holdout.
-See more usage examples in the [Custom evaluation](examples/Custom evaluation.ipynb) notebook.
+See more usage examples in the [Custom evaluation](examples/Custom_evaluation.ipynb) notebook.
+
+### Reproducing others work
+Polara offers even more options to highly customize experimentation pipeline and tailor it to specific needs. See, for example, [Reproducing EIGENREC results](examples/Reproducing_EIGENREC_results.ipynb) notebook to learn how Polara can be used to reproduce experiments from the *"[EIGENREC: generalizing PureSVD for effective and efﬁcient top-N recommendations](https://arxiv.org/abs/1511.06033)"* paper.
diff --git a/conda_req.txt b/conda_req.txt
@@ -1,11 +1,11 @@
 # This file may be used to create an environment using:
 # $ conda create --name <env> --file <this file>
 
+python>=3.6
 jupyter>=1.0.0
 numba>=0.21.0
 numpy>=1.10.1
 matplotlib>=1.4.3
 pandas>=0.17.1
 requests>=2.7.0
 scipy>=0.16.0
-seaborn>=0.6.0
diff --git a/examples/Custom evaluation.ipynb → examples/Custom_evaluation.ipynb b/examples/Custom evaluation.ipynb → examples/Custom_evaluation.ipynb
@@ -45,7 +45,6 @@
    "metadata": {},
    "outputs": [],
    "source": [
-    "from __future__ import print_function\n",
     "import numpy as np\n",
     "from polara.datasets.movielens import get_movielens_data"
    ]
@@ -178,7 +177,8 @@
      "output_type": "stream",
      "text": [
       "Preparing data...\n",
-      "Done.\n"
+      "Done.\n",
+      "There are 766928 events in the training and 0 events in the holdout.\n"
      ]
     }
    ],
@@ -466,7 +466,7 @@
      "name": "stdout",
      "output_type": "stream",
      "text": [
-      "PureSVD training time: 0.0981479023320091s\n"
+      "PureSVD training time: 0.128s\n"
      ]
     }
    ],
@@ -563,7 +563,10 @@
     {
      "data": {
       "text/plain": [
-       "Hits(true_positive=4443, false_positive=1131, true_negative=15870, false_negative=19081)"
+       "[Relevance(precision=0.4671998676068703, recall=0.18790260795025798, fallout=0.0587545652398142, specifity=0.7075255418074651, miss_rate=0.7362722359391265),\n",
+       " Ranking(nDCG=0.16791941496631102, nDCL=0.07078245692187013),\n",
+       " Experience(coverage=0.14883215643671918),\n",
+       " Hits(true_positive=4443, false_positive=1131, true_negative=15870, false_negative=19081)]"
       ]
      },
      "execution_count": 19,
@@ -770,47 +773,47 @@
        "  <thead>\n",
        "    <tr style=\"text-align: right;\">\n",
        "      <th></th>\n",
-       "      <th>new</th>\n",
        "      <th>old</th>\n",
+       "      <th>new</th>\n",
        "    </tr>\n",
        "  </thead>\n",
        "  <tbody>\n",
        "    <tr>\n",
        "      <th>775</th>\n",
-       "      <td>775</td>\n",
        "      <td>776</td>\n",
+       "      <td>775</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>1965</th>\n",
-       "      <td>1965</td>\n",
        "      <td>1966</td>\n",
+       "      <td>1965</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>4137</th>\n",
-       "      <td>4137</td>\n",
        "      <td>4138</td>\n",
+       "      <td>4137</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>4422</th>\n",
-       "      <td>4422</td>\n",
        "      <td>4423</td>\n",
+       "      <td>4422</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>4746</th>\n",
-       "      <td>4746</td>\n",
        "      <td>4747</td>\n",
+       "      <td>4746</td>\n",
        "    </tr>\n",
        "  </tbody>\n",
        "</table>\n",
        "</div>"
       ],
       "text/plain": [
-       "       new   old\n",
-       "775    775   776\n",
-       "1965  1965  1966\n",
-       "4137  4137  4138\n",
-       "4422  4422  4423\n",
-       "4746  4746  4747"
+       "       old   new\n",
+       "775    776   775\n",
+       "1965  1966  1965\n",
+       "4137  4138  4137\n",
+       "4422  4423  4422\n",
+       "4746  4747  4746"
       ]
      },
      "execution_count": 27,
@@ -1102,47 +1105,47 @@
        "  <thead>\n",
        "    <tr style=\"text-align: right;\">\n",
        "      <th></th>\n",
-       "      <th>new</th>\n",
        "      <th>old</th>\n",
+       "      <th>new</th>\n",
        "    </tr>\n",
        "  </thead>\n",
        "  <tbody>\n",
        "    <tr>\n",
        "      <th>0</th>\n",
-       "      <td>0</td>\n",
        "      <td>4833</td>\n",
+       "      <td>0</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>1</th>\n",
-       "      <td>1</td>\n",
        "      <td>4834</td>\n",
+       "      <td>1</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>2</th>\n",
-       "      <td>2</td>\n",
        "      <td>4835</td>\n",
+       "      <td>2</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>3</th>\n",
-       "      <td>3</td>\n",
        "      <td>4836</td>\n",
+       "      <td>3</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>4</th>\n",
-       "      <td>4</td>\n",
        "      <td>4837</td>\n",
+       "      <td>4</td>\n",
        "    </tr>\n",
        "  </tbody>\n",
        "</table>\n",
        "</div>"
       ],
       "text/plain": [
-       "   new   old\n",
-       "0    0  4833\n",
-       "1    1  4834\n",
-       "2    2  4835\n",
-       "3    3  4836\n",
-       "4    4  4837"
+       "    old  new\n",
+       "0  4833    0\n",
+       "1  4834    1\n",
+       "2  4835    2\n",
+       "3  4836    3\n",
+       "4  4837    4"
       ]
      },
      "execution_count": 35,
@@ -1180,47 +1183,47 @@
        "  <thead>\n",
        "    <tr style=\"text-align: right;\">\n",
        "      <th></th>\n",
-       "      <th>new</th>\n",
        "      <th>old</th>\n",
+       "      <th>new</th>\n",
        "    </tr>\n",
        "  </thead>\n",
        "  <tbody>\n",
        "    <tr>\n",
        "      <th>0</th>\n",
-       "      <td>0</td>\n",
        "      <td>1</td>\n",
+       "      <td>0</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>1</th>\n",
-       "      <td>1</td>\n",
        "      <td>2</td>\n",
+       "      <td>1</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>2</th>\n",
-       "      <td>2</td>\n",
        "      <td>3</td>\n",
+       "      <td>2</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>3</th>\n",
-       "      <td>3</td>\n",
        "      <td>4</td>\n",
+       "      <td>3</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>4</th>\n",
-       "      <td>4</td>\n",
        "      <td>5</td>\n",
+       "      <td>4</td>\n",
        "    </tr>\n",
        "  </tbody>\n",
        "</table>\n",
        "</div>"
       ],
       "text/plain": [
-       "   new  old\n",
-       "0    0    1\n",
-       "1    1    2\n",
-       "2    2    3\n",
-       "3    3    4\n",
-       "4    4    5"
+       "   old  new\n",
+       "0    1    0\n",
+       "1    2    1\n",
+       "2    3    2\n",
+       "3    4    3\n",
+       "4    5    4"
       ]
      },
      "execution_count": 36,
@@ -1603,7 +1606,10 @@
     {
      "data": {
       "text/plain": [
-       "Hits(true_positive=1063, false_positive=245, true_negative=3505, false_negative=4666)"
+       "[Relevance(precision=0.48771352650892064, recall=0.1962177724147278, fallout=0.05871125477406844, specifity=0.6808813050133541, miss_rate=0.741780456106087),\n",
+       " Ranking(nDCG=0.17149113058615703, nDCL=0.06967069219097612),\n",
+       " Experience(coverage=0.10999456816947312),\n",
+       " Hits(true_positive=1063, false_positive=245, true_negative=3505, false_negative=4666)]"
       ]
      },
      "execution_count": 41,
@@ -1655,6 +1661,13 @@
    "source": [
     "svd.evaluate('relevance')"
    ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": []
   }
  ],
  "metadata": {