Skip to content

Implementation and analysis of three recommendation algorithms (Naive Approaches, UV matrix decomposition, and Matrix Factorization) on the MovieLens dataset

Notifications You must be signed in to change notification settings

hedzd/Recommender-Systems

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 

Repository files navigation

MovieLens Recommender Systems

Description

This project implements several recommender systems using the MovieLens 1M dataset to provide personalized movie recommendations. The dataset includes over one million ratings, and the goal is to estimate the rating a user would give to a movie using different algorithms. This implementation covers naive approaches, UV matrix decomposition, and Matrix Factorization, with an emphasis on evaluating their accuracy using RMSE and MAE metrics using 5-fold cross-validation. In addition, PCA, t-SNE, and UMAP were utilized to visualize vector representations of users and movies, which were generated by matrix factorization algorithms, to get a better understanding of dataset's characteristics.

Algorithms Implemented

  • Naive Approaches: This method leverages the overall average rating, the average rating per item, the average rating per user, and a finely tuned blend of these averages to predict unknown ratings. It's a straightforward approach that sets the foundation for more complex algorithms.
  • UV Matrix Decomposition: At its core, this technique seeks to find sparse matrices $U$ and $V$ so as to minimize the mean squared error of $M - UV$ (for known values). To this end, we iterate through each element of $U$ and $V$ and set it to the optimal value to minimise the MSE relative to all other current values of $U$ and $V$.
  • Matrix Factorization: Building on the concept of UV matrix decomposition, matrix factorization also aims to approximate the original ratings matrix through the product of two lower-dimensional matrices. However, it employs a more sophisticated optimization process, using gradient descent and regularization to refine the estimates.

Data Visualization

  • Uses PCA, t-SNE, and UMAP for reducing the dimensions of the data and visualizing the vector representations of users and movies.
  • Aims to reveal patterns and clusters based on movie genres and user demographics.

Results

Evaluation of each algorithm, emphasizing their performance in terms of RMSE and MAE metrics using 5 fold cross-validation.

Algorithm Mean RMSE Mean MAE
Naive Approach - Global Average 1.423 0.871
Naive Approach - User Average 1.155 0.794
Naive Approach - Movie Average 1.038 0.751
Naive Approach - Linear Combo 0.894 0.675
UV Matrix Decomposition 0.938 0.654
Matrix Factorization 0.848 0.642

About

Implementation and analysis of three recommendation algorithms (Naive Approaches, UV matrix decomposition, and Matrix Factorization) on the MovieLens dataset

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published