Skip to content

Latest commit

 

History

History
48 lines (30 loc) · 1.71 KB

README.md

File metadata and controls

48 lines (30 loc) · 1.71 KB

Movie Recommender System: Content-Based Filtering

1.1 Overview

Content-based filtering recommends movies based on the features of the items (movies) and a profile of the user's preferences. The system analyzes movie attributes such as genre, director, cast, and plot keywords to recommend similar movies that match a user's past preferences.

1.2 Objectives

Extract and preprocess movie features. Build a content-based recommender model. Evaluate the performance of the model.

2.1 Data Source

Describe the dataset used, such as the MovieLens dataset. Mention its source, number of movies, and the type of features available (e.g., genre, director, cast, plot keywords).

2.2 Data Features

List the features extracted from the dataset. Common features include:

Movie titles Genres Director names Cast names Plot keywords

3.1 Data Cleaning

Handle missing values by filling or removing them. Remove duplicates. Standardize text data (e.g., convert all text to lowercase).

###3.2 Feature Extraction Tokenize and normalize text features (e.g., split genres into individual terms). Create a combined feature for text-based similarities, such as combining genres, plot keywords, and director into a single string.

###3.3 Vectorization Use techniques like TF-IDF (Term Frequency-Inverse Document Frequency) to convert text features into numerical vectors.

3.2 Feature Extraction

Tokenize and normalize text features (e.g., split genres into individual terms). Create a combined feature for text-based similarities, such as combining genres, plot keywords, and director into a single string.

3.3 Vectorization

Use TF-IDF (Term Frequency-Inverse Document Frequency) to convert text features into numerical vectors.