Welcome to the Book Recommender System! This project is a Streamlit-based application that provides personalized book recommendations using a k-Nearest Neighbors (kNN) model, and a collaborative filtering approach via SVD.
- Displays the total number of books and users in the dataset.
- Shows the most popular books based on the number of ratings.
- Highlights the top-rated books in the dataset.
- Provides book recommendations based on user selection.
- Allows users to rate recommendations and save their feedback.
- Search for books by keywords.
- Discover a random book from the dataset.
- View distributions of ratings and user interactions with books.
- Streamlit: For building the web-based user interface.
- scikit-learn: For implementing the kNN recommendation algorithm.
- pandas: For data manipulation.
- plotly: For creating interactive visualizations.
- pickle: For saving and loading the pre-trained model and datasets.
- SVD (Singular Value Decomposition): For collaborative filtering-based recommendations.
- Python 3.9 or later
- Required Python libraries (listed in
requirements.txt
):
-
Clone the repository:
git clone https://github.com/your-username/book-recommender-system.git cd book-recommender-system
-
Install the dependencies:
pip install -r requirements.txt
-
Place the required data and model files in the
artifacts/
directory:knn_model.pkl
svd_model.pkl
book_titles.pkl
book_df.pkl
-
Run the Streamlit app:
streamlit run app.py
-
Open your web browser and navigate to
http://localhost:8501
.
book-recommender-system/
├── app.py # Main application script
├── utils.py # Functions library
├── tabs/ # Tabs functions library
│ ├── tab0.py
│ ├── tab1.py
│ ├── tab2.py
│ ├── tab3.py
│ ├── tab4.py
│ ├── tab5.py
│ └── tab6.py
├── artifacts/ # Contains model and dataset files
│ ├── svd_model.pkl # Trained SVD model
│ ├── knn_model.pkl # Trained KNN model
│ ├── book_df.pkl # Book DataFrame
│ └── book_titles.pkl # Titles of the books
├── data/ # Raw and cleaned datasets
│ ├── dataset.csv # Raw dataset with book info
│ ├── Ratings.csv # Ratings data
│ ├── dataset_with_details.csv # Extended dataset
│ ├── cleaned_data.csv # Preprocessed dataset
│ ├── Users.csv # User data
│ └── Books.csv # Book data
├── scrapper/ # Scraping scripts
│ ├── scrapper.py # Main scraping script
│ └── scrapper_cache_check.py # Script to verify cached data
├── notebooks/ # Jupyter notebooks for model training and analysis
│ ├── base.ipynb # Basic EDA notebook
├── Dockerfile # Docker configuration file
├── train.py # Script for training the recommendation model
├── requirements.txt # Dependencies for the project
├── runtime.txt # Specifies Python version for deployment
├── README.md # Project documentation
├── LICENSE # License information
└── .gitignore # Git ignore file
- artifacts/: Contains all the generated artifacts from the training process, including the trained models and processed data.
- data/: Directory for raw and cleaned datasets.
- scrapper/: Contains the scraping scripts used to collect book and user data.
- notebooks/: Jupyter notebooks for exploratory data analysis (EDA) and model training.
- Dockerfile: Used for containerizing the application.
- app.py: The main application file for deploying the recommendation engine.
- requirements.txt: Lists all the Python dependencies required for the project.
- runtime.txt: Specifies the runtime environment for deployment (e.g., Python version).
- train.py: A standalone Python script for training the recommendation engine model.
This project is licensed under the MIT License. See the LICENSE
file for details.
Thank you for exploring the Book Recommender System! 🚀