This repository contains the Data Science project for Acme Innovations, a legacy household name facing declining customer retention rates. This project is part of the Data Mining and Business Intelligence course in the Master's program. The project aims to leverage advanced data science techniques to understand customer behavior, identify growth opportunities, optimize marketing strategies, and enhance customer retention.
Course: Data Mining and Business Intelligence
Program: Master's Degree in Business Analytics
Institution: University of New Haven
Semester: 2nd Semester
- Project_Introudction.pdf: Detailed Explanation of the project and the deliverables
- Proposal for Acme Innovations.pdf: Detailed project proposal document
- Presentation for ACME Innovations.pdf: Presentation document
- Customer Segmentation - Product Recommendation Project.R: R code for data analysis and modeling
- customer_data.csv: Dataset containing customer information
- customer_purchase_history_final.csv: Dataset containing customer purchase history
- README.md: This file, provides an overview of the project
Motto: Catalyzing Growth through Data
1. KNN Clustering for Customer Segmentation
- Features used: salary and spending score
- Data scaling for unbiased clustering
- Optimal cluster determination using the elbow method
- Visualization of clusters using ggplot2
2. Apriori Algorithm for Product Recommendation
- Preprocessing of transaction data
- Careful setting of support and confidence thresholds
- Rule generation and evaluation using support, confidence, and lift metrics
- Customer segmentation revealed distinct clusters based on salary and spending score
- Identified high-salary customers with low spending scores as potential targets for revenue increase
- Discovered product combinations with high support and confidence scores for targeted recommendations
- Focus marketing efforts on high-salary, low-spending customers to increase revenue
- Offer loyalty program memberships to high-spending customers for recurring revenue
- Leverage product recommendations with high support and confidence scores for cross-selling
- Implement personalized marketing strategies based on customer segments
- Clone the repository to your local machine.
- Review the Proposal for Acme Innovations.pdf for a detailed project overview and methodology.
- Examine the Customer Segmentation - Product Recommendation Project.R file for the complete data analysis and modeling process.
- Ensure you have R and the required packages installed (see Dependencies section).
- Place the customer_data.csv and customer_purchase_history_final.csv files in the appropriate directory.
- Run the R script to reproduce the analysis and generate insights.
The following R packages are required to run the analysis:
- readr
- datasets
- cluster
- caTools
- ggplot2
- stringr
- arules
Install these packages using install.packages() if you haven't already.
- Implement more advanced product recommendation techniques, such as collaborative filtering
- Incorporate time series analysis to understand customer behavior trends over time
- Develop a predictive model for customer churn
- Create an interactive dashboard for real-time monitoring of key metrics
- Integrate external data sources for more comprehensive analysis
This project is submitted as part of the requirements for the Data Mining and Business Intelligence course. It represents my own work and adheres to the academic integrity policies of the University of New Haven. All sources used have been properly cited and acknowledged.
While this is primarily an academic project, constructive feedback and suggestions for improvement are welcome. Please feel free to open an issue or submit a pull request if you have any insights to share.
This project is licensed under the MIT License - see the LICENSE.md file for details.
For any questions or feedback regarding this project, please contact:
Aris
Email: mailaristotle@gmail.com