Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Google Analytics Capstone Project #687

Merged
merged 10 commits into from
Jul 6, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 12 additions & 0 deletions Google Analytics Capstone Project/Dataset/GCapstone.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
Interest in Large Language Model by Region throughout 2023,,,,,,Sourced from Google Trends Searches,,,,,Data is created by cumulating interest throughout periods of a month,
,,,,,,,,,,,"If data includes numbers from other months, will be included in other months instead. (i.e. if data runs from November 26th to December 4th, the data will be counted for December)",
Month:,January,February,March,April,May,June,July,August,September,October,November,December
Regions:,,,,,,,,,,,,
WorldWide,0,0,0,27,51,158,149,260,312,280,303,386
China,0,7,0,46,35,73,161,303,214,217,277,234
Singapore,6,0,5,33,69,120,218,205,324,237,320,328
South Korea,3,0,0,17,25,30,153,163,252,238,313,307
Japan,1,0,1,10,44,123,239,242,338,326,351,366
United States,0,0,0,28,65,181,193,242,310,279,294,369
,,,,,,,,,,,,
"Max interest a month is four hundred, because each month has a period of four.",,,,,,,,,,,,
1 change: 1 addition & 0 deletions Google Analytics Capstone Project/Dataset/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
dataset link: https://www.kaggle.com/datasets/fredericxiong/google-analytics-capstone-project
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
3,440 changes: 3,440 additions & 0 deletions Google Analytics Capstone Project/Model/Google_Analytics_Capstone_Project.ipynb

Large diffs are not rendered by default.

59 changes: 59 additions & 0 deletions Google Analytics Capstone Project/Model/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
## **PROJECT TITLE**

### 🎯 **Goal**

Create an analysis model for the Google analytics using machine learning.

### 🧵 **Dataset**

https://www.kaggle.com/datasets/fredericxiong/google-analytics-capstone-project

### 🧾 **Description**

Analysis of Interest in Generative AI across different Regions

### 🧮 **What I had done!**

Data Collection and Preparation -> EDA -> Model Training -> Model Validation -> Comparing the performance metrics of various models

### 🚀 **Models Implemented**

1. SARIMA
2. ARIMA
3. Linear Regression
4. Random Forest
5. LSTM

### 📚 **Libraries Needed**

1. NumPy
2. Pandas
3. Matplotlib
4. Sci-kit learn

### 📊 **Exploratory Data Analysis Results**

<img src="https://github.com/why-aditi/ML-Crate/blob/main/Google%20Analytics%20Capstone%20Project/Images/download%20(1).png">
<img src="https://github.com/why-aditi/ML-Crate/blob/main/Google%20Analytics%20Capstone%20Project/Images/download%20(2).png">
<img src="https://github.com/why-aditi/ML-Crate/blob/main/Google%20Analytics%20Capstone%20Project/Images/download%20(3).png">
<img src="https://github.com/why-aditi/ML-Crate/blob/main/Google%20Analytics%20Capstone%20Project/Images/download%20(4).png">
<img src="https://github.com/why-aditi/ML-Crate/blob/main/Google%20Analytics%20Capstone%20Project/Images/download%20(5).png">
<img src="https://github.com/why-aditi/ML-Crate/blob/main/Google%20Analytics%20Capstone%20Project/Images/download%20(6).png">

### 📈 **Performance of the Models based on the Accuracy Scores**

Mean Squared Error was used as performance metric
1. SARIMA: 3025.1666666666665
2. ARIMA: 0.015828689092572328
3. Linear Regression: 0.15681863010490982
4. Random Forest: 0.02206226453506559
5. LSTM: 27538.882124875207


### 📢 **Conclusion**

ARIMA has turned out to be the best model with MSE 0.016.

### ✒️ **Your Signature**

Aditi Kala
4 changes: 4 additions & 0 deletions Google Analytics Capstone Project/requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
1. NumPy
2. Pandas
3. Tensorflow
4. Sci-kit learn