-
-
Notifications
You must be signed in to change notification settings - Fork 213
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #568 from Avdhesh-Varshney/ielts
IELTS Success Analysis and Prediction Model
- Loading branch information
Showing
9 changed files
with
121 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,33 @@ | ||
# IELTS Success Stories Dataset | ||
|
||
The Dataset used here is taken from the Kaggle database website. You can download the file from the link given here, [IELTS Success Stories Dataset](https://www.kaggle.com/datasets/zakirkhanaleemi/ielts-success-stories-dataset) | ||
|
||
## About the dataset | ||
|
||
- There are 27 rows / entries in this dataset. | ||
- There are 23 different features which are listed below: | ||
|
||
- Candidate: Name or identifier of the individual who took the IELTS test. | ||
- Location: The city or region where the candidate is located. | ||
- Profession: The candidate's occupation or field of work/study. | ||
- Study Duration (months): The duration, in months, that the candidate spent preparing for the IELTS test. | ||
- IELTS Score (Overall): The overall band score achieved by the candidate in the IELTS test. | ||
- Key Strategies: Strategies and methods employed by the candidate during their IELTS preparation. | ||
- Education Level: The highest level of education attained by the candidate (e.g., Bachelor's, Master's). | ||
- Age: The age of the candidate at the time of taking the IELTS test. | ||
- Target Country: The country the candidate aspires to move to or pursue further studies in. | ||
- English Proficiency (Preparation): The candidate's self-assessed English proficiency level before starting IELTS preparation. | ||
- Practice Hours per Week: The average number of hours per week the candidate dedicated to IELTS practice. | ||
- Mock Tests Taken: The number of practice/mock IELTS tests taken by the candidate. | ||
- Achieved Desired Score: Indicates whether the candidate achieved their target IELTS score. | ||
- Preferred Learning Resources: The materials or resources the candidate favored during their IELTS preparation. | ||
- Application Status: The status of the candidate's application for further studies or immigration. | ||
- Job Offer Received: Indicates whether the candidate received a job offer in their target country. | ||
- Additional Certifications: Any additional certifications or qualifications attained by the candidate. | ||
- Volunteer Experience: Whether the candidate has relevant volunteer experience. | ||
- Language Fluency: The candidate's fluency in languages other than English. | ||
- Internship Experience: Whether the candidate has relevant internship experience. | ||
- Relevant Skills: Skills possessed by the candidate that are relevant to their profession or studies. | ||
- Recommendations: The strength of recommendations provided for the candidate. | ||
- Networking Efforts: Efforts made by the candidate to network within their field or community. | ||
|
Binary file added
BIN
+17 KB
IELTS Success Analysis and Prediction/Images/Application Status_feature.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+18 KB
IELTS Success Analysis and Prediction/Images/Study Duration (months)_feature.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
1 change: 1 addition & 0 deletions
1
IELTS Success Analysis and Prediction/Model/ielts-success-analysis-and-prediction.ipynb
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,80 @@ | ||
<h1>IELTS Success Stories Analysis and Prediction Model</h1> | ||
|
||
**GOAL** | ||
|
||
The aim of this project is to analyze and predict the success rates of IELTS. | ||
|
||
**DATASET** | ||
|
||
https://www.kaggle.com/datasets/zakirkhanaleemi/ielts-success-stories-dataset | ||
|
||
**DESCRIPTION** | ||
|
||
To analyze the IELTS Success Stories Dataset and build and train the model on the basis of different features and variables. | ||
|
||
|
||
### Visualization and EDA of different attributes: | ||
|
||
<img alt="heatmap" src="./Images/correlation_heatmap.png"> | ||
|
||
<img alt="graph" src="./Images/target_correlation.png"> | ||
|
||
<img alt="graph" src="./Images/Application Status_feature.png"> | ||
|
||
<img alt="graph" src="./Images/Location_feature.png"> | ||
|
||
<img alt="graph" src="./Images/Study Duration (months)_feature.png"> | ||
|
||
|
||
**MODELS USED** | ||
|
||
| Model | MSE_train | R2_train | MSE_test | R2_test | | ||
|-----------------------------|---------------------|----------|-----------|-----------| | ||
| Random Forest Regression | 7.79e-03 | 0.977 | 0.0151 | 0.9257 | | ||
| XG Boost Regression | 1.42e-07 | 1.000 | 0.0165 | 0.919 | | ||
| Decision Tree Regression | 0.000 | 1.000 | 0.0208 | 0.8974 | | ||
| Ridge Regression | 6.44e-04 | 0.998 | 0.0723 | 0.6439 | | ||
| Elastic Net Regression | 9.25e-02 | 0.727 | 0.1335 | 0.3428 | | ||
| Linear Regression | 4.13e-30 | 1.000 | 0.154 | 0.2418 | | ||
| KNN Regression | 1.01e-01 | 0.703 | 0.1683 | 0.1713 | | ||
|
||
|
||
|
||
**WHAT I HAD DONE** | ||
|
||
* Load the dataset which contains 27 entries in it and having 23 features in it. | ||
* Checked for missing values and cleaned the data accordingly. | ||
* Analyzed the data, found insights and visualized them accordingly. | ||
* Plotting heatmap using correlation and checking the relation between different features. | ||
* Found detailed insights of different columns with target variable using plotting libraries and plot the box-plot to see the distribution of dataset correspond to target features. | ||
* Split the dataset into training and testing dataset. | ||
* Apply PCA to reduce the number of features. | ||
* Apply different training models and get their accuracies and MSE and R2 scores. | ||
* Train the datasets by different models and saves their accuracies into a dataframe. | ||
|
||
|
||
**LIBRARIES NEEDED** | ||
|
||
1. Pandas | ||
2. Matplotlib | ||
3. Sklearn | ||
4. NumPy | ||
5. XGBoost | ||
6. Tensorflow | ||
7. Keras | ||
8. Sci-py | ||
9. Seaborn | ||
|
||
|
||
**CONCLUSION** | ||
|
||
- Random Forest and XG Boost Regression models show promising performance with lower MSE and higher R2 values. | ||
- Decision Tree Regression achieved perfect R2 on the training set but performed poorly on the test set, indicating overfitting. | ||
|
||
|
||
**YOUR NAME** | ||
|
||
*Avdhesh Varshney* | ||
|
||
[![LinkedIn](https://img.shields.io/badge/linkedin-%230077B5.svg?style=for-the-badge&logo=linkedin&logoColor=white)](https://www.linkedin.com/in/avdhesh-varshney/) [![GitHub](https://img.shields.io/badge/github-%23121011.svg?style=for-the-badge&logo=github&logoColor=white)](https://github.com/Avdhesh-Varshney) | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
numpy==1.19.2 | ||
pandas==1.4.3 | ||
matplotlib==3.7.1 | ||
scikit-learn~=1.0.2 | ||
scipy==1.5.0 | ||
seaborn==0.10.1 | ||
xgboost~=1.5.2 |