Projects developed as part of Udacity's Machine Learning Engineer Nanodegree.
- View Jupyter Notebook or Go to project directory
- Explored the dataset to identify which features best predict a passenger's survival
- Used those features to create decision functions to predict the survival of the passengers
- View Jupyter Notebook or Go to project directory
- Measured the linear correlation between features and selling price using Pearson's r
- Observed the effect of different training and testing splits on model performance
- Used learning and complexity curves to detect underfitting and overfitting
- Combined Grid Search with cross-validation to find the optimal maximum depth for a decision tree regressor
- Final model obtained an R^2 score of 0.77
- Discussed the model's applicability in a real-world scenario
- View Jupyter Notebook or Go to project directory
- Transformed the SMS messages using the Bag-of-Words model
- Generated features based on the frequency of each word
- Classified SMS messages as spam or not spam with a Naive Bayes model
- View Jupyter Notebook or Go to project directory
- Evaluated the performance of different supervised algorithms in identifying individuals making more than $50,000
- Preprocessed the data by scaling numerical features, one-hot encoding categorical features and applying logarithmic transformations on features with skewed distribution
- Built a pipeline to quickly evaluate the performance of different algorithms
- Analyzed AdaBoost's performance in relation to the maximum number of estimators
- Identified the top 5 most important features and analyzed the effects of feature selection on AdaBoost's performance
- View Jupyter Notebook or Go to project directory
- Applied PCA to identify customer spending patterns and to reduce the dimensionality of the data
- Compared K-means and Gaussian Mixture Model to decide which is more suitable for grouping customers into segments
- Performed a silhouette analysis to determine optimal number of components for Gaussian Mixture Model
- Designed an A/B test to measure the effect of a delivery service change on each customer segment
- Trained a classifier on customer segment data to label new customers based on their estimated spendings