Skip to content

Machine Learning Engineer Nanodegree projects

Notifications You must be signed in to change notification settings

ocpodariu/udacity-mlnd

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

37 Commits
 
 
 
 
 
 

Repository files navigation

Machine Learning Engineer Nanodegree

Projects developed as part of Udacity's Machine Learning Engineer Nanodegree.

Projects

0. Titanic Survival Exploration

  • View Jupyter Notebook or Go to project directory
  • Explored the dataset to identify which features best predict a passenger's survival
  • Used those features to create decision functions to predict the survival of the passengers

1. Predicting Boston Housing Prices

  • View Jupyter Notebook or Go to project directory
  • Measured the linear correlation between features and selling price using Pearson's r
  • Observed the effect of different training and testing splits on model performance
  • Used learning and complexity curves to detect underfitting and overfitting
  • Combined Grid Search with cross-validation to find the optimal maximum depth for a decision tree regressor
  • Final model obtained an R^2 score of 0.77
  • Discussed the model's applicability in a real-world scenario

2. SMS Spam Classification

  • View Jupyter Notebook or Go to project directory
  • Transformed the SMS messages using the Bag-of-Words model
  • Generated features based on the frequency of each word
  • Classified SMS messages as spam or not spam with a Naive Bayes model

3. Finding Donors for CharityML

  • View Jupyter Notebook or Go to project directory
  • Evaluated the performance of different supervised algorithms in identifying individuals making more than $50,000
  • Preprocessed the data by scaling numerical features, one-hot encoding categorical features and applying logarithmic transformations on features with skewed distribution
  • Built a pipeline to quickly evaluate the performance of different algorithms
  • Analyzed AdaBoost's performance in relation to the maximum number of estimators
  • Identified the top 5 most important features and analyzed the effects of feature selection on AdaBoost's performance

4. Creating Customer Segments

  • View Jupyter Notebook or Go to project directory
  • Applied PCA to identify customer spending patterns and to reduce the dimensionality of the data
  • Compared K-means and Gaussian Mixture Model to decide which is more suitable for grouping customers into segments
  • Performed a silhouette analysis to determine optimal number of components for Gaussian Mixture Model
  • Designed an A/B test to measure the effect of a delivery service change on each customer segment
  • Trained a classifier on customer segment data to label new customers based on their estimated spendings

About

Machine Learning Engineer Nanodegree projects

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published