Welcome to my data portfolio!
I am a results-oriented data scientist with a strong academic background and practical experience in data analysis and engineering. I am proficient in Python, SQL, and have hands-on experience in machine learning and deep learning frameworks such as PyTorch and TensorFlow. My goal is to leverage data-driven insights to tackle complex challenges and drive strategic decision-making.
Below are details of the projects I've worked on:
In this project, I conducted an exploratory data analysis (EDA) on electric vehicle (EV) registrations by the Washington State Department of Licensing (DOL) each month. The analysis aimed to uncover trends, identify missing or inconsistent data, and provide insights into the adoption of electric vehicles over time.
- Data Overview: This dataset shows the number of vehicles registered each month, segmented by county and vehicle type.
- Data Quality Checks: Identified missing values and inconsistent counts.
- Key Findings: Trends in EV registrations over time and by county.
- Conclusion: Insights into the growth of EV adoption in Washington State despite data challenges.
- Python for data analysis and scripting.
- Pandas, Matplotlib, and Seaborn for data manipulation and visualization.
https://github.com/VaishDeshpande234/Electric_Vehicle_Registrations_Project
The goal of this project is to predict the Remaining Useful Life (RUL) of Turbofan engines based on sensor data. Predictive maintenance helps identify the point at which an engine is likely to fail, allowing for timely maintenance to prevent failures and optimize maintenance schedules.
- Data Preprocessing: Loaded and cleaned the training, test, and RUL datasets.
- Feature Engineering: Calculated RUL for the training dataset and prepared the test dataset.
- Model Training and Evaluation: Trained Random Forest, Gradient Boosting, and LSTM models. Evaluated models using RMSE and MAE metrics.
- Visualizations: Created scatter plots comparing predicted and ground truth RUL, histograms of engine cycle distributions, and correlation matrices of features to analyze model performance and data relationships.
- Python for data analysis and scripting.
- Pandas, Matplotlib, Seaborn for data manipulation and visualization.
- Scikit-learn for machine learning models.
- Keras for LSTM model.
https://github.com/VaishDeshpande234/Predictive-Maintenance
The goal of this project is to perform sentiment analysis on a large dataset of tweets to classify them as positive or negative. Sentiment analysis helps in understanding the sentiment of users towards specific topics, brands, or events, enabling better decision-making and strategy formulation.
- Data Preprocessing and EDA: Loaded the dataset, removed unnecessary columns, replaced sentiment values for better understanding, and created word clouds for negative and positive tweets.
- Feature Engineering: Preprocessed text data by converting text to lowercase, replacing URLs, emojis, and usernames with placeholders, removing non-alphanumeric characters and stopwords, and lemmatizing words. Converted text data into numerical features using TF-IDF.
- Model Training and Evaluation: Split the data into training and test sets. Trained three models (Bernoulli Naive Bayes, LinearSVC, and Logistic Regression) and evaluated them using precision, recall, f1-score, and confusion matrix.
- Results: Achieved good accuracy and performance across all models, with Logistic Regression performing the best with an accuracy of 0.83.
- Visualizations:
- Word Cloud for Negative Tweets: Visual representation of the most frequent words in negative tweets.
- Word Cloud for Positive Tweets: Visual representation of the most frequent words in positive tweets.
- Confusion Matrix: Heatmap of the confusion matrix showing the performance of the models in terms of true positives, false positives, false negatives, and true negatives.
- Python for data analysis and scripting.
- Pandas, Matplotlib, Seaborn for data manipulation and visualization.
- Scikit-learn for machine learning models.
- NLTK for natural language processing.
https://github.com/VaishDeshpande234/Sentiment-Analysis
I am currently expanding my portfolio with more projects in data science and machine learning. Stay tuned for updates!
Feel free to reach out to me via LinkedIn(https://www.linkedin.com/in/vaishnavi-deshpande-477392297/) or Email(deshpandevaish2310@gmail.com).