This repository showcases my individual efforts in a data science project aimed at enhancing road safety, with a focus on vulnerable road users like pedestrians, cyclists, and motorcyclists. Through extensive data analysis and exploration, I aim to derive actionable insights and propose practical recommendations to improve road safety globally.
This project comprises six milestones, of which I have successfully completed four:
-
Milestone 1: Data Preparation
- Splitting the dataset into training and testing subsets.
- Formatting the data for analysis to ensure compatibility with analysis tools.
-
Milestone 2: Data Ethics, Pre-Processing, and Exploration
- Examining the dataset for potential biases and ethical considerations.
- Preprocessing the data to handle missing values and outliers.
- Exploring the dataset to gain insights into road safety trends and patterns.
-
Milestone 3: Time Series Analysis
- Analyzing temporal patterns in motor vehicle collision data.
- Using time series analysis techniques to forecast future collision trends.
-
Milestone 4: Geospatial Analysis
- Conducting geospatial analysis to understand the spatial distribution of collisions.
- Visualizing collision hotspots on maps to identify high-risk areas.
Work is in progress for the remaining two milestones:
- Milestone 5: Self-Guided Research Question (Work in Progress)
- Milestone 6: Virtual Poster Board Creation: Data Storytelling (Work in Progress)
Leveraging the New York City OpenData transportation dataset, I apply data science tools and techniques to derive data-driven insights on enhancing road safety. By analyzing motor vehicle collision data, I gain insights into potential biases within the data, crucial for ensuring ethical data science practices. Additionally, I create informative data visualizations and develop analytical models aimed at real-world deployment.
- NYC OpenData Motor Vehicle Collisions – Crashes Dataset