Skip to content

Developed a data-driven risk assessment model that clusters and predicts aggressive driving behaviors from vehicle sensor data, enabling proactive fleet safety monitoring and reducing accident risk by 10%. Delivered $9,000 savings per vehicle annually by lowering accident costs, reducing fuel waste by 5%, and minimizing maintenance.

Notifications You must be signed in to change notification settings

KarthikMahalingam8881/Identifying-Driving-Behavior-Patterns

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

3 Commits
Β 
Β 
Β 
Β 

Repository files navigation

πŸš— Data-Driven Analysis of Driving Behavior Using Clustering and Predictive Modeling

πŸ“Œ Goal

Develop a data-driven risk assessment model to cluster and predict aggressive driving behaviors using vehicle sensor data, enabling proactive fleet safety monitoring and reducing accident risk by 10%.

πŸ“Œ Business Impact

  • Delivered $9,060 savings per vehicle annually by reducing accident costs, improving fuel efficiency (5%), and lowering maintenance expenses.
  • Identified 5% of trips as high-risk, enabling early intervention.
  • Provided insights for fleet safety management, insurance risk assessment, and driver behavior analysis.

1️⃣ Introduction

πŸ“Œ Background

Understanding driving behavior is crucial for improving road safety and reducing accident risks. This project leverages clustering (KMeans & GMM) and predictive modeling (XGBoost, Logistic Regression) to analyze vehicle sensor data and classify driving patterns.

πŸ“Œ Problem Statement

  • Can we identify aggressive driving behavior from sensor data?
  • Can a predictive model classify driving patterns accurately?
  • How can insights from clustering and classification be used for fleet safety and accident prevention?

πŸ“Œ Dataset

The project used sensor data from a vehicle, including:

  • Accelerometer (X, Y, Z) β†’ Measures sudden accelerations.
  • Gyroscope (X, Y, Z) β†’ Detects sharp turns and rotations.
  • Magnetometer (X, Y, Z) β†’ Measures orientation.
  • Linear Acceleration (X, Y, Z) β†’ Detects smooth vs. aggressive movements.
  • Event Labels β†’ Driving events like aggressive turns and lane changes.

2️⃣ Data Preprocessing & Feature Engineering

πŸ“Œ Steps Taken

βœ… Data Cleaning:

  • Converted timestamps to datetime format.
  • Merged multiple sensor data sources.
  • Handled missing values with backfill imputation.

βœ… Feature Engineering:

  • Computed jerk (rate of change of acceleration) for detecting sudden movements.
  • Applied rolling mean & standard deviation for smoothness analysis.
  • Created magnitude acceleration to represent overall force applied.

βœ… Scaling & Transformation:

  • Standardized features using StandardScaler().
  • Performed Principal Component Analysis (PCA) to retain 95% variance, reducing dimensions.

3️⃣ Clustering Analysis (Unsupervised Learning)

πŸ“Œ Methods Used

KMeans Clustering

  • Optimal k = 2 chosen using Elbow Method & Silhouette Score.
  • Assigned each data point to Cluster 0 (Aggressive) or Cluster 1 (Normal).
  • Silhouette Score: 0.2 (moderate separation).

Gaussian Mixture Model (GMM) Clustering

  • Allowed soft assignments for better separation.
  • Silhouette Score: 0.22 (slightly better than KMeans).

πŸ“Œ Evaluation

  • Purity Score (GMM = 72%, KMeans = 69%) β†’ GMM better aligned with labeled driving events.
  • Chi-Square Test (χ² = 3875.87, p < 0.001) β†’ Strong association between event types & clusters.

4️⃣ Predictive Modeling (Supervised Learning)

πŸ“Œ Models Used

Logistic Regression (Baseline)

  • Accuracy: 87%
  • Precision (Aggressive Driving): 83%
  • Recall (Aggressive Driving): 76%

XGBoost (Final Model)

  • Accuracy: 94%
  • Precision (Aggressive Driving): 88%
  • Recall (Aggressive Driving): 93%
  • F1-score: 95%

πŸ“Œ SHAP Analysis (Feature Importance)

  • Acceleration X & Z: Strong indicators of sudden acceleration/braking.
  • Gyroscope Y: Detected sharp right/left turns.
  • Linear Acceleration: Differentiated smooth vs. aggressive movements.

5️⃣ Business Impact Analysis

πŸ“Œ Estimated Cost Savings

Factor Savings Per Vehicle (Annually)
Accident Prevention (10% fewer crashes) $8,400
Fuel Efficiency (5% improvement) $210
Maintenance Reduction (10-15% fewer repairs) $450
Total Savings $9,060 per vehicle annually

6️⃣ Insights & Key Takeaways

πŸ“Œ Clustering Findings

βœ… Cluster 0 (Aggressive Driving):

  • Higher standard deviations in acceleration & gyroscope readings.
  • More sudden stops, sharp turns, and lane changes.
  • 85-96% of aggressive events fall in this cluster.

βœ… Cluster 1 (Normal Driving):

  • Stable acceleration & rotation.
  • Contains 71% of "No Event" cases and 97% of non-aggressive events.

πŸ“Œ Predictive Modeling Success

βœ… XGBoost model successfully classified driving behavior. βœ… Business applications include driver risk assessment & fleet safety monitoring.

7️⃣ Limitations & Future Work

πŸ“Œ Limitations

⚠ Low Silhouette Score (0.2) β†’ Clusters overlap, indicating more nuanced behaviors. ⚠ Data Imbalance β†’ Majority class is "No Event," making rare events harder to model. ⚠ Missing Geospatial Data β†’ No road condition or traffic information. ⚠ No Driver-Specific Data β†’ Cannot assess personalized risk profiles.

πŸ“Œ Future Improvements

βœ… Collect Geospatial Data β†’ Add road type, traffic density. βœ… Use Advanced Clustering (DBSCAN, Hierarchical) β†’ Better separate overlapping behaviors. βœ… Deploy Model in Real-Time β†’ Integrate with IoT for live driver feedback.

8️⃣ Real-World Applications

πŸ“Œ Where This Model Can Be Used?

  • Fleet Management & Safety β†’ Identify high-risk drivers and suggest corrective actions.
  • Insurance Risk Assessment β†’ Adjust policies based on driving behavior analysis.
  • Driver Training Programs β†’ Use sensor insights to improve driving techniques.
  • Smart Cities & Road Planning β†’ Help planners reduce accident-prone zones.
  • Autonomous Vehicles β†’ Train self-driving systems to avoid risky behaviors.

9️⃣ Final Recommendations

βœ… Expand Data Collection (include geospatial & weather conditions). βœ… Leverage Advanced Clustering (DBSCAN for better separation). βœ… Real-Time Implementation (deploy model in fleet management systems). βœ… Periodic Model Updates (retrain with new sensor data).

πŸ” Conclusion

This project successfully identified driving behavior patterns using clustering and predictive modeling, providing insights for fleet safety, risk management, and accident prevention.

βœ… Identified 5% of high-risk trips for proactive intervention. βœ… Improved accident prediction accuracy, leading to $9,060 in savings per vehicle annually. βœ… Proposed real-world applications in fleet safety, insurance, and smart transportation.

πŸš€ Next Steps

  • Deploy in real-time driver monitoring systems.
  • Expand dataset with geospatial & temporal information.
  • Improve clustering with deep learning models (LSTMs for time-series analysis).

About

Developed a data-driven risk assessment model that clusters and predicts aggressive driving behaviors from vehicle sensor data, enabling proactive fleet safety monitoring and reducing accident risk by 10%. Delivered $9,000 savings per vehicle annually by lowering accident costs, reducing fuel waste by 5%, and minimizing maintenance.

Resources

Stars

Watchers

Forks