Skip to content

Latest commit

 

History

History
84 lines (61 loc) · 3.53 KB

README.md

File metadata and controls

84 lines (61 loc) · 3.53 KB

San Diego County Collisions Exploratory Analysis

Overview

This project explores the relationship between traffic collisions and various factors in San Diego County, focusing on data from 2015 to 2019. We investigate the correlation between collision locations and popular nightlife areas, temporal patterns of accidents, and potential police biases in traffic stops.

Table of Contents

  1. Research Questions
  2. Hypotheses
  3. Datasets
  4. Methods
  5. Key Findings
  6. Ethical Considerations
  7. Limitations
  8. Team Members
  9. Acknowledgements

Research Questions

  1. What are the most common types of traffic collisions in San Diego County?
  2. Is there a relationship between high bar density areas and traffic collision frequency?
  3. Which police beats and geographic divisions experience the most severe accidents?
  4. Are there any demographic biases in police traffic stops?

Hypotheses

  1. Minor, non-fatal accidents will be most prevalent.
  2. More collisions will occur near nightlife hotspots (e.g., Pacific Beach, Gaslamp).
  3. Lower-income neighborhoods will experience more severe accidents.
  4. Younger drivers will be stopped and questioned more frequently.

Datasets

  1. Traffic Collisions (2015-2019): 28,122 observations
  2. Police Stops (2018-2019): 179,725 observations
  3. Yelp Bars: 50 observations
    • Source: Yelp API
  4. Yelp Clubs: 49 observations
    • Source: Yelp API

Methods

  • Geospatial analysis of collision locations relative to nightlife areas
  • Temporal analysis of collision frequency by time and day
  • Demographic analysis of police stops
  • Statistical testing of hypotheses

Key Findings

  1. Most common violations: Traffic signal and sign violations
  2. Highest collision frequency: Pacific Beach (1500 collisions)
  3. Severe accidents: Northwestern San Diego (highest average injuries), Southern San Diego (highest average fatalities)
  4. Demographics: Younger people stopped more frequently and for longer durations

Ethical Considerations

  • Implemented Safe Harbour protocol to protect individual privacy
  • Careful interpretation of results to avoid reinforcing stereotypes or biases
  • Consideration of socioeconomic factors in analyzing collision patterns

Limitations

  • Incomplete bar and nightclub data from Yelp API
  • Overlapping violation categories in the dataset
  • Broad geographic divisions may obscure local patterns
  • Limited timeframe (2015-2019) may not capture long-term trends

Team Members

Acknowledgements

This project was completed as part of COGS 108: Data Science in Practice at the University of California, San Diego. We thank our instructors and the San Diego Data Portal for providing resources and data.

For more information on San Diego police beats, visit the San Diego Police Department website.