Skip to content

This project analyzes crime data from Jefferson County, KY, with a primary focus on auto thefts spanning the years 2020 to 2024. The objective is to process, clean, and visualize crime data to uncover trends, provide insights, and prepare datasets for further reporting. Inspired by the theft of my vehicle, December 2024.

Notifications You must be signed in to change notification settings

matthewlondon/Capstone

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

37 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Crime Data Analysis Project

Project Overview

This project analyzes crime data from Jefferson County, KY, spanning the years 2020 to 2024. The main focus is on processing, cleaning, and visualizing crime data, specifically related to auto thefts. The analysis merges crime data with ZIP code information and standardizes fields to prepare the data for insights and reporting.

The motivation for this project stems from a personal experience: my vehicle was stolen during the first week of December 2024. I became curious about the rates of theft, how common this was in my area, and quickly decided to structure my project around this issue.

Data Sources

Feature 1: Read TWO data files (JSON, CSV, Excel, etc.).

File Structure

  • main.py: The main script that orchestrates data processing.

  • preprocessing.py: Contains modular functions for loading, cleaning, and processing data.

  • data/raw_data/: Folder containing raw crime data files.

  • data/processed_data/: Folder containing processed data outputs.

Feature 2: Clean your data and perform a pandas merge with your two data sets, then calculate some new values based on the new data set.

Steps to Reproduce

  1. Clone the repository:
open new Git bash terminal in code editor
git clone https://github.com/matthewlondon/Capstone.git
  1. Set up a virtual environment:
cd Capstone
python -m venv venv
source venv/bin/activate

if using Windows:

source venv/Scripts/activate

Feature 4: Utilize a virtual environment and include instructions in your README on how the user should set one up.

  1. Install dependencies:
pip install -r requirements.txt
  1. Run the main.py script:
python src/main.py
  1. Open the exploration.ipynb file:

  2. Run all scripts to view matplot and seaborn visualizations:

  • Most Common Days for Reported Auto Thefts
  • Auto Theft Trends Over the Years
  • Top 10 ZIP Codes for Auto Thefts
  • Trends by Day of Week and Top 10 Locations
  • Yearly Trends by ZIP Code
  • Personal Analysis of My Car Theft Incident
  • Incidents per 100 People by ZIP Code
May encounter "cannot resolve (library) from source. Check if python interpreter is Python or venv(python). If not venv(python), attempt to select kernel. If no venv(python) is in the drop down, clear cache and reload the window. In VSCode, use Ctrl + Shift + p, and search "Python: Clear Cache and Reload Window" then select kernel as venv(python).

Feature 3: Make 3 matplotlib or seaborn (or another plotting library) visualizations to display your data.

Processed Data Outputs

  • cleaned_crime_data.csv: Contains the cleaned and processed crime data.
  • jefferson_zip_df.csv: Contains filtered ZIP code data for Jefferson County, KY.

Data Dictionary

Cleaned Crime Data

Column Name Description Data Type
zip ZIP code of the crime location. String
incident_number Unique identifier for each incident. String
date_reported Date when the crime was reported. DateTime
date_occurred Date when the crime occurred. DateTime
offense_classification Standardized offense classification (e.g., AUTO THEFT). Category
location_category Category describing the crime location. Category
was_offense_completed Whether the offense was completed (YES/NO/UNKNOWN). Category
value_range Estimated monetary range associated with the offense. Category
week_day_reported Day of the week the crime was reported. Category
week_day_occurred Day of the week the crime occurred. Category

Jefferson County ZIP Data

Column Name Description Data Type
zip ZIP codes in Jefferson County, KY. Integer
latitude Coordinates Float
longitude Coordinates Float
irs_estimated_population 2020 population estimates based on exemption filings Integer

Visualizations

- Most Common Days for Reported Auto Thefts
- Auto Theft Trends Over the Years
- Top 10 ZIP Codes for Auto Thefts
- Trends by Day of Week and Top 10 Locations
- Yearly Trends by ZIP Code
- Personal Analysis of My Car Theft Incident
- Incidents per 100 People by ZIP Code

Contact

Matthew London - mmatthewlondon@gmail.com

About

This project analyzes crime data from Jefferson County, KY, with a primary focus on auto thefts spanning the years 2020 to 2024. The objective is to process, clean, and visualize crime data to uncover trends, provide insights, and prepare datasets for further reporting. Inspired by the theft of my vehicle, December 2024.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published