This project analyzes crime data from Jefferson County, KY, spanning the years 2020 to 2024. The main focus is on processing, cleaning, and visualizing crime data, specifically related to auto thefts. The analysis merges crime data with ZIP code information and standardizes fields to prepare the data for insights and reporting.
The motivation for this project stems from a personal experience: my vehicle was stolen during the first week of December 2024. I became curious about the rates of theft, how common this was in my area, and quickly decided to structure my project around this issue.
-
Crime Data: Raw crime data from 2020 to 2024 provided in CSV format.
-
https://data.louisvilleky.gov/datasets/LOJIC::louisville-metro-ky-crime-data-2020/about
-
https://data.louisvilleky.gov/datasets/LOJIC::louisville-metro-ky-crime-data-2021/about
-
https://data.louisvilleky.gov/datasets/LOJIC::louisville-metro-ky-crime-data-2022/about
-
https://data.louisvilleky.gov/datasets/LOJIC::louisville-metro-ky-crime-data-2023/about
-
https://data.louisvilleky.gov/datasets/LOJIC::louisville-metro-ky-crime-data-2024/about
-
ZIP Codes: ZIP code data for filtering relevant locations (Jefferson County, KY).
-
main.py
: The main script that orchestrates data processing. -
preprocessing.py
: Contains modular functions for loading, cleaning, and processing data. -
data/raw_data/
: Folder containing raw crime data files. -
data/processed_data/
: Folder containing processed data outputs.
Feature 2: Clean your data and perform a pandas merge with your two data sets, then calculate some new values based on the new data set.
- Clone the repository:
open new Git bash terminal in code editor
git clone https://github.com/matthewlondon/Capstone.git
- Set up a virtual environment:
cd Capstone
python -m venv venv
source venv/bin/activate
if using Windows:
source venv/Scripts/activate
Feature 4: Utilize a virtual environment and include instructions in your README on how the user should set one up.
- Install dependencies:
pip install -r requirements.txt
- Run the
main.py
script:
python src/main.py
-
Open the
exploration.ipynb
file: -
Run all scripts to view matplot and seaborn visualizations:
- Most Common Days for Reported Auto Thefts
- Auto Theft Trends Over the Years
- Top 10 ZIP Codes for Auto Thefts
- Trends by Day of Week and Top 10 Locations
- Yearly Trends by ZIP Code
- Personal Analysis of My Car Theft Incident
- Incidents per 100 People by ZIP Code
May encounter "cannot resolve (library) from source. Check if python interpreter is Python or venv(python). If not venv(python), attempt to select kernel. If no venv(python) is in the drop down, clear cache and reload the window. In VSCode, use Ctrl + Shift + p, and search "Python: Clear Cache and Reload Window" then select kernel as venv(python).
Feature 3: Make 3 matplotlib or seaborn (or another plotting library) visualizations to display your data.
cleaned_crime_data.csv
: Contains the cleaned and processed crime data.jefferson_zip_df.csv
: Contains filtered ZIP code data for Jefferson County, KY.
Column Name | Description | Data Type |
---|---|---|
zip |
ZIP code of the crime location. | String |
incident_number |
Unique identifier for each incident. | String |
date_reported |
Date when the crime was reported. | DateTime |
date_occurred |
Date when the crime occurred. | DateTime |
offense_classification |
Standardized offense classification (e.g., AUTO THEFT). | Category |
location_category |
Category describing the crime location. | Category |
was_offense_completed |
Whether the offense was completed (YES/NO/UNKNOWN). | Category |
value_range |
Estimated monetary range associated with the offense. | Category |
week_day_reported |
Day of the week the crime was reported. | Category |
week_day_occurred |
Day of the week the crime occurred. | Category |
Column Name | Description | Data Type |
---|---|---|
zip |
ZIP codes in Jefferson County, KY. | Integer |
latitude |
Coordinates | Float |
longitude |
Coordinates | Float |
irs_estimated_population |
2020 population estimates based on exemption filings | Integer |
- Most Common Days for Reported Auto Thefts
- Auto Theft Trends Over the Years
- Top 10 ZIP Codes for Auto Thefts
- Trends by Day of Week and Top 10 Locations
- Yearly Trends by ZIP Code
- Personal Analysis of My Car Theft Incident
- Incidents per 100 People by ZIP Code
Matthew London - mmatthewlondon@gmail.com