- Introduction
- Dataset
- Data Pre-Processing
- Objective of Data Analysis
- Tasks
- Analytics Workflow
- Visualizations
- Bar Chart - Distribution of Crime Counts Across Areas
- Heatmap of Crime Incidents Across Los Angeles
- Seasonal Crime Comparison Across Areas
- Yearly Crime Trends Over Days
- Moving Average - Small Multiples of Crimes Across Months Per Area
- Bubble Map within each Area per Year
- Bar Graph of Arrests in the Detected Area Name
- Inferences
- Work Distribution
- References
- Functions
This README accompanies the Data Visualization Assignment 3 report, detailing the analysis, tasks performed, visualizations created, and insights derived from crime and arrest data in Los Angeles.
The dataset includes crime and arrest data from Los Angeles spanning from January 2020 to December 2023.
The data underwent preprocessing steps, including cleaning and filtering out entries with unknown location coordinates.
The primary objective was to determine the impact of location and time on crime counts within various areas of Los Angeles.
Two main tasks were identified:
- Identifying temporal periods when crime counts surged in different areas.
- Examining the impact of crime surges on the quantity of arrests within specific areas.
The workflow encompassed various stages, including data loading, preprocessing, analysis, visualization, and deriving insights.
Detailed visualizations were created to analyze crime patterns, including bar charts, heatmaps, seasonal comparisons, yearly trends, moving averages, and spatial distribution maps.
This visualization illustrates the total number of reported crimes across different areas in Los Angeles. Each bar represents a distinct area, providing a visual comparison of crime frequency among various areas.
The heatmap visualizes the spatial distribution and intensity of reported crime incidents across different areas in Los Angeles. It showcases the geographical concentration of crime incidents, offering insights into crime hotspots or patterns based on location coordinates.
Small multiples showcase the comparison of seasonal crime incidents across different areas in Los Angeles. The plot comprises subplots representing distinct seasons—Fall, Spring, Summer, and Winter—depicting the count of reported crimes within various areas during these seasons.
This visualization presents the yearly trends of reported crime incidents from 2020 to 2023, demonstrating how the count of reported crimes fluctuates over the days of the respective years.
The moving average plots visualize the 7-day average of crimes in each area for each year as a line chart. These plots help in understanding trends by smoothing out short-term fluctuations.
This visualization uses unsupervised clustering to identify crime clusters within each area for each year. Each cluster represents the concentration of crimes, plotted on the Los Angeles map based on location coordinates.
Bar graphs depict the number of arrests in specific areas during particular time frames, helping understand the impact of crime surges on arrest quantities.
Insights were derived, highlighting specific periods and areas experiencing notable spikes in crime counts and analyzing the correlation between crime spikes and arrest quantities.
- Munagala Kalyan Ram
- Vikas Kalyanapuram
- Ramsai Koushik
- Crime Data 2020 to Present in Los Angeles: Crime Data Link
- Crime Dataset Description: Crime Dataset Description Link
- Arrest Data 2020 to Present in Los Angeles: Arrest Data Link
- Arrest Dataset Description: Arrest Dataset Description Link
- Matplotlib Tutorial: Matplotlib Tutorial Link
- Seaborn Tutorial: Seaborn Tutorial Link
- Folium Tutorial: Folium Tutorial Link
- A1 Report: A1 Report Link
Ensure you have the following installed:
- Python (version 3.x)
- Jupyter Notebook
- Required libraries: pandas, matplotlib, seaborn, folium, scikit-learn
- Clone the Repository:
git clone https://github.com/DataViz-Trio/Assignment-3.git
cd Assignment-3
- Install Dependencies: Ensure the required libraries are installed using the following command:
pip install pandas matplotlib seaborn folium scikit-learn
-
Run the Notebook: Open Jupyter Notebook and navigate to the cloned repository. Run the
DataVisualizationAssignment3.ipynb
notebook. -
Exploring Visualizations:
- The notebook contains different sections, each generating specific visualizations. Execute each cell to generate the visualizations.
- Modify parameters or input data as needed to explore different aspects of the data or adjust the visualizations.
- Understanding Outputs:
- The notebook generates various visualizations showcasing crime patterns, seasonal trends, spatial distributions, and more.
- Explore each visualization to understand the insights derived from the data.