Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The Simpsons Episodes Dataset Analysis #527

Merged
merged 4 commits into from
Jan 16, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6,723 changes: 6,723 additions & 0 deletions Simpsons Episodes Analysis/Dataset/simpsons_characters.csv

Large diffs are not rendered by default.

601 changes: 601 additions & 0 deletions Simpsons Episodes Analysis/Dataset/simpsons_episodes.csv

Large diffs are not rendered by default.

4,460 changes: 4,460 additions & 0 deletions Simpsons Episodes Analysis/Dataset/simpsons_locations.csv

Large diffs are not rendered by default.

Binary file not shown.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
666 changes: 666 additions & 0 deletions Simpsons Episodes Analysis/Model/Simpsons_Analysis.ipynb

Large diffs are not rendered by default.

101 changes: 101 additions & 0 deletions Simpsons Episodes Analysis/Readme.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,101 @@
# Data Analysis Report: The Simpsons Dataset

## Introduction
This report presents a comprehensive analysis of The Simpsons dataset, encompassing characters, episodes, locations, and script lines. The analysis explores various aspects such as character interactions, episode ratings, viewership trends, sentiment analysis, and additional advanced analyses.

## Data Loading and Cleaning
The datasets were loaded into Pandas DataFrames, and basic cleaning steps were performed to handle encoding issues. The following datasets were used:
- simpsons_characters
- simpsons_episodes
- simpons_locations
- simpons_script_lines

DataSets: https://www.kaggle.com/datasets/thedevastator/the-simpsons-episodes-dataset

## Character Analysis

### Top Speaking Characters
Identified and visualized the top characters based on the number of lines spoken.

![Top Speaking Characters](https://github.com/Bayyana-kiran/sdf/assets/99533113/60bd9db2-8bc4-4ffe-b333-75fcd6d8c2f0)


### Gender Distribution of Speaking Characters
Explored the distribution of lines spoken by male and female characters.

![Gender Distribution](https://github.com/Bayyana-kiran/sdf/assets/99533113/3f43f0b7-53c3-4a11-ab76-69044cbfab55)


### Character Lines Pie Chart
Visualized the percentage of lines spoken by each character in a pie chart.

![Character Lines pie Chart](https://github.com/Bayyana-kiran/sdf/assets/99533113/6fa8afb2-0529-40c1-ab50-d177ee9fab87)


## Episode Analysis

### Season-wise Episode Count
Visualized the number of episodes in each season.

![Episode count per season](https://github.com/Bayyana-kiran/sdf/assets/99533113/632a943e-b57b-4abc-b0ea-1885c3693630)




### Seasonal Viewership Bar Chart
Visualized the average viewership per season using a bar chart.

![Seasonal Viewrship](https://github.com/Bayyana-kiran/sdf/assets/99533113/7368c975-fa54-49a4-9ee5-8572ec98726c)

## Location Analysis

### Top Locations Across Episodes
Identified and visualized the most popular locations based on the number of lines spoken.

![Top Locations across episodes](https://github.com/Bayyana-kiran/sdf/assets/99533113/d6fb1949-e18d-4fae-b95e-0017bc36faef)



## Text Analysis

### Word Cloud of Spoken Words
Created a word cloud to visualize the most frequently used words in the spoken lines.

![Word Cloud of spoken words](https://github.com/Bayyana-kiran/sdf/assets/99533113/90fbb2aa-7ca3-4a0c-8e14-592581db492a)


### Sentiment Analysis Over Time
Analyzed the sentiment of spoken words over time.

![Sentiment Analyis over time](https://github.com/Bayyana-kiran/sdf/assets/99533113/c4390f94-6e7a-4871-a230-c477f6ac9e6c)


# Conclusion

## Top Characters
- Identified leading characters by lines spoken.
- Explored gender distribution in dialogues.

## Episode Insights
- Visualized season-wise episode counts.
- Analyzed average viewership trends.

## Location Highlights
- Identified popular locations across episodes.

## Text Analysis
- Word cloud for frequent words.
- Sentiment analysis over time.

## Summary
- Provided comprehensive insights.
- Dataset lays the groundwork for further exploration.

In this extensive analysis of The Simpsons dataset, we delved into various facets of the animated series, providing nuanced insights into character dynamics, episode trends, and textual patterns. By identifying top-speaking characters, exploring gender distribution in dialogues, and visualizing episode counts and viewership trends, we gained a comprehensive understanding of the show's landscape. Additionally, the analysis of popular locations across episodes and the examination of spoken words through word clouds and sentiment analysis added depth to our exploration. This report serves as a robust foundation for further investigations into The Simpsons dataset, offering a wealth of information for researchers and enthusiasts alike.


---
## Contributor: Sai Kiran B L S

Github: [Sai Kiran B L S](https://github.com/Bayyana-kiran)