This repository contains an ongoing project to compile and analyze Taylor Swift's discography, including all songs she's written for other artists. All data, including lyrics, is scraped initially from Genius via parsel and placed into a SQLite database. SQL queries transform the data via pandas and sqlite3 before ultimately being visualized via matplotlib and seaborn.
Please note this project is a work-in-progress and not yet complete.
This project is currently deployed on Streamlit!
- app: contains application files for the Streamlit app
- pages: contains application files for additional pages in Streamlit app
- assets: contains third-party assets
- fonts: contains font
ttf
files used in charts (courtesy of Google Fonts) - img: contains image files used in app
- fonts: contains font
- data: contains pickle versions of both raw and cleaned webscraping data, as well as SQLite database file
- csv: contains CSV files used to add/remove data from dataframe
- kaggle: contains CSV file used for Kaggle dataset
- figures: contains project pngs, including database schema (courtesy of dbdiagram.io)
- charts: contains all matplotlib/seaborn charts created
- notebooks: contains all Juptyter Notebooks
- data_collection.ipynb: initial webscraping from Genius using parsel, creating dataframes and database of Taylor Swift songs and lyrics
- sql: contains all SQL queries used in project (note: they are written in SQLite SQL)
- src: contains all Python modules used in project
This discography prioritizes song coverage over album coverage; this means that deluxe versions of albums with more songs are preferred over standard versions, and that album releases are preferred over single/EP releases. While not all versions of a song's release will be covered in this dataset, it contains every unique song. Rerecorded songs count as separate entries from their original versions.
Please see table below for more information on what is and isn't included in the discography:
Included in Discography | Not Included in Discography |
---|---|
Songs released on Taylor Swift studio albums | Duplicate song entries (e.g. for single release, album release, deluxe album release, etc.) |
Songs released on Taylor Swift rerecorded "Taylor's Version" albums | Unreleased/leaked songs and demos |
Songs by Taylor Swift for soundtracks or released as non-album singles | Covers of Taylor Swift songs by other artists |
Remixes of Taylor Swift songs featuring new performing artists | Remixes/acoustic versions of songs not featuring new performing artists |
Songs written by Taylor Swift for other artists, even if they don't feature Taylor | Live versions of songs/song mashups previously accounted for |
Songs by other artists featuring Taylor Swift | Songs that sample/interpolate Taylor Swift songs, even if they list her as a writer or feature |
Song covers by or featuring Taylor Swift that have been officially recorded and released | Songs that only exist in video format (DVD, live show recording, etc.) |