-
The backend for a proof-of-concept internal tool to find and monitor copyrighted content on YouTube.
-
This is a Python Flask application that searches the YouTube Data API, filters out specific channels from the search and saves data to a live PosgreSQL database.
-
The data includes:
Channel Title, Channel ID, Video ID, Description, Thumbnail URL, and Publish Time
-
The app is deployed continuously to Heroku and the PostgreSQL database is hosted on Supabase.
- The live base URL:
https://flask-youtube-scraper-a55f990bea9f.herokuapp.com/
- Local development URL:
localhost:5000/
{{URL}}/api/search?query=<YOUR_SEARCH_QUERY>
Optionally exclude specific channels by name:
{{URL}}/api/search?query=<YOUR_SEARCH_QUERY>&exclude=ChannelNameToExclude,AnotherChannelToExclude
for example: to search for "Lil Wayne" but exclude his official channel with his channel ID:
{{URL}}/api/search?query=lil%20wayne&exclude=LilWayneVEVO
- rename
.env.example
to.env
and add your environment variables
$ # Create virtual environment
$ venv venv
$ # Activate virtual environment
$ # If on Mac or Linux
$ source venv/bin/activate
$ # If on Windows
$ c:\>c:\Python35\python -m venv c:\path\to\venv
$ # Install dependencies
$ pip install -r requirements.txt
$ # Export Flask app
$ export FLASK_APP=app.py
$ # Run the development server
$ flask run