Skip to content

vdutts7/reddit-map

Repository files navigation

Reddit Logo Obama

Visual Map | Reddit comments

Visualizing maps of Reddit comments based on semantic similarity

Github

Reddit Logo

Table of Contents

    💸 FREE 200 USD cloud credits
    📝 About
    💻 How to build
    🔧 Tools used
    👤 Contact

💸FREE 200 USD cloud credits

Click the banner to activate $200 free personal cloud credits on DigitalOcean (deploy anything).

📝About

  • How to automate the extraction, processing, and mapping of Reddit comments using Python and Nomic Atlas
  • Use the Reddit API to fetch comments from a Reddit post URL
  • Store the data in Nomic Atlas
  • Create an Atlas map on the dataset to produce a visualization

💻How to build

1. Setup:

python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt

2. Environemnt variables

Get your Reddit developer credentials: https://www.reddit.com/prefs/apps

REDDIT_CLIENT_ID=<your_client_id>
REDDIT_CLIENT_SECRET=<your_client_secret>
REDDIT_USER_AGENT=<your_user_agent>

3. Data Collection

Run the script to collect Reddit comments:

python reddit.py

The script will:

  • Prompt for a Reddit post URL
  • Extract all comments recursively
  • Show real-time progress
  • Save data to a CSV file

4. Data Processing

The script automatically:

  • Extracts comment metadata (author, score, timestamp)
  • Handles nested comment structures
  • Implements rate limiting and error handling
  • Saves processed data in a structured format

5. Visualization

After data collection, the comments are visualized using Nomic Atlas:

  • Creates semantic embeddings of comments
  • Generates interactive 2D/3D visualizations
  • Clusters similar comments together
  • Allows exploration of comment relationships

Example Output

(venv) (base) vdutts7@Vacbook-Vro reddit-map % python reddit.py             
Enter Reddit post URL: https://www.reddit.com/r/pics/comments/5bx4bx/thanks_obama/
Loading comments...
Found 5936 comments to process
Progress: 1.7% (100/5936 comments processed)
Progress: 3.4% (200/5936 comments processed)
Progress: 5.1% (300/5936 comments processed)
Progress: 6.7% (400/5936 comments processed)
Progress: 8.4% (500/5936 comments processed)
Progress: 10.1% (600/5936 comments processed)
Progress: 11.8% (700/5936 comments processed)
Progress: 13.5% (800/5936 comments processed)
Progress: 15.2% (900/5936 comments processed)
Progress: 16.8% (1000/5936 comments processed)
Progress: 18.5% (1100/5936 comments processed)
Progress: 20.2% (1200/5936 comments processed)
Progress: 21.9% (1300/5936 comments processed)
Progress: 23.6% (1400/5936 comments processed)
Progress: 25.3% (1500/5936 comments processed)
Progress: 27.0% (1600/5936 comments processed)
Progress: 28.6% (1700/5936 comments processed)
Progress: 30.3% (1800/5936 comments processed)
Progress: 32.0% (1900/5936 comments processed)
Progress: 33.7% (2000/5936 comments processed)
Progress: 35.4% (2100/5936 comments processed)
Progress: 37.1% (2200/5936 comments processed)
Progress: 38.7% (2300/5936 comments processed)
Progress: 40.4% (2400/5936 comments processed)
Progress: 42.1% (2500/5936 comments processed)
Progress: 43.8% (2600/5936 comments processed)
Progress: 45.5% (2700/5936 comments processed)
Progress: 47.2% (2800/5936 comments processed)
Progress: 48.9% (2900/5936 comments processed)
Progress: 50.5% (3000/5936 comments processed)
Progress: 52.2% (3100/5936 comments processed)
Progress: 53.9% (3200/5936 comments processed)
Progress: 55.6% (3300/5936 comments processed)
Progress: 57.3% (3400/5936 comments processed)
Progress: 59.0% (3500/5936 comments processed)
Progress: 60.6% (3600/5936 comments processed)
Progress: 62.3% (3700/5936 comments processed)
Progress: 64.0% (3800/5936 comments processed)
Progress: 65.7% (3900/5936 comments processed)
Progress: 67.4% (4000/5936 comments processed)
Progress: 69.1% (4100/5936 comments processed)
Progress: 70.8% (4200/5936 comments processed)
Progress: 72.4% (4300/5936 comments processed)
Progress: 74.1% (4400/5936 comments processed)
Progress: 75.8% (4500/5936 comments processed)
Progress: 77.5% (4600/5936 comments processed)
Progress: 79.2% (4700/5936 comments processed)
Progress: 80.9% (4800/5936 comments processed)
Progress: 82.5% (4900/5936 comments processed)
Progress: 84.2% (5000/5936 comments processed)
Progress: 85.9% (5100/5936 comments processed)
Progress: 87.6% (5200/5936 comments processed)
Progress: 89.3% (5300/5936 comments processed)
Progress: 91.0% (5400/5936 comments processed)
Progress: 92.7% (5500/5936 comments processed)
Progress: 94.3% (5600/5936 comments processed)
Progress: 96.0% (5700/5936 comments processed)
Progress: 97.7% (5800/5936 comments processed)
Progress: 99.4% (5900/5936 comments processed)
Completed! Total comments fetched: 5936
Comments saved to reddit_comments_1730443051.csv

Demo: https://atlas.nomic.ai/data/auth0thread765/reddit-dataset

🔧Tools Used

Python PRAW Pandas Nomic

👤Contact

Email Twitter

Releases

No releases published

Packages

No packages published

Languages