Skip to content

Fohlen/reddit_sentiment

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

54 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

reddit_sentiment

A toolkit to download, preprocess and analyse reddit sentiment data. It is derived from the Reddit Comments Archive hosted by pushshift.

How to run

  1. Install requirements (jq, curl), e.g brew install jq curl
  2. Install Poetry
  3. Download archives
  4. Distill a smaller dataset
poetry install

# use --help for help with the commands
poetry run download-annotate-archives 2005 2006 --multithreading
poetry run distill-dataset

Analysis

For a basic and more advanced usages of the resulting dataset, consider the analysis folder.

About

Software to create a reddit sentiment corpus

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages