NLTK
When exporting news about a certain topic a lot of trash is incorporated. Traditionally finding the trash meant either manually looking through each article or using boolean search methods. Using boolean restricts how many articles comes in. I would like to use machine learning to swift through all the trash and take out only relevant articles.
Once relevant articles are exported, I would like to take out interesting words and graph that throught time.