This repo is about applying natural language processing to understand the sentiment in the latest news articles featuring Bitcoin and Ethereum. As well as applying fundamental NLP techniques to better understand the other factors involved with the coin prices such as common words and phrases and organizations and entities mentioned in the articles.
-
Which coin had the highest mean positive score? - Bitcoin
-
Which coin had the highest negative score? - Ethereum
-
Which coin had the highest positive score? - Ethereum
- Use NLTK to produce the ngrams for N = 2.
- List the top 10 words for each coin.
- 'bitcoin', 16
- 'reuters', 12
- 'cryptocurrency', 5
- 'year', 5
- 'november', 5
- 'currency', 5
- 'ruvic', 5
- 'virtual', 4
- 'day', 3
- 'reaching', 3
- 'reuters', 28
- 'bitcoin', 15
- 'currency', 12
- 'virtual', 11
- 'ruvic', 10
- 'london', 7
- 'november', 6
- 'new', 6
- 'york', 5
- 'ethereum', 4
Named entity recognition model for both coins. Visualizing the tags using SpaCy.