Implementation of LSTM architecture in Keras to perform sentiment analysis on movie reviews from the Large Movie Review Dataset.
The Large Movie Review Dataset (often referred to as the IMDB dataset) contains 25,000 highly-polar movie reviews (good or bad) for training and the same amount again for testing. The problem is to determine whether a given movie review has a positive or negative sentiment.
Link to download dataset here
The data was collected by Stanford researchers and was used in a 2011 paper where a split of 50-50 of the data was used for training and test. An accuracy of 88.89% was achieved.
- A Simple LSTM Network
- LSTM Network With Dropout
- LSTM and Convolutional Neural Network with dropout
- Simple LSTM - 85.72%
- LSTM Network With Dropout - 85.67%
- LSTM and Convolutional Neural Network with dropout - 88.41% !
The following resources are extremely useful for a detailed understanding of this task or Deep Learning techniques:
- WildML blog by Denny Britz
- Machine Learning Mastery blog by Dr.Jason Brownlee
- Andrej Karpathy's blog on RNNs