The tasks performed are:
Basic Text Processing
Dividing the processed text corpus into test and train in 80:20 ratio.
Compute MLE for unigram, bigram, trigrams and quadgrams.
generate(mle) : generates a sentence given a model.
probLog(sentence,model_name) : gives the log space probabilty of a sentence given a model.
Smoothng:
Class goodTuring, ADD1 updates the count after smoothing and gives new probablity of Sample..
Model Evaluation:
perPlexity(data,bigramLst,model) : measurement of how well a model predicts a sample/Sentence.