Skip to content

Files

Latest commit

 

History

History

assignment2

Info

The tasks performed are:

Basic Text Processing
Dividing the processed text corpus into test and train in 80:20 ratio.
Compute MLE for unigram, bigram, trigrams and quadgrams.
generate(mle) : generates a sentence given a model.
probLog(sentence,model_name) : gives the log space probabilty of a sentence given a model.
Smoothng:
  Class goodTuring, ADD1 updates the count after smoothing and gives new probablity of Sample..
Model Evaluation:
  perPlexity(data,bigramLst,model) : measurement of how well a model predicts a sample/Sentence.