Bag-of-Words Model #37

imsanjoykb · 2020-12-14T17:42:52Z

A simple and effective model for thinking about text documents in machine learning is called the Bag-of-Words Model, or BoW.

The model is simple in that it throws away all of the order information in the words and focuses on the occurrence of words in a document.

This can be done by assigning each word a unique number. Then any document we see can be encoded as a fixed-length vector with the length of the vocabulary of known words. The value in each position in the vector could be filled with a count or frequency of each word in the encoded document.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bag-of-Words Model #37

Bag-of-Words Model #37

imsanjoykb commented Dec 14, 2020

Bag-of-Words Model #37

Bag-of-Words Model #37

Comments

imsanjoykb commented Dec 14, 2020