Replies: 3 comments
-
One-hot vectors might work, except the question is at which level do you get the embedding? Typically each row is labeled with the word, and you get a column vector of 0's except a 1 for the word of interest. How would the following pattern be represented?
|
Beta Was this translation helpful? Give feedback.
-
Realizing now, LexicalDispersionPlot effectively does this. The positions of Lexical patterns are visualized with MatrixPlot. Each row is a pattern and each column is a position in the source text. So to construct the one-hot vector for the pattern just replace the position (x, y) with a 1. |
Beta Was this translation helpful? Give feedback.
-
A word embedding is a word-context encoding vector. A lexical embedding is a lexical-pattern-context encoding vector. For each l-gram of size equal to the lexical pattern in text there is a 0 or 1 representing it's position. Then take the sum of lexical patterns across text at each position to get real-valued embeddings. Then you have lexical features instead of word features. |
Beta Was this translation helpful? Give feedback.
-
A LexicalPattern Embeddings is like a word embedding.
Maybe a class(?) embedding?
Might be difficult because the structure of a lexical pattern is complex where a sequence of words is simple. That is to say — each word in the vocabulary gets its own integer, a lexical pattern can have variations, if it's AnyOrdered or contains Optionals.
Beta Was this translation helpful? Give feedback.
All reactions