Can token vectors be used for word sense disambiguation? #13047
lawctan
started this conversation in
Help: Best practices
Replies: 1 comment 1 reply
-
I don't know how Japanese language works, hence I will talk about English word embeddings. Using the vector embedding, a given word (aka token) is always assaigned a single vector representation, therefore, IMHO isn't possible to use it as disambiguation. For example the word "fish" will have the same vectorial representation, no matter if you're using it as noun (a fish that swims in water) or the verb fish (related to fishing). You can use a transformer, that considers different word positions in a sentence, or use the dependency relations to identify if is a noun or a verb for example. |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I am using ja_core_news_lg model, which contains vectors. If a word can have multiple meaning, can I use the vector information of a token to compare with all the meanings/senses, and choose the one that is most similar? Will that be reliable as a way to figure out correct meaning of a token?
Beta Was this translation helpful? Give feedback.
All reactions