What is the cleanest method to get coreferenced text returned #12142
Replies: 6 comments 2 replies
-
Could you explain a bit more what you'd like to accomplish? There are two ways to access the coreference chains: (1) through the https://github.com/msg-systems/coreferee#2-interacting-with-the-data-model Suppose that we have the English example from the coreferee documentation:
At a basic level, you can see a
Suppose that you want the first (and only) coreference chain that wife s part of. You can index a
This is a
To unpack this:
Since a
I would recommend you to go through the documentation of |
Beta Was this translation helpful? Give feedback.
-
What I want to return is -> doc = nlp("Although he was very busy with his work, Peter had had enough of it. He and his wife decided they needed a holiday. They travelled to Spain because they loved the country very much.") and output is:
The coreference is only a starting step in the information extraction pipeline. I want to retrieve the coreferenced text and input it into the REBEL model next. I could perhaps come up with some crappy code to do this, but I want to use this example in my book and would therefore like as clean code as possible. Thanks |
Beta Was this translation helpful? Give feedback.
-
A method like neuralCoref provides would be great
|
Beta Was this translation helpful? Give feedback.
-
Found a solution
|
Beta Was this translation helpful? Give feedback.
-
Hi @tomasonjo, we considered but decided against implementing this requirement at the very beginning of Coreferee's life: see msg-systems/coreferee#1. The problem is that some of the languages Coreferee supports — and probably a majority of the world's languages that it might one day support — do not consistently express anaphors with overt pronouns. So in many cases there is nothing for a referred-to noun to replace. You can do this in English, but because it's a language-specific thing it is more appropriate to use the code supplied above outside Coreferee rather than within it. |
Beta Was this translation helpful? Give feedback.
-
Hi @richardpaulhudson , thanks for your input. However, it would probably make sense to at least include the code for english version in the docs. At least, that's my opinion as coreference is usually only a single step in the NLP pipeline and not the end output. Thanks |
Beta Was this translation helpful? Give feedback.
-
Hey, I like using SpaCy but I have little to no experience with Doc or Token objects. I would simply like to return coreferenced text that I get by using the coreferee plugin. It seems that you can check coref clusters and resolve tokens with the appropriate method, but I have no idea what would be the cleanest options to return the coreferenced text of the entire doc object.
Thanks,
Tomaž
Beta Was this translation helpful? Give feedback.
All reactions