Predicting and ranking possible next sentences in a narrative in an artificial grammar #9126

metalaureate · 2021-09-03T02:46:28Z

metalaureate
Sep 3, 2021

I'm probably pushing it with this question but hoping for the wise direction of spaCy's wonderful maintainers. Is it possible to use spaCy to predict and rank the probability of possible next sentences in an artificial grammar? The training data is a corpus of thousands of documents with sentence sequences ranging between 5 to 200 sentences. I've scoured Jurafsky & Martin and I'm wondering if this an NLP problem that I can tackle in the spaCy framework. What spaCy feature domain would that map to?

My domain of discourse is a metalanguage for labeling narratives, A corpus of thousands of narratives (documents) are labeled into sentences of an artificial grammar (using spaCy to create this meta grammar labeling). Each sentence can be described in BNF meta notation, involving about 200 entity types related through 5 semantic role relationships as arguments of 20 possible narrative function verbs. The purpose of my project is to auto-generate possible narrative paths from the previous path of narrative sentences according to a global scoring function for narrative excitement.

polm · 2021-09-04T04:37:10Z

polm
Sep 4, 2021

Next sentence prediction is kind of like language modelling / next token prediction and is a pretty common task. There isn't anything similar in spaCy by default. You could wire up a component to do this if you have a way of getting candidate next sentences, maybe the Entity Linker would be a useful reference since it uses an external database. Basically you have an input text and your model uses features of that to select from a list of candidates somehow generated using the input text and external data (a knowledge base for entity linking, or your grammar for your case).

I have absolutely no idea how that that would go together with your artificial grammar. It sounds like maybe you could model it as a neural NLG/seq2seq problem, but NLG is specifically out of scope for spaCy, and it wouldn't necessarily fit your grammar constraints anyway.

If you can generate candidate sentences it should be possible to train a model to pick the best one, but I'm not sure I really know of any similar problems...

1 reply

metalaureate Sep 4, 2021
Author

Awesome, yes... NLG/seq2seq is a great call out. You guys help me so much just by pointing out the domain I might be in. It stops me floundering. I didn't even know NLG was a thing. I was messing about with LSTMs in Keras thinking that perhaps this was like a weather forecasting problem.

You can think of my training set as folkfloric abstraction of the plot of movies and TV shows. Each sentence is a plot move that is scored as to its contribution to a global performance variable, which I called excitement in this scenario. So for any series of input of sentences, I have in theory candidat next sentences.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Predicting and ranking possible next sentences in a narrative in an artificial grammar #9126

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 1 comment 1 reply

{{title}}

{{title}}

Select a reply

Predicting and ranking possible next sentences in a narrative in an artificial grammar #9126

metalaureate Sep 3, 2021

Replies: 1 comment · 1 reply

polm Sep 4, 2021

metalaureate Sep 4, 2021 Author

metalaureate
Sep 3, 2021

Replies: 1 comment 1 reply

polm
Sep 4, 2021

metalaureate Sep 4, 2021
Author