pronoun resolution backwards pronoun resolution backwards python python

pronoun resolution backwards


This is perhaps not really an answer to be happy with, but I think the answer is that there's no such functionality built in anywhere, though you can code it yourself without too much difficulty. Giving an outline of how I'd do it with CoreNLP:

  1. Still run coref. This'll tell you that "the man" and "the man" are coreferent, and so you can replace the second one with a pronoun.

  2. Run the gender annotator from CoreNLP. This is a poorly-documented and even more poorly advertised annotator that tries to attach gender to tokens in a sentence.

  3. Somehow figure out plurals. Most of the time you could use the part-of-speech tag: plural nouns get the tags NNS or NNPS, but there are some complications so you might also want to consider (1) the existence of conjunctions in the antecedent; (2) the lemma of a word being different from its text; (3) especially in conjunction with 2, the word ending in 's' or 'es' -- this can distinguish between lemmatizations which strip out plurals versus lemmatizations which strip out tenses, etc.

  4. This is enough to figure out the right pronoun. Now it's just a matter of chopping up the sentence and putting it back together. This is a bit of a pain if you do it in CoreNLP -- the code is just not set up to change the text of a sentence -- but in the worst case you can always just re-annotate a new surface form.

Hope this helps somewhat!