Convert words between verb/noun/adjective forms

python nlp nltk wordnet

This is more a heuristic approach. I have just coded it so appologies for the style. It uses the derivationally_related_forms() from wordnet. I have implemented nounify. I guess verbify works analogous. From what I've tested works pretty well:

from nltk.corpus import wordnet as wndef nounify(verb_word):    """ Transform a verb to the closest noun: die -> death """    verb_synsets = wn.synsets(verb_word, pos="v")    # Word not found    if not verb_synsets:        return []    # Get all verb lemmas of the word    verb_lemmas = [l for s in verb_synsets \                   for l in s.lemmas if s.name.split('.')[1] == 'v']    # Get related forms    derivationally_related_forms = [(l, l.derivationally_related_forms()) \                                    for l in    verb_lemmas]    # filter only the nouns    related_noun_lemmas = [l for drf in derivationally_related_forms \                           for l in drf[1] if l.synset.name.split('.')[1] == 'n']    # Extract the words from the lemmas    words = [l.name for l in related_noun_lemmas]    len_words = len(words)    # Build the result in the form of a list containing tuples (word, probability)    result = [(w, float(words.count(w))/len_words) for w in set(words)]    result.sort(key=lambda w: -w[1])    # return all the possibilities sorted by probability    return result

python nlp nltk wordnet

Here is a function that is in theory able to convert words between noun/verb/adjective/adverb form that I updated from here (originally written by bogs, I believe) to be compliant with nltk 3.2.5 now that synset.lemmas and sysnset.name are functions.

from nltk.corpus import wordnet as wn# Just to make it a bit more readableWN_NOUN = 'n'WN_VERB = 'v'WN_ADJECTIVE = 'a'WN_ADJECTIVE_SATELLITE = 's'WN_ADVERB = 'r'def convert(word, from_pos, to_pos):        """ Transform words given from/to POS tags """    synsets = wn.synsets(word, pos=from_pos)    # Word not found    if not synsets:        return []    # Get all lemmas of the word (consider 'a'and 's' equivalent)    lemmas = []    for s in synsets:        for l in s.lemmas():            if s.name().split('.')[1] == from_pos or from_pos in (WN_ADJECTIVE, WN_ADJECTIVE_SATELLITE) and s.name().split('.')[1] in (WN_ADJECTIVE, WN_ADJECTIVE_SATELLITE):                lemmas += [l]    # Get related forms    derivationally_related_forms = [(l, l.derivationally_related_forms()) for l in lemmas]    # filter only the desired pos (consider 'a' and 's' equivalent)    related_noun_lemmas = []    for drf in derivationally_related_forms:        for l in drf[1]:            if l.synset().name().split('.')[1] == to_pos or to_pos in (WN_ADJECTIVE, WN_ADJECTIVE_SATELLITE) and l.synset().name().split('.')[1] in (WN_ADJECTIVE, WN_ADJECTIVE_SATELLITE):                related_noun_lemmas += [l]    # Extract the words from the lemmas    words = [l.name() for l in related_noun_lemmas]    len_words = len(words)    # Build the result in the form of a list containing tuples (word, probability)    result = [(w, float(words.count(w)) / len_words) for w in set(words)]    result.sort(key=lambda w:-w[1])    # return all the possibilities sorted by probability    return resultconvert('direct', 'a', 'r')convert('direct', 'a', 'n')convert('quick', 'a', 'r')convert('quickly', 'r', 'a')convert('hunger', 'n', 'v')convert('run', 'v', 'a')convert('tired', 'a', 'r')convert('tired', 'a', 'v')convert('tired', 'a', 'n')convert('tired', 'a', 's')convert('wonder', 'v', 'n')convert('wonder', 'n', 'a')

As you can see below, it doesn't work so great. It's unable to switch between adjective and adverb form (my specific goal), but it does give some interesting results in other cases.

>>> convert('direct', 'a', 'r')[]>>> convert('direct', 'a', 'n')[('directness', 0.6666666666666666), ('line', 0.3333333333333333)]>>> convert('quick', 'a', 'r')[]>>> convert('quickly', 'r', 'a')[]>>> convert('hunger', 'n', 'v')[('hunger', 0.75), ('thirst', 0.25)]>>> convert('run', 'v', 'a')[('persistent', 0.16666666666666666), ('executive', 0.16666666666666666), ('operative', 0.16666666666666666), ('prevalent', 0.16666666666666666), ('meltable', 0.16666666666666666), ('operant', 0.16666666666666666)]>>> convert('tired', 'a', 'r')[]>>> convert('tired', 'a', 'v')[]>>> convert('tired', 'a', 'n')[('triteness', 0.25), ('banality', 0.25), ('tiredness', 0.25), ('commonplace', 0.25)]>>> convert('tired', 'a', 's')[]>>> convert('wonder', 'v', 'n')[('wonder', 0.3333333333333333), ('wonderer', 0.2222222222222222), ('marveller', 0.1111111111111111), ('marvel', 0.1111111111111111), ('wonderment', 0.1111111111111111), ('question', 0.1111111111111111)]>>> convert('wonder', 'n', 'a')[('curious', 0.4), ('wondrous', 0.2), ('marvelous', 0.2), ('marvellous', 0.2)]

hope this is able to save someone a little trouble

python nlp nltk wordnet

I understand that this doesn't answer your whole question, but it does answer a large part of it. I would check outhttp://nodebox.net/code/index.php/Linguistics#verb_conjugation This python library is able to conjugate verbs, and recognize whether a word is a verb, noun, or adjective.

EXAMPLE CODE

print en.verb.present("gave")print en.verb.present("gave", person=3, negate=False)>>> give>>> gives

It can also categorize words.

print en.is_noun("banana")>>> True

The download is at the top of the link.

CodeHunter

Convert words between verb/noun/adjective forms

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last