How to use Stanford Parser in NLTK using Python How to use Stanford Parser in NLTK using Python python python

How to use Stanford Parser in NLTK using Python


Note that this answer applies to NLTK v 3.0, and not to more recent versions.

Sure, try the following in Python:

import osfrom nltk.parse import stanfordos.environ['STANFORD_PARSER'] = '/path/to/standford/jars'os.environ['STANFORD_MODELS'] = '/path/to/standford/jars'parser = stanford.StanfordParser(model_path="/location/of/the/englishPCFG.ser.gz")sentences = parser.raw_parse_sents(("Hello, My name is Melroy.", "What is your name?"))print sentences# GUIfor line in sentences:    for sentence in line:        sentence.draw()

Output:

[Tree('ROOT', [Tree('S', [Tree('INTJ', [Tree('UH', ['Hello'])]), Tree(',', [',']), Tree('NP', [Tree('PRP$', ['My']), Tree('NN', ['name'])]), Tree('VP', [Tree('VBZ', ['is']), Tree('ADJP', [Tree('JJ', ['Melroy'])])]), Tree('.', ['.'])])]), Tree('ROOT', [Tree('SBARQ', [Tree('WHNP', [Tree('WP', ['What'])]), Tree('SQ', [Tree('VBZ', ['is']), Tree('NP', [Tree('PRP$', ['your']), Tree('NN', ['name'])])]), Tree('.', ['?'])])])]

Note 1:In this example both the parser & model jars are in the same folder.

Note 2:

  • File name of stanford parser is: stanford-parser.jar
  • File name of stanford models is: stanford-parser-x.x.x-models.jar

Note 3:The englishPCFG.ser.gz file can be found inside the models.jar file (/edu/stanford/nlp/models/lexparser/englishPCFG.ser.gz). Please use come archive manager to 'unzip' the models.jar file.

Note 4:Be sure you are using Java JRE (Runtime Environment) 1.8 also known as Oracle JDK 8. Otherwise you will get: Unsupported major.minor version 52.0.

Installation

  1. Download NLTK v3 from: https://github.com/nltk/nltk. And install NLTK:

    sudo python setup.py install

  2. You can use the NLTK downloader to get Stanford Parser, using Python:

    import nltknltk.download()
  3. Try my example! (don't forget the change the jar paths and change the model path to the ser.gz location)

OR:

  1. Download and install NLTK v3, same as above.

  2. Download the latest version from (current version filename is stanford-parser-full-2015-01-29.zip):http://nlp.stanford.edu/software/lex-parser.shtml#Download

  3. Extract the standford-parser-full-20xx-xx-xx.zip.

  4. Create a new folder ('jars' in my example). Place the extracted files into this jar folder: stanford-parser-3.x.x-models.jar and stanford-parser.jar.

    As shown above you can use the environment variables (STANFORD_PARSER & STANFORD_MODELS) to point to this 'jars' folder. I'm using Linux, so if you use Windows please use something like: C://folder//jars.

  5. Open the stanford-parser-3.x.x-models.jar using an Archive manager (7zip).

  6. Browse inside the jar file; edu/stanford/nlp/models/lexparser. Again, extract the file called 'englishPCFG.ser.gz'. Remember the location where you extract this ser.gz file.

  7. When creating a StanfordParser instance, you can provide the model path as parameter. This is the complete path to the model, in our case /location/of/englishPCFG.ser.gz.

  8. Try my example! (don't forget the change the jar paths and change the model path to the ser.gz location)


Deprecated Answer

The answer below is deprecated, please use the solution on https://stackoverflow.com/a/51981566/610569 for NLTK v3.3 and above.


EDITED

Note: The following answer will only work on:

  • NLTK version >=3.2.4
  • Stanford Tools compiled since 2015-04-20
  • Python 2.7, 3.4 and 3.5 (Python 3.6 is not yet officially supported)

As both tools changes rather quickly and the API might look very different 3-6 months later. Please treat the following answer as temporal and not an eternal fix.

Always refer to https://github.com/nltk/nltk/wiki/Installing-Third-Party-Software for the latest instruction on how to interface Stanford NLP tools using NLTK!!


TL;DR

cd $HOME# Update / Install NLTKpip install -U nltk# Download the Stanford NLP toolswget http://nlp.stanford.edu/software/stanford-ner-2015-04-20.zipwget http://nlp.stanford.edu/software/stanford-postagger-full-2015-04-20.zipwget http://nlp.stanford.edu/software/stanford-parser-full-2015-04-20.zip# Extract the zip file.unzip stanford-ner-2015-04-20.zip unzip stanford-parser-full-2015-04-20.zip unzip stanford-postagger-full-2015-04-20.zipexport STANFORDTOOLSDIR=$HOMEexport CLASSPATH=$STANFORDTOOLSDIR/stanford-postagger-full-2015-04-20/stanford-postagger.jar:$STANFORDTOOLSDIR/stanford-ner-2015-04-20/stanford-ner.jar:$STANFORDTOOLSDIR/stanford-parser-full-2015-04-20/stanford-parser.jar:$STANFORDTOOLSDIR/stanford-parser-full-2015-04-20/stanford-parser-3.5.2-models.jarexport STANFORD_MODELS=$STANFORDTOOLSDIR/stanford-postagger-full-2015-04-20/models:$STANFORDTOOLSDIR/stanford-ner-2015-04-20/classifiers

Then:

>>> from nltk.tag.stanford import StanfordPOSTagger>>> st = StanfordPOSTagger('english-bidirectional-distsim.tagger')>>> st.tag('What is the airspeed of an unladen swallow ?'.split())[(u'What', u'WP'), (u'is', u'VBZ'), (u'the', u'DT'), (u'airspeed', u'NN'), (u'of', u'IN'), (u'an', u'DT'), (u'unladen', u'JJ'), (u'swallow', u'VB'), (u'?', u'.')]>>> from nltk.tag import StanfordNERTagger>>> st = StanfordNERTagger('english.all.3class.distsim.crf.ser.gz') >>> st.tag('Rami Eid is studying at Stony Brook University in NY'.split())[(u'Rami', u'PERSON'), (u'Eid', u'PERSON'), (u'is', u'O'), (u'studying', u'O'), (u'at', u'O'), (u'Stony', u'ORGANIZATION'), (u'Brook', u'ORGANIZATION'), (u'University', u'ORGANIZATION'), (u'in', u'O'), (u'NY', u'O')]>>> from nltk.parse.stanford import StanfordParser>>> parser=StanfordParser(model_path="edu/stanford/nlp/models/lexparser/englishPCFG.ser.gz")>>> list(parser.raw_parse("the quick brown fox jumps over the lazy dog"))[Tree('ROOT', [Tree('NP', [Tree('NP', [Tree('DT', ['the']), Tree('JJ', ['quick']), Tree('JJ', ['brown']), Tree('NN', ['fox'])]), Tree('NP', [Tree('NP', [Tree('NNS', ['jumps'])]), Tree('PP', [Tree('IN', ['over']), Tree('NP', [Tree('DT', ['the']), Tree('JJ', ['lazy']), Tree('NN', ['dog'])])])])])])]>>> from nltk.parse.stanford import StanfordDependencyParser>>> dep_parser=StanfordDependencyParser(model_path="edu/stanford/nlp/models/lexparser/englishPCFG.ser.gz")>>> print [parse.tree() for parse in dep_parser.raw_parse("The quick brown fox jumps over the lazy dog.")][Tree('jumps', [Tree('fox', ['The', 'quick', 'brown']), Tree('dog', ['over', 'the', 'lazy'])])]

In Long:


Firstly, one must note that the Stanford NLP tools are written in Java and NLTK is written in Python. The way NLTK is interfacing the tool is through the call the Java tool through the command line interface.

Secondly, the NLTK API to the Stanford NLP tools have changed quite a lot since the version 3.1. So it is advisable to update your NLTK package to v3.1.

Thirdly, the NLTK API to Stanford NLP Tools wraps around the individual NLP tools, e.g. Stanford POS tagger, Stanford NER Tagger, Stanford Parser.

For the POS and NER tagger, it DOES NOT wrap around the Stanford Core NLP package.

For the Stanford Parser, it's a special case where it wraps around both the Stanford Parser and the Stanford Core NLP (personally, I have not used the latter using NLTK, i would rather follow @dimazest's demonstration on http://www.eecs.qmul.ac.uk/~dm303/stanford-dependency-parser-nltk-and-anaconda.html )

Note that as of NLTK v3.1, the STANFORD_JAR and STANFORD_PARSER variables is deprecated and NO LONGER used


In Longer:


STEP 1

Assuming that you have installed Java appropriately on your OS.

Now, install/update your NLTK version (see http://www.nltk.org/install.html):

  • Using pip: sudo pip install -U nltk
  • Debian distro (using apt-get): sudo apt-get install python-nltk

For Windows (Use the 32-bit binary installation):

  1. Install Python 3.4: http://www.python.org/downloads/ (avoid the 64-bit versions)
  2. Install Numpy (optional): http://sourceforge.net/projects/numpy/files/NumPy/ (the version that specifies pythnon3.4)
  3. Install NLTK: http://pypi.python.org/pypi/nltk
  4. Test installation: Start>Python34, then type import nltk

(Why not 64 bit? See https://github.com/nltk/nltk/issues/1079)


Then out of paranoia, recheck your nltk version inside python:

from __future__ import print_functionimport nltkprint(nltk.__version__)

Or on the command line:

python3 -c "import nltk; print(nltk.__version__)"

Make sure that you see 3.1 as the output.

For even more paranoia, check that all your favorite Stanford NLP tools API are available:

from nltk.parse.stanford import StanfordParserfrom nltk.parse.stanford import StanfordDependencyParserfrom nltk.parse.stanford import StanfordNeuralDependencyParserfrom nltk.tag.stanford import StanfordPOSTagger, StanfordNERTaggerfrom nltk.tokenize.stanford import StanfordTokenizer

(Note: The imports above will ONLY ensure that you are using a correct NLTK version that contains these APIs. Not seeing errors in the import doesn't mean that you have successfully configured the NLTK API to use the Stanford Tools)


STEP 2

Now that you have checked that you have the correct version of NLTK that contains the necessary Stanford NLP tools interface. You need to download and extract all the necessary Stanford NLP tools.

TL;DR, in Unix:

cd $HOME# Download the Stanford NLP toolswget http://nlp.stanford.edu/software/stanford-ner-2015-04-20.zipwget http://nlp.stanford.edu/software/stanford-postagger-full-2015-04-20.zipwget http://nlp.stanford.edu/software/stanford-parser-full-2015-04-20.zip# Extract the zip file.unzip stanford-ner-2015-04-20.zip unzip stanford-parser-full-2015-04-20.zip unzip stanford-postagger-full-2015-04-20.zip

In Windows / Mac:


STEP 3

Setup the environment variables such that NLTK can find the relevant file path automatically. You have to set the following variables:

  • Add the appropriate Stanford NLP .jar file to the CLASSPATH environment variable.

    • e.g. for the NER, it will be stanford-ner-2015-04-20/stanford-ner.jar
    • e.g. for the POS, it will be stanford-postagger-full-2015-04-20/stanford-postagger.jar
    • e.g. for the parser, it will be stanford-parser-full-2015-04-20/stanford-parser.jar and the parser model jar file, stanford-parser-full-2015-04-20/stanford-parser-3.5.2-models.jar
  • Add the appropriate model directory to the STANFORD_MODELS variable (i.e. the directory where you can find where the pre-trained models are saved)

    • e.g. for the NER, it will be in stanford-ner-2015-04-20/classifiers/
    • e.g. for the POS, it will be in stanford-postagger-full-2015-04-20/models/
    • e.g. for the Parser, there won't be a model directory.

In the code, see that it searches for the STANFORD_MODELS directory before appending the model name. Also see that, the API also automatically tries to search the OS environments for the `CLASSPATH)

Note that as of NLTK v3.1, the STANFORD_JAR variables is deprecated and NO LONGER used. Code snippets found in the following Stackoverflow questions might not work:

TL;DR for STEP 3 on Ubuntu

export STANFORDTOOLSDIR=/home/path/to/stanford/tools/export CLASSPATH=$STANFORDTOOLSDIR/stanford-postagger-full-2015-04-20/stanford-postagger.jar:$STANFORDTOOLSDIR/stanford-ner-2015-04-20/stanford-ner.jar:$STANFORDTOOLSDIR/stanford-parser-full-2015-04-20/stanford-parser.jar:$STANFORDTOOLSDIR/stanford-parser-full-2015-04-20/stanford-parser-3.5.2-models.jarexport STANFORD_MODELS=$STANFORDTOOLSDIR/stanford-postagger-full-2015-04-20/models:$STANFORDTOOLSDIR/stanford-ner-2015-04-20/classifiers

(For Windows: See https://stackoverflow.com/a/17176423/610569 for instructions for setting environment variables)

You MUST set the variables as above before starting python, then:

>>> from nltk.tag.stanford import StanfordPOSTagger>>> st = StanfordPOSTagger('english-bidirectional-distsim.tagger')>>> st.tag('What is the airspeed of an unladen swallow ?'.split())[(u'What', u'WP'), (u'is', u'VBZ'), (u'the', u'DT'), (u'airspeed', u'NN'), (u'of', u'IN'), (u'an', u'DT'), (u'unladen', u'JJ'), (u'swallow', u'VB'), (u'?', u'.')]>>> from nltk.tag import StanfordNERTagger>>> st = StanfordNERTagger('english.all.3class.distsim.crf.ser.gz') >>> st.tag('Rami Eid is studying at Stony Brook University in NY'.split())[(u'Rami', u'PERSON'), (u'Eid', u'PERSON'), (u'is', u'O'), (u'studying', u'O'), (u'at', u'O'), (u'Stony', u'ORGANIZATION'), (u'Brook', u'ORGANIZATION'), (u'University', u'ORGANIZATION'), (u'in', u'O'), (u'NY', u'O')]>>> from nltk.parse.stanford import StanfordParser>>> parser=StanfordParser(model_path="edu/stanford/nlp/models/lexparser/englishPCFG.ser.gz")>>> list(parser.raw_parse("the quick brown fox jumps over the lazy dog"))[Tree('ROOT', [Tree('NP', [Tree('NP', [Tree('DT', ['the']), Tree('JJ', ['quick']), Tree('JJ', ['brown']), Tree('NN', ['fox'])]), Tree('NP', [Tree('NP', [Tree('NNS', ['jumps'])]), Tree('PP', [Tree('IN', ['over']), Tree('NP', [Tree('DT', ['the']), Tree('JJ', ['lazy']), Tree('NN', ['dog'])])])])])])]

Alternatively, you could try add the environment variables inside python, as the previous answers have suggested but you can also directly tell the parser/tagger to initialize to the direct path where you kept the .jar file and your models.

There is NO need to set the environment variables if you use the following method BUT when the API changes its parameter names, you will need to change accordingly. That is why it is MORE advisable to set the environment variables than to modify your python code to suit the NLTK version.

For example (without setting any environment variables):

# POS tagging:from nltk.tag import StanfordPOSTaggerstanford_pos_dir = '/home/alvas/stanford-postagger-full-2015-04-20/'eng_model_filename= stanford_pos_dir + 'models/english-left3words-distsim.tagger'my_path_to_jar= stanford_pos_dir + 'stanford-postagger.jar'st = StanfordPOSTagger(model_filename=eng_model_filename, path_to_jar=my_path_to_jar) st.tag('What is the airspeed of an unladen swallow ?'.split())# NER Tagging:from nltk.tag import StanfordNERTaggerstanford_ner_dir = '/home/alvas/stanford-ner/'eng_model_filename= stanford_ner_dir + 'classifiers/english.all.3class.distsim.crf.ser.gz'my_path_to_jar= stanford_ner_dir + 'stanford-ner.jar'st = StanfordNERTagger(model_filename=eng_model_filename, path_to_jar=my_path_to_jar) st.tag('Rami Eid is studying at Stony Brook University in NY'.split())# Parsing:from nltk.parse.stanford import StanfordParserstanford_parser_dir = '/home/alvas/stanford-parser/'eng_model_path = stanford_parser_dir  + "edu/stanford/nlp/models/lexparser/englishRNN.ser.gz"my_path_to_models_jar = stanford_parser_dir  + "stanford-parser-3.5.2-models.jar"my_path_to_jar = stanford_parser_dir  + "stanford-parser.jar"parser=StanfordParser(model_path=eng_model_path, path_to_models_jar=my_path_to_models_jar, path_to_jar=my_path_to_jar)


As of NLTK v3.3, users should avoid the Stanford NER or POS taggers from nltk.tag, and avoid Stanford tokenizer/segmenter from nltk.tokenize.

Instead use the new nltk.parse.corenlp.CoreNLPParser API.

Please see https://github.com/nltk/nltk/wiki/Stanford-CoreNLP-API-in-NLTK


(Avoiding link only answer, I've pasted the docs from NLTK github wiki below)

First, update your NLTK

pip3 install -U nltk # Make sure is >=3.3

Then download the necessary CoreNLP packages:

cd ~wget http://nlp.stanford.edu/software/stanford-corenlp-full-2018-02-27.zipunzip stanford-corenlp-full-2018-02-27.zipcd stanford-corenlp-full-2018-02-27# Get the Chinese model wget http://nlp.stanford.edu/software/stanford-chinese-corenlp-2018-02-27-models.jarwget https://raw.githubusercontent.com/stanfordnlp/CoreNLP/master/src/edu/stanford/nlp/pipeline/StanfordCoreNLP-chinese.properties # Get the Arabic modelwget http://nlp.stanford.edu/software/stanford-arabic-corenlp-2018-02-27-models.jarwget https://raw.githubusercontent.com/stanfordnlp/CoreNLP/master/src/edu/stanford/nlp/pipeline/StanfordCoreNLP-arabic.properties # Get the French modelwget http://nlp.stanford.edu/software/stanford-french-corenlp-2018-02-27-models.jarwget https://raw.githubusercontent.com/stanfordnlp/CoreNLP/master/src/edu/stanford/nlp/pipeline/StanfordCoreNLP-french.properties # Get the German modelwget http://nlp.stanford.edu/software/stanford-german-corenlp-2018-02-27-models.jarwget https://raw.githubusercontent.com/stanfordnlp/CoreNLP/master/src/edu/stanford/nlp/pipeline/StanfordCoreNLP-german.properties # Get the Spanish modelwget http://nlp.stanford.edu/software/stanford-spanish-corenlp-2018-02-27-models.jarwget https://raw.githubusercontent.com/stanfordnlp/CoreNLP/master/src/edu/stanford/nlp/pipeline/StanfordCoreNLP-spanish.properties 

English

Still in the stanford-corenlp-full-2018-02-27 directory, start the server:

java -mx4g -cp "*" edu.stanford.nlp.pipeline.StanfordCoreNLPServer \-preload tokenize,ssplit,pos,lemma,ner,parse,depparse \-status_port 9000 -port 9000 -timeout 15000 & 

Then in Python:

>>> from nltk.parse import CoreNLPParser# Lexical Parser>>> parser = CoreNLPParser(url='http://localhost:9000')# Parse tokenized text.>>> list(parser.parse('What is the airspeed of an unladen swallow ?'.split()))[Tree('ROOT', [Tree('SBARQ', [Tree('WHNP', [Tree('WP', ['What'])]), Tree('SQ', [Tree('VBZ', ['is']), Tree('NP', [Tree('NP', [Tree('DT', ['the']), Tree('NN', ['airspeed'])]), Tree('PP', [Tree('IN', ['of']), Tree('NP', [Tree('DT', ['an']), Tree('JJ', ['unladen'])])]), Tree('S', [Tree('VP', [Tree('VB', ['swallow'])])])])]), Tree('.', ['?'])])])]# Parse raw string.>>> list(parser.raw_parse('What is the airspeed of an unladen swallow ?'))[Tree('ROOT', [Tree('SBARQ', [Tree('WHNP', [Tree('WP', ['What'])]), Tree('SQ', [Tree('VBZ', ['is']), Tree('NP', [Tree('NP', [Tree('DT', ['the']), Tree('NN', ['airspeed'])]), Tree('PP', [Tree('IN', ['of']), Tree('NP', [Tree('DT', ['an']), Tree('JJ', ['unladen'])])]), Tree('S', [Tree('VP', [Tree('VB', ['swallow'])])])])]), Tree('.', ['?'])])])]# Neural Dependency Parser>>> from nltk.parse.corenlp import CoreNLPDependencyParser>>> dep_parser = CoreNLPDependencyParser(url='http://localhost:9000')>>> parses = dep_parser.parse('What is the airspeed of an unladen swallow ?'.split())>>> [[(governor, dep, dependent) for governor, dep, dependent in parse.triples()] for parse in parses][[(('What', 'WP'), 'cop', ('is', 'VBZ')), (('What', 'WP'), 'nsubj', ('airspeed', 'NN')), (('airspeed', 'NN'), 'det', ('the', 'DT')), (('airspeed', 'NN'), 'nmod', ('swallow', 'VB')), (('swallow', 'VB'), 'case', ('of', 'IN')), (('swallow', 'VB'), 'det', ('an', 'DT')), (('swallow', 'VB'), 'amod', ('unladen', 'JJ')), (('What', 'WP'), 'punct', ('?', '.'))]]# Tokenizer>>> parser = CoreNLPParser(url='http://localhost:9000')>>> list(parser.tokenize('What is the airspeed of an unladen swallow?'))['What', 'is', 'the', 'airspeed', 'of', 'an', 'unladen', 'swallow', '?']# POS Tagger>>> pos_tagger = CoreNLPParser(url='http://localhost:9000', tagtype='pos')>>> list(pos_tagger.tag('What is the airspeed of an unladen swallow ?'.split()))[('What', 'WP'), ('is', 'VBZ'), ('the', 'DT'), ('airspeed', 'NN'), ('of', 'IN'), ('an', 'DT'), ('unladen', 'JJ'), ('swallow', 'VB'), ('?', '.')]# NER Tagger>>> ner_tagger = CoreNLPParser(url='http://localhost:9000', tagtype='ner')>>> list(ner_tagger.tag(('Rami Eid is studying at Stony Brook University in NY'.split())))[('Rami', 'PERSON'), ('Eid', 'PERSON'), ('is', 'O'), ('studying', 'O'), ('at', 'O'), ('Stony', 'ORGANIZATION'), ('Brook', 'ORGANIZATION'), ('University', 'ORGANIZATION'), ('in', 'O'), ('NY', 'STATE_OR_PROVINCE')]

Chinese

Start the server a little differently, still from the `stanford-corenlp-full-2018-02-27 directory:

java -Xmx4g -cp "*" edu.stanford.nlp.pipeline.StanfordCoreNLPServer \-serverProperties StanfordCoreNLP-chinese.properties \-preload tokenize,ssplit,pos,lemma,ner,parse \-status_port 9001  -port 9001 -timeout 15000

In Python:

>>> parser = CoreNLPParser('http://localhost:9001')>>> list(parser.tokenize(u'我家没有电脑。'))['我家', '没有', '电脑', '。']>>> list(parser.parse(parser.tokenize(u'我家没有电脑。')))[Tree('ROOT', [Tree('IP', [Tree('IP', [Tree('NP', [Tree('NN', ['我家'])]), Tree('VP', [Tree('VE', ['没有']), Tree('NP', [Tree('NN', ['电脑'])])])]), Tree('PU', ['。'])])])]

Arabic

Start the server:

java -Xmx4g -cp "*" edu.stanford.nlp.pipeline.StanfordCoreNLPServer \-serverProperties StanfordCoreNLP-arabic.properties \-preload tokenize,ssplit,pos,parse \-status_port 9005  -port 9005 -timeout 15000

In Python:

>>> from nltk.parse import CoreNLPParser>>> parser = CoreNLPParser('http://localhost:9005')>>> text = u'انا حامل'# Parser.>>> parser.raw_parse(text)<list_iterator object at 0x7f0d894c9940>>>> list(parser.raw_parse(text))[Tree('ROOT', [Tree('S', [Tree('NP', [Tree('PRP', ['انا'])]), Tree('NP', [Tree('NN', ['حامل'])])])])]>>> list(parser.parse(parser.tokenize(text)))[Tree('ROOT', [Tree('S', [Tree('NP', [Tree('PRP', ['انا'])]), Tree('NP', [Tree('NN', ['حامل'])])])])]# Tokenizer / Segmenter.>>> list(parser.tokenize(text))['انا', 'حامل']# POS tagg>>> pos_tagger = CoreNLPParser('http://localhost:9005', tagtype='pos')>>> list(pos_tagger.tag(parser.tokenize(text)))[('انا', 'PRP'), ('حامل', 'NN')]# NER tag>>> ner_tagger = CoreNLPParser('http://localhost:9005', tagtype='ner')>>> list(ner_tagger.tag(parser.tokenize(text)))[('انا', 'O'), ('حامل', 'O')]

French

Start the server:

java -Xmx4g -cp "*" edu.stanford.nlp.pipeline.StanfordCoreNLPServer \-serverProperties StanfordCoreNLP-french.properties \-preload tokenize,ssplit,pos,parse \-status_port 9004  -port 9004 -timeout 15000

In Python:

>>> parser = CoreNLPParser('http://localhost:9004')>>> list(parser.parse('Je suis enceinte'.split()))[Tree('ROOT', [Tree('SENT', [Tree('NP', [Tree('PRON', ['Je']), Tree('VERB', ['suis']), Tree('AP', [Tree('ADJ', ['enceinte'])])])])])]>>> pos_tagger = CoreNLPParser('http://localhost:9004', tagtype='pos')>>> pos_tagger.tag('Je suis enceinte'.split())[('Je', 'PRON'), ('suis', 'VERB'), ('enceinte', 'ADJ')]

German

Start the server:

java -Xmx4g -cp "*" edu.stanford.nlp.pipeline.StanfordCoreNLPServer \-serverProperties StanfordCoreNLP-german.properties \-preload tokenize,ssplit,pos,ner,parse \-status_port 9002  -port 9002 -timeout 15000

In Python:

>>> parser = CoreNLPParser('http://localhost:9002')>>> list(parser.raw_parse('Ich bin schwanger'))[Tree('ROOT', [Tree('NUR', [Tree('S', [Tree('PPER', ['Ich']), Tree('VAFIN', ['bin']), Tree('AP', [Tree('ADJD', ['schwanger'])])])])])]>>> list(parser.parse('Ich bin schwanger'.split()))[Tree('ROOT', [Tree('NUR', [Tree('S', [Tree('PPER', ['Ich']), Tree('VAFIN', ['bin']), Tree('AP', [Tree('ADJD', ['schwanger'])])])])])]>>> pos_tagger = CoreNLPParser('http://localhost:9002', tagtype='pos')>>> pos_tagger.tag('Ich bin schwanger'.split())[('Ich', 'PPER'), ('bin', 'VAFIN'), ('schwanger', 'ADJD')]>>> pos_tagger = CoreNLPParser('http://localhost:9002', tagtype='pos')>>> pos_tagger.tag('Ich bin schwanger'.split())[('Ich', 'PPER'), ('bin', 'VAFIN'), ('schwanger', 'ADJD')]>>> ner_tagger = CoreNLPParser('http://localhost:9002', tagtype='ner')>>> ner_tagger.tag('Donald Trump besuchte Angela Merkel in Berlin.'.split())[('Donald', 'PERSON'), ('Trump', 'PERSON'), ('besuchte', 'O'), ('Angela', 'PERSON'), ('Merkel', 'PERSON'), ('in', 'O'), ('Berlin', 'LOCATION'), ('.', 'O')]

Spanish

Start the server:

java -Xmx4g -cp "*" edu.stanford.nlp.pipeline.StanfordCoreNLPServer \-serverProperties StanfordCoreNLP-spanish.properties \-preload tokenize,ssplit,pos,ner,parse \-status_port 9003  -port 9003 -timeout 15000

In Python:

>>> pos_tagger = CoreNLPParser('http://localhost:9003', tagtype='pos')>>> pos_tagger.tag(u'Barack Obama salió con Michael Jackson .'.split())[('Barack', 'PROPN'), ('Obama', 'PROPN'), ('salió', 'VERB'), ('con', 'ADP'), ('Michael', 'PROPN'), ('Jackson', 'PROPN'), ('.', 'PUNCT')]>>> ner_tagger = CoreNLPParser('http://localhost:9003', tagtype='ner')>>> ner_tagger.tag(u'Barack Obama salió con Michael Jackson .'.split())[('Barack', 'PERSON'), ('Obama', 'PERSON'), ('salió', 'O'), ('con', 'O'), ('Michael', 'PERSON'), ('Jackson', 'PERSON'), ('.', 'O')]