Generating plain text from a Wikipedia database dump Generating plain text from a Wikipedia database dump shell shell

Generating plain text from a Wikipedia database dump


The first argument to python should be the script name.

You probably need to swap xml and py file names:

$ python WikiExtractor.py enwiki-latest-pages-articles.xml -b 500K -o extracted