Apache Spark: How to use pyspark with Python 3 Apache Spark: How to use pyspark with Python 3 python python

Apache Spark: How to use pyspark with Python 3


Just set the environment variable:

export PYSPARK_PYTHON=python3

in case you want this to be a permanent change add this line to pyspark script.


PYSPARK_PYTHON=python3 ./bin/pyspark

If you want to run in in IPython Notebook, write:

PYSPARK_PYTHON=python3 PYSPARK_DRIVER_PYTHON=ipython PYSPARK_DRIVER_PYTHON_OPTS="notebook" ./bin/pyspark

If python3 is not accessible, you need to pass path to it instead.

Bear in mind that the current documentation (as of 1.4.1) has outdate instructions. Fortunately, it has been patched.


1,edit profile :vim ~/.profile

2,add the code into the file: export PYSPARK_PYTHON=python3

3, execute command : source ~/.profile

4, ./bin/pyspark