How to package python script with dependencies into zip/tar? How to package python script with dependencies into zip/tar? hadoop hadoop

How to package python script with dependencies into zip/tar?


I have solved similar problem in Apache Spark and Python context by creating a Docker image which has needed python libraries and Spark slave script installed. The image is distributed to other machines and when the container is started it auto-joins to to the cluster, we have only one such image / host machine.

Our ever-changing python project is submitted as a zip file along with the job and imports work transparently form there. Luckily we rarely need to re-create those slave images and we don't run jobs with conflicting requirements.

I'm not sure how applicable this in your scenario, especially since (in my understanding) some python libraries must be compiled.