How to build an image for KubeFlow pipeline? How to build an image for KubeFlow pipeline? kubernetes kubernetes

How to build an image for KubeFlow pipeline?


Basically you can follow the steps provided by Docker here to create Docker image and publish to Docker Hub (or you can build your own private docker registry, but I think it may be too much work for beginner). Just roughly list steps:

  1. Create Dockerfile. In your Dockerfile, just specify several things: base image (for you case, just use python image from Docker), working directory and what commands to be executed when running this image
  2. Run your Image locally to make sure it works as expected (Install docker first if you haven't), then push to Docker Hub
  3. Once published, you will have the image URL after publishing to Docker Hub, then use that url when you create pipelines in Kubeflow.

Also, you can read this doc to know how to create pipelines (Kubeflow pipeline is just argo workflow). For your case, just fill in inputs and/or outputs sections of the step you want in the pipeline YAML file.


  1. You do not need to build images. For small to medium size components you can work on top of existing images. Check the lightweight components sample.For python see Data passing in python componentsFor non-python see Creating components from command-line programs

  2. KFP SDK has some support for building container images. See the container_build sample.

  3. Read the official component authoring documentation.

Let's assume that I have a simple python function that crops images:

You can just create a component from a python function like this:

from kfp.components import InputPath, OutputPath, create_component_from_func# Declare function (with annotations)def crop_image(    image_path: InputPath(),    start_pixel: int,    end_pixel: int,    cropped_image_path: OutputPath(),):    import some_image_lib    some_image_lib.crop(image_path, start_pixel, end_pixel, cropped_image_path)# Create componentcrop_image_op = create_component_from_func(  crop_image,  # base_image=..., # Optional. Base image that has most of the packages that you need. E.g. tensorflow/tensorflow:2.2.0  packages_to_install=['some_image_lib==1.2.3'],  output_component_file='component.yaml', # Optional. Use this to share the component between pipelines, teams or people in the world)# Create pipelinedef my_pipeline():    download_image_task = download_image_op(...)    crop_image_task = crop_image_op(        image=download_image_task.output,        start_pixel=10,        end_pixel=200,    )# Submit pipelinekfp.Client(host=...).create_run_from_pipeline_func(my_pipeline, arguments={})