Kubernetes: Managing uploaded user content Kubernetes: Managing uploaded user content kubernetes kubernetes

Kubernetes: Managing uploaded user content


Hello I think you should forget kubernetes a bit and think of the architecture and capabilities of your Django application. I guess you have built a web app, that offers some 'upload image' functionality, and then you have code that 'stores' this image somewhere. On the very simple scenario if you run your app on your laptop, the you web app, is configured to save this content to a local folder, a more advanced example is that you deploy your application to a VM or a cloud vm e.g an AWS EC2 instance, and your app is saving the files to the local storage of this EC2 instance. The question is twofold - what happens if we have 2 instances of your web app deployed - can the be configured and run - so that they 'share' the same folder to save the images? I guess this is what you want, other wise your app would not scale horizontally , each user would have to hit each individual instance - in order to upload or retrieve specific images. So having that in mind that is a design decision of your application, which I am pretty sure you have already worked it out, the you need to think - how can I share a folder? a bucket so that all my instances of my web app can save files? If you spinned 3 different vms, on any cloud, you would have to use some kind of clour storage, so that all three instances point to the same physical storage location, or an NFS drive or you could save your data to a cloud storage service S3!

Having all the above in mind, and clearly understanding that you need to decouple your application from the notion of locale storage especially if you want to make it as as stateless as it gets (whatever that means to you), having your web app, which is packaged as a docker container and deployed in a kubernetes cluster as a pod - and saving files to the local storage is not going to get any far, since each pod, each docker container will use the underlying kubernetes worker (vm) storage to save files, so another instance will be saving files on some other vm etc etc.

Kubernetes provides this kind of abstraction for applications (pods) that want to 'share' within the kubernetes cluster, some local storage and of course persist it. Something that I did not add above is that pod and worker storage (meaning if you save files in the kubernetes worker or pod) once this vm / instance is restarted you will loose your data. So you want something durable.

To cut a long story short,

1) you can either to deploy your application / pod along with a Persistent Volume Claim assuming that your kubernetes cluster supports it. What is happening is that you can mount to your pod some kind of folder / storage which will be backed up by whatever is available to your cluster - some kind of NFS store. https://kubernetes.io/docs/concepts/storage/persistent-volumes/

2) You can 'outsource' this need to share a common local storage to some external provider, e.g a common case use an S3 bucket, and not tackle the problem on kubernetes - just keep and provision the app within kubernetes.

I hope I gave you some basic ideas.


Note: Kubernetes 1.14 now (March 2019) comes with Durable Local Storage Management is Now GA, which:

  • Makes locally attached (non-network attached) storage available as a persistent volume source.
  • Allows users to take advantage of the typically cheaper and improved performance of persistent local storage kubernetes/kubernetes: #73525, #74391, #74769 kubernetes/enhancements: #121 (kep)

That might help securing a truly persistent storage for your case.

As noted by x-yuri in the comments:
See more with "Kubernetes 1.14: Local Persistent Volumes GA", from Michelle Au (Google), Matt Schallert (Uber), Celina Ward (Uber).