How to avoid having idle connection timeout while uploading large file?

django nginx amazon-s3 gunicorn amazon-elb

I have faced the same issue and fixed it by using django-queued-storage on top of django-storages. What django queued storage does is that when a file is received it creates a celery task to upload it to the remote storage such as S3 and in mean time if file is accessed by anyone and it is not yet available on S3 it serves it from local file system. In this way you don't have to wait for the file to be uploaded to S3 in order to send a response back to the client.

As your application behind Load Balancer you might want to use shared file system such as Amazon EFS in order to use the above approach.

django nginx amazon-s3 gunicorn amazon-elb

You can try to skip uploading the file to your server and upload it to s3 directly, then only get back an url for your application.

There is an app for that: django-s3direct you can give it a try.

django nginx amazon-s3 gunicorn amazon-elb

You can create an upload handler to upload file directly to s3. In this way you shouldn't encounter connection timeout.

https://docs.djangoproject.com/en/1.10/ref/files/uploads/#writing-custom-upload-handlers

I did some tests and it works perfectly in my case.

You have to start a new multipart_upload with boto for example and send chunks progressively.

Don't forget to validate the chunk size. 5Mb is the minimum if your file contains more than 1 part. (S3 Limitation)

I think this is the best alternative to django-queued-storage if you really want to upload directly to s3 and avoid connection timeout.

You'll probably also need to create your own filefield to manage file correctly and not send it a second time.

The following example is with S3BotoStorage.

S3_MINIMUM_PART_SIZE = 5242880class S3FileUploadHandler(FileUploadHandler):    chunk_size = setting('S3_FILE_UPLOAD_HANDLER_BUFFER_SIZE', S3_MINIMUM_PART_SIZE)    def __init__(self, request=None):        super(S3FileUploadHandler, self).__init__(request)        self.file = None        self.part_num = 1        self.last_chunk = None        self.multipart_upload = None    def new_file(self, field_name, file_name, content_type, content_length, charset=None, content_type_extra=None):        super(S3FileUploadHandler, self).new_file(field_name, file_name, content_type, content_length, charset, content_type_extra)        self.file_name = "{}_{}".format(uuid.uuid4(), file_name)        default_storage.bucket.new_key(self.file_name)        self.multipart_upload = default_storage.bucket.initiate_multipart_upload(self.file_name)    def receive_data_chunk(self, raw_data, start):        buffer_size = sys.getsizeof(raw_data)        if self.last_chunk:            file_part = self.last_chunk            if buffer_size < S3_MINIMUM_PART_SIZE:                file_part += raw_data                self.last_chunk = None            else:                self.last_chunk = raw_data            self.upload_part(part=file_part)        else:            self.last_chunk = raw_data    def upload_part(self, part):        self.multipart_upload.upload_part_from_file(            fp=StringIO(part),            part_num=self.part_num,            size=sys.getsizeof(part)        )        self.part_num += 1    def file_complete(self, file_size):        if self.last_chunk:            self.upload_part(part=self.last_chunk)        self.multipart_upload.complete_upload()        self.file = default_storage.open(self.file_name)        self.file.original_filename = self.original_filename        return self.file

CodeHunter

How to avoid having idle connection timeout while uploading large file?

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last