How can I access s3 files in Python using urls? How can I access s3 files in Python using urls? python python

How can I access s3 files in Python using urls?


For opening, it should be as simple as:

import urllibopener = urllib.URLopener()myurl = "https://s3.amazonaws.com/skyl/fake.xyz"myfile = opener.open(myurl)

This will work with s3 if the file is public.

To write a file using boto, it goes a little something like this:

from boto.s3.connection import S3Connectionconn = S3Connection(AWS_KEY, AWS_SECRET)bucket = conn.get_bucket(BUCKET)destination = bucket.new_key()destination.name = filenamedestination.set_contents_from_file(myfile)destination.make_public()

lemme know if this works for you :)


Here's how they do it in awscli :

def find_bucket_key(s3_path):    """    This is a helper function that given an s3 path such that the path is of    the form: bucket/key    It will return the bucket and the key represented by the s3 path    """    s3_components = s3_path.split('/')    bucket = s3_components[0]    s3_key = ""    if len(s3_components) > 1:        s3_key = '/'.join(s3_components[1:])    return bucket, s3_keydef split_s3_bucket_key(s3_path):    """Split s3 path into bucket and key prefix.    This will also handle the s3:// prefix.    :return: Tuple of ('bucketname', 'keyname')    """    if s3_path.startswith('s3://'):        s3_path = s3_path[5:]    return find_bucket_key(s3_path)

Which you could just use with code like this

from awscli.customizations.s3.utils import split_s3_bucket_keyimport boto3client = boto3.client('s3')bucket_name, key_name = split_s3_bucket_key(    's3://example-bucket-name/path/to/example.txt')response = client.get_object(Bucket=bucket_name, Key=key_name)

This doesn't address the goal of interacting with an s3 key as a file like object but it's a step in that direction.


I haven't seen something that would work directly with S3 urls, but you could use an S3 access library (simples3 looks decent) and some simple string manipulation:

>>> url = "s3:/bucket/path/">>> _, path = url.split(":", 1)>>> path = path.lstrip("/")>>> bucket, path = path.split("/", 1)>>> print bucket'bucket'>>> print path'path/'