What is the best way to get a semi long unique id (non sequential) key for Database objects What is the best way to get a semi long unique id (non sequential) key for Database objects flask flask

What is the best way to get a semi long unique id (non sequential) key for Database objects


Encoding the integers

You could use a reversible encoding for your integers:

def int_str(val, keyspace):    """ Turn a positive integer into a string. """    assert val >= 0    out = ""    while val > 0:        val, digit = divmod(val, len(keyspace))        out += keyspace[digit]    return out[::-1]def str_int(val, keyspace):    """ Turn a string into a positive integer. """    out = 0    for c in val:        out = out * len(keyspace) + keyspace.index(c)    return out

Quick testing code:

keyspace = "fw59eorpma2nvxb07liqt83_u6kgzs41-ycdjh" # Can be anything you like - this was just shuffled letters and numbers, but...assert len(set(keyspace)) == len(keyspace) # each character must occur only oncedef test(v):    s = int_str(v, keyspace)    w = str_int(s, keyspace)    print "OK? %r -- int_str(%d) = %r; str_int(%r) = %d" % (v == w, v, s, s, w)test(1064463423090)test(4319193500)test(495689346389)test(2496486533)

outputs

OK? True -- int_str(1064463423090) = 'antmgabi'; str_int('antmgabi') = 1064463423090OK? True -- int_str(4319193500) = 'w7q0hm-'; str_int('w7q0hm-') = 4319193500OK? True -- int_str(495689346389) = 'ev_dpe_d'; str_int('ev_dpe_d') = 495689346389OK? True -- int_str(2496486533) = '1q2t4w'; str_int('1q2t4w') = 2496486533

Obfuscating them and making them non-continuous

To make the IDs non-contiguous, you could, say, multiply the original value with some arbitrary value, add random "chaff" as the digits-to-be-discarded - with a simple modulus check in my example:

def chaffify(val, chaff_size = 150, chaff_modulus = 7):    """ Add chaff to the given positive integer.    chaff_size defines how large the chaffing value is; the larger it is, the larger (and more unwieldy) the resulting value will be.    chaff_modulus defines the modulus value for the chaff integer; the larger this is, the less chances there are for the chaff validation in dechaffify() to yield a false "okay".    """    chaff = random.randint(0, chaff_size / chaff_modulus) * chaff_modulus    return val * chaff_size + chaffdef dechaffify(chaffy_val, chaff_size = 150, chaff_modulus = 7):    """ Dechaffs the given chaffed value. The chaff_size and chaff_modulus parameters must be the same as given to chaffify() for the dechaffification to succeed.    If the chaff value has been tampered with, then a ValueError will (probably - not necessarily) be raised. """    val, chaff = divmod(chaffy_val, chaff_size)    if chaff % chaff_modulus != 0:        raise ValueError("Invalid chaff in value")    return valfor x in xrange(1, 11):    chaffed = chaffify(x)    print x, chaffed, dechaffify(chaffed)

outputs (with randomness):

1 262 12 440 23 576 34 684 45 841 56 977 67 1197 78 1326 89 1364 910 1528 10

EDIT: On second thought, the randomness of the chaff may not be a good idea, as you lose the canonicality of each obfuscated ID -- this lacks the randomness but still has validation (changing one digit will likely invalidate the whole number if chaff_val is Large Enough).

def chaffify2(val, chaff_val = 87953):    """ Add chaff to the given positive integer. """    return val * chaff_valdef dechaffify2(chaffy_val, chaff_val = 87953):    """ Dechaffs the given chaffed value. chaff_val must be the same as given to chaffify2(). If the value does not seem to be correctly chaffed, raises a ValueError. """    val, chaff = divmod(chaffy_val, chaff_val)    if chaff != 0:        raise ValueError("Invalid chaff in value")    return val

Putting it all together

document_id = random.randint(0, 1000000)url_fragment = int_str(chaffify(document_id))print "URL for document %d: http://example.com/%s" % (document_id, url_fragment)request_id = dechaffify(str_int(url_fragment))print "Requested: Document %d" % request_id

outputs (with randomness)

URL for document 831274: http://example.com/w840piRequested: Document 831274


probably a little longer than you would like.

Python 2.7.1+ (r271:86832, Apr 11 2011, 18:13:53) [GCC 4.5.2] on linux2Type "help", "copyright", "credits" or "license" for more information.>>> import uuid>>> uuid.uuid4()UUID('ba587488-2a96-4daa-b422-60300eb86155')>>> str(uuid.uuid4())'001f8565-6330-44a6-977a-1cca201aedcc'>>> 

And if you are using sqlalchemy you can define an id column of type uuid like so

from sqlalchemy import typesfrom sqlalchemy.databases.mysql import MSBinaryfrom sqlalchemy.schema import Columnimport uuidclass UUID(types.TypeDecorator):    impl = MSBinary    def __init__(self):        self.impl.length = 16        types.TypeDecorator.__init__(self,length=self.impl.length)    def process_bind_param(self,value,dialect=None):        if value and isinstance(value,uuid.UUID):            return value.bytes        elif value and not isinstance(value,uuid.UUID):            raise ValueError,'value %s is not a valid uuid.UUID' % value        else:            return None    def process_result_value(self,value,dialect=None):        if value:            return uuid.UUID(bytes=value)        else:            return None    def is_mutable(self):        return Falseid_column_name = "id"def id_column():    import uuid    return Column(id_column_name,UUID(),primary_key=True,default=uuid.uuid4)

If you are using Django, Preet's answer is probably more appropriate since a lot of django's stuff depends on primary keys that are ints.


Looking into your requirement, the best bet would be to use itertools.combinations somewhat like this

>>> urls=itertools.combinations(string.ascii_letters,6)>>> 'someurl.com/object/'+''.join(x.next())'someurl.com/object/abcdek'>>> 'someurl.com/object/'+''.join(x.next())'someurl.com/object/abcdel'>>> 'someurl.com/object/'+''.join(x.next())'someurl.com/object/abcdem'