Replacing sequence with random number Replacing sequence with random number sql sql

Replacing sequence with random number


For generating unique and random-looking identifiers from a serial, using ciphers might be a good idea. Since their output is bijective (there is a one-to-one mapping between input and output values) -- you will not have any collisions, unlike hashes. Which means your identifiers don't have to be as long as hashes.

Most cryptographic ciphers work on 64-bit or larger blocks, but the PostgreSQL wiki has an example PL/pgSQL procedure for a "non-cryptographic" cipher function that works on (32-bit) int type. Disclaimer: I have not tried using this function myself.

To use it for your primary keys, run the CREATE FUNCTION call from the wiki page, and then on your empty tables do:

ALTER TABLE foo ALTER COLUMN foo_id SET DEFAULT pseudo_encrypt(nextval('foo_foo_id_seq')::int);

And voila!

pg=> insert into foo (foo_id) values(default);pg=> insert into foo (foo_id) values(default);pg=> insert into foo (foo_id) values(default);pg=> select * from foo;  foo_id   ------------ 1241588087 1500453386 1755259484(4 rows)


I added my comment to your question and then realized that I should have explained myself better... My apologies.

You could have a second key - not the primary key - that is visible to the user. That key could use the primary as the seed for the hash function you describe and be the one that you use to do lookups. That key would be generated by a trigger after insert (which is much simpler than trying to ensure atomicity of the operation) and

That is the key that you share with your clients, never the PK. I know there is debate (albeit, I can't understand why) if PKs are to be invisible to the user applications or not. The modern database design practices, and my personal experience, all seem to suggest that PKs should NOT be visible to users. They tend to attach meaning to them and, over time, that is a very bad thing - regardless if they have a check digit in the key or not.

Your joins will still be done using the PK. This other generated key is just supposed to be used for client lookups. They are the face, the PK is the guts.

Hope that helps.

Edit: FWIW, there is little to be said about "right" or "wrong" in database design. Sometimes it boils down to a choice. I think the choice you face will be better served by leaving the PK alone and creating a secondary key - just that.


I think you are way over-complicating this. Why not let the database do what it does best and let it take care of atomicity and ensuring that the same id is not used twice? Why not use a postgresql SERIAL type and get an autogenerated surrogate primary key, just like an integer IDENTITY column in SQL Server or DB2? Use that on the column instead. Plus it will be faster than your user-defined function.

I concur regarding hiding this surrogate primary key and using an exposed secondary key (with a unique constraint on it) to lookup clients in your interface.

Are you using a sequence because you need a unique identifier across several tables? This is usually an indication that you need to rethink your table design, and those several tables should perhaps be combined into one, with an autogenerated surrogate primary key.

Also see here