Is it faster to search for a large string in a DB by its hashcode? Is it faster to search for a large string in a DB by its hashcode? database database

Is it faster to search for a large string in a DB by its hashcode?


In general: probably not, assuming the column is indexed. Database servers are designed to do such lookups quickly and efficiently. Some databases (e.g. Oracle) provide options to build indexes based on hashing.

However, in the end this can be only answered by performance testing with representative (of your requirements) data and usage patterns.


Though I've never done it, it sounds like this would work in principle. There's a chance you may get false positives but that's probably quite slim.

I'd go with a fast algorithm such as MD5 as you don't want to spend longer hashing the string than it would have taken you to just search for it.

The final thing I can say is that you'll only know if it is better if you try it out and measure the performance.


I'd be surprised if this offered huge improvement and I would recommend not using your own performance optimisations for a DB search.

If you use a database index there is scope for performance to be tuned by a DBA using tried and trusted methods. Hard coding your own index optimisation will prevent this and may stop you gaining for any performance improvements in indexing in future versions of the DB.