Find similar images in (pure) PHP / MySQL Find similar images in (pure) PHP / MySQL php php

Find similar images in (pure) PHP / MySQL


I've had this exact same issue before.

Feel free to copy what I did, and hopefully it will help you / solve your problem.


How I solved it

My first idea that failed, similar to what you may be thinking, is I ended up making strings for every single image (no matter what size). But I quickly worked out this fills your database super fast, and wasn't effective.

Next option (that works) was a smaller image (like your 5px idea), and I did exactly that, but with 10px*10px images. The way I created the 'hash' for each image was the imagecolorat() function.

See php.net here.

When receiving the rgb colours for the image, I rounded them to the nearest 50, so that the colours were less specific. That number (50) is what you want to change depending on how specific you want your searches to be.

for example:

// Pixel RGBrgb(105, 126, 225) // Originalrgb(100, 150, 250) // After rounding numbers to nearest 50

After doing this to every pixel (10px*10px will give you 100 rgb()'s back), I then turned them into an array, and stored them in the database as base64_encode() and serialize().

When doing the search for images that are similar, I did the exact same process to the image they wanted to upload, and then extracted image 'hashes' from the database to compare them all, and see what had matching rounded rgb's.


Tips

  • The Bigger that 50 is in the rgb rounding, the less specific your search will be (and vice versa).

  • If you want your SQL to be more specific, it may be better to store extra/specific info about the image in the database, so that you can limit the searches you get in the database. eg. if the aspect ratio is 4:3, only pull images around 4:3 from the database. (etc)

  • It can be difficult to get this perfectly 5px*5px, so a suggestion is phpthumb. I used it with the syntax:

phpthumb.php?src=IMAGE_NAME_HERE.png&w=10&h=10&zc=1// &w=  width of your image// &h=  height of your image// &zc= zoom control. 0:Keep aspect ratio, 1:Change to suit your width+height

Good luck mate, hope I could help.


For an easy php implementation check out: https://github.com/kennethrapp/phasher

However - I wonder if there is a native mySql function for "compare" (see php class above)


I scale down image to 8x8 then I convert RGB to 1-byte HSV so result hash is 172 bytes string.

HSVHSVHSVHSVHSVHSVHSVHSV... (from 8x8 block, 172 bytes long)0fff0f3ffff4373f346fff00...

It's not 100% accurate (some duplicates aren't found) but it works nice and looks like there is no false positive results.