March 27, 2020
Determining how similar two images are with Python + Perceptual Hashing
Years ago I dabbled a bit with a side project that was an app to identify fashion items in an image.
The idea was that users could upload an image of a fashion item and find out things like:
- What brand are those shoes?
- Where can I get that dress?
The app would take the uploaded image and search a database of fashion images to find potential matches that were partly determined by finding similar images to the uploaded image.
So how do you determine how similar two images are?
To compare two images to see how similar they are, you can hash the images and compare the hashes.
But you can't just hash them any old way...
Regular cryptographic hashing like md5 or sha-1 won't work for this, because if the images are slightly different, by even a couple of pixels, it will result in a completely different hash.
To most human eyes, the images that differ by a couple of pixels are essentially the same image, so we need to be able to compare them in a way that is more closely aligned with how human eyes would compare them.
Instead, we will use perceptual hashing algorithms to hash the images in such a way that the more similar the two images are, the more similar their hashes also are.
The perceptual hashing algorithms used here involve scaling the original image to an 8x8 grayscale image, and then performing calculations on each of the 64 pixels.
The result is a fingerprint of the image that can be compared to other fingerprints.
We can use the imagehash library in Python to compute the hash of an image and then compare them to find the most similar ones.
I used the wavelet hash in this project, and you can read more about that here.
You may have heard of the average hash as well, which is also available in this library.
In your own project you might want to experiment with different hashes to see which one gives you the best results.
I'm also using the
Image module from Pillow.
In a virtual environment, install these two packages.
pip install imagehash
pip install pillow
And import them in a new file.
from PIL import Image
Let's look at a few images to demonstrate.
This black coat is from Uniqlo.
First hash the image.
image_one = 'uniqlo_jacket.jpg'
img = Image.open(image_one)
image_one_hash = imagehash.whash(img)
Now we will compare the black coat to this grey hoodie that is also from Uniqlo.
Hash the grey hoodie image.
image_two = 'uniqlo_grey_hoodie.jpg'
img2 = Image.open(image_two)
image_two_hash = imagehash.whash(img2)
The hamming distance can be used to find the similarity of the images.
To calculate the hamming distance, you take two strings of the same length and compare them at each index, where the hamming distance is increased by one for every difference between the strings.
Note: In this case, the hamming distance is calculated from the binary representation of the hashes.
The black coat hash converted to binary:
and the grey hoodie:
If the images are identical, they will have the same hash, and the hamming distance will be zero.
So a smaller hamming distance means that they are more similar.
The ImageHash package does all of this, so we don't have to convert anything to a binary representation and can easily get the hamming distance between the two images.
similarity = image_one_hash - image_two_hash
These two images have a hamming distance of 22.
That doesn't mean much by itself, so let's look at another image.
This black wrap is also from Uniqlo, and we will compare the image to the first black coat.
Do you think the hamming distance value will be more or less than 22?
image_three = 'uniqlo_black_wrap.jpg'
img3 = Image.open(image_three)
image_three_hash = imagehash.whash(img3)
And then find the hamming distance.
similarity_two = image_one_hash - image_three_hash
At 12, the image of the black wrap is considered more similar to the image of the black coat than the image of the grey hoodie with a 22.
How similar is similar enough?
Depending on your goals, you might have a different idea of what an acceptable threshold to call images similar is.
But the gist of this idea is that the user's uploaded image is compared to images and then similar matches are returned to them as search results.
Scaling and performance issues
If you were just taking an input image and comparing it to every other image in your database, that could perform terribly.
You might have millions or even billions of images to compare it to!
In that case you would want to look into data structures that are much more suited to this type of search, such as KD-Trees, VP-Trees, Ball trees, among others.
I used the Ball Tree data structure implementation from scikit-learn.
Read more about nearest neighbors as well.
Thanks for reading!
Interestingly, another part of this fashion app project that I dabbled in involved training a classifier to tell which category of fashion item was in the image - shoes, dresses, tops, bottoms, handbags, etc.
This was years ago and I had to compile all of the images to train this classifier myself, but now there is a Fashion MNIST dataset available!
As well, there is a TensorFlow tutorial that uses the Fashion MNIST dataset to classify items of clothing.
If you're not familiar with MNIST, it's a huge database of hand-written digits that is often used for training image processing systems.
If you have any questions or comments, write them below or reach out to me on Twitter @LVNGD.