Determining how similar two images are with Python + Perceptual Hashing

dresspixelated.jpg

Years ago I dabbled a bit with a side project that was an app to identify fashion items in an image.

The idea was that users could upload an image of a fashion item and find out things like:

  • What brand are those shoes?
  • Where can I get that dress?

The app would take the uploaded image and search a database of fashion images to find potential matches that were partly determined by finding similar images to the uploaded image.

So how do you determine how similar two images are?

Perceptual hashing

To compare two images to see how similar they are, you can hash the images and compare the hashes.

But you can't just hash them any old way...

Regular cryptographic hashing like md5 or sha-1 won't work for this, because if the images are slightly different, by even a couple of pixels, it will result in a completely different hash.

To most human eyes, the images that differ by a couple of pixels are essentially the same image, so we need to be able to compare them in a way that is more closely aligned with how human eyes would compare them.

Instead, we will use perceptual hashing algorithms to hash the images in such a way that the more similar the two images are, the more similar their hashes also are.

The perceptual hashing algorithms used here involve scaling the original image to an 8x8 grayscale image, and then performing calculations on each of the 64 pixels.

The result is a fingerprint of the image that can be compared to other fingerprints.

We can use the imagehash library in Python to compute the hash of an image and then compare them to find the most similar ones.

I used the wavelet hash in this project, and you can read more about that here.

You may have heard of the average hash as well, which is also available in this library.

In your own project you might want to experiment with different hashes to see which one gives you the best results.


I'm also using the Image module from Pillow.

In a virtual environment, install these two packages.

pip install imagehash
pip install pillow

And import them in a new file.

from PIL import Image
import imagehash

Let's look at a few images to demonstrate.

Uniqlo black jacket

This black coat is from Uniqlo.

First hash the image.

image_one = 'uniqlo_jacket.jpg'

img = Image.open(image_one)
image_one_hash = imagehash.whash(img)
print(image_one_hash)

ffd3c181818181ff

Now we will compare the black coat to this grey hoodie that is also from Uniqlo.

Uniqlo grey hoodie

Hash the grey hoodie image.

image_two = 'uniqlo_grey_hoodie.jpg'

img2 = Image.open(image_two)
image_two_hash = imagehash.whash(img2)
print(image_two_hash)

e3c38787878d8f81

The hamming distance can be used to find the similarity of the images.

To calculate the hamming distance, you take two strings of the same length and compare them at each index, where the hamming distance is increased by one for every difference between the strings.

Note: In this case, the hamming distance is calculated from the binary representation of the hashes.

The black coat hash converted to binary:

1111111111010011110000011000000110000001100000011000000111111111

and the grey hoodie:

1110001111000011100001111000011110000111100011011000111110000001

If the images are identical, they will have the same hash, and the hamming distance will be zero.

So a smaller hamming distance means that they are more similar.

The ImageHash package does all of this, so we don't have to convert anything to a binary representation and can easily get the hamming distance between the two images.

similarity = image_one_hash - image_two_hash
print(similarity)

22

These two images have a hamming distance of 22.

That doesn't mean much by itself, so let's look at another image.

This black wrap is also from Uniqlo, and we will compare the image to the first black coat.

Do you think the hamming distance value will be more or less than 22?

Uniqlo black wrap

image_three = 'uniqlo_black_wrap.jpg'

img3 = Image.open(image_three)
image_three_hash = imagehash.whash(img3)
print(image_three_hash)

c3c383838181dbff

And then find the hamming distance.

similarity_two = image_one_hash - image_three_hash
print(similarity_two)

12

At 12, the image of the black wrap is considered more similar to the image of the black coat than the image of the grey hoodie with a 22.


How similar is similar enough?

Depending on your goals, you might have a different idea of what an acceptable threshold to call images similar is.

But the gist of this idea is that the user's uploaded image is compared to images and then similar matches are returned to them as search results.


Scaling and performance issues

If you were just taking an input image and comparing it to every other image in your database, that could perform terribly.

You might have millions or even billions of images to compare it to!

In that case you would want to look into data structures that are much more suited to this type of search, such as KD-Trees, VP-Trees, Ball trees, among others.

I used the Ball Tree data structure implementation from scikit-learn.

Read more about nearest neighbors as well.


Thanks for reading!

Interestingly, another part of this fashion app project that I dabbled in involved training a classifier to tell which category of fashion item was in the image - shoes, dresses, tops, bottoms, handbags, etc.

This was years ago and I had to compile all of the images to train this classifier myself, but now there is a Fashion MNIST dataset available!

As well, there is a TensorFlow tutorial that uses the Fashion MNIST dataset to classify items of clothing.

If you're not familiar with MNIST, it's a huge database of hand-written digits that is often used for training image processing systems.


If you have any questions or comments, write them below or reach out to me on Twitter @LVNGD.

blog comments powered by Disqus

Recent Posts

mortonzcurve.png
Computing Morton Codes with a WebGPU Compute Shader
May 29, 2024

Starting out with general purpose computing on the GPU, we are going to write a WebGPU compute shader to compute Morton Codes from an array of 3-D coordinates. This is the first step to detecting collisions between pairs of points.

Read More
webgpuCollide.png
WebGPU: Building a Particle Simulation with Collision Detection
May 13, 2024

In this post, I am dipping my toes into the world of compute shaders in WebGPU. This is the first of a series on building a particle simulation with collision detection using the GPU.

Read More
abstract_tree.png
Solving the Lowest Common Ancestor Problem in Python
May 9, 2023

Finding the Lowest Common Ancestor of a pair of nodes in a tree can be helpful in a variety of problems in areas such as information retrieval, where it is used with suffix trees for string matching. Read on for the basics of this in Python.

Read More
Get the latest posts as soon as they come out!