Determining how similar two images are with Python + Perceptual Hashing

dresspixelated.jpg

Years ago I dabbled a bit with a side project that was an app to identify fashion items in an image.

The idea was that users could upload an image of a fashion item and find out things like:

  • What brand are those shoes?
  • Where can I get that dress?

The app would take the uploaded image and search a database of fashion images to find potential matches that were partly determined by finding similar images to the uploaded image.

So how do you determine how similar two images are?

Perceptual hashing

To compare two images to see how similar they are, you can hash the images and compare the hashes.

But you can't just hash them any old way...

Regular cryptographic hashing like md5 or sha-1 won't work for this, because if the images are slightly different, by even a couple of pixels, it will result in a completely different hash.

To most human eyes, the images that differ by a couple of pixels are essentially the same image, so we need to be able to compare them in a way that is more closely aligned with how human eyes would compare them.

Instead, we will use perceptual hashing algorithms to hash the images in such a way that the more similar the two images are, the more similar their hashes also are.

The perceptual hashing algorithms used here involve scaling the original image to an 8x8 grayscale image, and then performing calculations on each of the 64 pixels.

The result is a fingerprint of the image that can be compared to other fingerprints.

We can use the imagehash library in Python to compute the hash of an image and then compare them to find the most similar ones.

I used the wavelet hash in this project, and you can read more about that here.

You may have heard of the average hash as well, which is also available in this library.

In your own project you might want to experiment with different hashes to see which one gives you the best results.


I'm also using the Image module from Pillow.

In a virtual environment, install these two packages.

pip install imagehash
pip install pillow

And import them in a new file.

from PIL import Image
import imagehash

Let's look at a few images to demonstrate.

Uniqlo black jacket

This black coat is from Uniqlo.

First hash the image.

image_one = 'uniqlo_jacket.jpg'

img = Image.open(image_one)
image_one_hash = imagehash.whash(img)
print(image_one_hash)

ffd3c181818181ff

Now we will compare the black coat to this grey hoodie that is also from Uniqlo.

Uniqlo grey hoodie

Hash the grey hoodie image.

image_two = 'uniqlo_grey_hoodie.jpg'

img2 = Image.open(image_two)
image_two_hash = imagehash.whash(img2)
print(image_two_hash)

e3c38787878d8f81

The hamming distance can be used to find the similarity of the images.

To calculate the hamming distance, you take two strings of the same length and compare them at each index, where the hamming distance is increased by one for every difference between the strings.

Note: In this case, the hamming distance is calculated from the binary representation of the hashes.

The black coat hash converted to binary:

1111111111010011110000011000000110000001100000011000000111111111

and the grey hoodie:

1110001111000011100001111000011110000111100011011000111110000001

If the images are identical, they will have the same hash, and the hamming distance will be zero.

So a smaller hamming distance means that they are more similar.

The ImageHash package does all of this, so we don't have to convert anything to a binary representation and can easily get the hamming distance between the two images.

similarity = image_one_hash - image_two_hash
print(similarity)

22

These two images have a hamming distance of 22.

That doesn't mean much by itself, so let's look at another image.

This black wrap is also from Uniqlo, and we will compare the image to the first black coat.

Do you think the hamming distance value will be more or less than 22?

Uniqlo black wrap

image_three = 'uniqlo_black_wrap.jpg'

img3 = Image.open(image_three)
image_three_hash = imagehash.whash(img3)
print(image_three_hash)

c3c383838181dbff

And then find the hamming distance.

similarity_two = image_one_hash - image_three_hash
print(similarity_two)

12

At 12, the image of the black wrap is considered more similar to the image of the black coat than the image of the grey hoodie with a 22.


How similar is similar enough?

Depending on your goals, you might have a different idea of what an acceptable threshold to call images similar is.

But the gist of this idea is that the user's uploaded image is compared to images and then similar matches are returned to them as search results.


Scaling and performance issues

If you were just taking an input image and comparing it to every other image in your database, that could perform terribly.

You might have millions or even billions of images to compare it to!

In that case you would want to look into data structures that are much more suited to this type of search, such as KD-Trees, VP-Trees, Ball trees, among others.

I used the Ball Tree data structure implementation from scikit-learn.

Read more about nearest neighbors as well.


Thanks for reading!

Interestingly, another part of this fashion app project that I dabbled in involved training a classifier to tell which category of fashion item was in the image - shoes, dresses, tops, bottoms, handbags, etc.

This was years ago and I had to compile all of the images to train this classifier myself, but now there is a Fashion MNIST dataset available!

As well, there is a TensorFlow tutorial that uses the Fashion MNIST dataset to classify items of clothing.

If you're not familiar with MNIST, it's a huge database of hand-written digits that is often used for training image processing systems.


If you have any questions or comments, write them below or reach out to me on Twitter @LVNGD.

blog comments powered by Disqus

Recent Posts

point_polygon_yn.png
Point in Polygon search with MongoDB
Nov. 12, 2022

In a recent project, we had a large number of points on a canvas, where a user could draw a region of interest to see only the points within that area. Here is a demo of how to do that using MongoDB with a geospatial 2D-index. Visualized using D3.

Read More
main_graphic.jpg
Image Similarity with Python Part II: Nearest Neighbor Search
Feb. 18, 2022

This is Part II of my post on image similarity in Python with perceptual hashing. In this post, we will use Spotify's Annoy library to perform nearest neighbors search on a collection of images to find similar images to a query image.

Read More
kruskal animation shot
Kruskal's Algorithm Animation + Maze Generation
Feb. 7, 2022

Kruskal's algorithm finds a minimum spanning tree in an undirected, connected and weighted graph. We will use a union-find algorithm to do this, and generate a random maze from a grid of points.

Read More
Get the latest posts as soon as they come out!