Perceptual hashing is an interesting technique for processing media files. I stumbled upon it not long ago and find it very exciting. Just like cryptographic hashing algorithms such as the SHA family, perceptual hashing functions generate short hash values for given input files. They work on media content, such as images, videos and audio clips.
Perceptual and cryptographic hashing differ in an interesting way. In case of the latter, minor changes to input result in an entirely different hash value. Take a look at the example below; changing just a single letter yields a completely different output.
This is an important property. Thanks to it, we can easily tell whether two files are different by simply comparing their cryptographic hashes. The rsync algorithm, Git commit identifiers and Bitcoin mining are just some applications of this property.
Perceptual hashing offers us a reversed guarantee. Minor changes to its input result in hashes which are similar to one another. Specifically, a bitwise exclusive OR of hashes of two similar inputs will have most bits unset. Perceptual hashes preserve similarity of its inputs.
pHash is an example of a perceptual hashing algorithm. It’s published as an open source library with a C API. Let’s try it out by building a simple command-line tool for calculating perceptual hashes of images. To make matters interesting we’ll do it with Rust; a language featured on this blog before, most recently when Kofi partially automated airing the office.
pHash isn’t available in most popular package managers, but we can install it straight from the source. We’ll focus on images, so after satisfying the library’s dependencies all we need is:
With the library in place, let’s use Cargo to create a Rust project.
To interact with the C API of pHash we’ll need some type aliases available in
the libc crate.
Let’s add it to the list of dependencies in Cargo.toml
The function we’re interested in is ph_dct_imagehash
.
Here’s its declaration from the pHash.h
header file.
To call the function we need to declare its interface in an extern
block.
We have to name the library the function is implemented in;
otherwise the linker wouldn’t be able to resolve it.
We can now define a function adapting the C API to Rust. For one, instead of using an integer value to signal an error or a success we can model it with the Option type.
We want our tool to take file names as its arguments and pass them to
imagehash
.
It should print hexadecimal hashes of all successfully processed images
and report all errors.
We can now compile our tool with cargo build
and let it process some files.
Just as expected, hashes of two photos I took recently in short succession
reflect their similarity; they differ just on 10 out of 64 bits.
Now we can process our entire photo library and generate a perceptual hash of
every image.
Using find
we invoke our tool on all JPEG files we’ve got in ~/pictures
.
Notice that we’re processing one file at a time,
but the task itself is pleasingly parallelizable.
Using find
, we can generate a null-separated list of files and pipe them to
xargs
.
We instruct the latter to run four parallel instances of phash
passing 50
files to each one.
Alternatively, we can implement parallel processing within phash
by using
Rayon, a data-parallelism library for Rust.
Let’s add rayon = "0.9.0"
to the dependency list in Cargo.toml
and modify
our main
function.
We need to accumulate all command-line arguments into a single vector.
Afterwards, we create a parallel iterator over the vector.
The rest remains unaltered.
Parallelism comes nearly for free.
Thanks to Rayon, our tool will use all cores of our machine to their fullest. We don’t even need to specify the level of parallelism explicitly.
Our job here is done. It took us just a screenful of code to implement what we wanted, multi-threaded processing included. Feel free to review the source code in its entirety on GitHub.
A task arising quite naturally at this point is similarity search. Given hashes of all those photos can we tell if a given picture resembles something in our library? Because — I could swear — it does look oddly familiar…
That’s a question I’ll address in a follow-up post. Brace yourself for more Rust, this time with algorithms and data structures. Until next time!