Have you ever heard or even use service like Shazam? Cool, right? No, we are not going to make something as magical as it đ But using chromaprint we can create audio fingerprint so that we can do music search by using a music sample.
Before we can use chromaprint python library, pyacoustid, we need to install chromaprint C library in our OS. The build instruction is in its repo. Or if you use Debian/Ubuntu, you can install it using apt
(if there is any error, try to google up the error. Most of the time, it’s due to missing dependency):
sudo apt install ffmpeg acoustid-fingerprinter
Then you can create virtual environment (or not) and install the pip packages:
pip install pyacoustid
Now to generate a fingerprint from a file:
import acoustid import chromaprint duration, fp_encoded = acoustid.fingerprint_file('music.mp3') fingerprint, version = chromaprint.decode_fingerprint(fp_encoded) print(fingerprint)
The fingerprint
will contain an array of 948 signed int
that represent compact characteristic aka fingerprint of the audio file. It can also be visualized with the help of numpy
and matplotlib
:
pip install numpy matplotlib
import numpy as np import matplotlib.pyplot as plt ... fig = plt.figure() bitmap = np.transpose(np.array([[b == '1' for b in list('{:32b}'.format(i & 0xffffffff))] for i in fingerprint])) plt.imshow(bitmap)

Now, let say we have another music file and want to check the similarity with the previous file. One way to do this is by calculating similarity between the fingerprints:
pip install fuzzywuzzy[speedup]
from fuzzywuzzy import fuzz similarity = fuzz.ratio(sample_fingerprint, fingerprint) print(similarity)
The similarity
will contains percentage of similarity between sample_fingerprint
and fingerprint
calculated using fuzzy algorithm.
I also made a simple program that uses all the codes above. The program calculates similarity between files in two directories, find the best match and also visualize the fingerprints.
That’s all. Thanks for reading â
Hello, this post is great. I’m trying to create a music service… I want to know how I can compare two big audio (radio fragments), then finding little similar fragments (songs), I need to split and extract the common fragments.
Thanks, beforehand!
Hi. I haven’t done such thing before. If I should, maybe I’d try to find boundary between songs in the big audios then compare the signatures. You can train a machine learning model to detect the boundary https://pdfs.semanticscholar.org/0fee/81a489242eae8f0ef578321311481ffbda9f.pdf
Thanks. The text behind of link is very interesting, so I need to understand it. Can you explain me or recommend me something more, about that?
As far as I understand, the idea in the paper is to split audio into smaller chunks, manually labels them either as song or accompaniment (intro, filler between song .etc) and use them train a machine learning models (SVM). Unfortunately I can’t explain more than that due to lack to knowledge & experience in audio processing.
If you prefer more practical recommendation, you can search in GitHub using “audio segmentation” keywords. This is the closest one I can find https://github.com/amsehili/audio-segmentation-by-classification-tutorial. It detects more classes than the paper, but you should be able to study it to get better idea.
Thanks, for your explanation… I’ll researching.
Hi, Yohanes.
How are you?
I would like to know how to store the fingerprints in mongodb.
The fingerprint here is just array of signed int. So we can store it as it is in mongodb with the help of package such as `pymongo`. But the actual problem is how to search them again based on similarity. I haven’t done such thing before. But I would try to cluster the whole fingerprints (eg. using k-means) and store fingerprints along with their cluster id. Then to find the similarity, I can just (1) find the cluster id of the input, (2) query mongodb for all fingerprints with the same cluster id, (3) run text similarity on the results… Read more »
Hi. I would like to know, how I can do the first code on c++
//
duration, fp_encoded = acoustid.fingerprint_file(‘music.mp3’)
fingerprint, version = chromaprint.decode_fingerprint(fp_encoded)
print(fingerprint)
Unfortunately, I haven’t tried it in C++ and found no documentation. I’ve tried to skim pyaccoustid source code but it seems quite complex. If you really need to do it, you should try to trace it from there https://github.com/beetbox/pyacoustid
I tried typing in this code:
import acoustid
import chromaprint
duration, fp_encoded = acoustid.fingerprint_file(‘YTP.mp3’)
fingerprint, version = chromaprint.decode_fingerprint(fp_encoded)
print(fingerprint)
but I get an error saying:
raise FingerprintGenerationError(“audio could not be decoded”)
acoustid.FingerprintGenerationError: audio could not be decoded
What could be the issue?
@Spencer I think it’s dependency or audio file encoding issue. If you can share the mp3 file, I can try to process it my machine
Yes it seems that it is a dependency problem. : freetype: no [The C/C++ header for freetype (freetype2\ft2build.h) could not be found. You may need to install the development package.] png: no [The C/C++ header for png (png.h) could not be found. You may need to install the development package.] However when I download both freetype and png it still cannot find them on my system. for example I have done PS C:\Users\sptzk> pip install pypng Requirement already satisfied: pypng in c:…… and also tried to download from developers site. is there a way to make it recognize that I… Read more »
I
hatedon’t use wind*ws for development, so I can’t be certain. But based on the error message, it is still missing the libraries (freetype2\ft2build.h
andpng.h
). So I think you need to install or copy them & verify their existance.Having python package installed doesn’t guarantee the actual library installed. Because usually python packages only act as wrapper of the actual libraries or binaries.
Hey Yohanes, me one more time.
If i don’t need to compare similarity, but i need to know if the fingerprint of an audio fragment is contained into a fingerprint of a Song… What can i do?
Example:
fragmentAudio = [4, 5, 6]
song = [-1, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10…]
In my experiment, a segment of a song has a quite similar fingerprint to its longer/full version https://gist.github.com/yohanesgultom/630a831eff1fbdcd84b3cfec6feabe02#file-music_fingerprint-py. Of course you can try to improve the accuracy by various way, like comparing a segment’s fingerprint against fingerprints of chunked full/long version of a song
Hello. I really appreciate the post. A great work. I just have a small query.
Where are these fingerprints stored in Ubuntu System as I didn’t connect to any local database.
Is there any option to do so? If yes, please revert.
Cheers!
In this post,
chromaprint
returns array of integer as an audio fingerprint. So you can store it any way you want, regardless of operating system. For example, you can convert the array of int as a JSON array and store it in database.Thank you for clarifying my query.
Is it possible for you to provide me python code to store the above fingerprints in a local MySQL Database?
My motive is to store these fingerprints in a local database and later access them for verification purposes.
Cheers!
You should be able to find tutorials on inserting data to mysql (like this) scattered all over the internet. As for importing array (list) to json, it is as simple as
str = json.dumps(fingerprint)
.Hey I made a program to analyze bird sounds and print the species of birds as output. But it always gives me the wrong answer. For example, if I input a sample hawk sound, it shows more similarity to the dove sound. Any help?
Interesting stuff! But unfortunately, I’m not an expert in audio analysis or bird sounds. There are many possibilities where such thing can go wrong: the fingerprinting (feature extraction) methods, the similarity function, or even the data itself. If it were me, I would first imitate proven feature extraction methods from paper like this, then try to improve from there
hi, thank you for the article, when I use the line fig = plt.figure() or plt.imshow(bitmap)
I get errors: _tkinter.TclError: no display name and no $DISPLAY environment variable
what can it be ?
this is the code
import acoustid
import chromaprint
import numpy as np
import matplotlib.pyplot as plt
duration, fp_encoded = acoustid.fingerprint_file(‘music.mp3’)
fingerprint, version = chromaprint.decode_fingerprint(fp_encoded)
print(fingerprint)
fig = plt.figure()
bitmap = np.transpose(np.array([[b == ‘1’ for b in list(‘{:32b}’.format(i & 0xffffffff))] for i in fingerprint]))
plt.imshow(bitmap)
after printing the fingerprint array, these are the errors:
Traceback (most recent call last):
File “test.py”, line 9, in
fig = plt.figure()
File “/usr/lib/python2.7/dist-packages/matplotlib/pyplot.py”, line 539, in figure
**kwargs)
File “/usr/lib/python2.7/dist-packages/matplotlib/backend_bases.py”, line 171, in new_figure_manager
return cls.new_figure_manager_given_figure(num, fig)
File “/usr/lib/python2.7/dist-packages/matplotlib/backends/backend_tkagg.py”, line 1049, in new_figure_manager_given_figure
window = Tk.Tk(className=”matplotlib”)
File “/usr/lib/python2.7/lib-tk/Tkinter.py”, line 1828, in __init__
self.tk = _tkinter.create(screenName, baseName, className, interactive, wantobjects, useTk, sync, use)
_tkinter.TclError: no display name and no $DISPLAY environment variable
Hi. Are you running the script in a server (no GUI)? If yes, try the solution from https://stackoverflow.com/a/37605654/1862500
Hi, thanks for the tutorial, do you know more about how chromaprint make it’s fingerprint ? specific technical details on how it works. I’m trying to make identification of same word rode my the same personne at the same speed, but the scores are really bad…
Hi. Unfortunately, I haven’t explored it that far. But you can just check its source code https://github.com/acoustid/chromaprint. You can also try to contact its creator from the official project website https://acoustid.org/
Hello,this post is great.I am trying to do a simple song identification. I am running the code given in the post:
import acoustid
import chromaprint
duration, fp_encoded = acoustid.fingerprint_file(‘input_1.wav’)
fingerprint, version = chromaprint.decode_fingerprint(fp_encoded)
print(fingerprint)
I following error has been thrown:
raise NoBackendError(“fpcalc not found”)
NoBackendError: fpcalc not found
I am running the file on windows with anaconda.
I have searched blogs which specified about downloading fpcalc but I could not understand how to download it and install it.Can you please let me know?
Thank you
You can try to download & use the chromaprint’s windows installer from its website https://acoustid.org/chromaprint
Hello,
I wanted to ask if ther is a problem if I use the code for a wav file, not an mp3 file.
Because I am trying, but I get the audio could not be decoded error.
Thanks
Hi. I can generate fingerprint from this file https://www.kozco.com/tech/piano2.wav (tested on Ubuntu 20). So the problem may be in your backend (chromaprint/ffmpeg) and/or the wav file itself.
Python 3.8.5 (default, Jul 28 2020, 12:59:40)
[GCC 9.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import acoustid
>>> import chromaprint
>>> input_path = "/mnt/data/Downloads/piano2.wav"
>>> duration, fp_encoded = acoustid.fingerprint_file(input_path)
>>> fingerprint, version = chromaprint.decode_fingerprint(fp_encoded)
>>> print(fingerprint)
[19467542, 53038342, 53042438, 36292870, 36293982, 36292063, 40751581, 40227293, 48583165, 43086329, 41972072, 41972073, 41969001, 42004329, 113307497, 100921160, 100787657, 117490121, 264290761, 260096505, 260174315, 264462699, 247685482, 239296826, 239297338, 235107130, 234967834, 234902298, 234901850]