
Shazam and company have become some of the most amazing apps we can carry on a smartphone. And it's not that they aren't famous, because everyone already knows them, but it still seems almost magical that they're able to recognize what song is playing at any given moment. This “magic” effect is achieved with mathematics, spectrograms and a gigantic databaseLet's see how apps like Shazam actually work.
Spectrography, the essential pillar
These applications are actually based on what we know as spectrography, or spectroscopy, that is, the body of knowledge related to spectroscopic analysis. These words can be difficult to understand, but we'll explain it in a moment. When any sound is produced, we can hear it because the particles between us and the source of that sound move—they vibrate. When we say these particles move, we mean they generate waves, which travel back and forth. The number of times these particles oscillate is called frequency.And we've all probably heard of the frequency of a sound, right? Well, spectrography, in this case, is dedicated to measuring the frequency of sounds over a given period of time. Each sound has a different frequency at each moment, and this allows us to differentiate, on a spectrogram, which sounds are being played.
For reliable recognition, the system look for stable energy peaks in that spectrogram (the "peaks" of energy at certain frequencies and times). These peaks are much more robust to noise than the rest of the signal, allowing a song to be identified even in challenging environments.
It's all about comparing
How do you know what song is playing? Comparing. Actually, it is like taking an "X-ray" and comparing it with other X-rays of sounds that we already had stored, thus being able to know which of all matches that one. Well, this is exactly how Shazam and other apps work. This process is called “audio fingerprinting” or creating an acoustic fingerprint.The app transforms a few seconds of music into a compact representation based on frequency pairs and the time between them. This fingerprint is converted into a set of hashes that are queried against a database.
The use of pairs of peaks with their temporal distance makes the match is very fast and robust Even if the recording is compressed, there's echo or some background noise. When the match is high, the app returns the song, artist, and other contextual information.
Shazam is a spectrograph
When we start Shazam, and it tells us that it is recognizing the song, what it is actually doing is turning our smartphone into a spectrograph. It is capturing the sound and generating a spectrogram like the one you have just above this paragraph. Once you have a sufficiently detailed spectrograph, then you go on to compare it to the entire database they have stored. In seconds, generate and send the fingerprint and wait for the match.
When there is identification, Shazam can show title, artist, album, synchronized lyrics, music videos, biographies and recommendations on similar topics. In addition, in some markets, the same technology is used to recognize television content and advertisements in order to expand information.
More useful technical context
The “miracle” of getting it right so quickly is based on three pillars: the compact acoustic footprint (based on peaks and their temporal relationships), a very efficient inverted index which organizes those hashes by song and time, and matching heuristics that score which theme best fits even if data is missing. Without the need for formulas, the key idea is that "constellations" of characteristic points are being compared, not the entire waveform.
The database is the most complex
In reality, the most complex of all is the database that stores the spectrograms of all the songs. We know how difficult it is to create a music service that contains all the music in the world. Spotify is one of those programs, but important songs are still missing from it. Well, if that is already complex, imagine what it must be like to store the spectrograms of all those songs. It is normal that part of the work of the team of Shazam and other similar applications is to dedicate itself to expanding the database that, in fact, is the heart of the application. For scaling, footprints are stored as lightweight hashes, allowing millions of references to be compared in milliseconds.
In addition to cloud recognition, there are approaches in which the database is on the device itself to protect privacy and work offline. A well-known example is the "Now Playing" feature on some phones, which maintains a local catalog optimized for identifying songs privately and offline.
Its offline operation is very simple
Sometimes we might wonder how these applications can work offline, without an Internet connection. It is actually very simple, since they never give us the data until they connect to the Internet. They don't have to save the entire song, they don't even have to save the piece of music that we want to analyze. In reality, the only thing they keep is the spectrographic data, so that later they can be compared in the database, and that takes up practically nothing. Some apps offer a “nearly offline mode”: They capture the sample, generate the fingerprint and leave it waiting to be uploaded and resolved when the connection returns.
Features like car shazam They allow you to listen in the background and group matches by date. And if the phone has recognition within the system itself, it will display the song on the lock screen without sending audio to servers.
The algorithm is essential
However, another of the most important aspects of these applications is the algorithm they use to compare songs. An algorithm, in reality, is nothing more than a way of performing a procedure. Shazam's algorithm must be constantly improving. Why? Because they must work to get the system to follow a path that allows it to find the song even faster. And it is that one can think that once the spectrograms are understood and the database of songs is completed, everything is done, but nothing is further from the truth. Let's think that you should compare the spectrogram with millions and millions of songs. However, the algorithm is one of the main aspects. There are several computer techniques to improve this, and we are not going to talk about any in particular because it would be like talking about the shape of the Clouds on a stormy day. However, it is always good to know that the algorithm of the application is one of the essential elements, along with the spectrography function, and that of the song database. Aspects such as selecting “anchor” peaks, pairing them with nearby peaks and encoding the pair with its temporal delta are key. for speed and noise resilience.
Beyond Shazam: Methods and Complementary Apps
- SoundHound: similar to Shazam, but adds the ability to recognize hummed or whistled melodies. It works best with popular songs and when the humming respects tone and rhythm.
- Browser extensions: There are options such as official Shazam extension for Chrome or alternative types AHA Music, useful for identifying what is playing on the computer itself.
- Midomi: A web service from the creators of SoundHound for searching from your browser, also supporting singing and whistling.
- Deezer SongCatcher: integrated with Deezer for recognize music and add it to your playlists without leaving the app.
- Voice assistants: recognize music with Google Assistant, Siri and Alexa incorporate recognition; Siri uses Shazam technologyIn many cases you can even hum so they can find the song.
- Recognition on the device: Some smartphones offer “It’s Playing”, with local database to identify topics offline and privately.
- Musipedia: It allows mark the rhythm with the space bar or whistle to search for matches in its open base.
- Search by letter: use search engines with quoted phrases helps to find the topic; Genius y Musixmatch facilitate lyrics and timing.
- Communities: in Reddit (tipofmytongue) and social media, it's common for the community to identify songs with a description, a snippet, or a short video. It can also help to explore TikTok viral lists on streaming platforms.
- Radio stations: many offer history of songs broadcast by time slot; usually available for a limited time.
- Listening history: With Last.fm You can “scrobble” your plays and recover forgotten songs from your history.
- AI Assistance: Conversational tools can help you if you give them phrases from the lyrics or descriptions. There are music-focused add-ons that expand the database and refine recommendations.
Practical tips to improve recognition: bring the phone closer to the source, avoid shouting or talking over the music, try capture fragments with lead vocals or chorus and if the first take fails, try another part of the song. Ambient noise and live mixing can make it difficult to get a correct answer, although modern algorithms tolerate quite a bit of interference.
Music recognition isn't magic: it combines physics of sound, intelligent information compression and colossal databases. Thanks to this you can discover instantly what's playing, save your findings, link them to your favorite streaming platform, and rely on assistants, websites, and communities when the situation requires it, all with an increasingly faster, more private, and more precise experience.

