Skip to main content

JavaScript: Browser-based Audio Analysis Using AnalyserNode

Greetings. 🎩

I will demonstrate the implementation of AnalyserNode to reveal (compute) the peaks and such from an audio source. This will only output the numbers rather than an elaborate animation.

In this, we can choose whether to use a sample audio (hosted on a cloud storage) or a local audio file.

Highlights of the flow:


Demonstration

An amazing user interface is ready to interface.

🟣

🟣

🧹


Application Flow

The overall flow of the application is as such:

  1. User loads the page

    DOMContentLoaded

    ⬇️

    applicationUI() runs

    ⬇️

    Initial UI and variables are prepared.

  2. User selects a source (sample URL or local file)
    • Sample audio chosen

      Button clicked

      ⬇️

      stopStream() clears old state

      ⬇️

      File fetched from the network → Blob → Blob URL

      ⬇️

      setupAudioAnalysis(blobURL) builds a fresh audio element.


      Make sure to set the cross-origin header for the hosted file to be:

      Or using your custom CORS pattern to allow it to be fetched from the URL where the application resides.

      Because browser fetch needs an explicit CORS policy on the resource being fetched — which will be converted as Blob URL for the audio src for the analysis using createMediaElementSource.


      The demonstration allows .mp3 for the file, but it can also work for .mp4 (if you insist 😂).

    • Local file chosen

      File input changes

      ⬇️

      stopStream() clears old audio, context, analyser, RAF

      ⬇️

      File converted to Blob URL

      ⬇️

      setupAudioAnalysis(blobURL) builds a fresh audio element.

  3. Setting up the audio element

    setupAudioAnalysis(url)

    ⬇️

    Clears output/UI

    ⬇️

    Creates <audio controls> and a note ➡️ Sets src, sets crossOrigin = "anonymous"

    ⬇️

    Appends it into the DOM

    ⬇️

    Browser loads metadata

    ⬇️

    onloadedmetadata fires

    ⬇️

    queueMicrotask(...) builds the Web Audio graph.

  4. Building the Web Audio graph

    Inside the microtask:

    Create AudioContext

    ⬇️

    Create AnalyserNode (FFT size set)

    ⬇️

    Create MediaElementSource from the audio

    ⬇️

    Connect:

    sourceNodeanalyserdestination

  5. User presses Play

    audioEl.onplay runs

    ⬇️

    audioContext.resume() summons the audio graph

    ⬇️

    Start the RAF loop:

    requestAnimationFrame(getAudioLevels)

  6. Continuous real-time analysis

    getAudioLevels() → every RAF tick

    ⬇️

    Reads time-domain data (waveform)

    ⬇️

    Reads frequency-domain data (FFT bins)

    ⬇️

    Computes:

    • Peak amplitude
    • RMS amplitude
    • Zero-crossing rate
    • Peak-hold
    • Spectral centroid
    • 10-band EQ levels (log-spaced)
    ⬇️

    Formats results into HTML

    ⬇️

    Displays live output

    ⬇️

    Schedules next RAF tick.

  7. User switches audio or presses "C L E A R"

    stopStream() fires

    ⬇️

    Cancels RAF

    ⬇️

    Stops input stream if any

    ⬇️

    Disconnects nodes

    ⬇️

    Closes AudioContext

    ⬇️

    Removes audio element

    ⬇️

    Resets UI

    ⬇️

    System returns to idle state.

Speaking of long scroll.

Simplified Flow


queueMicrotask

I use queueMicrotask to wrap the AudioContext generation in the onloadedmetadata to avoid Chrome's "long task violation" complaint.

queueMicrotask is quite new (2020), so I employ it to show you the queue method. But since AudioContext boot is synchronous and heavy — Chrome will still show the "long task violation" warning.

I tried it with setTimeout, requestAnimationFrame, Promise, even moved the generation to onplay, still... 😂

The heavy work is not in the handler at all — it's inside Chrome's internal AudioContext boot. AudioContext startup is synchronous, it blocks the main thread. So. 🤷‍♂️

That warning always happens at the "very first time" of the AudioContext generation. After that, with browser reload, should be without warning. Interesting, that. That means the real-time audio thread stays alive for some time. Reproducing the "very first time" (first boot) is bit arbitrary at the moment. It's related to Chrome's audio engine suspension timing (trigger), for energy/CPU resource saving.


HTML of the Demonstration (Style-Element-Script)

This has more than 350 lines. 😶

You can use this button below to open it on a new tab:

Or skim the code below:


The 10-band Logic

The bands:

It's the common web-EQ layout.

Each band:

  • Has a centre frequency (31 to 16k).
  • Uses a 1-octave window around each centre:
    • From centre ÷ √2.
    • To centre × √2.
  • Averages all FFT bins falling inside that range.
  • Outputs the strength of the band as an integer amplitude (0–255).

This is the same approach used by browser EQ visualisers and light JS music visualisers.


Microphone

We can also use the microphone as the audio source. But it's not covered in the application above.


Discrete Fourier Transform

The Fast Fourier Transform (FFT) itself is an algorithm — an optimisation — for computing the Discrete Fourier Transform (DFT) efficiently. So the discrete formulation starts with the DFT itself.

The Discrete Fourier Transform (DFT) is a mathematical method that converts a finite sequence of equally spaced samples from the time domain into a set of complex numbers representing the signal's amplitude and phase at specific frequencies in the frequency domain. It is essential in computing because it enables the conversion of time-based digital signals — like audio, images, or sensor data — into their frequency components, allowing computers to analyse, filter, compress, and visualise these signals more effectively by revealing patterns and structures that are not immediately visible in the time domain.

There's the continuous counterpart, simply called the Fourier Transform (FT). Used in pure mathematics and physics to analyse continuous-time signals (like theoretical waveforms, electromagnetic fields, etc.) DFT is there to approximate FT when things are sampled and digitised — within the digital realm.

I remember back then in college, in "Signals and Sytems" class, because the professor was quite captivating, I got a "B". Not because I was any good in signal analysis. But because in my thoughts:

🧠 Interesting.

And I wrote plenty of swirling shenanigans using the formulation. More pages with ink, convincing. Ended with double-line strike to underscore the final answers. I guess the TA (Teaching Asisstant) was like, Oh, by golly. This lad. At least he wrote e−2πift. And bloody [SH]u,r = -ikeN?! Isn't it [FN]k,n = e−2πikn/N? 🤔 It does look like it. All right, let's count the pages.


Fourier

The name refers to Jean-Baptiste Joseph Fourier (1768–1830), a French mathematician and physicist who developed Fourier series and the Fourier transform. Served under Napoleon in Egypt, collecting scientific data and dabbling in administrative roles.

Fourier saw order where others saw noise.

And gave us the mathematical key to unlock the hidden frequencies of the universe.

He's the bloke who quite literally reshaped how we see waves.


Thanks for your visit. All the best. 🎩

Comments

Monkey Raptor uses biscuits. More info on Privacy Policy