Extract audio waveform data

View Markdown

โ€œBeatSyncโ€ (a music collaboration app like SoundCloud) shows waveform visualizations in its audio player. Rather than rendering waveforms client-side from raw audio, extract the data server-side and ship lightweight JSON to the browser.

API

ittybit audio \
  -i https://beatsync-app.com/tracks/collab-99.wav \
  --waveform json \
  --waveform_points 200
const task = {
  input: 'https://beatsync-app.com/tracks/collab-99.wav',
  kind: 'audio',
  options: {
    waveform: 'json',
    waveform_points: 200,
  },
};

const res = await fetch('https://api.ittybit.com/jobs', {
  method: 'POST',
  headers: {
    Authorization: `Bearer ${process.env.ITTYBIT_API_KEY}`,
    'Content-Type': 'application/json',
  },
  body: JSON.stringify(task),
});
const data = await res.json();
import requests

task = {
    "input": "https://beatsync-app.com/tracks/collab-99.wav",
    "kind": "audio",
    "options": {
        "waveform": "json",
        "waveform_points": 200,
    },
}

res = requests.post(
    "https://api.ittybit.com/jobs",
    headers={"Authorization": f"Bearer {api_key}"},
    json=task,
)
data = res.json()
TASK='{
  "input": "https://beatsync-app.com/tracks/collab-99.wav",
  "kind": "audio",
  "options": {
    "waveform": "json",
    "waveform_points": 200
  }
}'

curl -X POST https://api.ittybit.com/jobs \
  -H "Authorization: Bearer $ITTYBIT_API_KEY" \
  -H "Content-Type: application/json" \
  -d "$TASK"

The waveform_points option controls resolution โ€” how many amplitude samples are in the output. 200 works well for most player widths. Use fewer for thumbnails, more for full-screen views.

CLI

ittybit audio \
  -i collab-99.wav \
  -o collab-99-waveform.json \
  --waveform json \
  --waveform_points 200

Output format

The JSON output is an array of normalized amplitude values between 0 and 1:

{
  "duration": 245.3,
  "sample_rate": 44100,
  "points": 200,
  "data": [0.02, 0.15, 0.43, 0.87, 0.91, 0.76, 0.54, 0.22, ...]
}

Each value represents the peak amplitude for that time slice. A 200-point waveform of a 4-minute track gives roughly 1.2 seconds per point.

Rendering the waveform

Use the data array to draw bars or a path in a <canvas> or SVG:

const waveform = await fetch('/api/tracks/collab-99/waveform').then((r) => r.json());

waveform.data.forEach((amplitude, i) => {
  const x = (i / waveform.points) * canvas.width;
  const height = amplitude * canvas.height;
  ctx.fillRect(x, canvas.height - height, barWidth, height);
});

Resolution guide

Use casePointsNotes
Thumbnail / list view50-100Small player, low detail
Standard player150-200Good balance of detail and payload size
Full-screen / timeline editor500-1000High detail for scrubbing