Split audio by silence detection

View Markdown

“Podstack” (a podcast hosting platform like Transistor) lets creators upload full episode recordings and automatically split them into chapters at natural pauses.

API

ittybit audio \
  -i https://podstack-app.com/uploads/ep-201-raw.wav \
  --split silence \
  --silence_threshold -35 \
  --silence_min_duration 1.5 \
  --format mp3

const task = {
  input: 'https://podstack-app.com/uploads/ep-201-raw.wav',
  kind: 'audio',
  options: {
    split: 'silence',
    silence_threshold: -35,
    silence_min_duration: 1.5,
    format: 'mp3',
  },
};

const res = await fetch('https://api.ittybit.com/jobs', {
  method: 'POST',
  headers: {
    Authorization: `Bearer ${process.env.ITTYBIT_API_KEY}`,
    'Content-Type': 'application/json',
  },
  body: JSON.stringify(task),
});
const data = await res.json();

import requests

task = {
    "input": "https://podstack-app.com/uploads/ep-201-raw.wav",
    "kind": "audio",
    "options": {
        "split": "silence",
        "silence_threshold": -35,
        "silence_min_duration": 1.5,
        "format": "mp3",
    },
}

res = requests.post(
    "https://api.ittybit.com/jobs",
    headers={"Authorization": f"Bearer {api_key}"},
    json=task,
)
data = res.json()

TASK='{
  "input": "https://podstack-app.com/uploads/ep-201-raw.wav",
  "kind": "audio",
  "options": {
    "split": "silence",
    "silence_threshold": -35,
    "silence_min_duration": 1.5,
    "format": "mp3"
  }
}'

curl -X POST https://api.ittybit.com/jobs \
  -H "Authorization: Bearer $ITTYBIT_API_KEY" \
  -H "Content-Type: application/json" \
  -d "$TASK"

The task returns multiple output files, one per detected segment. Each output includes start and end timestamps from the original recording.

CLI

ittybit audio \
  -i ep-201-raw.wav \
  -o chapters/ \
  --split silence \
  --silence_threshold -35 \
  --silence_min_duration 1.5

Output files are named sequentially: chapters/001.mp3, chapters/002.mp3, etc.

Silence threshold guide

The silence_threshold is in dBFS (decibels relative to full scale). Lower values are more permissive — they require deeper silence to trigger a split.

Content type	Threshold	Min duration	Notes
Speech / podcast	`-35` to `-30` dB	1.5s	Speakers pause between topics
Audiobook	`-40` to `-35` dB	2.0s	Chapter breaks are longer, quieter
Live recording	`-25` to `-20` dB	1.0s	Background noise raises the floor
Music with gaps	`-50` to `-40` dB	3.0s	Only split on true silence between tracks

Start with -35 dB and 1.5 seconds for speech content. If you get too many splits, lower the threshold or increase the minimum duration.

Minimum segment length

Avoid tiny fragments by setting a minimum segment duration:

ittybit audio \
  -i ep-201-raw.wav \
  -o chapters/ \
  --split silence \
  --silence_threshold -35 \
  --silence_min_duration 1.5 \
  --min_segment 30

This discards any segment shorter than 30 seconds — useful for filtering out coughs, mic bumps, or false positives.

Split and convert

Combine splitting with format conversion. Upload a WAV, get MP3 chapters:

ittybit audio \
  -i ep-201-raw.wav \
  -o chapters/ \
  --split silence \
  --silence_threshold -35 \
  --silence_min_duration 1.5 \
  --format mp3 \
  --quality high