# Podcast search with Supabase and Ittybit

Build a searchable podcast platform using Ittybit audio processing and Postgres full-text search

Podcast episodes are long and hard to browse. If your users can't search for what was said, they won't find it. This guide wires up Supabase Storage, Ittybit audio processing, and Postgres full-text search so every episode is transcribed and instantly searchable the moment it's uploaded.

## Architecture

1. Creator uploads a raw audio file to Supabase Storage
2. A database webhook fires an Edge Function on insert
3. The Edge Function creates an Ittybit `audio` task to normalize the file to MP3
4. A second task transcribes the audio
5. Ittybit sends a webhook on completion
6. A receiving Edge Function stores the processed URL and transcript in Postgres
7. A `tsvector` column enables instant full-text search across all episodes

## Create the episodes table

The `search_vector` column is a generated `tsvector` that automatically updates whenever the transcript changes. The GIN index makes queries fast even across thousands of episodes.

```sql
create table public.episodes (
  id uuid primary key default gen_random_uuid(),
  title text not null,
  storage_path text not null,
  source_url text not null,
  audio_task_id text,
  transcript_task_id text,
  status text default 'pending',
  audio_url text,
  transcript text,
  search_vector tsvector generated always as (
    to_tsvector('english', coalesce(title, '') || ' ' || coalesce(transcript, ''))
  ) stored,
  duration_seconds numeric,
  created_at timestamptz default now(),
  updated_at timestamptz default now()
);

create index idx_episodes_search on episodes using gin(search_vector);
```

## Edge Function: dispatch processing

When a file lands in the `podcasts` bucket, this function creates two Ittybit tasks -- one to normalize the audio and one to transcribe it.

<CodeGroup labels={["TypeScript", "curl"]}>
```typescript
// supabase/functions/process-episode/index.ts

serve(async (req) => {
const payload = await req.json();
const record = payload.record;

const supabase = createClient(
Deno.env.get("SUPABASE_URL")!,
Deno.env.get("SUPABASE_SERVICE_ROLE_KEY")!,
);

// Build the public URL for the uploaded file
const { data: urlData } = supabase.storage
.from(record.bucket_id)
.getPublicUrl(record.name);

const sourceUrl = urlData.publicUrl;
const title = record.name.replace(/\.[^.]+$/, "").replace(/[-_]/g, " ");

// Insert a pending episode row
const { data: episode } = await supabase
.from("episodes")
.insert({
title,
storage_path: `${record.bucket_id}/${record.name}`,
source_url: sourceUrl,
status: "processing",
})
.select()
.single();

const headers = {
Authorization: `Bearer ${Deno.env.get("ITTYBIT_API_KEY")}`,
"Content-Type": "application/json",
};

// Task 1: Normalize audio to MP3
const audioRes = await fetch("https://api.ittybit.com/jobs", {
method: "POST",
headers,
body: JSON.stringify({
input: sourceUrl,
kind: "audio",
options: {
format: "mp3",
quality: "high",
},
metadata: {
episode_id: episode.id,
callback_type: "audio",
},
}),
});
const audioTask = await audioRes.json();

// Task 2: Transcribe the audio
const transcriptRes = await fetch("https://api.ittybit.com/jobs", {
method: "POST",
headers,
body: JSON.stringify({
input: sourceUrl,
kind: "transcript",
metadata: {
episode_id: episode.id,
callback_type: "transcript",
},
}),
});
const transcriptTask = await transcriptRes.json();

// Store task IDs
await supabase
.from("episodes")
.update({
audio_task_id: audioTask.id,
transcript_task_id: transcriptTask.id,
})
.eq("id", episode.id);

return new Response(JSON.stringify({ ok: true }), {
headers: { "Content-Type": "application/json" },
});
});

````

```bash
# Test the audio processing task manually
curl -X POST https://api.ittybit.com/jobs \
  -H "Authorization: Bearer $ITTYBIT_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "input": "https://your-project.supabase.co/storage/v1/object/public/podcasts/ep-42.wav",
    "kind": "audio",
    "options": {
      "format": "mp3",
      "quality": "high"
    }
  }'
````

</CodeGroup>

## Wire up the database webhook

In the Supabase Dashboard, go to **Database > Webhooks** and create a new webhook:

- **Table:** `storage.objects`
- **Events:** `INSERT`
- **Type:** Supabase Edge Function
- **Function:** `process-episode`

You can filter to only the `podcasts` bucket by adding a condition on `bucket_id`.

## Edge Function: receive Ittybit webhook

This function handles callbacks for both the audio and transcript tasks. It checks which type arrived via `metadata.callback_type` and updates the appropriate columns. Once both are done, the episode status moves to `completed`.

<CodeGroup labels={["TypeScript", "curl"]}>
```typescript
// supabase/functions/ittybit-webhook/index.ts

serve(async (req) => {
const payload = await req.json();

const supabase = createClient(
Deno.env.get("SUPABASE_URL")!,
Deno.env.get("SUPABASE_SERVICE_ROLE_KEY")!,
);

const episodeId = payload.metadata?.episode_id;
const callbackType = payload.metadata?.callback_type;
if (!episodeId || !callbackType) {
return new Response("Missing metadata", { status: 400 });
}

if (payload.status === "failed") {
await supabase
.from("episodes")
.update({ status: "failed", updated_at: new Date().toISOString() })
.eq("id", episodeId);
return new Response(JSON.stringify({ ok: true }), {
headers: { "Content-Type": "application/json" },
});
}

// Update the appropriate fields based on callback type
const updates: Record<string, unknown> = {
updated_at: new Date().toISOString(),
};

if (callbackType === "audio") {
updates.audio_url = payload.output?.url;
updates.duration_seconds = payload.output?.duration;
}

if (callbackType === "transcript") {
updates.transcript = payload.output?.text;
}

await supabase.from("episodes").update(updates).eq("id", episodeId);

// Check if both tasks are now complete
const { data: episode } = await supabase
.from("episodes")
.select("audio_url, transcript")
.eq("id", episodeId)
.single();

if (episode?.audio_url && episode?.transcript) {
await supabase
.from("episodes")
.update({ status: "completed" })
.eq("id", episodeId);
}

return new Response(JSON.stringify({ ok: true }), {
headers: { "Content-Type": "application/json" },
});
});

````

```bash
# Register your webhook endpoint in the Ittybit dashboard
# or via the API:
curl -X POST https://api.ittybit.com/webhooks \
  -H "Authorization: Bearer $ITTYBIT_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://your-project.supabase.co/functions/v1/ittybit-webhook",
    "events": ["job.succeeded", "job.failed"]
  }'
````

</CodeGroup>

## Search endpoint

Create an Edge Function that queries the `search_vector` column. Postgres `ts_rank` sorts results by relevance, and `ts_headline` returns a snippet with matching terms highlighted.

<CodeGroup labels={["TypeScript", "curl"]}>
```typescript
// supabase/functions/search-episodes/index.ts

serve(async (req) => {
const { searchParams } = new URL(req.url);
const query = searchParams.get("q");
if (!query) {
return new Response(JSON.stringify({ episodes: [] }), {
headers: { "Content-Type": "application/json" },
});
}

const supabase = createClient(
Deno.env.get("SUPABASE_URL")!,
Deno.env.get("SUPABASE_SERVICE_ROLE_KEY")!,
);

const { data: episodes } = await supabase.rpc("search_episodes", {
search_query: query,
});

return new Response(JSON.stringify({ episodes }), {
headers: { "Content-Type": "application/json" },
});
});

````

```bash
# Search for episodes mentioning "kubernetes"
curl "https://your-project.supabase.co/functions/v1/search-episodes?q=kubernetes"
````

</CodeGroup>

The `search_episodes` function lives in Postgres:

```sql
create or replace function search_episodes(search_query text)
returns table (
  id uuid,
  title text,
  audio_url text,
  duration_seconds numeric,
  headline text,
  rank real
) language sql as $$
  select
    e.id,
    e.title,
    e.audio_url,
    e.duration_seconds,
    ts_headline('english', e.transcript, plainto_tsquery('english', search_query),
      'StartSel=<mark>, StopSel=</mark>, MaxWords=35, MinWords=15'
    ) as headline,
    ts_rank(e.search_vector, plainto_tsquery('english', search_query)) as rank
  from episodes e
  where e.search_vector @@ plainto_tsquery('english', search_query)
    and e.status = 'completed'
  order by rank desc
  limit 20;
$$;
```

## Deploy

```bash
supabase functions deploy process-episode
supabase functions deploy ittybit-webhook
supabase functions deploy search-episodes
```

Set your secrets:

```bash
supabase secrets set ITTYBIT_API_KEY=your_ittybit_api_key
```

`SUPABASE_URL` and `SUPABASE_SERVICE_ROLE_KEY` are available automatically in Edge Functions.

## Test the pipeline

Upload an episode and watch it progress:

```sql
select id, title, status, audio_url is not null as has_audio,
       transcript is not null as has_transcript
from episodes
order by created_at desc
limit 5;
```

Once the status reads `completed`, search for something mentioned in the episode:

```sql
select title, ts_headline('english', transcript,
  plainto_tsquery('english', 'machine learning'),
  'StartSel=<mark>, StopSel=</mark>, MaxWords=35, MinWords=15'
) as snippet
from episodes
where search_vector @@ plainto_tsquery('english', 'machine learning')
order by ts_rank(search_vector, plainto_tsquery('english', 'machine learning')) desc;
```

## See also

- [Auto-process Supabase uploads](/guides/auto-process-supabase-uploads) -- general-purpose Supabase + Ittybit pipeline
- [Prepare podcast audio](/guides/prepare-podcast-audio) -- format and quality options for podcast distribution
- [Build a user upload pipeline](/guides/build-a-user-upload-pipeline) -- multi-task processing for uploads