Subtitles
Overview
The Subtitles Task creates caption tracks (.vtt or .srt) for video or audio files.
It converts detected speech into synchronized subtitle text with timestamps that align with playback.
When a Subtitles task runs, it creates a Track file with kind: "subtitles"
and outputs a .vtt or .srt text file containing caption data.
Example Output
Creating a Subtitles Task
Subtitles tasks can be created directly for a file or URL.
They typically follow a prior Speech task to transcribe the spoken audio.
When the task completes, ittybit will create a Track file in your project
and send a webhook to your endpoint if webhook_url is defined.
Webhook Example
File Structure
| Property | Type | Description | 
|---|---|---|
| id | string | Unique file ID for the subtitle track. | 
| object | string | Always "track". | 
| kind | string | Always "subtitles". | 
| language | string | Language code (ISO 639-1). | 
| format | string | Output format — "vtt" or "srt". | 
| filename | string | Name of the subtitle file. | 
| duration | number | Duration of the associated media file in seconds. | 
| filesize | number | Size of the subtitle file in bytes. | 
| url | string | Publicly accessible subtitle file URL. | 
| metadata | object | Reserved for additional details. | 
| created / updated | string (ISO 8601) | Timestamps for creation and last update. | 
Supported Inputs
Subtitles can be generated from:
- 
Video files: .mp4,.mov,.webm
- 
Audio files: .mp3,.wav,.m4a
Example Workflow Integration
Subtitles tasks can be part of a broader Automation
that processes uploaded media automatically.
When a new media file is created, this automation will encode the video and generate subtitles automatically.
Example Output Format
Typical .vtt output:
Common Use Cases
- 
Generating closed captions for accessibility 
- 
Localizing content into multiple languages 
- 
Improving SEO and video searchability 
- 
Creating transcribed learning materials 
Summary
The Subtitles task converts detected speech into time-coded captions for video or audio.
It produces .vtt or .srt files compatible with standard web and media players
and can be chained with other tasks in workflows or automations.