Last updated: 16/04/2025

OpenAIOfficial Docs

GPT-4o mini TTS

gpt-4o-mini-tts

active

GPT-4o mini TTS

A versatile text-to-speech model powered by GPT-4o mini, capable of generating high-quality audio from text inputs.

Supports a 128,000 token context window. Handles Text, Image, Video, Audio, Transcription, Text-to-Speech inputs and outputs. Supports fine-tuning for custom applications.

Capable of generating structured output formats.

Capabilities

Text

Input Pricing

$0.15/ KTok

Context: 128,000 tokens

Output Pricing

$0.20/ KTok

Max tokens: 4,096

Audio

Input Pricing

Not available

Generation Pricing

$0.01 /minute

Transcription

Transcription Pricing

$0.01/minute

Text-to-Speech

Text-to-Speech Pricing

$0.01/1k characters

Embeddings

Embeddings Pricing

$0.02/1k tokens

Fine-Tuning

Fine-Tuning Pricing

$3.00/MTok training

Additional Model Information

Tool Use

No

Structured Output

Yes

Reasoning

No

Flatten your repo for AI in seconds

Flatten repos. Prompt faster. One click → one GPT-ready file

Free Online & Desktop