Last updated: 16/04/2025

OpenAIOfficial Docs

GPT-4o mini Realtime

gpt-4o-mini-realtime-preview

active

GPT-4o mini Realtime

A smaller, real-time model for text and audio inputs and outputs, with a 128K token context window and support for a wide range of capabilities including text, image, video, audio, transcription, and text-to-speech.

Supports fine-tuning for custom applications, tool use for advanced automation, and generation of structured output formats.

Model Timeline

Launch Date

12/17/2024

Capabilities

Text

Input Pricing

$0.60/ KTok

Context: 128,000 tokens

Output Pricing

$0.30/ KTok

Max tokens: 4,096

Vision Capabilities

Max resolution: 2048x2048
Max images per prompt: 10

Image

Input Pricing

1000 tokens/image

Audio

Input Pricing

$ 10.00 /minute

Generation Pricing

$20.00 /minute

Transcription

Transcription Pricing

$10.00/minute

Text-to-Speech

Text-to-Speech Pricing

$0.01/1k characters

Embeddings

Embeddings Pricing

$0.02/1k tokens

Fine-Tuning

Fine-Tuning Pricing

$3.00/MTok training

Additional Model Information

Tool Use

Yes

Structured Output

Yes

Reasoning

Yes

Flatten your repo for AI in seconds

Flatten repos. Prompt faster. One click → one GPT-ready file

Free Online & Desktop