Last updated: 17/03/2025

MistralOfficial Docs

Mistral OCR

mistral-ocr-latest

active

## Mistral OCR Introducing the world's best document understanding API. Mistral OCR handles a wide range of inputs including text, images, video, and audio, with a large 32,768 token context window. Supports Text, Image, Video, Audio, Transcription, Text-to-Speech inputs and outputs. Supports fine-tuning for custom applications.

Additional Information

Notes

Context length: 32768 tokens. Aliases: mistral-ocr-2503. Pricing: 1000 Pages / 1$.

Capabilities

Text

Input Pricing

$0.70/ MTok

Context: 32,768 tokens

Output Pricing

$0.70/ MTok

Max tokens: 4,096

Vision Capabilities

Max resolution: 4096x4096
Max images per prompt: 10

Image

Input Pricing

1000 tokens/image

Embeddings

Embeddings Pricing

$0.10/1k tokens