Mistral Official Docs

Mistral OCR

mistral-ocr-2503

active

Mistral OCR

Mistral OCR is a powerful document understanding service that enables users to extract interleaved text and images from a wide range of document formats. With its large 32,768 token context window, this model can handle complex, multi-page documents with ease.

Supports Text, Image, Video, Audio, Transcription, and Text-to-Speech inputs and outputs. Specialized for document processing and analysis tasks.

Supports fine-tuning for custom applications, allowing users to adapt the model to their specific needs. Capable of generating structured output formats for seamless integration into downstream workflows.

Also available via the alias mistral-ocr-latest.

Additional Information

Notes

Also available via the alias 'mistral-ocr-latest'.

Model Timeline

Launch Date

3/1/2025

Last Updated

3/1/2025

Capabilities

Text

Input Pricing

$0.70/ MTok

Context: 32,768 tokens

Output Pricing

$0.70/ MTok

Max tokens: 4,096

Vision Capabilities

Max resolution: 4096x4096

Max images per prompt: 10

Image

Input Pricing

1000 tokens/image

Embeddings

Embeddings Pricing

$0.10/1k tokens

Additional Model Information

Tool Use

Structured Output

Yes

Mistral Official Docs

Mistral OCR

Mistral OCR

Additional Information

Notes

Model Timeline

Launch Date

Last Updated

Capabilities

Text

Input Pricing

Output Pricing

Vision Capabilities

Image

Input Pricing

Embeddings

Embeddings Pricing

Additional Model Information

Tool Use

Structured Output

Reasoning

Anthropic

Cohere

DeepSeek

Google Vertex AI

Groq

Mistral

OpenAI

X.AI

Additional Information

Notes

Model Timeline

Launch Date

Last Updated

Capabilities

Text

Input Pricing

Output Pricing

Vision Capabilities

Image

Input Pricing

Embeddings

Embeddings Pricing

Additional Model Information

Tool Use

Structured Output

Reasoning

Flatten your repo for AI in seconds

Anthropic

Cohere

DeepSeek

Google Vertex AI

Groq

Mistral

OpenAI

X.AI