Last updated: 16/04/2025

MistralOfficial Docs

Pixtral 12B

pixtral-12b-2409

active

Pixtral 12B

Pixtral 12B is a powerful AI model from Mistral AI that combines advanced text understanding with state-of-the-art image comprehension capabilities. This versatile model can handle a wide range of inputs and outputs, including text, images, video, audio, transcription, and text-to-speech.

Supports a 131K token context window. Handles Text, Image, Video, Audio, Transcription, Text-to-Speech inputs and outputs. Supports fine-tuning for custom applications. Supports tool use for advanced automation. Capable of generating structured output formats.

Model Timeline

Launch Date

9/1/2024

Last Updated

9/1/2024

Capabilities

Text

Input Pricing

$0.70/ MTok

Context: 131,072 tokens

Output Pricing

$0.70/ MTok

Max tokens: 4,096

Vision Capabilities

Max resolution: 2048x2048
Max images per prompt: 16

Image

Input Pricing

85 tokens/image

Embeddings

Embeddings Pricing

$0.10/1k tokens

Additional Model Information

Tool Use

Yes

Structured Output

Yes

Reasoning

Yes

Flatten your repo for AI in seconds

Flatten repos. Prompt faster. One click → one GPT-ready file

Free Online & Desktop