Mistral Official Docs

Pixtral 12B

pixtral-12b

active

Pixtral 12B

Pixtral 12B is a powerful AI model from Mistral AI that combines advanced text understanding with image comprehension capabilities. This versatile model can handle a wide range of inputs and outputs, including text, images, video, audio, transcription, and text-to-speech.

Supports a 131K token context window. Handles Text, Image, Video, Audio, Transcription, Text-to-Speech inputs and outputs. Supports fine-tuning for custom applications. Supports tool use for advanced automation. Capable of generating structured output formats.

Pixtral 12B is available under the Apache2 license and can be accessed through Mistral AI's API. To learn more, check out the blog post.

Additional Information

Notes

Context length: 131,072 tokens. Available under Apache2 license. Also known as pixtral-12b-2409.

Model Timeline

Launch Date

9/1/2024

Capabilities

Text

Input Pricing

$0.70/ MTok

Context: 131,072 tokens

Output Pricing

$0.70/ MTok

Max tokens: 4,096

Vision Capabilities

Max resolution: 2048x2048

Max images per prompt: 16

Image

Input Pricing

85 tokens/image

Embeddings

Embeddings Pricing

$0.10/1k tokens

Additional Model Information

Tool Use

Yes

Structured Output

Yes

Reasoning

Yes

Mistral Official Docs

Pixtral 12B

Pixtral 12B

Additional Information

Notes

Model Timeline

Launch Date

Capabilities

Text

Input Pricing

Output Pricing

Vision Capabilities

Image

Input Pricing

Embeddings

Embeddings Pricing

Additional Model Information

Tool Use

Structured Output

Reasoning

Anthropic

Cohere

DeepSeek

Google Vertex AI

Groq

Mistral

OpenAI

X.AI

Additional Information

Notes

Model Timeline

Launch Date

Capabilities

Text

Input Pricing

Output Pricing

Vision Capabilities

Image

Input Pricing

Embeddings

Embeddings Pricing

Additional Model Information

Tool Use

Structured Output

Reasoning

Flatten your repo for AI in seconds

Anthropic

Cohere

DeepSeek

Google Vertex AI

Groq

Mistral

OpenAI

X.AI