Groq Official Docs

Llama 4 Scout 17B 16E Instruct

meta-llama/llama-4-scout-17b-16e-instruct

active

Llama 4 Scout 17B 16E Instruct

Llama 4 Scout is a powerful 17B parameter model with 16 experts, designed for efficient instruction-following tasks. With a large 131,072 token context window, it delivers high-speed performance of approximately 460 tokens per second, making it well-suited for a variety of text-based applications.

Supports a 131,072 token context window. Handles Text, Image, Video, Audio, Transcription, Text-to-Speech inputs and outputs. Supports fine-tuning for custom applications.

Additional Information

Notes

This is a preview model intended for evaluation purposes only and should not be used in production environments as it may be discontinued at short notice. It has a maximum completion token limit of 8,192 tokens.

Capabilities

Text

Input Pricing

$0.11/ MTok

Context: 131,072 tokens

Output Pricing

$0.34/ MTok

Max tokens: 8,192

Image

Input Pricing

6400 tokens/image

Text-to-Speech

Text-to-Speech Pricing

$0.05/1k characters

Embeddings

Embeddings Pricing

$0.0001/1k tokens

Additional Model Information

Tool Use

Structured Output

Reasoning

Yes

Groq Official Docs

Llama 4 Scout 17B 16E Instruct

Llama 4 Scout 17B 16E Instruct

Additional Information

Notes

Capabilities

Text

Input Pricing

Output Pricing

Image

Input Pricing

Text-to-Speech

Text-to-Speech Pricing

Embeddings

Embeddings Pricing

Additional Model Information

Tool Use

Structured Output

Reasoning

Anthropic

Cohere

DeepSeek

Google Vertex AI

Groq

Mistral

OpenAI

X.AI

Additional Information

Notes

Capabilities

Text

Input Pricing

Output Pricing

Image

Input Pricing

Text-to-Speech

Text-to-Speech Pricing

Embeddings

Embeddings Pricing

Additional Model Information

Tool Use

Structured Output

Reasoning

Flatten your repo for AI in seconds

Anthropic

Cohere

DeepSeek

Google Vertex AI

Groq

Mistral

OpenAI

X.AI