Last updated: 16/04/2025

GroqOfficial Docs

Llama 4 Scout 17B 16E Instruct

meta-llama/llama-4-scout-17b-16e-instruct

active

Llama 4 Scout 17B 16E Instruct

Llama 4 Scout is a powerful 17B parameter model with 16 experts, designed for efficient instruction-following tasks. With a large 131,072 token context window, it delivers high-speed performance of approximately 460 tokens per second, making it well-suited for a variety of text-based applications.

Supports a 131,072 token context window. Handles Text, Image, Video, Audio, Transcription, Text-to-Speech inputs and outputs. Supports fine-tuning for custom applications.

Additional Information

Notes

This is a preview model intended for evaluation purposes only and should not be used in production environments as it may be discontinued at short notice. It has a maximum completion token limit of 8,192 tokens.

Capabilities

Text

Input Pricing

$0.11/ MTok

Context: 131,072 tokens

Output Pricing

$0.34/ MTok

Max tokens: 8,192

Image

Input Pricing

6400 tokens/image

Text-to-Speech

Text-to-Speech Pricing

$0.05/1k characters

Embeddings

Embeddings Pricing

$0.0001/1k tokens

Additional Model Information

Tool Use

No

Structured Output

No

Reasoning

Yes

Flatten your repo for AI in seconds

Flatten repos. Prompt faster. One click → one GPT-ready file

Free Online & Desktop