Groq Official Docs

Llama 3.1 8B Instant

llama-3.1-8b-instant

active

Llama 3.1 8B Instant

Meta's Llama 3.1 8B Instant model offers high-speed inference at 750 tokens per second, making it a powerful and efficient choice for a wide range of natural language processing tasks. With a 128K token context window, this model can handle long-form content and complex inputs.

Supports a 128K token context window. Handles Text, Image, Video, Audio, Transcription, Text-to-Speech inputs and outputs. Supports fine-tuning for custom applications.

Capabilities

Text

Input Pricing

$0.05/ MTok

Context: 131,072 tokens

Output Pricing

$0.08/ MTok

Max tokens: 8,192

Image

Input Pricing

6400 tokens/image

Text-to-Speech

Text-to-Speech Pricing

$0.05/1k characters

Embeddings

Embeddings Pricing

$0.08/1k tokens

Additional Model Information

Tool Use

Structured Output

Reasoning

Yes

Groq Official Docs

Llama 3.1 8B Instant

Llama 3.1 8B Instant

Capabilities

Text

Input Pricing

Output Pricing

Image

Input Pricing

Text-to-Speech

Text-to-Speech Pricing

Embeddings

Embeddings Pricing

Additional Model Information

Tool Use

Structured Output

Reasoning

Anthropic

Cohere

DeepSeek

Google Vertex AI

Groq

Mistral

OpenAI

X.AI

Capabilities

Text

Input Pricing

Output Pricing

Image

Input Pricing

Text-to-Speech

Text-to-Speech Pricing

Embeddings

Embeddings Pricing

Additional Model Information

Tool Use

Structured Output

Reasoning

Flatten your repo for AI in seconds

Anthropic

Cohere

DeepSeek

Google Vertex AI

Groq

Mistral

OpenAI

X.AI