Last updated: 16/04/2025

Google Vertex AIOfficial Docs

Gemini 2.0 Flash-Lite

gemini-2.0-flash-lite

active

Gemini 2.0 Flash-Lite

Gemini 2.0 Flash-Lite is a cost-efficient and low-latency AI model optimized for multimodal tasks, with a 1 million token context window and support for text, image, video, audio, transcription, and text-to-speech capabilities. It outperforms the previous 1.5 Flash model on a variety of benchmarks.

Supports a 1048576 token context window. Handles Text, Image, Video, Audio, Transcription, Text-to-Speech inputs and outputs. Supports fine-tuning for custom applications. Supports tool use for advanced automation. Capable of generating structured output formats.

Additional Information

Notes

A Gemini 2.0 Flash model optimized for cost efficiency and low latency. It has a 1 million token context window and supports multimodal input including audio, images, video, and text. It outperforms 1.5 Flash on the majority of benchmarks.

Model Timeline

Launch Date

2/25/2025

Capabilities

Text

Input Pricing

$0.07/ MTok

Context: 1,048,576 tokens

Output Pricing

$0.30/ MTok

Max tokens: 1,048,576

Vision Capabilities

Max images per prompt: 1

Image

Input Pricing

1290 tokens/image

Video

Input Pricing

$3.00/video

Text-to-Speech

Text-to-Speech Pricing

$0.02/1k characters

Embeddings

Embeddings Pricing

$0.0001/1k tokens

Fine-Tuning

Fine-Tuning Pricing

$1.00/MTok training

Storage:

$0.02/month

Additional Model Information

Tool Use

Yes

Structured Output

Yes

Reasoning

Yes

Flatten your repo for AI in seconds

Flatten repos. Prompt faster. One click → one GPT-ready file

Free Online & Desktop