c4ai-aya-vision-32b
## C4AI Aya Vision 32B A multimodal vision model based on the Aya research model, capable of handling a wide range of inputs and outputs including text, images, video, audio, transcription, and text-to-speech. Part of the Aya Expanse model family. Supports a 16,384 token context window. Handles Text, Image, Video, Audio, Transcription, Text-to-Speech inputs and outputs. Supports fine-tuning for custom applications.
$-/ KTok
$-/ KTok
$0.0001 /image
$0.0001/1k tokens
$3.00/MTok training