ModelsLlama 3.1 405B

Llama 3.1 405B

Ultra-cheap

Maximum cost efficiency for high-volume tasks

Pricing

Input

$0.30

per 1M tokens

Output

$0.50

per 1M tokens

Cached

N/A

per 1M tokens

Note: Released July 2024. World's largest openly available model

Context & Output

Context Window128K tokens

Max Output32K tokens

Latency

Fast

Capabilities

Multimodal

Streaming

Function Calling

Prompt Caching

Key Strengths

What makes this model stand out

Largest open model

Multilingual

128K context

Similar Models in Ultra-cheap Tier

Other models with similar pricing and performance characteristics

Llama 4 Scout