ModelsLlama 3.1 70B
Llama 3.1 70B
by Meta
Ultra-cheap
Maximum cost efficiency for high-volume tasks
Pricing
Input
$0.08
per 1M tokens
Output
$0.15
per 1M tokens
Cached
N/A
per 1M tokens
Note: Released July 2024. Most popular open model
Context & Output
Context Window128K tokens
Max Output32K tokens
Latency
Very Fast
Capabilities
Multimodal
Streaming
Function Calling
Prompt Caching
Key Strengths
What makes this model stand out
Popular for production
Open weights
Strong reasoning
Similar Models in Ultra-cheap Tier
Other models with similar pricing and performance characteristics
Llama 4 Scout
Meta
Input:$0.11/M
Context:10M tokens
GPT-5 Nano
OpenAI
Input:$0.05/M
Context:272K tokens
Llama 3.1 405B
Meta
Input:$0.30/M
Context:128K tokens