ModelsLlama 3.3 70B

Llama 3.3 70B

by Meta

Cost-effective

Best value for production workloads

Pricing
Input
$0.35
per 1M tokens
Output
$0.55
per 1M tokens
Cached
N/A
per 1M tokens
Note: Released Dec 2024. 405B performance at fraction of cost
Context & Output
Context Window128K tokens
Max Output32K tokens
Latency
Fast
Capabilities
Multimodal
Streaming
Function Calling
Prompt Caching
Key Strengths
What makes this model stand out
Similar to 3.1 405B
Open weights
Cost-efficient
Similar Models in Cost-effective Tier
Other models with similar pricing and performance characteristics
DeepSeek V3.1
DeepSeek
Input:$0.56/M
Context:128K tokens
View Details
GPT-5 Mini
OpenAI
Input:$0.25/M
Context:272K tokens
View Details
DeepSeek R1
DeepSeek
Input:$0.55/M
Context:128K tokens
View Details