ModelsGPT-OSS-120B

GPT-OSS-120B

Cost-effective

Best value for production workloads

Pricing

Input

Self-hosted

per 1M tokens

Output

Self-hosted

per 1M tokens

Cached

N/A

per 1M tokens

Note: Released Aug 2025. OpenAI's first open-weight model since GPT-2

Context & Output

Context Window128K tokens

Max Output32K tokens

Latency

Fast

Capabilities

Multimodal

Streaming

Function Calling

Prompt Caching

Key Strengths

What makes this model stand out

Matches o4-mini

117B total (5.1B active)

Open weights

Similar Models in Cost-effective Tier

Other models with similar pricing and performance characteristics

Amazon Nova 2 Lite

Amazon

Input:TBD/M

Context:1M tokens

Amazon Nova 2 Sonic

Amazon

Input:TBD/M

Context:1M tokens

DeepSeek V3.1

DeepSeek

Input:$0.56/M

Context:128K tokens