ModelsPhi-4

Phi-4

by Microsoft

Local/Edge

Pricing
Input
Self-hosted
per 1M tokens
Output
Self-hosted
per 1M tokens
Cached
N/A
per 1M tokens
Note: Latest Phi release, optimized for edge deployment
Context & Output
Context Window16K tokens
Max Output4K tokens
Latency
Very Fast
Capabilities
Multimodal
Streaming
Function Calling
Prompt Caching
Key Strengths
What makes this model stand out
SOTA small model
Reasoning
Runs on laptop
Similar Models in Local/Edge Tier
Other models with similar pricing and performance characteristics
IBM Granite 4.0 Nano 1.5B
IBM
Input:Self-hosted/M
Context:128K tokens
View Details
IBM Granite 4.0 Nano 350M
IBM
Input:Self-hosted/M
Context:128K tokens
View Details
Phi-4-multimodal
Microsoft
Input:Self-hosted/M
Context:16K tokens
View Details