🚀
Production AI
Ship AI features that scale
38 resources
10-15 hours
5 beginner · 29 intermediate · 4 advanced
38 resources
🎯 Start Here
Perfect for beginners or those new to this topic
interactive
beginner
Prompt Engineering Interactive Tutorial
Anthropic
2-3 hours
Hands-on Jupyter notebooks teaching production-grade prompting with Claude. Nine chapters covering prompt structure, common pitfalls, Claude's strengths/limitations, and crafting effective prompts. Interactive exercises throughout.
Updated: 2024
guide
beginner
Prompt Engineering Guide
Anthropic
30 min read
Official documentation on prompt engineering best practices for Claude 4.x models. Covers clear instructions, providing context, using examples, chain-of-thought reasoning, and Claude 4 specific techniques.
guide
beginner
Prompt Injection Explained
Simon Willison
20 min read
Clear explanation of prompt injection attacks and how they work. Real examples and implications for AI security.
guide
beginner
Best LLM Evaluation Platforms 2025
Braintrust
30 minutes
Comparison of LLM evaluation platforms including Braintrust, LangSmith, Weights & Biases, Langfuse, and LangWatch. Feature comparison, integration ecosystems, and use case recommendations. Understand the evaluation tooling landscape.
Updated: 2025
course
beginner
Complete Beginner's Course on AI Evaluations
Aman Khan (Arize AI)
2-3 hours
Step-by-step course teaching AI evaluations from scratch. Covers why evals matter, how to build your first eval, evaluation methods, and production patterns. Designed for product managers and builders new to AI evaluation.
Updated: 2025
🚀 Core Concepts
Dive deeper into the fundamentals and best practices
guide
intermediate
Building Effective Agents
Anthropic
30 min read
Comprehensive guide from Anthropic's experience working with dozens of teams building LLM agents. Covers workflows vs agents distinction, practical patterns, tool design, and when to use agentic systems. Includes real-world examples and code.
Updated: December 2024
guide
intermediate
Building Agents with Claude Agent SDK
Anthropic
45 min read
Guide to building agents with the Claude Agent SDK (formerly Claude Code SDK). Covers tool design, bash access, file operations, and building various agent types: finance agents, personal assistants, customer support, research agents.
tutorial
intermediate
AutoGen Multi-Agent Framework Tutorial
Microsoft
2-3 hours
Official tutorial for building multi-agent AI applications with AutoGen. Learn how to create conversational agents, build agent teams, and orchestrate complex workflows. Covers AutoGen Studio for no-code prototyping, AgentChat for conversations, and Core API for event-driven systems. Includes practical examples and working code.
Updated: 2025
guide
intermediate
Amazon Bedrock Agents
AWS
1-2 hours
Build, deploy, and scale AI agents on AWS with Amazon Bedrock. Orchestrate multi-step tasks, connect to enterprise data sources, and invoke APIs. Supports Claude, Llama, and other foundation models.
Updated: 2025
tutorial
intermediate
Amazon Bedrock Agents Workshop
AWS
2-3 hours
Hands-on workshop for building production AI agents with Amazon Bedrock. Learn to orchestrate multi-step tasks, connect to enterprise data, and invoke APIs. Includes code examples for creating agents with Claude, implementing RAG, and deploying to production with built-in observability and guardrails.
Updated: 2025
guide
intermediate
Google Vertex AI Agent Builder
Google Cloud
2-3 hours
Comprehensive platform to build, scale, and govern reliable agents on Google Cloud. Python Agent Development Kit (ADK) downloaded over 7 million times. New observability dashboards and faster build-and-deploy tools.
Updated: 2025
course
intermediate
Introduction to Model Context Protocol (MCP)
Anthropic
2-3 hours
Official Anthropic course teaching how to build MCP servers and clients from scratch using Python. Learn MCP's three core primitives—tools, resources, and prompts—to connect Claude with external services. Think of MCP like USB-C for AI applications.
Updated: 2025
guide
intermediate
Model Context Protocol Documentation
Anthropic
1 hour
Official MCP documentation covering server and client implementation. Use the MCP connector in the Messages API, add MCP servers to Claude Code or Claude Desktop, and enable MCP for your team. Includes pre-built servers for Google Drive, Slack, GitHub, Git, Postgres, and Puppeteer.
Updated: 2025
tutorial
intermediate
Microsoft Copilot Studio ❤️ MCP
Microsoft
1-2 hours
Lab teaching how to deploy an MCP server and integrate it with Microsoft Copilot Studio. Deploy locally via VS Code with dev tunnels or to Azure Container Apps. Includes example 'Jokes MCP Server' template. Full infrastructure-as-code with Azure Developer CLI.
Updated: 2025
guide
intermediate
Beyond 'Vibe-Coding': Scaling AI with GitHub Spec Kit
Mark Beacom
25 minutes
Critical analysis of spec-driven development vs 'vibe-coding' (prompt-and-pray). Explains why specifications are essential for scaling AI-powered development in enterprise teams. Real-world examples and architectural insights.
Updated: 2025
guide
intermediate
Cost Optimization Guide
Anthropic
20 min read
Official strategies for optimizing LLM costs. Covers prompt caching (save 90% on repeated context), model selection, prompt design, and batch processing.
guide
intermediate
Prompt Caching Explained
Anthropic
15 min read
Deep dive on prompt caching. How it works, when to use it, and implementation patterns. Can save 90% on costs for repeated context.
guide
intermediate
Reducing AI Latency
Vercel
30 min read
Strategies for faster AI responses: streaming, parallel calls, edge deployment. Practical patterns for production applications.
guide
intermediate
Error Handling Guide
Anthropic
25 min read
Best practices for handling AI failures gracefully. Fallbacks, retries, error types, and user-facing error messages.
guide
intermediate
Observability for LLM Applications
Langfuse
30 min read
What to monitor in production LLM apps. Beyond API errors: latency, costs, quality, and user satisfaction. Open source monitoring platform.
guide
intermediate
LLM Security Guide (OWASP Top 10)
OWASP
45 min read
Top 10 security risks for LLM applications. Prompt injection, data poisoning, model theft, and more. Must-read for production deployments.
course
intermediate
Prompt Evaluations Course
Anthropic
3-4 hours
Comprehensive course on building AI evaluations. Nine lessons teaching code-graded, human-graded, and LLM-graded evaluation methods. Learn to implement evals successfully in your workflows with the Anthropic API. Build the skill that OpenAI and Anthropic CPOs call 'the most important for 2025.'
Updated: 2025
tutorial
intermediate
OpenAI Evals Framework
OpenAI
2-3 hours
Open-source framework for evaluating LLMs and LLM systems. Battle-tested eval templates including Match, Includes, FuzzyMatch, and model-graded evals. Learn OpenAI's recommended 'chain-of-thought then classify' approach for more accurate LLM judges. Complete registry of standardized benchmarks.
Updated: 2025
guide
intermediate
Your AI Product Needs Evals
Hamel Husain
45 minutes
Influential guide from former GitHub ML engineer on why evals are essential. Hamel advocates spending 60-80% of development time on evals and error analysis. Learn how to move beyond demos to production-grade AI products through systematic evaluation.
Updated: 2025
guide
intermediate
LLM Evaluation Metrics: Complete Guide
Braintrust
40 minutes
Comprehensive guide covering all LLM evaluation metrics: accuracy (exact match, semantic similarity), quality (relevance, coherence, factuality), and production metrics (latency, cost, success rate). Learn which metrics matter for different use cases.
Updated: 2025
guide
intermediate
LLM Evaluation: Metrics, Frameworks, and Best Practices
Weights & Biases
1 hour
Comprehensive report on LLM evaluation covering functional evaluation (accuracy, hallucination rates), quality metrics, and evaluation-driven development. Learn how W&B upgraded Wandbot to v1.1 using systematic evaluation.
Updated: 2025
tutorial
intermediate
Vertex AI: Getting Started
Google Cloud
1-2 hours
Comprehensive introduction to Vertex AI, Google's enterprise ML platform. Learn to deploy models, build AI applications, and scale ML workflows. Covers Gemini integration, model deployment, and production best practices.
Updated: 2025
tutorial
intermediate
Azure AI Foundry Quickstart
Microsoft
45 minutes
Official quickstart for Azure AI Foundry. Learn to build and deploy multi-agent systems at enterprise scale. Covers Agent Service, model catalog, observability, and Entra Agent ID for security.
Updated: 2025
guide
intermediate
Microsoft 365 Agents: Developer Guide
Microsoft
1 hour read
Comprehensive guide to building agents for Microsoft 365. Learn multi-agent orchestration, declarative agents, and integration with Copilot. Covers architecture patterns and best practices for enterprise agents.
Updated: 2025
tutorial
intermediate
Amazon Bedrock: Developer Guide
AWS
1-2 hours
Comprehensive guide to building with Amazon Bedrock. Access Claude, Llama, Mistral, and Amazon Nova via unified API. Learn to build agents with AgentCore, implement RAG with Knowledge Bases, and deploy at scale.
Updated: 2025
tutorial
intermediate
Amazon SageMaker: Getting Started
AWS
2-3 hours
Complete introduction to Amazon SageMaker for building, training, and deploying ML models. Learn the full ML workflow from data preparation to model deployment. Covers SageMaker Studio, training jobs, and endpoints.
Updated: 2025
guide
intermediate
Prompt Engineering for Business Performance
Anthropic
30 min read
How a Fortune 500 company used prompt engineering to improve Claude accuracy by 20%. Real case study with techniques: scratchpads, few-shot examples, and subject matter expert collaboration.
Updated: February 2024
guide
intermediate
LLM Patterns: Metrics That Matter
Eugene Yan
45 min read
Practical patterns for building with LLMs including evaluation, metrics, and production considerations. From an AI/ML practitioner at Amazon.
course
intermediate
Building Systems with the ChatGPT API
DeepLearning.AI (Andrew Ng & OpenAI)
1 hour
Learn to build multi-step systems using LLMs. Covers breaking complex tasks into subtasks, chaining prompts, evaluating outputs, and building end-to-end customer service assistants. Hands-on Python examples.
Updated: 2024
🔥 Advanced Topics
For experienced practitioners looking to go deeper
guide
advanced
Code Execution with MCP: Building Efficient AI Agents
Anthropic Engineering
45 minutes
Deep dive into using MCP for code execution in AI agents. Learn how MCP enables secure, sandboxed code execution while maintaining agent efficiency. Engineering insights from the Anthropic team.
Updated: 2025
guide
advanced
Advanced RAG Techniques
LlamaIndex
1 hour read
Beyond basic RAG: chunking strategies, reranking, query transformation, hybrid search, and more. Cheat sheet and recipes for powerful RAG systems.
guide
advanced
Using LLM-as-a-Judge: Complete Guide
Hamel Husain
1 hour
Comprehensive guide to using LLMs to evaluate other LLMs. When to use LLM judges, how to validate them against human judgments, best practices for rubrics and prompts. Practical patterns for scaling subjective evaluations.
Updated: 2025
course
advanced
Machine Learning Systems Design (CS329S)
Stanford University (Chip Huyen)
10 weeks
Stanford course on designing and deploying ML systems in production. Covers data engineering, model development, deployment, monitoring, and maintenance. Includes case studies from industry. Perfect for understanding the full ML lifecycle.
Updated: 2024
What to Learn Next
Continue your AI learning journey with these related paths