🚀

Production AI

Ship AI features that scale

38 resources

10-15 hours

5 beginner · 29 intermediate · 4 advanced

38 resources

🎯 Start Here

Perfect for beginners or those new to this topic

interactive

beginner

Prompt Engineering Interactive Tutorial

Anthropic

2-3 hours

Hands-on Jupyter notebooks teaching production-grade prompting with Claude. Nine chapters covering prompt structure, common pitfalls, Claude's strengths/limitations, and crafting effective prompts. Interactive exercises throughout.

Updated: 2024

View Resource

guide

beginner

Prompt Engineering Guide

Anthropic

30 min read

Official documentation on prompt engineering best practices for Claude 4.x models. Covers clear instructions, providing context, using examples, chain-of-thought reasoning, and Claude 4 specific techniques.

View Resource

guide

beginner

Prompt Injection Explained

Simon Willison

20 min read

Clear explanation of prompt injection attacks and how they work. Real examples and implications for AI security.

View Resource

guide

beginner

Best LLM Evaluation Platforms 2025

Braintrust

30 minutes

Comparison of LLM evaluation platforms including Braintrust, LangSmith, Weights & Biases, Langfuse, and LangWatch. Feature comparison, integration ecosystems, and use case recommendations. Understand the evaluation tooling landscape.

Updated: 2025

View Resource

course

beginner

Complete Beginner's Course on AI Evaluations

Aman Khan (Arize AI)

2-3 hours

Step-by-step course teaching AI evaluations from scratch. Covers why evals matter, how to build your first eval, evaluation methods, and production patterns. Designed for product managers and builders new to AI evaluation.

Updated: 2025

View Resource

🚀 Core Concepts

Dive deeper into the fundamentals and best practices

guide

intermediate

Building Effective Agents

Anthropic

30 min read

Comprehensive guide from Anthropic's experience working with dozens of teams building LLM agents. Covers workflows vs agents distinction, practical patterns, tool design, and when to use agentic systems. Includes real-world examples and code.

Updated: December 2024

View Resource

guide

intermediate

Building Agents with Claude Agent SDK

Anthropic

45 min read

Guide to building agents with the Claude Agent SDK (formerly Claude Code SDK). Covers tool design, bash access, file operations, and building various agent types: finance agents, personal assistants, customer support, research agents.

View Resource

tutorial

intermediate

AutoGen Multi-Agent Framework Tutorial

Microsoft

2-3 hours

Official tutorial for building multi-agent AI applications with AutoGen. Learn how to create conversational agents, build agent teams, and orchestrate complex workflows. Covers AutoGen Studio for no-code prototyping, AgentChat for conversations, and Core API for event-driven systems. Includes practical examples and working code.

Updated: 2025

View Resource

guide

intermediate

Amazon Bedrock Agents

AWS

1-2 hours

Build, deploy, and scale AI agents on AWS with Amazon Bedrock. Orchestrate multi-step tasks, connect to enterprise data sources, and invoke APIs. Supports Claude, Llama, and other foundation models.

Updated: 2025

View Resource

tutorial

intermediate

Amazon Bedrock Agents Workshop

AWS

2-3 hours

Hands-on workshop for building production AI agents with Amazon Bedrock. Learn to orchestrate multi-step tasks, connect to enterprise data, and invoke APIs. Includes code examples for creating agents with Claude, implementing RAG, and deploying to production with built-in observability and guardrails.

Updated: 2025

View Resource

guide

intermediate

Google Vertex AI Agent Builder

Google Cloud

2-3 hours

Comprehensive platform to build, scale, and govern reliable agents on Google Cloud. Python Agent Development Kit (ADK) downloaded over 7 million times. New observability dashboards and faster build-and-deploy tools.

Updated: 2025

View Resource

course

intermediate

Introduction to Model Context Protocol (MCP)

Anthropic

2-3 hours

Official Anthropic course teaching how to build MCP servers and clients from scratch using Python. Learn MCP's three core primitives—tools, resources, and prompts—to connect Claude with external services. Think of MCP like USB-C for AI applications.

Updated: 2025

View Resource

guide

intermediate

Model Context Protocol Documentation

Anthropic

1 hour

Official MCP documentation covering server and client implementation. Use the MCP connector in the Messages API, add MCP servers to Claude Code or Claude Desktop, and enable MCP for your team. Includes pre-built servers for Google Drive, Slack, GitHub, Git, Postgres, and Puppeteer.

Updated: 2025

View Resource

tutorial

intermediate

Microsoft Copilot Studio ❤️ MCP

Microsoft

1-2 hours

Lab teaching how to deploy an MCP server and integrate it with Microsoft Copilot Studio. Deploy locally via VS Code with dev tunnels or to Azure Container Apps. Includes example 'Jokes MCP Server' template. Full infrastructure-as-code with Azure Developer CLI.

Updated: 2025

View Resource

guide

intermediate

Beyond 'Vibe-Coding': Scaling AI with GitHub Spec Kit

Mark Beacom

25 minutes

Critical analysis of spec-driven development vs 'vibe-coding' (prompt-and-pray). Explains why specifications are essential for scaling AI-powered development in enterprise teams. Real-world examples and architectural insights.

Updated: 2025

View Resource

guide

intermediate

Cost Optimization Guide

Anthropic

20 min read

Official strategies for optimizing LLM costs. Covers prompt caching (save 90% on repeated context), model selection, prompt design, and batch processing.

View Resource

guide

intermediate

Prompt Caching Explained

Anthropic

15 min read

Deep dive on prompt caching. How it works, when to use it, and implementation patterns. Can save 90% on costs for repeated context.

View Resource

guide

intermediate

Reducing AI Latency

Vercel

30 min read

Strategies for faster AI responses: streaming, parallel calls, edge deployment. Practical patterns for production applications.

View Resource

guide

intermediate

Error Handling Guide

Anthropic

25 min read

Best practices for handling AI failures gracefully. Fallbacks, retries, error types, and user-facing error messages.

View Resource

guide

intermediate

Observability for LLM Applications

Langfuse

30 min read

What to monitor in production LLM apps. Beyond API errors: latency, costs, quality, and user satisfaction. Open source monitoring platform.

View Resource

guide

intermediate

LLM Security Guide (OWASP Top 10)

OWASP

45 min read

Top 10 security risks for LLM applications. Prompt injection, data poisoning, model theft, and more. Must-read for production deployments.

View Resource

course

intermediate

Prompt Evaluations Course

Anthropic

3-4 hours

Comprehensive course on building AI evaluations. Nine lessons teaching code-graded, human-graded, and LLM-graded evaluation methods. Learn to implement evals successfully in your workflows with the Anthropic API. Build the skill that OpenAI and Anthropic CPOs call 'the most important for 2025.'

Updated: 2025

View Resource

tutorial

intermediate

OpenAI Evals Framework

OpenAI

2-3 hours

Open-source framework for evaluating LLMs and LLM systems. Battle-tested eval templates including Match, Includes, FuzzyMatch, and model-graded evals. Learn OpenAI's recommended 'chain-of-thought then classify' approach for more accurate LLM judges. Complete registry of standardized benchmarks.

Updated: 2025

View Resource

guide

intermediate

Your AI Product Needs Evals

Hamel Husain

45 minutes

Influential guide from former GitHub ML engineer on why evals are essential. Hamel advocates spending 60-80% of development time on evals and error analysis. Learn how to move beyond demos to production-grade AI products through systematic evaluation.

Updated: 2025

View Resource

guide

intermediate

LLM Evaluation Metrics: Complete Guide

Braintrust

40 minutes

Comprehensive guide covering all LLM evaluation metrics: accuracy (exact match, semantic similarity), quality (relevance, coherence, factuality), and production metrics (latency, cost, success rate). Learn which metrics matter for different use cases.

Updated: 2025

View Resource

guide

intermediate

LLM Evaluation: Metrics, Frameworks, and Best Practices

Weights & Biases

1 hour

Comprehensive report on LLM evaluation covering functional evaluation (accuracy, hallucination rates), quality metrics, and evaluation-driven development. Learn how W&B upgraded Wandbot to v1.1 using systematic evaluation.

Updated: 2025

View Resource

tutorial

intermediate

Vertex AI: Getting Started

Google Cloud

1-2 hours

Comprehensive introduction to Vertex AI, Google's enterprise ML platform. Learn to deploy models, build AI applications, and scale ML workflows. Covers Gemini integration, model deployment, and production best practices.

Updated: 2025

View Resource

tutorial

intermediate

Azure AI Foundry Quickstart

Microsoft

45 minutes

Official quickstart for Azure AI Foundry. Learn to build and deploy multi-agent systems at enterprise scale. Covers Agent Service, model catalog, observability, and Entra Agent ID for security.

Updated: 2025

View Resource

guide

intermediate

Microsoft 365 Agents: Developer Guide

Microsoft

1 hour read

Comprehensive guide to building agents for Microsoft 365. Learn multi-agent orchestration, declarative agents, and integration with Copilot. Covers architecture patterns and best practices for enterprise agents.

Updated: 2025

View Resource

tutorial

intermediate

Amazon Bedrock: Developer Guide

AWS

1-2 hours

Comprehensive guide to building with Amazon Bedrock. Access Claude, Llama, Mistral, and Amazon Nova via unified API. Learn to build agents with AgentCore, implement RAG with Knowledge Bases, and deploy at scale.

Updated: 2025

View Resource

tutorial

intermediate

Amazon SageMaker: Getting Started

AWS

2-3 hours

Complete introduction to Amazon SageMaker for building, training, and deploying ML models. Learn the full ML workflow from data preparation to model deployment. Covers SageMaker Studio, training jobs, and endpoints.

Updated: 2025

View Resource

guide

intermediate

Prompt Engineering for Business Performance

Anthropic

30 min read

How a Fortune 500 company used prompt engineering to improve Claude accuracy by 20%. Real case study with techniques: scratchpads, few-shot examples, and subject matter expert collaboration.

Updated: February 2024

View Resource

guide

intermediate

LLM Patterns: Metrics That Matter

Eugene Yan

45 min read

Practical patterns for building with LLMs including evaluation, metrics, and production considerations. From an AI/ML practitioner at Amazon.

View Resource

course

intermediate

Building Systems with the ChatGPT API

DeepLearning.AI (Andrew Ng & OpenAI)

1 hour

Learn to build multi-step systems using LLMs. Covers breaking complex tasks into subtasks, chaining prompts, evaluating outputs, and building end-to-end customer service assistants. Hands-on Python examples.

Updated: 2024

View Resource

🔥 Advanced Topics

For experienced practitioners looking to go deeper

guide

advanced

Code Execution with MCP: Building Efficient AI Agents

Anthropic Engineering

45 minutes

Deep dive into using MCP for code execution in AI agents. Learn how MCP enables secure, sandboxed code execution while maintaining agent efficiency. Engineering insights from the Anthropic team.

Updated: 2025

View Resource

guide

advanced

Advanced RAG Techniques

LlamaIndex

1 hour read

Beyond basic RAG: chunking strategies, reranking, query transformation, hybrid search, and more. Cheat sheet and recipes for powerful RAG systems.

View Resource

guide

advanced

Using LLM-as-a-Judge: Complete Guide

Hamel Husain

1 hour

Comprehensive guide to using LLMs to evaluate other LLMs. When to use LLM judges, how to validate them against human judgments, best practices for rubrics and prompts. Practical patterns for scaling subjective evaluations.

Updated: 2025

View Resource

course

advanced

Machine Learning Systems Design (CS329S)

Stanford University (Chip Huyen)

10 weeks

Stanford course on designing and deploying ML systems in production. Covers data engineering, model development, deployment, monitoring, and maintenance. Includes case studies from industry. Perfect for understanding the full ML lifecycle.

Updated: 2024

View Resource

What to Learn Next

Continue your AI learning journey with these related paths

🎓

AI Fundamentals

Understand how LLMs work and what they can do

🤖

Building AI Agents

Build agents that can use tools and reason

🧠

Agent Memory & RAG

Give agents context and memory