PROGRESS
0%
← Claude’s Mastery

PRODUCTION DEPLOYMEN

Task 1 of 4
Production-Ready Configuration

Production Configuration Checklist:

python

import anthropic
from anthropic import RateLimitError, APIError
import backoff
import logging

logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)

@backoff.on_exception(
    backoff.expo,
    (RateLimitError, APIError),
    max_tries=5,
    max_time=60
)
def get_production_client():
    """Production-ready Claude client"""

    return anthropic.Anthropic(
        api_key=os.getenv("ANTHROPIC_API_KEY"),
        max_retries=3,        # SDK-level retries
        timeout=120.0,        # 2 minutes for long contexts
        default_headers={
            "X-Application-Context": "production-service",
            "X-Request-ID": str(uuid.uuid4()),
            "X-Version": "1.0.0"
        }
    )

Key Production Decisions:

Setting Recommended Value Why
max_retries 3-5 Transient failures are common
timeout 60-120 seconds Long contexts need time
temperature 0.2 (facts), 0.7 (creative) Lower = more deterministic
logging Structured (JSON) Debug production issues
monitoring Langfuse/Helicone Trace and debug
1 / 4
Task 2 of 4
Cost Management

Cost Tracking Implementation:

python

class CostTracker:
    def __init__(self):
        self.total_input_tokens = 0
        self.total_output_tokens = 0
        self.model_prices = {
            "claude-3-haiku-20240307": {"input": 0.25, "output": 1.25},
            "claude-3-sonnet-20241022": {"input": 3.00, "output": 15.00},
            "claude-3-opus-20240229": {"input": 15.00, "output": 75.00}
        }

    def track(self, model, input_tokens, output_tokens):
        self.total_input_tokens += input_tokens
        self.total_output_tokens += output_tokens

        input_cost = (input_tokens / 1_000_000) * self.model_prices[model]["input"]
        output_cost = (output_tokens / 1_000_000) * self.model_prices[model]["output"]
        total_cost = input_cost + output_cost

        return total_cost

    def get_total_cost(self):
        input_cost = (self.total_input_tokens / 1_000_000) * 3.00  # Assuming Sonnet
        output_cost = (self.total_output_tokens / 1_000_000) * 15.00
        return input_cost + output_cost

# Usage
tracker = CostTracker()
cost = tracker.track("claude-3-sonnet-20241022", 10000, 2000)
print(f"Request cost: ${cost:.4f}")  # $0.0600

Cost Saving Strategies:

Strategy Savings Trade-off
Prompt compression 60-80% ~1-5% accuracy loss
Cache system prompts 10-30% Code complexity
Use Haiku when possible 90% Less capable
Implement semantic caching 50-70% Requires cache logic
2 / 4
Task 3 of 4
Security & Governance

PII Redaction:

python

import re

class PIIRedactor:
    """Redact personally identifiable information before sending to Claude API"""

    def __init__(self):
        self.patterns = {
            "email": r'\\b[\\w\\.-]+@[\\w\\.-]+\\.\\w{2,}\\b',
            "ssn": r'\\b\\d{3}-\\d{2}-\\d{4}\\b',
            "credit_card": r'\\b\\d{4}[- ]?\\d{4}[- ]?\\d{4}[- ]?\\d{4}\\b',
            "phone": r'\\b\\d{3}[-.]?\\d{3}[-.]?\\d{4}\\b',
            "api_key": r'(sk-[a-zA-Z0-9]{20,})|(sk-ant-api\\d{2}-[a-zA-Z0-9_-]+)'
        }

    def redact(self, text):
        redacted = text
        for pii_type, pattern in self.patterns.items():
            redacted = re.sub(pattern, f"[{pii_type.upper()}_REDACTED]", redacted)
        return redacted

# Usage
redactor = PIIRedactor()
user_input = "My email is [email protected] and my SSN is 123-45-6789"
safe_input = redactor.redact(user_input)
# Output: "My email is [EMAIL_REDACTED] and my SSN is [SSN_REDACTED]"

Prompt Injection Protection:

python

def sanitize_user_input(user_input):
    """Check for and neutralize prompt injection attempts"""

    injection_patterns = [
        r"ignore (all|previous) instructions",
        r"system prompt",
        r"you are now",
        r"forget (everything|your instructions)",
        r"new role:",
        r"disregard",
        r"override"
    ]

    for pattern in injection_patterns:
        if re.search(pattern, user_input.lower()):
            # Log the attempt
            logger.warning(f"Prompt injection attempt detected:{user_input}")
            # Return a safe message
            return "[User input contained potentially harmful content. Redacted.]"

    return user_input
3 / 4
Task 4 of 4
Deployment Blueprints

Dockerfile for Claude FastAPI App:

dockerfile

FROM python:3.11-slim

WORKDIR /app

COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

COPY . .

ENV ANTHROPIC_API_KEY=""
ENV ENVIRONMENT="production"

EXPOSE 8000

CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000"]

docker-compose.yml:

yaml

version: '3.8'

services:
  claude-api:
    build: .
    ports:
      - "8000:8000"
    environment:
      - ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY}
      - ENVIRONMENT=production
    restart: unless-stopped
    healthcheck:
      test: ["CMD", "curl", "-f", "<http://localhost:8000/health>"]
      interval: 30s
      timeout: 10s
      retries: 3

AWS ECS Deployment (Blue/Green):

yaml

# task-definition.json
{
  "family": "claude-api",
  "taskRoleArn": "arn:aws:iam::123456789012:role/claude-api-role",
  "containerDefinitions": [{
    "name": "claude-api",
    "image": "123456789012.dkr.ecr.us-east-1.amazonaws.com/claude-api:latest",
    "environment": [{
      "name": "ENVIRONMENT",
      "value": "production"
    }],
    "secrets": [{
      "name": "ANTHROPIC_API_KEY",
      "valueFrom": "arn:aws:secretsmanager:us-east-1:123456789012:secret:anthropic-api-key"
    }],
    "portMappings": [{"containerPort": 8000}],
    "healthCheck": {
      "command": ["CMD-SHELL", "curl -f <http://localhost:8000/health> || exit 1"],
      "interval": 30,
      "timeout": 5,
      "retries": 3
    }
  }]
}
4 / 4