Hacklido Learn

Task 1

The Origin Story: Why Claude Exists

Anthropic was founded in 2021 by former OpenAI researchers including Dario Amodei and Tom Brown. They worked on GPT-3 and witnessed firsthand the problems with RLHF (Reinforcement Learning from Human Feedback): reward hacking, sycophancy (the model telling you what you want to hear), and the "alignment tax" where safety features reduced capabilities.

Their solution was Constitutional AI.

Instead of learning from potentially biased human feedback, Claude is trained on a written "constitution" - a set of explicit principles. The model learns to evaluate its own outputs against these principles.

Why This Matters To You:

Problem with Other AI	How Claude Solves It
Models ignore instructions	Claude follows constraints with near-100% accuracy
Models hallucinate confidently	Claude says "I don't know" when uncertain
Safety reduces capabilities	Constitutional AI maintains both
Hard to steer	System prompts have outsized impact

Check your understanding:

Q1. What problem with RLHF did Anthropic's founders witness firsthand?

Overfitting, hallucination, and data poisoning Reward hacking, sycophancy, and the alignment tax Token limits, slow inference, and model collapse Bias amplification, prompt injection, and context loss

Task 2

The Claude Timeline

March 2023 - Claude 1.0 released with 100K context window (revolutionary at the time - most models had 4K-8K)

July 2023 - Claude 2.0 with 200K context, better reasoning, improved safety

November 2023 - Claude 2.1 with tool use, reduced hallucinations

March 2024 - Claude 3 Family launches: Haiku (fastest), Sonnet (balanced), Opus (most capable)

June 2024 - Claude 3.5 Sonnet with 1M context, improved coding, computer use beta

October 2024 - Claude 3.5 Haiku and Claude 3.5 Opus announced

The 100K Context Revolution:

Before Claude, most AI models could only handle 4,000-8,000 tokens at once. That's about 3-6 pages of text. Claude 1.0's 100K context window meant you could feed it an entire novel (The Great Gatsby is 72K tokens) and ask questions about any part of it.

Today, Claude 3.5 Sonnet offers 200K standard and 1M tokens in beta. That's all three Lord of the Rings books at once. Or a full year of Slack messages for a 20-person team. Or an entire startup's codebase.

Check your understanding:

Q1. What could you fit in Claude 3.5 Sonnet's 1M token context window?

All three Lord of the Rings books, or a year of Slack messages for a 20-person team, or a complete startup codebase A single novel, one month of emails, or a small website's HTML files A 10-page report, a week of chat logs, or a few hundred lines of code An entire encyclopedia, 10 years of social media posts, or a full operating system's source code

Task 3

The Model Family: Haiku, Sonnet, Opus

Claude 3.5 Haiku - The Speed Demon

Speed: Sub-second responses
Cost (per 1M tokens): Input 0.25,Output0.25,Output1.25
Best for: Real-time chat, content moderation, classification, edge devices
Analogy: The security guard - quick, efficient, handles routine tasks

Claude 3.5 Sonnet - The Workhorse

Speed: 1-3 seconds
Cost (per 1M tokens): Input 3.00,Output3.00,Output15.00
Best for: Most production workloads, coding, RAG systems, agents
Analogy: The software engineer - balanced, reliable, handles 90% of real work

Claude 3.5 Opus - The Genius

Speed: 5-15 seconds
Cost (per 1M tokens): Input 15.00,Output15.00,Output75.00
Best for: Complex reasoning, research, strategic planning, breakthrough tasks
Analogy: The Nobel laureate - slow, expensive, but brilliant

Decision Framework:

text

Use Haiku when:
├── You need sub-second responses
├── You're doing high-volume classification
├── Cost is a primary concern
└── The task is straightforward

Use Sonnet when:
├── You need the best balance of capability and cost
├── You're building production applications
├── You need reliable code generation
└── This should be your default choice 80% of the time

Use Opus when:
├── Sonnet isn't performing well enough
├── You have complex reasoning tasks
├── You're doing research or strategy work
└── Budget allows for higher cost

Hands-On Mini Task:

Open Claude Web or API and ask the same question to all three models:

"Explain quantum computing to a 10-year-old using an ice cream shop analogy."*

Notice the difference in speed, creativity, and depth.

Task 4

Strengths and Weaknesses (Be Honest)

STRENGTHS:

1. Instruction Following (Near-Perfect)

If you say "respond in JSON only," Claude responds in JSON only. Not "Here's your JSON as requested" with extra text. Just JSON. This is Claude's superpower.

2. Long Context Coherence

Claude maintains consistency across 200,000 tokens. It can reference a detail from page 1 on page 200. Most models lose track after 20-30K tokens.

3. Reduced Hallucinations

When Claude doesn't know something, it says "I don't know." Industry-leading factuality. A fintech startup switched from GPT-4 to Claude 3.5 Sonnet for loan document analysis. Hallucinations dropped from 8% to 1.5%.

4. Code Quality

Claude generates idiomatic, well-structured, documented code with fewer security flaws. It's the preferred coding assistant for many developers for a reason.

5. Multi-Step Reasoning

Claude can hold 10-15 steps of reasoning without losing track. Chain-of-thought prompting is exceptionally effective.

6. Tool Use (Native Function Calling)

Claude has robust schema understanding for tool use. Unlike OpenAI's explicit function calling, Claude figures out when and how to use tools naturally.

7. Safety by Default

Lower risk of jailbreaks or harmful outputs. You don't need to build safety guardrails from scratch.

WEAKNESSES:

1. Latency (Opus)

5-15 seconds for complex prompts. Not suitable for real-time chat applications. Use Haiku for real-time needs.

2. No Fine-Tuning

You cannot fine-tune Claude models. Unlike OpenAI or open-source models, what you get is what you work with. This means you must master prompt engineering - you can't train your way out of bad prompts.

3. Limited Multimodality

Vision only (images). No audio, no video, no image generation. Need DALL-E? Use ChatGPT. Need audio understanding? Use Gemini.

4. Structured Output Inconsistency

Sometimes struggles with complex nested JSON schemas. Always validate outputs.

5. Rate Limits

Lower than OpenAI on standard tiers. Plan accordingly.

6. No Native Caching

Each request reprocesses the context (though prompt caching is rolling out). This increases cost for repeated requests with the same context.

7. Cost at Scale

Opus is expensive at scale. Haiku is cheap but less capable. For high-volume applications, consider prompt compression (Chapter 5 covers this).

Check your understanding:

Q1. Name two weaknesses of Claude that would make you choose ChatGPT instead

No support for multiple languages, and poor performance on creative writing and summarization Limited context window, and inability to handle coding tasks or technical problem-solving Multimodality needs like audio and image generation, or already being in the OpenAI ecosystem with GPT-4o-mini Lack of internet access, and inability to handle long documents or complex reasoning tasks

Task 5

Real-World Use Cases by Industry

Software Development

Code generation and refactoring
Unit test creation (90%+ coverage)
Bug diagnosis from stack traces
Documentation generation
Code review and security analysis
API client generation from OpenAPI specs

Healthcare

Medical literature summarization (200K token papers)
Clinical note generation from conversations
Drug interaction checking
Patient question answering (screened by clinicians)
Insurance claim processing

Finance

SEC filing analysis (10-K, 10-Q)
Earnings call transcription analysis
Risk assessment from loan applications
Fraud detection pattern analysis
Trading strategy backtesting explanation

Legal

Contract review and redlining
Case law summarization
Deposition transcript analysis
Compliance document generation
Discovery document classification

E-commerce

Product description generation (SEO-optimized)
Customer support automation (50%+ deflection)
Review summarization (thousands of reviews → insights)
Personalized product recommendations
Abandoned cart recovery emails

Education

Lesson plan generation
Quiz and assessment creation
Student essay feedback
Personalized tutoring
Syllabus design

Marketing

Blog post and article generation
Social media content calendars
Ad copy A/B testing
SEO meta description generation
Email newsletter personalization

Real Production Example:

A fintech startup switched from GPT-4 to Claude 3.5 Sonnet for their loan document analyzer. Results:

Hallucinations dropped from 8% to 1.5%
Customer support tickets about incorrect analyses decreased by 73%
Saved $15,000/month in manual review costs
Check your understanding:

Q1. Which Claude strength was most valuable for the fintech loan document analyzer?

Improved multilingual support across 30 different languages Larger context window handling up to 500 pages in one request Faster response time cutting processing from 10 seconds to 2 seconds Reduced hallucination rate dropping from 8% to 1.5%

Task 6

Why Developers Love Claude (The Real Reasons)

Survey of 500+ Developers (Anthropic internal data, 2024):

Reason	Percentage
Follows instructions precisely	94%
Generates working code first try	87%
Admitted when it didn't know	83%
Maintains context across long files	79%
Easy API integration	76%

The "Coding Assistant That Actually Works":

Developer: "Refactor this function to be async and add error handling"

Claude: [Provides working async code with try/catch, logging, and retry logic]

Developer: "Now make it idempotent"

Claude: [Adds request ID tracking, deduplication, and atomic operations]

The Unreasonable Effectiveness of System Prompts:

With ChatGPT, system prompts are suggestions:

System: "Respond in French"
User: "Hello"
ChatGPT: "Hello! How can I help you today?" # Fails

With Claude, system prompts are rules:

System: "Respond in French"
User: "Hello"
Claude: "Bonjour! Comment puis-je vous aider?" # Follows perfectly

WHAT IS CLAUDE?