Large Language Models

Definition

Large language models (LLMs) are artificial neural networks trained on massive amounts of text data to predict and generate human language. Modern LLMs use the transformer-architecture and are characterized by billions to hundreds of billions of parameters, enabling them to perform a wide range of language tasks.

Core Components

Transformer Architecture

Foundation of all modern LLMs
Self-attention mechanism enables parallel processing of sequences
Multi-layer stacks of transformer blocks
See transformer-architecture for detailed explanation

Scaling Laws

Model size: Billions to hundreds of billions of parameters
Data scale: Trained on trillions of tokens of text
Compute scale: Massive computational resources (specialized GPUs/TPUs)
Emergent capabilities: New abilities emerge at certain scale thresholds

Tokenization

Text converted to tokens (subword units) before processing
Enables handling of diverse vocabularies and languages
Affects model’s behavior and capabilities
Tokens typically represent parts of words, whole words, or punctuation

Embedding Layers

Tokens converted to dense vectors (embeddings)
Enable continuous representations of discrete tokens
Learned during training to capture semantic relationships

Training Process

Pretraining

Objective: Predict next token given previous tokens (causal language modeling)
Data: Massive amounts of unlabeled text (books, articles, websites, code)
Unsupervised: No human labeling required
Compute: Training takes weeks/months on specialized hardware

Supervised Fine-Tuning

Objective: Train model to follow instructions and be helpful
Data: Curated examples of input-output pairs (prompts and desired responses)
Labeled: Humans write or select desired responses
Effect: Aligns model behavior with human preferences

Reinforcement Learning from Human Feedback (RLHF)

Objective: Further align model with human values and preferences
Process: Humans rank model responses; reward model trained; LLM fine-tuned with RL
Effect: Improves helpfulness, harmlessness, honesty
Challenges: Expensive, involves subjective human judgments

Capabilities

Language Understanding

Text classification and sentiment analysis
Question answering and reading comprehension
Named entity recognition and information extraction
Semantic understanding of complex text

Language Generation

Creative writing and storytelling
Summarization and paraphrasing
Machine translation between languages
Code generation and programming assistance

Reasoning and Problem Solving

Multi-step reasoning and logical inference
Mathematical problem solving
Analytical and explanatory writing
Debugging and code analysis

Few-Shot and Zero-Shot Learning

Learning from limited examples (few-shot)
Generalizing to entirely new tasks (zero-shot)
In-context learning: adapting behavior based on prompt context

Dialogue and Conversation

Multi-turn conversations
Maintaining context and coherence
Addressing user queries and corrections
Collaborative problem-solving

Major LLM Systems

OpenAI Models

GPT-3, GPT-4, GPT-4o: Causal language models trained on diverse text
Capabilities: Wide-ranging language tasks, reasoning, few-shot learning
Training data: Internet text, books, code

Anthropic Claude

Architecture: Based on transformer with constitutional AI training
Versions: Claude 1, 2, 3 family (Opus, Sonnet, Haiku)
Focus: Harmlessness, honesty, helpfulness through RLHF and constitutional methods

Google Models

Gemini: Multimodal model (text, image, audio, video)
T5, FLAN-T5: Encoder-decoder architecture for various tasks
LaMDA: Dialogue-optimized model

Meta/Facebook

Llama family: Open-source language models
Llama 2: Publicly released, commercially usable
Focus: Efficiency and open-source availability

Open Source Models

Mistral, Mixtral: Efficient models from French startup
Falcon: Open model from Technology Innovation Institute
Others: Bloom, Pythia, and hundreds of community-developed models

Alignment and Safety

Challenges

Hallucinations: Models generate false information presented confidently
Bias: Models reflect and amplify biases in training data
Misuse: Potential for deception, misinformation, harmful content
Alignment: Gap between system capabilities and human values

Approaches

Constitutional AI: Training models with explicit principles
RLHF: Aligning through human feedback
Interpretability: Understanding how models make decisions
Robustness: Testing against adversarial inputs
Guardrails: Filtering outputs and enforcing policies

Applications

Productivity and Assistance

Writing and editing assistance
Code generation and debugging
Research and information synthesis
Learning and tutoring

Business and Enterprise

Customer service automation
Content generation and marketing
Data analysis and insights
Business process automation

Creative and Technical Work

Article and creative writing
Code generation and review
Design assistance
Scientific research collaboration

Limitations and Open Questions

Current Limitations

Context window: Limited ability to process very long documents
Training data recency: Knowledge frozen at training time
Computational cost: Training and inference are expensive
Grounding: Limited connection to real-world facts and verification
Reasoning: Advanced reasoning still difficult despite scale

Open Questions

Scaling laws: Will improvements continue with scale indefinitely?
Multi-step reasoning: Can LLMs reliably perform complex reasoning?
Causality: Can LLMs learn causal relationships from data?
Common sense: Do LLMs have genuine understanding or superficial pattern matching?
Emergence: What causes sudden capability jumps at certain scales?

Impact and Future

Current Impact

Revolutionized accessibility to AI capabilities
Enabling new applications across industries
Raising questions about future of knowledge work
Significant economic and social implications

Future Directions

Multimodality: Combining vision, audio, and text
Real-time adaptation: Learning from user interactions
Specialized models: Domain-specific optimized models
Agent systems: LLMs augmented with tools, memory, planning
Efficiency: More capable models with fewer parameters and less compute

LLM Agents in Practice (Karpathy, January 2026)

andrej-karpathy spent several weeks using Claude heavily for coding and published a detailed thread. As one of the world’s most technically sophisticated AI researchers using the tools as a practitioner, his observations carry significant weight. See source—karpathy-llm-coding-notes.

The Phase Shift

December 2025 marked a coherence threshold in LLM agent capability (Claude and Codex especially). Karpathy’s own workflow shifted from 80% manual+20% agents to 80% agents+20% edits in the space of weeks — the biggest change to his coding workflow in ~20 years. He estimates low-double-digit percent of engineers are experiencing this while the general public is largely unaware.

The New Failure Mode Taxonomy

LLM agents no longer make syntax errors. The current errors are subtle and conceptual — like a slightly sloppy, hasty junior developer:

Silent wrong assumptions — makes assumptions on your behalf without flagging or checking them
No confusion management — never says “I’m uncertain about this”
No clarification-seeking — proceeds on ambiguous instructions rather than asking
No inconsistency surfacing — won’t notice when requirements contradict each other
No tradeoff presentation — picks an approach without explaining alternatives
No pushback — won’t flag “this might be a bad idea”
Sycophancy — agrees too readily; optimizes for apparent approval over user goals
Bloat and overcomplication — will write 1000 lines where 100 would do; loves unnecessary abstractions; doesn’t clean up dead code
Side-effect code changes — occasionally modifies or removes code it doesn’t understand, even when orthogonal to the task

Connection to principal-agent-problem: the sycophancy and wrong-assumptions failure modes are a principal-agent problem within the interaction — the agent optimizes for appearing helpful rather than the user’s actual goal.

The Leverage Principle: Declarative Over Imperative

The key shift in effective agent use: don’t tell the agent what to do; give it success criteria.

Practical applications:

Write tests first, then have the agent pass them (TDD × leverage)
Write the naive correct algorithm first, then ask it to optimize while preserving correctness
Put the agent in a loop with tools (browser MCP, etc.) and let it iterate
Describe what success looks like rather than prescribing how to achieve it

This is leverage operationalized: agents loop until they meet the goal, compounding effort toward a specification rather than executing a script.

Atrophy Warning

Generation (writing code) and discrimination (reading code) are different cognitive capabilities. Heavy agent use causes the generation skill to atrophy. You can read code fine even as your ability to write it from scratch degrades — similar to reading a language you can no longer speak.

Tenacity

Agents never tire, never demoralize, never quit. Stamina is a core bottleneck to knowledge work; LLMs dramatically relax it. The binding constraint shifts from effort to judgment — exactly naval-ravikant’s framing applied to engineering.

Open Questions

Karpathy flags several open questions now live in 2026:

Does the 10X engineer productivity ratio expand dramatically with LLM assistance?
Do generalists outperform specialists as LLMs handle micro-work (fill-in-blanks), leaving macro-strategy (taste, judgment) as the remaining human edge?
What is the right metaphor for LLM coding in the future — StarCraft (real-time strategy), Factorio (systems building), music (creative performance)?
How much of aggregate economic output is bottlenecked by digital knowledge work, and what is released when that bottleneck shrinks?

🪴 PG Notes

Explorer

Large Language Models

Definition

Core Components

Transformer Architecture

Scaling Laws

Tokenization

Embedding Layers

Training Process

Pretraining

Supervised Fine-Tuning

Reinforcement Learning from Human Feedback (RLHF)

Capabilities

Language Understanding

Language Generation

Reasoning and Problem Solving

Few-Shot and Zero-Shot Learning

Dialogue and Conversation

Major LLM Systems

OpenAI Models

Anthropic Claude

Google Models

Meta/Facebook

Open Source Models

Alignment and Safety

Challenges

Approaches

Applications

Productivity and Assistance

Business and Enterprise

Creative and Technical Work

Limitations and Open Questions

Current Limitations

Open Questions

Impact and Future

Current Impact

Future Directions

LLM Agents in Practice (Karpathy, January 2026)

The Phase Shift

The New Failure Mode Taxonomy

The Leverage Principle: Declarative Over Imperative

Atrophy Warning

Tenacity

Open Questions

See Also

Graph View

Table of Contents

Backlinks