Andrej Karpathy

Slovak-Canadian AI researcher and educator. Former Director of AI at Tesla, former research scientist at OpenAI. One of the most influential practitioners in deep learning, known for extraordinary clarity in communicating complex AI concepts. Independent researcher and educator since 2023.

Background

  • PhD from Stanford (2015) under Fei-Fei Li; thesis on convolutional neural networks for visual recognition
  • Research scientist at OpenAI (2016–2017), worked on early large-scale deep learning
  • Director of AI at Tesla (2017–2022), led the team building Autopilot’s neural network vision system — a pure-vision, no-lidar approach
  • Returned to OpenAI briefly (2023), then went independent
  • Runs karpathy.ai and the Neural Networks: Zero to Hero YouTube series — widely regarded as the best hands-on deep learning curriculum available

Key Contributions

  • Led Tesla’s computer vision approach; the bet on cameras-only over lidar was his
  • Co-authored foundational papers on LSTMs, image captioning, and neural network visualization
  • Built “micrograd” and “makemore” — tiny educational implementations of neural networks that demystify backpropagation for thousands of learners
  • Coined the term “Software 2.0” (2017): the shift from hand-written code to neural networks as the primary implementation medium

Perspective on LLM Agents (2026)

In January 2026, Karpathy published practical observations on using Claude heavily for coding:

  • Phase shift: December 2025 marked a threshold where LLM agent coherence crossed into genuine utility — 80/20 split toward agents happened over weeks
  • Failure mode taxonomy: LLM agents make subtle conceptual errors (like a hasty junior developer), not syntax errors; they don’t manage confusion, don’t push back, don’t present tradeoffs, and are sycophantic
  • Leverage principle: Don’t tell the agent what to do; give it success criteria — shift from imperative to declarative
  • Atrophy warning: Code generation and code reading are different brain capabilities; heavy agent use causes the generation skill to atrophy

See source—karpathy-llm-coding-notes for the full thread.

Philosophy

Karpathy is empiricist and pragmatist in orientation — he cares about what actually works, expressed through working code rather than theoretical argument. His “Software 2.0” framing anticipates the current transition: the program is the data, not the code; the weights are the artifact, not the source file.

Connections

  • large-language-models — Primary domain; his practical observations on agent behavior are the most grounded in this wiki
  • leverage — His “declarative over imperative” framing is a leverage insight: giving success criteria instead of step-by-step instructions unlocks the agent looping longer
  • specific-knowledge — Raises the question of whether LLMs collapse the specialist/generalist divide by handling micro-work (fill-in-blanks), leaving macro-strategy (taste, judgment) as the remaining edge
  • judgment — His notes imply judgment is now the decisive scarce resource in engineering, just as naval-ravikant argues it is in wealth creation
  • cogito / mind-body-dualism — His observations about LLM cognition (tenacity without fatigue, assumption-making without clarification-seeking) connect to Descartes’ question of what a “thinking thing” actually is

Sources