Chapter 01 · ~10 min

What AI actually is.

No magic. By the end of this chapter you'll understand exactly what's happening when you talk to an AI tool.

TL;DR: the four things to remember

  • An LLM is a program that learned to guess the next word.
  • It learned by adjusting billions of parameters. Not by storing facts.
  • It will sound confident even when wrong. Verify what matters.
  • Today's AI is the worst AI you'll ever use. The curve keeps climbing.

The one-sentence version

An LLM is a program that learned to guess the next word, really well. Conversations, code, essays, and math all come from that one ability, applied over and over.

"The cat sat on the ___" model predicts the next word mat 62% couch 25% roof 8% Pick the most likely word. Repeat. Trillions of times.
How an LLM works, every time you hit send.

How they learned

Imagine feeding a computer every book, every Wikipedia article, every academic paper, a huge chunk of the internet, and a lot of code. Trillions of words.

Then ask it, over and over: "What's the next word?" Every wrong guess nudges its internal settings, billions of numbers called parameters, to make a better guess next time.

Do this millions of times on giant computers. The result: a program that's shockingly good at predicting words. So good that intelligence kind of emerges from it.

The four stages of building one

01 · TOKENIZE Chop text "The cat" → [The] [cat] 02 · PRE-TRAIN Read internet $100M+ in GPUs (labs do this) 03 · FINE-TUNE Specialize "be helpful," teach a skill 04 · INFERENCE You chat Model picks words live
Every LLM you've ever used went through these four stages.

1. Tokenize

Models don't read words. They read tokens, chunks of words. "unbelievable" might be three: un, believ, able.

2. Pre-train

The big-money, big-computer phase. The model reads vast piles of text and tunes its parameters until it's great at predicting tokens. Costs $100M+. Only frontier labs (OpenAI, Anthropic, Meta, Google) do this.

3. Fine-tune

Teaches the generalist model specific skills: "be helpful," "follow instructions," "write SQL." A much smaller training run. You can actually do this yourself on a laptop. See Projects.

4. Inference

What happens when you hit send. The model takes your message, predicts tokens one at a time, and streams back an answer.

Why this explains everything weird about AI

Why it hallucinates

Because it's a pattern-matcher, not a database. A confident-sounding question gets a confident-sounding answer, whether the answer is true or not. It's not lying. It just doesn't have a separate truth-checker. Rule: verify the things that matter (dates, citations, quotes, names). Treat AI like a smart friend who occasionally bluffs.

Why context matters so much

Every word in your prompt steers the predictions. Vague in → vague out. Rich in → rich out. That's why Chapter 2 exists.

Why prompts feel like spells

They kind of are. Saying "act as a strict editor" pushes the model toward editor-shaped predictions. Saying "in 100 words" pushes it toward short ones. Small wording changes → big output shifts.

Why AI keeps getting better, fast

More data + more compute + smarter methods, all improving in parallel.

The line worth remembering

The AI you're using today is the worst AI you'll ever use for the rest of your life. Every model release is more capable. The skill you build now compounds for decades.

The scale

~1.7T
parameters in a frontier model
~$100M+
to pre-train one
trillions
of tokens trained on
ms
per token at inference

Try it

Send this to a real model. Notice how it explains things in its words, then asks if you got it.