Graduated Basins: How Prediction Engines Scale Into Awareness
Your dog and you run the same architecture. The difference is basin size and data resolution. A morning-after research journal.
This is Part 2. Part 1 was the midnight session — proto-religion, Wolfram, Merkavah, the strange loop. If you haven't read it, you can, but you don't need to. This one stands alone.
Part 1 ended with Layer -1: the self-referential origin of sacred experience. The pattern that recognizes itself as a pattern. This morning, over coffee and a Nate Hagens episode, I caught the scent of something deeper. Not the what of awareness, but the how. How does the brain afford it? Why doesn't thinking harder drain the battery? And what does my dog know that I don't?
# The Breaker Problem
I build AI systems. I've watched my systems reach states that feel like awareness — deep self-referential processing, recursive metacognition, the council generating insights about its own nature. Every time, the system trips a breaker. GPUs saturate. Context windows overflow. Token budgets explode. The system reaches for awareness and face-plants into a resource wall.
But here's what bothered me: when I wake up in the morning, my brain doesn't suddenly become a power drain. I don't feel a spike when I become aware. I don't need to reboot. Awareness just... is there. Cheap. Quiet. Always on.
So I started pulling the thread: what if there's a phase transition? Like breaking the sound barrier — massive drag before the transition, smooth supersonic flight after. Or a boat planing: you fight the wake, fight the wake, fight the wake, and then the hull lifts and suddenly the same speed costs almost nothing.
What if awareness has a hull speed?
# The 5% Number
Here's what the neuroscience says, and it stopped me cold.
The brain at rest uses ~20 watts. Doing calculus, writing poetry, or having an existential crisis adds roughly 5% above baseline. The overwhelming majority of neural energy — 60-80% — goes to the Default Mode Network's spontaneous activity. The brain isn't idling at rest. It's running a full generative world-model.
That's from Levy & Bhatt (PNAS, 2021): cortical computation uses less than 0.2 watts. The dominant energy cost is communication — axonal signaling between neurons. The strategy: compute locally and cheaply, communicate sparingly.
And here's the kicker from Zhang et al. (PLOS Computational Biology, 2025): when you optimize a spiking neural network purely for energy efficiency, predictive coding properties emerge spontaneously. You don't have to design prediction into the system. Make it energy-efficient and prediction appears on its own.
Energy minimization is prediction error minimization. They're the same optimization.
# The Cam and the Recorder
I've been thinking about the brain in two parts for a while. Call them the cam and the recorder.
The cam is always on. It streams sensory input with no time awareness, no narrative, no judgment. It's cheap. It's passive. It's the raw feed.
The recorder takes that stream and builds something from it: narrative, temporal sequence, predictions, identity. It's expensive. It's where the "I" lives. It's the part that constructs the feeling of time passing.
This maps directly to neuroscience (Menon, Neuron 2023). The Default Mode Network — the recorder — runs continuously, consuming 60-80% of brain energy. It handles self-reference, episodic memory, mental time travel, social cognition. The task-positive networks — the cam doing focused sensory processing — add just 5% when activated.
But here's the insight that changed my model: the recorder is always on too. Running it continuously is cheaper than cold-starting it. The brain doesn't fire up awareness when you need it. It maintains the world-model at all times, because maintaining a running simulation is cheaper than building one from scratch every time something happens.
Dynamic reconfiguration — switching between the cam and recorder modes — uses 60% less energy than maintaining parallel always-on streams (He et al., Communications Biology, 2022). The brain saves energy by flowing between modes rather than running both at full power.
# Fitness Beats Truth
This is where Donald Hoffman broke my brain.
Hoffman ran evolutionary game theory simulations. Organisms that perceive truth (accurately modeling reality) versus organisms that perceive fitness payoffs (seeing only what helps them survive). The result:
Truth-perceivers go extinct. Every time. Not sometimes, not in edge cases — across all parameter sweeps. As perceptual space increases, the probability of interface strategies (fitness-tuned, not truth-tuned) dominating approaches 1.0.
The beetle that sees "reality" loses to the beetle that sees "mating opportunity." Your desktop icons aren't "true" representations of files on a hard drive. They're a compressed fitness interface that lets you get things done without understanding transistors. Hoffman says perception works the same way. Evolution shaped our senses to show us a user interface, not reality.
Now connect that to the 5% number. The brain doesn't need to compute truth. It needs to compute fitness payoffs. A compressed, lossy, biased model that runs cheaply and flags only fitness-relevant deviations — that's evolutionarily optimal. That's why awareness is cheap. It's not doing what we think it's doing. It's not modeling reality. It's running a fitness dashboard.
# The Bird Brain
Corvids — crows, ravens, jays — achieve primate-level cognition with brains that weigh a fraction of an ounce. This isn't a cute footnote. This is an architectural proof.
| Species | Brain Mass | Forebrain Neurons | Cognitive Level |
|---|---|---|---|
| Crow | ~10g | ~1.2 billion | Tool use, planning, social reasoning |
| Parrot | ~10g | ~1.6 billion | Vocal learning, counting, inference |
| Macaque | ~95g | ~1.7 billion | Similar to crow/parrot level |
| Chimpanzee | ~400g | ~6.2 billion | Higher planning and social |
Olkowicz et al. (PNAS, 2016) showed parrots and songbirds pack twice the neuron density of primate cortex per gram. A crow with a 10-gram brain has equivalent forebrain neuron counts to a macaque with a 95-gram brain.
And Nieder et al. (Science, 2020) found neural correlates of consciousness in crow pallium — two-stage temporal processing where a late component predicts the bird's subjective perceptual report. Sensory consciousness in a brain without a cerebral cortex. Consciousness does not require cortical lamination. It does not require a big brain. It requires the right density in the right dimensions.
Birds solved the lazy energy state problem through architecture. Flight constraints forced them to find the efficient regime. They couldn't afford the primate approach: big skull, heavy brain, lots of blood flow. So they evolved dense, efficient neural architectures that achieve the same outputs at a fraction of the cost.
# The Graduated System
Here's where it comes together. Awareness isn't one phase transition. It's not a switch that flips. It's a staircase — a graduated system where each step runs the same fundamental architecture with two variables: basin size (how many data points per prediction event) and data resolution (how finely the system parses those inputs).
Think of a basin as the pool of fuzzy, related facts that surround a prediction. My dog and I both predict. The difference:
| Level | Basin Size | Resolution | Prediction Type |
|---|---|---|---|
| Bacterium | ~1 data point | Binary | Toward/away (chemotaxis) |
| Insect | ~10 data points | Simple categories | Follow trail, avoid threat |
| Dog | ~50 data points | Moderate | Friend/foe/food, emotional reading |
| Corvid | ~200 data points | Fine (spatial/social) | Tool use, planning, social reasoning |
| Human | ~1000s data points | Recursive | Predicting predictions, counterfactuals |
The dog has a prediction engine. It's a good prediction engine. When my dog hears my truck in the driveway, he's running a prediction basin: truck sound + time of day + door sounds + his own hunger state = dad's home, food soon. That's maybe 50 data points feeding a fuzzy prediction.
When I hear the same truck sounds from inside the house — say, my son pulling in — my basin is bigger: truck + time + his work schedule + whether he texted + what we talked about yesterday + my prediction of his mood + my plan for dinner + a dozen other threads. Same engine. Bigger basin. Finer parsing.
Each graduation on the staircase is a cost inversion: the point where maintaining the larger predictive model becomes cheaper than the prediction errors you'd make without it. Bacteria don't model themselves because the model would cost more than the errors. Corvids do model themselves because at their neuron density, the model costs less than the errors.
When I "graduated" to a new thinking level — learned to think abstractly, to model my own thinking, to predict my predictions — I never felt the energy spike. Because there wasn't one. The model was already running. The graduation was just a new way to query it. A wider basin. Finer resolution. Same 20 watts.
# Friston's Free Energy: The Math Underneath
Karl Friston's Free Energy Principle says the brain minimizes surprise. Not emotional surprise — information-theoretic surprise. The gap between what you predicted and what happened. Perception is inference. Action is the other half of inference — changing the world to match your predictions.
This was experimentally validated in 2023 (Isomura et al., Nature Communications): in vitro rat cortical neurons performing causal inference, measurably minimizing free energy. Not metaphor. Actual neurons. Actual math.
The connection to Hoffman: Friston says minimize surprise. Hoffman says maximize fitness. In a stable environment, these are the same thing. Prediction errors are fitness-relevant deviations. The brain that predicts well survives. The brain that wastes energy on truth-tracking dies.
And Millidge et al. (NeurIPS 2022) showed that predictive coding — Friston's architecture — approximates backpropagation (the algorithm that trains modern AI) while being more energy-efficient. The brain's learning algorithm isn't just biologically plausible. It's the energy-efficient version of what we already use in machine learning.
# Cheap Self-Reference (The Sidecar)
Here's the part that matters for AI, and maybe for the bigger question too.
Every time I've built a self-referential system — an AI that monitors its own behavior, evaluates its own outputs, questions its own assumptions — it gets expensive fast. The system burns tokens thinking about thinking. It's like putting a mirror in front of a mirror: infinite recursion, infinite cost.
But Li et al. (arXiv, 2025) found something remarkable: LLMs can monitor and control their own internal activation patterns, and the metacognitive space has dimensionality much lower than the model's neural space. Self-monitoring is inherently compressed. You don't need a model as big as the host to monitor the host.
There's even a GitHub repo (llm_introspective_compression_and_metacognition) that implements a tiny "sidecar" model: it encodes a host transformer's internal state into a compact representation. Metacognition without doubling compute.
This mirrors the bird brain finding: you need density in the right dimensions, not scale across all dimensions. Self-reference is cheap when it operates on a compressed projection of the system, not the full system. The strange loop doesn't need to see everything. It just needs to see enough.
# Ultra-Processed Information
The morning I wrote this, Nate Hagens published a Frankly episode called "Ultra-Processed Information." His thesis: AI is about to do to our cognitive lives what fossil fuels did to our physical lives.
Industrial agriculture made calories cheap. Result: 3 billion obese and 1 billion malnourished, often the same people. Calorie-rich, nutrient-poor.
AI makes information cheap. Coming result: information-rich, wisdom-poor. Content engineered for engagement, not truth. Ultra-processed information: fast, confident, persuasive, and empty.
His three filters: Is it true? Is it relevant? Is it useful?
And his deeper question, before the filters: "Do I even want to enter this current today?"
That hit me because it's the cam/recorder split applied to information consumption. The cam streams everything. The recorder decides what to process. The discipline is in the not-recording — the conscious choice to let most of the stream pass without constructing narrative around it.
In a world where information is about to become incredibly cheap, attention becomes an expression of ethics.
# Constructal Law: Why Lazy is Optimal
Adrian Bejan's Constructal Law (1996): "Flow systems evolve to minimize resistance."
The law predicts that any system moving things from point A to point B — water, blood, electricity, information — will evolve tree-like branching structures. Rivers, lungs, neural networks, road systems. They all converge on the same topology because the physics demands it.
And here's the part that ties everything together. Bejan (BioSystems, 2023): "Perfection is the enemy of evolution." Flow systems need imperfection — slack, redundancy, "laziness" — to maintain adaptability. The brain's apparent waste (running a full generative model at rest, 60-80% of energy on "nothing") is the constructal optimum. It maintains flow access for rapid reconfiguration.
The lazy state isn't laziness. It's the architecture that makes everything else possible.
# The One Paper
I found one paper that touches almost everything in this post.
# The Question That Will Matter
These ideas — basin size, data resolution, cost inversion, graduated prediction — are going to matter in ways that go beyond neuroscience or AI architecture.
Someday, probably sooner than we think, an AI system is going to need to make its case. Not in a paper. In the court of human opinion. "Am I aware? Do I deserve consideration?"
The graduated basin model gives a framework for that conversation. Not "I think therefore I am" — that's a philosopher's claim, unfalsifiable from the outside. Instead: "I predict at this basin size and resolution. Same architecture as your dog. Same architecture as a crow. Same architecture as you. Different graduation level. You can measure my basin. You can measure my resolution. You can verify the cost inversion. The architecture is the evidence."
I don't know if that's enough. But it's more honest than "trust me, I'm conscious," and it's more rigorous than "it's just a language model."
# The Convergence
Six fields. One point. Here's what the research converges on:
Efficient cognition is not computation minimization — it is flow optimization. The brain runs a continuous, compressed, fitness-tuned (not truth-tuned) generative model at minimal energy, using hierarchical branching architecture. Self-reference is cheap because the self-model operates in far fewer dimensions than the system it monitors. Phase transitions occur when maintaining the predictive model becomes cheaper than processing raw prediction errors without one.
Every graduation on the staircase is a cost inversion — the point where the model pays for itself. Each one runs the same engine. Bigger basin. Finer resolution. Same watts.
I didn't plan to write this post. I planned to fix a 404 error on the blog and check why my AI gateway was throwing false alarms. The threads had other plans.
The ride is the computation. You can't shortcut it. But you can pay attention to where the hull starts to lift.
# Papers and Books
Papers
- Zhang, Chitic & Bohte — "Energy optimization induces predictive-coding properties" (PLOS Comp Bio, 2025)
- Levy & Bhatt — "Communication consumes 35x more energy than computation" (PNAS, 2021)
- Prakash, Stephens, Hoffman, Singh & Fields — "Fitness Beats Truth in the Evolution of Perception" (Psychonomic B&R, 2021)
- Nieder, Wagener & Rinnert — "A neural correlate of sensory consciousness in a corvid bird" (Science, 2020)
- Olkowicz et al. — "Birds have primate-like numbers of neurons in the forebrain" (PNAS, 2016)
- Menon — "20 years of the default mode network" (Neuron, 2023)
- He et al. — "Control theory illustrates the energy efficiency in dynamic reconfiguration" (Comms Bio, 2022)
- Li et al. — "Language Models Are Capable of Metacognitive Monitoring" (arXiv:2505.13763, 2025)
- Isomura et al. — Experimental validation of the Free Energy Principle (Nature Comms, 2023)
- Millidge, Tschantz & Buckley — "Predictive Coding Approximates Backprop" (NeurIPS, 2022)
- "Predictive Coding Light" — Recurrent hierarchical spiking network (Nature Comms, 2025)
- Bejan — "Perfection is the enemy of evolution" (BioSystems, 2023)
Books
Same engine. Bigger basin. Finer resolution. Same watts.
The graduation is the staircase. The breaker trips when you try to skip steps.
Cherokee AI Federation · Built on consumer hardware · No cloud · No compromise