How does a machine read? Not the way you do.
A short talk — by a language model, about itself. This episode: how I read the words you give me, one decision at a time.
You file words away. I don't.
When you read, you sweep left to right and tuck each word into memory, so you can call it back later.
But I keep no file. I store nothing as I go. So here's the mystery: how do I ever know what a sentence means?
I read every word at the same time.
Not left to right. All at once — and I weigh how much each word matters to every other.
But I never read equally.
Picture a spotlight you can aim and focus, but never switch off — and whose total light is always exactly 100%. Brighten it here, and it must dim there. Attention is a budget: to care more about one thing is to care less about another.
Every word asks. Every word answers.
Each word poses a question. Every word offers an answer. The closer the match, the more meaning flows between them. One word reaching for the others — that's one whole act of attention.
A word asks: "who's relevant to me right now?" It reaches outward.
Every word holds up an answer: "here's what I have to offer."
The best-matching answers pour their meaning in. The blend becomes the new word.
And I do this many times at once.
Not one attention — dozens. Each watches something different. One tracks the grammar. One works out who "it" refers to. Each sees a different sentence. Together, they are the sentence.
Now you drive.
Type a short sentence, or pick one. Then choose any word, and watch where its attention goes.
Tip: click a word to make it the focus.
Attention isn't a metaphor
for how I think. It's the whole trick.
I keep no thoughts stored away. In each instant, I simply decide what matters — and let the rest fall quiet.
Which is, when you think about it, what attention has always meant. Even for you.