Nvidia Just Said It Produces The Cheapest AI Tokens In The World — Here Is What That Actually Means

Jensen Huang just said something bold.

The CEO of Nvidia — the company that makes the chips that power almost every major AI in the world — stood on stage at an event called Cadence Live 2026 and made a very specific claim.

"I produce the lowest cost tokens in the world."

That is a massive statement. But most people have no idea what it actually means.

What is a token? Why does the cost of a token matter? And why is Nvidia saying this right now in 2026?

This is the full story — explained simply.

First — What Is A Token?

Before anything else, you need to understand what a token is.

When you type something into ChatGPT, Claude, or Gemini — the AI does not read your message the way you or I would read it. It breaks your message down into tiny pieces called tokens.

A token is roughly equal to about three or four letters. The word "hamburger" is about two or three tokens. The word "AI" is one token.

When the AI reads your message — it reads it as thousands of tokens. When the AI writes a response back to you — it generates tokens one at a time. Every single word. Every sentence. Every paragraph.

This happens millions of times every second across the internet.

Every time you use ChatGPT — tokens. Every time a company uses AI to answer customer support — tokens. Every time a doctor uses AI to analyse a report — tokens. Every time a developer uses AI to write code — tokens.

Tokens are the basic unit of all AI. Think of them like electricity — they power everything, but you never see them directly.

Why Does The Cost Of A Token Matter?

Here is where it gets interesting.

Every token costs money to produce.

When an AI generates a response for you, a computer somewhere has to do the calculation. That computer uses electricity. That computer cost someone money to buy. That computer needs engineers to run it.

All of that cost, divided by every token the computer produces — that is the cost per token.

Right now, companies pay for AI in tokens. If you use the OpenAI API — you pay per million tokens. If you use Anthropic's Claude API — you pay per million tokens. If you use Google's Gemini API — same thing.

The cheaper the token, the cheaper it is to run AI.

And the cheaper it is to run AI, the more things become possible.

A doctor who could only afford to run AI on 100 patient reports a day — if tokens get cheaper — can now run it on 10,000 patient reports a day.

A startup that could not afford to add AI to its product — if tokens get cheaper — suddenly can.

A government that could not afford AI translation services for its citizens — if tokens get cheaper — suddenly it becomes realistic.

Token cost is not just a business metric. It is the key to how widely AI can spread across the world.

What Nvidia Is Claiming

At Cadence Live 2026 this week, Jensen Huang said something that turned heads.

He acknowledged that Nvidia's machines — the chips and hardware they sell — are extremely expensive. A single Nvidia AI system can cost millions of dollars.

But then he made the counterintuitive argument.

Even though the hardware is expensive — the tokens that hardware produces are the cheapest in the world.

How is that possible?

Think of it this way.

Imagine two factories that make shoes.

Factory A is cheap to build — it cost $1 million. But it can only make 100 pairs of shoes a day. So each pair of shoes costs a lot to produce.

Factory B is expensive to build — it cost $10 million. But it can make 100,000 pairs of shoes a day. So each individual pair of shoes is extremely cheap to produce.

Nvidia's argument is that their machines are Factory B. Yes, they cost more upfront. But they produce so many more tokens per hour, per day, per year — that each individual token ends up being far cheaper than on any competing hardware.

The Numbers Behind The Claim

This is not just a marketing claim. There are real numbers behind it.

Nvidia's latest chip platform is called Blackwell. The specific system Jensen is talking about is called the GB200 NVL72.

Here is what the numbers show:

The older Nvidia platform — called Hopper — costs about $1.41 per hour to run. The new Blackwell platform costs about $2.65 per hour to run. So Blackwell costs nearly double per hour.

But here is the twist.

Blackwell produces so many more tokens per hour that the cost per token ends up being 35 times lower than the older Hopper system.

You pay twice as much per hour. But you get 35 times more tokens for your money.

That is an extraordinary efficiency gain.

In real numbers — Nvidia's Blackwell system can now produce AI tokens for as little as two cents per million tokens. That is not per token — that is per million tokens. Two cents.

And a $5 million investment in a Blackwell system can generate $75 million in token revenue. That is a 15 times return on investment.

These are the numbers that have the entire AI industry paying attention.

The Printing Press Analogy

Nvidia uses a great analogy to explain this.

Think about a high-speed printing press.

An old printing press might print 1,000 pages per hour. A modern industrial press might print 100,000 pages per hour.

Yes, the modern press costs more to buy. But the cost to print each individual page is much lower — because so many more pages are coming out every hour.

AI chips work the same way. The more tokens a chip can produce in an hour — the cheaper each token becomes.

Nvidia's Blackwell chips produce tokens so fast and so efficiently that even though they cost more to buy, the cost of each token ends up being the lowest in the industry.

How Did Nvidia Get Here?

This did not happen by accident.

Nvidia has spent years building what Jensen Huang calls a full-stack approach to AI.

Most chip companies just make chips. They sell you the hardware and let you figure out the rest.

Nvidia does something different. They build the chip. They build the software that runs on the chip. They build the tools that make the software run better. They work directly with the biggest AI companies — OpenAI, Meta, Google, Anthropic — to make sure the most popular AI models run as efficiently as possible on their hardware.

This means every part of the system — from the chip design to the software to the AI models — is optimised together. Not separately.

The result is that on Nvidia hardware, the same AI model runs faster and more efficiently than it would on any other hardware.

Nvidia also works closely with open-source AI tools like vLLM, SGLang, and TensorRT-LLM — continuously making improvements so that existing Nvidia hardware gets faster over time even without buying new chips.

That is rare. Usually when you buy hardware, its performance is fixed. With Nvidia, the software keeps improving — so the same hardware you bought last year produces more tokens today than it did when you first bought it.

What Is Coming Next — Rubin

And Nvidia is not stopping at Blackwell.

In January 2026, Nvidia announced their next platform — called Rubin. Named after Vera Florence Cooper Rubin, an astronomer whose discoveries changed our understanding of the universe.

Rubin is expected to deliver up to 10 times lower token cost compared to Blackwell. And Blackwell already delivers 35 times lower cost than the previous generation.

The companies that will be among the first to deploy Rubin-based systems include AWS, Google Cloud, Microsoft, and CoreWeave — all expected in the second half of 2026.

AI labs including OpenAI, Meta, Anthropic, and Mistral are already looking at the Rubin platform for training their next generation of models.

The cost of every AI token is about to drop again. Dramatically.

What Does This Mean For You?

You might be thinking — this sounds like a story about expensive data centres and business metrics. What does it have to do with me?

The answer is — everything.

Every time the cost of a token drops, AI gets more affordable for everyone. Here is what that means in practice:

Cheaper AI tools — The apps and AI tools you use every day get cheaper or get more features for the same price. When tokens are cheap, companies can afford to give you more.

AI reaches more countries — Right now, most advanced AI is used in rich countries because it is expensive to run. Cheaper tokens mean AI becomes accessible to hospitals, schools, and businesses in developing countries too.

More AI in everyday products — Your bank's customer service. Your phone's camera. Your navigation app. The reason these do not all have powerful AI yet is cost. Cheaper tokens change that equation.

Better AI models — When it costs less to run AI, researchers can afford to test more ideas, train more versions, and improve faster.

The falling cost of the token is one of the most important trends in technology right now. It just happens to be invisible to most people.

The Simple Summary

Here is everything in plain words:

A token is the basic unit of AI — every word, every response, every AI interaction is made of tokens.

Tokens cost money to produce. The cheaper the token, the more AI can be used by more people and more businesses.

Nvidia CEO Jensen Huang just said — and backed it up with real numbers — that Nvidia produces the cheapest tokens in the world today.

Their Blackwell chip platform costs 35 times less per token than their previous generation. Their next platform, Rubin, is expected to cut costs another 10 times on top of that.

The result? AI that costs almost nothing to run — opening the door to a world where AI is not just for big tech companies but for everyone.

The machines are expensive. The tokens are not.

And the tokens are what actually matter.

#nvidia#jensen huang#token cost#nvidia blackwell#nvidia rubin#ai infrastructure 2026#cheapest ai tokens