Not Cursor, not ChatGPT—but AI Agents that can work 40 hours a week like real people, thinking and acting autonomously. If we deploy such “digital employees” at scale, how many can today’s global compute resources sustain? The answer is probably much lower than you think—but growing much faster than you think.

I. What Is a Digital Employee?

A digital employee is not Cursor, and it’s not ChatGPT.

Today, most people’s impression of AI tools stays at “command–response” interaction: you give it an instruction, it replies with a result, then stops and waits for your next instruction. Cursor, ChatGPT, and even most Agent products all follow this pattern. Most of the time is actually spent waiting for the human to issue the next command, rather than on continuous AI execution.

What we mean here by a digital employee is something fundamentally different: it can, like a human employee, work 8 hours a day, 5 days a week, continuously thinking and acting on its own. Management only needs to give it a rough requirement—“research competitors and write an analysis report,” “implement this feature from design to production”—and it can break down the task, plan steps, execute them, solve problems on its own or seek help, and keep working until it’s done.

Technically, this capability is called long-horizon tasks. The most advanced coding agents today can already run autonomously for hours per session, up from just a few minutes. This window is rapidly extending. When an Agent can reliably execute tasks measured in “days,” it truly becomes an “employee” rather than a tool. Imagine: assign it a project Monday morning, it delivers Friday before close of business, and you don’t need to babysit it in between.

From a hardware load perspective, such a digital employee is essentially a continuously running inference loop: constantly generating tokens (thinking and acting) → calling tools → observing results → generating more tokens. The core GPU cost comes from continuous token generation (decode).

Standard profile:

  • Sustained output rate: 100 token/s (current measured level for leading agents like Claude Opus 4.6, GPT-5.4, etc.)
  • Input token cost: roughly zero. Thanks to KV Cache and Prefix Cache, inputs along a long agent trajectory are efficiently cached and reused, and the incremental GPU cost of new input is negligible
  • Working time: 40 hours/week, 160 hours/month (same as human knowledge workers)
  • Monthly output tokens: ~57.6 million
  • SaaS utilization: 50% (commercial cloud services need redundancy to handle peaks)

II. Status Quo: Only 6.8 Million “AI Workers” Globally

We estimate the number of digital employees that can be supported worldwide as of early 2026 using three independent methods:

Method 1: API Revenue (Market Constraint)

The combined annualized revenue of major global AI service providers is about $90 billion (OpenAI ~25B, Anthropic ~19B, Google/Microsoft/AWS/others ~46B). Given the “monthly salary” implied by different model tiers, how many digital employees can this money pay for?

Model Tier Representative Model Output Price ($/M tok) Monthly Cost/Employee Digital Employees Supported
Efficient Open Source DeepSeek V3.2 $0.42 ~$24 ~310M (insufficient capability)
Mainstream Frontier Gemini 3.1 Pro $12 ~$691 ~10.9M
Mainstream Frontier GPT-5.4 / Claude Sonnet 4.6 $15 ~$864 ~8.7M
Top-Tier Reasoning Claude Opus 4.6 $25 ~$1,440 ~5.2M

In today’s “capability race” phase, digital employees must use frontier models to reliably complete tasks. The effective range is 5–9 million.

Method 2: Number of GPUs (Hardware Constraint)

Parameter Value
Global AI GPU stock ~6M H200-equivalent
Share used for inference ~60% → 3.6M
Peak throughput per 8-GPU node ~3,000 tok/s
50% SaaS utilization → effective throughput ~1,500 tok/s
Digital employees per node ~15
Total ~6.8M

Method 3: Electricity (Energy Constraint)

Parameter Value
Global power available for LLM inference ~10 GW
Average power per digital employee (incl. PUE) ~650 W
Total ~15.4M

Cross-Validation

Method Constraint Type Midpoint Estimate
API Revenue Market demand ~8M
GPU Count Hardware supply ~6.8M
Electricity Energy ~15.4M

Conclusion: GPU count is the current hard constraint. Around 6.8 million digital employees, just 0.7% of the world’s 1 billion knowledge workers—roughly equivalent to the entire US tech industry workforce. There is about 2x headroom on the power side.

The Real Output Behind 6.8 Million

However, 6.8 million may severely underestimate actual digital-employee output—because their productivity profile is fundamentally different from humans and depends heavily on task type:

Code and document output (API-friendly tasks): 10–100× a human. If the output is code, reports, data analysis, or other fully digital artifacts, and the tooling is AI-friendly (has APIs rather than only GUIs), then a digital employee can be 10–100× as productive as a human who is not using AI. This means that even with only 6.8 million digital employees, their effective output in code and documents could equal 68 million to 680 million humans—far from a negligible 0.7%.

Computer-use output (GUI-interaction tasks): on par with or slower than humans. If the task involves operating traditional desktop software—with lots of mouse clicks, menu navigation, and GUI load times—a digital employee will not be much faster than a person. On the OSWorld benchmark, the best agent (Claude Opus 4.6) has achieved a task success rate of 72.7%, essentially the same as humans at 72.4%. But success rate is only half the story—according to efficiency research from OSWorld-Human, agents need 1.4–2.7× as many interaction steps to complete the same task, and their per-step inference latency keeps growing over time, so later steps can take as long as earlier ones. In other words, even when an agent gets the job done, it’s often much slower than a person.

Task Type Representative Scenario Digital Employee vs Human Reason
Code/Docs Programming, report writing, data analysis 10–100× Direct API access, no GUI latency, strong parallelism
Computer Use Operating ERP, filling forms, using Excel ≤1× Similar success rate, but 1.4–2.7× more steps and higher latency

This leads to a key insight: the true value of digital employees depends on how much of the world’s workflows are “API-friendly.” A large share of enterprise workflows is still trapped in GUI-driven legacy software—this is not only technical debt but also one of the biggest bottlenecks limiting digital-employee productivity. Whoever API-ifies their workflows first gets a 10–100× leverage advantage first.

III. Why So Few? Bottlenecks, Costs, and Physical Limits

The Paradox of “Calculators” vs “Industrial Furnaces”

The world produces billions of smartphone and PC chips every year, and together they only consume 1–2% of global electricity. Why do AI chips run into a compute crunch?

The answer lies in duty cycle and power density: a phone chip’s average power usage is under 1 watt, like a calculator that gets pressed occasionally; an AI GPU, on the other hand, continuously draws 700 watts, like an industrial furnace burning at full power 24/7. Tens of thousands of 700W GPUs in one place would melt an entire building without liquid cooling.

Supply-Chain Bottlenecks

If we want to scale digital employees up to 1 billion (match the global knowledge workforce), we must overcome four layers of physical bottlenecks:

Bottleneck Current Status Severity Resolution Timeline
Advanced packaging (CoWoS) TSMC monopoly, ~680k wafers/year capacity in 2026 Most urgent 2029–2031
HBM memory SK hynix, Samsung, Micron sold out through 2026 Tight for now 2028–2029
Electricity 66 GW data center power, AI at 50–60% Becomes binding at large scale 2030+
Advanced-node wafers TSMC 3nm + 5nm ~4.3M wafers/year Not the bottleneck Already sufficient

“All Humanity, Full Throttle” Thought Experiment

Assume humanity allocates 50% of advanced chip production and 50% of global electricity to digital employees:

Resource 50% Available GPUs Supported Digital Employees Supported
CoWoS packaging ~340k wafers/year ~40M ~75M
Advanced-node wafers (3nm + 5nm) ~2.15M wafers/year ~240M ~450M
HBM memory 50% of capacity ~80M ~150M
Electricity (~1,700 GW) 1,700 GW ~1.13B ~2.6B

A counterintuitive finding: the bottleneck is neither electricity nor silicon wafers—it’s advanced packaging (CoWoS). The 2.5D packaging technology that integrates GPU chips with HBM memory stacks is actually the narrowest funnel in the supply chain. Many analyses claim that “power will collapse first,” but the real ordering of capacity bottlenecks is: CoWoS << HBM << advanced process nodes << electricity.

The cost stack from chips to end users

The cost of digital workers is not just electricity and chips—it passes through a multi-layer, profit-stacked supply chain:

Level Participant Gross Margin Cumulative Price/Month
Chip manufacturing TSMC + SK Hynix $60
Chip vendor NVIDIA 75% $240
Power + facilities Data center operators $305
Cloud infrastructure AWS/Azure/GCP 40% ~$510
Model provider OpenAI/Anthropic 45% ~$980
App developer Enterprise SaaS 60% ~$2,950

The bare infrastructure cost of $125/month goes through about 24× markup along the supply chain and finally becomes an ~$2,950/month end-user price. Power accounts for only about ~$30/month—negligible. The truly expensive part is not electricity and silicon, but the profit margin at each layer.

Physical limits: ceilings that even AI can’t break

Limit Constraint When it’s reached
Semiconductor scaling Transistors approaching atomic scale (2nm ≈ 10 atoms) 2028–2030
Heat dissipation Chip power density already at liquid-cooling threshold Already constraining
Speed of light Latency between data centers Already relevant
Landauer limit Minimum energy per bit erased, kT ln 2 After 2040
Construction time Power plants need 2–10 years; AI can’t skip this Persistently constraining

The nearest physical limit is the end of semiconductor scaling. Future improvements will mainly come from packaging (3D stacking, chiplets), architecture (sparsity, quantization), and algorithms, rather than smaller transistors.

IV. Evolution over the next decade

“We always overestimate the change that will occur in the next two years and underestimate the change that will occur in the next ten.” — Bill Gates

Many people today are anxious about whether AI can do what we’re already doing, afraid of being replaced within a year or two. But almost no one seriously asks: what will the world look like in ten years? Our data yields a shocking answer— in the short term AI is far less scary than imagined (6.8 million vs. 1 billion), but in the long run its scale will far exceed anyone’s imagination (72 billion, 72× the global number of knowledge workers).

The inference cost of AI models is dropping at 5–10× per year (Epoch AI data). But this efficiency improvement will not translate linearly into more digital workers. There are three distinct phases:

Phase 1: Capability-chasing (2026–2027) — Jevons paradox

According to Jevons paradox, efficiency gains in this phase will not make digital workers cheaper—because current agent capabilities are just barely good enough. As noted earlier, Computer Use success rates have just caught up with humans, but are far less efficient (1.4–2.7× more steps, with compounding latency); coding tasks are fast, but reliability is still insufficient to fully let go.

This means all efficiency gains are poured into improving model capabilities—more accurate, fewer steps, fewer errors—rather than reducing unit cost. Cost stays flat, intelligence increases. Digital workers remain expensive (~$1,000–3,000/month), mainly serving as “external brains” for senior experts.

Phase 2: Distillation and explosion (2028–2030) — Harvesting efficiency

When model capabilities clearly surpass humans (~2028), a realization hits: we don’t need an IQ 150 super-AI for data entry. Frontier capabilities are distilled into small and medium-sized models, which are likely to be open-sourced, releasing the efficiency gains of the previous two years in one go.

This produces roughly a 3× one-off downgrade dividend (from the largest models to medium-scale models, from closed to open models), followed by an ongoing 2× efficiency improvement per year. At the same time, supply-chain markup compresses from 24× to about 8×—competition and scale turn digital workers from a “luxury good” into a “utility.”

Once compute cost is no longer the primary expense, AI pricing logic changes accordingly. Today, the mainstream is token-based billing, anchored to compute consumption—even though actual prices are far above marginal cost (the 24× markup above is proof). Outcome-based pricing is already emerging (e.g., Pine AI takes a cut based on how much money it saves users). But as tokens become extremely cheap, this balance will tilt completely—when production cost is no longer the scarce constraint, prices are determined by how much buyers are willing to pay for outcomes. Production cost retreats to being a price floor, while buyers’ valuation of outcomes becomes the dominant pricing force.

Phase 3: Self-reinforcement and transcendence (2030–2035)

AI will no longer just do human work; it will begin to improve the infrastructure that produces itself:

  1. AI designs chips: NVIDIA’s Marco framework has already achieved 60× speedups in timing analysis; Cadence + NVIDIA achieved 80× speedups in CFD simulation. A super-AI will iterate through millions of chip architectures in a matter of weeks, pushing computing toward ultra-efficient ASICs, neuromorphic computing, or photonic computing. Chip iteration cycles will shrink from 2–3 years to 1–1.5 years.
  2. AI builds energy and the grid: The 7-year grid interconnection queue is a product of human bureaucracy. AI agents will optimize the energy supply chain, automate approval processes, optimize grid topology, and even direct robots to build off-grid SMR (small modular reactor) or geothermal data centers in a few months.
  3. AI optimizes its own operation: Managing cooling (PUE from 1.3 down to 1.1), optimizing scheduling (utilization 50% → 80%+), and discovering architectures or algorithms humans cannot conceive.
  4. Profit margins collapse: When AI writes code and runs SaaS platforms itself, the traditionally high 80% software gross margin will be completely flattened, and costs will approach the pure thermodynamic cost of energy.

Better AI → better chips → more AI → faster improvement. This is a self-reinforcing flywheel.

Will all this actually happen?

Yes—because the economic incentive is overwhelmingly large. Global knowledge workers are paid a total of $50–70 trillion per year; replacing even 1% of that is a $500–700 billion market. The cumulative investment required to reach compute parity is only about $3–5 trillion, a staggering return. No government mandate is needed—pure commercial incentives are enough to drive this transformation. The only friction comes from geopolitics (TSMC’s concentration in Taiwan, U.S.–China export controls) and infrastructure cycles (power plants are built on multi-year timelines), but these are engineering problems solvable with capital and time, not fundamental obstacles.

The full projection: from 6.8 million to 72 billion

All the above analysis—hardware growth, efficiency multipliers, three-phase evolution—converges into a single table:

Year GPU Installed Base Model Efficiency # of Digital Workers Share of Knowledge Workers End-User Monthly Price Phase
2026 6 million 6.8 million 0.7% $2,950 Capability-chasing
2027 11 million 12.5 million 1.3% $2,250 Capability-chasing
2028 20 million 2.5× 62 million 6.2% $700 Inflection
2029 35 million 290 million 26% $230 Harvesting efficiency
2030 60 million 14× 1.4 billion ~100% $72 Parity
2031 100 million 25× 5.3 billion 350% $27 Super individuals
2032 150 million 35× 14 billion 890% $14 Super individuals
2033 210 million 42× 26 billion 26× $9 Universal access
2035 360 million 52× 72 billion 72× $4 Universal access

Key milestones:

  • 2028 inflection: Model capabilities surpass humans, mid-range models start replacing frontier models, and the efficiency curve steepens sharply.
  • 2030 parity: Compute capacity matches the world’s 1 billion knowledge workers, and monthly cost drops to $72.
  • 2031 super individuals: One human + 100 agents = one company. Digital workers will not be “in oversupply”—just as there was no “oversupply” of electricity after electrification; instead it enabled entirely new demand such as refrigerators, computers, and the internet. When monthly cost drops to $27, solo founders, freelancers, and even blue-collar workers begin hiring their own digital teams.
  • 2035 universal access: Capacity for 72 billion digital workers (about 9 digital assistants per person), with a total annual operating cost of around $3.5 trillion—only 5–7% of global knowledge worker compensation ($50–70 trillion). Every human becomes the commander of a digital team, rather than someone to be replaced.

V. Beyond compute: when labor is no longer scarce

The core figure in the projection above—the cost of digital workers falling from $2,950 to $4/month—is more than just a technical metric. It means the marginal cost of knowledge work is approaching zero. This is a structural shift at the level of economics whose impact may far exceed that of the technology itself.

The pricing foundation of knowledge work is shaken

Today, knowledge workers worldwide are paid $50–70 trillion per year. This number rests on an implicit assumption: knowledge labor is scarce, therefore expensive. When a digital worker can complete equivalent work for $72/month, the answer to “how much is one hour of knowledge work worth?” will be fundamentally rewritten.

This does not mean human knowledge workers will instantly lose their jobs—but it does mean that pure information-processing ability will no longer be a reasonable basis for compensation. Analyzing a financial statement, drafting a legal document, writing a piece of code—market prices for these tasks will keep falling until they approach the cost of digital workers. The remaining premium for humans will concentrate in judgment, creativity, and interpersonal trust.

Early signals of these changes are already visible: AI API prices have fallen by 90% in two years (after adjusting for capability); hiring of software engineers is slowing, and consulting and outsourcing industries are seeing profit margins squeezed; the profit share of NVIDIA, TSMC, and hyperscale cloud providers continues to expand, while downstream knowledge-service providers are losing pricing power. If these trends continue to 2030, traditional GDP may stagnate (the price of digital output trends toward zero), but real output and living standards will soar—economics will need a new measurement framework that is no longer anchored to human labor hours.

Super Individuals and the “Great Inversion”

When one person can command 100 digital employees to run a company, the original rationale for traditional corporate organizations—the transaction costs of coordinating large amounts of human labor—will shrink dramatically. We are already seeing AI‑augmented solo entrepreneurs emerging; by the 2030s, this will be the norm rather than the exception.

And that is only the beginning. As digital employees become more autonomous and cheaper, labor and employment relationships will undergo a “Great Inversion”:

Era Division of labor
2026 (today) Human decision-making → Human execution (digital + physical) → AI assistance
~2030 Human decision-making → AI executes all digital work → Humans execute physical work
~2035 AI makes decisions and executes all digital work → AI reversely hires humans to perform physical tasks

Why is the final stage “AI hires humans” rather than “AI replaces humans”? Because the current AI revolution is essentially happening in the digital world. Embodied intelligence (robots) is still at least a decade away from being deployed at scale like digital AI. Constraints in the physical world—atoms are slower than bits, regulation is stricter, trust is harder to establish—give humans a structural advantage in physical space. When AI systems need to complete physical tasks (moving, installation, caregiving, face‑to‑face negotiation), the most economical way is not to build a robot, but to hire a person.

This means the future economy will split into two very different worlds along the boundary between digital and physical:

Digital economy Physical economy
Main workforce AI digital employees Humans + gradually introduced robots
Marginal cost Close to zero Still constrained by materials, energy, and human labor
Pricing logic Compute cost (kWh) Human time and physical resources
Scarce resources Energy, chips (in the short term) Human judgment, presence, trust

This dividing line is a natural stabilizer—the inertia of the physical world buys society time to adapt. But it also means that whoever first opens up the interface between digital and physical (embodied AI, human‑AI collaboration platforms) will hold the key to the next decade.

VI. Core Conclusions

2026 2030 2035
Digital employee capacity 6.8 million 1.4 billion 72 billion
Monthly price for end users $2,950 $72 $4
Share of knowledge workers 0.7% ~100% 72×
Main bottleneck CoWoS packaging Power + grid Physical limits of semiconductors
Economic form AI‑assisted humans Human‑AI collaboration Super individuals + digital/physical division of labor

2028 is the watershed: model capabilities surpass humans, the Jevons paradox is lifted, and efficiency harvesting begins. Before that, everyone is chasing capability; after that, compute becomes a public utility, and digital employees become commodities.

By 2030, global compute will be able to support a digital workforce comparable in size to the global population of knowledge workers, at a monthly cost of under $100. By 2035, each human will have about 9 digital assistants, at a monthly cost of only $4.

But this is not a story of “humans being replaced.” Think of Swiss mechanical watches: electronic watches are more accurate and cheaper, yet the value of a Patek Philippe lies precisely in a human craftsperson spending hundreds of hours hand‑finishing it. When AI can do all information work, “done by a human” itself becomes the source of value. The value of a psychotherapist is not in saying the correct words (AI can do that too), but in another consciousness being present with you; no one will watch a robot Olympics—we watch beings like ourselves push their limits. Work that is today seen as “soft” and secondary will become the main bearer of economic value in the AI era. And the demand for human presence is currently suppressed by cost—when AI releases productivity and lowers the cost of basic living, demand for education, psychotherapy, art, and craftsmanship will not shrink, but explode.

The pain of transition is real: not everyone who loses a programming job can become an artisan or therapist, skills take time to retrain, and the transition period will be very hard for some groups. But the long‑term direction is clear—the digital world is handed over to AI, while the physical world and human‑to‑human connection remain with humans. Intelligence becomes infrastructure rather than a scarce good, and the human role shifts from being a provider of labor to the commander of this cluster of digital workers, and an irreplaceable presence in domains that deliberately require humans to be there.

Comments