AI Cookbook
← Back to blog

AI's invisible bottleneck: why even US-made chips do a round trip to Taiwan

There is a technical detail almost nobody outside the semiconductor industry understands but it defines the real speed of every new generation of AI model: it is called CoWoS — Chip on Wafer on Substrate.

It is TSMC's advanced packaging technology that glues the logic chip (GPU) and the high-bandwidth memory (HBM) onto the same 2.5D substrate. Without CoWoS, models like GPT-5.5 and Claude Opus 4.7 simply would not exist at the current scale. The server could not feed the GPU fast enough.

The memory wall — and how CoWoS broke it down

For decades, GPUs and CPUs were limited by something called the memory wall: traditional RAM sits far from the chip, connected by relatively slow channels. Every memory access costs hundreds of wait cycles.

CoWoS solved this with an extra layer called an interposer — basically a high-density routing chip that places HBM literally next to the GPU. Microsecond distance becomes nanosecond distance.

Nvidia Blackwell was the first commercial product manufactured on the new CoWoS-L generation. And here is the political part: Nvidia reserved most of the capacity on this new line. Meaning that even if Anthropic, Meta, or OpenAI wanted to build their own GPUs in this packaging, they could not — Jensen Huang got there first with a bigger check.

Why "Made in USA" is not enough

Here is the twist. Even chips manufactured in Arizona need to travel back to Taiwan for packaging.

  • Wafer fabrication (turning sand into a logic chip) — can be done in Arizona
  • Advanced packaging (CoWoS, joining GPU + HBM) — only Taiwan has the capacity

Result: every H200 or B200 GPU currently goes through this round trip Taiwan → Arizona → Taiwan → world. That adds weeks to lead times and a geopolitical dependency that Arizona's billion-dollar capex has not yet solved.

What this means for you

Three practical effects in the short term:

  • Enterprise GPU prices stay high while CoWoS is the bottleneck
  • Hyperscalers (AWS, Azure, GCP) negotiate priority access to CoWoS — anyone renting cloud GPUs pays this invisible hand
  • Alternative solutions (Groq, Cerebras) that do not depend on CoWoS gain relevance in specific workloads

For developers and creators who depend on inference: it is worth having a plan B. Not for today, but for the next 12-18 months if Taiwan-China tension escalates.

Sources

  • CNBC (April 8, 2026): AI's next bottleneck: Why even the best chips made in the U.S. take a round trip to Taiwan
  • Digitimes (April 7, 2026): Global AI chip suppliers compete as TSMC remains top foundry partner
  • Next Platform (April 20, 2026): AI Will Soon Drive A Third Of TSMC's Business