Beyond Text: Why AI Agents Need Interactive UI

Published: 2026-05-15

The Text Trap

ChatGPT, Claude, Gemini — they all have the same bottleneck. You ask for a calculation, they respond:

99.5 × 3 = 298.5

Correct — but inert. Change 3 to 4? New round. Add tax? Another round. Charts, forms, sliders — all flattened to text you can't interact with.

The ecosystem solved input (MCP, Function Calling, REST). Output is still text.

Yes, You Can Already Embed an Iframe

ERI (Embedded Result Interface) is an open spec that turns ad-hoc iframe embedding into a repeatable pattern: your web app appears as interactive UI inside any Agent conversation, via a skill.md and a URL.

So why a spec? Because without a convention, every Agent-provider integration is ad-hoc. One Agent embeds your URL with the data in the hash. Another puts it in query params. A third forgets to include a text fallback for platforms that strip iframes. Your API gets called differently each time, and your embed page has no consistent contract.

ERI defines the contract: a skill.md instructs the Agent to call a specific API, encode the result, embed a specific URL, and include a plain text fallback. Same pattern, every Agent, every platform that renders iframes. The result is consistent and discoverable — not a lucky coincidence.

The Landscape

Three approaches exist for interactive Agent output. ERI works on most platforms today using only what they already have — iframe rendering. MCP Apps and A2UI offer richer capabilities but require platform-specific runtimes. See the spec comparison table for details — or the live demo to see it in action.

Key point: MCP Apps also render inside sandboxed iframes. An ERI embed page is structurally compatible with the MCP Apps runtime, and ERI Level 2 uses the same ui/* JSON-RPC bridge. Start with ERI; the embed page carries forward unchanged.

Why ERI

Low commitment. Your embed page works with or without ERI. The skill.md is ~15 lines of Markdown. MIT-licensed spec, anyone can fork. Under a day from zero to working embed.

Trade-offs: snapshot-based output (each user turn produces a fresh embed), no native device APIs, URL size limits for encoded data. See spec limitations and large data pattern for mitigations.

Convinced? Try the demo, read the team adoption guide, or grab the copy-paste template.

AI Agents output text. ERI lets them output interactive UI — embed any web app inside ChatGPT, Claude, or Gemini with a single skill.md. Open spec, MIT licensed. 2234839.github.io/eri-spec

Share on X Share on LinkedIn Submit to HN

Beyond Text: Why AI Agents Need Interactive UI

The Text Trap

Yes, You Can Already Embed an Iframe

The Landscape

Why ERI

Share This