Mentra Live — Spatial interaction for open-eye AR.

Open-eye AR design probe — interruption cost, spatial anchoring, and AI overlays when the wearer can verify part of the scene and must trust the rest.

Build log — active

This probe is in progress.

The sections below capture what’s been worked through so far — the design problems that are well-defined, the decisions that have been made, and the questions still open. This page updates as the work develops rather than waiting until it’s complete.

01 — The Constraint

Open-eye means attention is always split.

Mentra Live is open-eye AR — the display sits on top of the real world rather than replacing it. That changes the design problem entirely. On a phone or closed-eye headset, the interface is the whole scene. Here, the interface competes with the scene. Every overlay is an interruption, and the wearer can always look past it.

That’s both a constraint and a kind of safety. An overlay that earns attention is useful. One that doesn’t is just noise at eye level. The core design question becomes: how does a spatial UI know when it has earned interruption and when it should stay out of the way?

This is different from the “glanceable” problem in closed displays. Glanceability is about speed. The open-eye problem is about threshold — whether an overlay should exist at all in a given moment, not just how fast it can be parsed.

02 — Trust in Scene

Some facts are verifiable. Most aren’t.

AI-generated content in a live scene behaves differently than on a screen. When a model says “the nearest coffee shop is two blocks north,” the wearer can check: is there a coffee shop there? But when a model says “this building was constructed in 1923,” they can’t verify that from where they’re standing. The scene offers no confirmation.

That distinction — in-scene verifiable vs. model-supplied — matters for what the system should offer and how confident it should appear when it offers it. A spatial overlay that presents unverifiable data with the same visual weight as an observable fact is doing something subtly wrong. It’s using the credibility of the scene to back up content the scene can’t support.

The working hypothesis: the design should signal epistemic type — not just accuracy — as part of the overlay itself. Not with disclaimers, but with visual and interaction conventions that distinguish what is anchored to the world from what is retrieved from a model.

03 — Still Open

What hasn’t been resolved yet.

Spatial anchoring vs. distraction

Anchoring an overlay to a real-world object gives it credibility and context. It also ties the display to something the wearer’s attention is already on — or may not want to look at. The line between “helpful anchor” and “distraction attached to a thing” isn’t obvious in the abstract. Still working through the cases where anchoring helps vs. where it compounds the interruption problem.

Gaze-and-voice without a primary screen

Most voice + gaze interaction design assumes a primary display the user is oriented toward. Mentra Live doesn’t have that. The wearer’s gaze is on the world, and their hands may be occupied. The interaction model for initiating and dismissing overlays in that context is still genuinely open — not just an implementation detail.

Signaling epistemic type visually

The hypothesis that overlays should visually distinguish verifiable-in-scene content from model-retrieved content is directionally clear. What that distinction should look like — whether it’s color, opacity, iconography, animation, or something else — is still being worked through. The visual language needs to be fast to read and not itself a distraction.