Build agent-friendly websites

Kasper Kulikowski
Kasper Kulikowski
Omkar More
Omkar More

Your website has a new type of visitor. Some human users are pivoting from manual navigation to delegating goal-oriented journeys to AI agents. Those autonomous systems can interpret input, plan, and execute actions on behalf of a user.

However, many of our websites are designed to be beautiful for humans, with complex hover-states, shifting layouts, and fluid motion. This is functionally broken for agents.

How agents view your site

Agents don't look at your website on a monitor. They operate on a machine-readable representation of your site. The quality of this representation determines their performance.

Agents can view your website in 3 primary ways: screenshots, raw HTML, and the accessibility tree.

Screenshots

The agent takes a snapshot of the rendered page and uses a vision model to identify elements. Based on the screenshot, the agent can recognize that a search bar at the top-right is a global search, while a box in the middle is likely a form field. Visual cues can be helpful, as agents can use color, size, and proximity to determine importance. A big Delete button will likely be interpreted with more caution than a small "Help" link. However analyzing screenshots can be slow and expensive (in terms of used tokens), making it better as a backup when the structure is confusing.

HTML

The agent analyzes the DOM and reads the HTML. It understands how elements are nested, the logical hierarchy of the DOM tree, attributes like IDs and classes that define structure, and raw data strings that form the site's informational backbone. This helps the agent understand the relationship between elements. If a "Buy Now" button is inside a product container, the agent assumes that button belongs to that specific product.

Accessibility tree

The accessibility tree is a browser-native API distills the DOM into what's most important: roles, names, and states of interactive elements. It's the page's semantic summary, used by assistive technology. For an AI agent, it functions as a high-fidelity map that ignores the visual "noise" of CSS to focus on pure utility. By interpreting this tree, an agent can learn the functional intent of every toggle, slider, and input field.

Combined modalities

Relying on a single input creates a semantic gap. For example, in the DOM, an agent might see a <div> without knowing you've actually configured this as a functional button with CSS and JavaScript. With screenshots, it's possible an agent may identify where that button sits on the screen, but it's still unaware of the button's intended destination or action that it's designed to trigger.

Modern agents, therefore, combine multiple modalities. They use the DOM and accessibility tree to get a clean, structured list of interactive elements, and then cross-reference that with a visual rendering to understand layout, grouping, and visual cues.

Our job is to provide clean signals across all these channels.

Build agent-friendly websites

To help agents navigate your website, consider following:

  • All necessary actions, taken by a human or agent, should be clearly reflected in the interface.
  • Ensure stable layout. Agents that take screenshots will likely be confused if your website layout is constantly shifting, for example when an Add to cart button on product page is in different location for each product category.
  • Avoid "ghost" elements or transparent overlays that might hide interactive elements. The visual analysis done by the agent might discard nodes that are covered, even if the node appears transparent.
  • Design actionable elements with semantic HTML. Prefer <button> and <a> tags over modified <div> and <span> elements. Agents recognize these as interactive.
    • If you cannot use semantic HTML, always provide the element the appropriate role and tabindex. For example, <div role="button">.
  • Set cursor: pointer in CSS, which is a strong signal for actionability.
  • Add the for attribute on <label> tags to link them to inputs. This helps the AI agent understand the purpose of a field by indicating the label text that is directly attached to the action string.
  • Make sure any interactive elements required to continue the user journey have a visible area larger than 8 square pixels, to avoid being filtered out by visual analysis.

Next steps

Everything we suggest to make a site "agent-ready" also makes sites better for humans.

Making websites AI agents-friendly is an incentive to recommit to its foundational principles of building well-structured, accessible, and semantic websites.

  • Read about WebMCP, a proposed web standard for helping websites interact with agents, and sign up for the early preview program to start experimenting.
  • Audit your A11y Tree: Use existing tools to ensure your site's hierarchy is machine-readable and stable.