Testing Guide — April 2026

Agent Simulation Testing
Test Your Site With a Real LLM

Static checks tell you if your site can be read by agents. Simulation testing tells you if agents can actually do things on your site. This is the difference between passing a compliance checklist and passing a real-world test.

What is Agent Simulation Testing?

Agent simulation testing is the practice of sending a real LLM (like Claude or GPT-4o) to interact with your website and measuring whether it can complete specific tasks. Instead of checking for the presence of files or the validity of schemas, it answers a more fundamental question: can an AI agent actually use your site?

The process works like a human QA tester, but the tester is an AI. You define tasks — "Find the pricing for the Pro plan," "Start a free trial," "Locate the API documentation" — and the agent attempts each one. The output is a pass/fail result with a detailed trace of what the agent tried, where it got stuck, and why it succeeded or failed.

This approach catches problems that static analysis misses. Your llms.txt might be perfect, your schemas valid, your robots.txt open — but if your pricing page is a client-rendered React component with no server-side content, the agent sees an empty page and cannot answer the most basic question about your product.

How Agent Simulation Works

The simulation follows a structured pipeline:

Context Gathering

The agent reads your llms.txt, homepage, and sitemap to build an understanding of your site structure — just like a real agent would in the wild.

Task Assignment

The system assigns tasks representing common agent use cases: finding pricing, locating documentation, identifying contact methods, starting a trial, and comparing features.

Autonomous Navigation

The agent navigates your site following links, reading content, and attempting to extract the information needed to complete each task. It does not use a browser — it reads the raw content agents actually see.

Completion Assessment

Each task is scored as completed, partially completed, or failed. The agent provides evidence (extracted text, URLs visited) to justify its assessment.

Trace Report

You receive a full trace showing every page the agent visited, what it extracted, where it got stuck, and recommendations for improvement.

What Simulation Testing Catches

Simulation testing reveals problems that are invisible to static analysis:

Client-side rendering gaps

Pages that look fine in a browser but return empty HTML to agents. Common with SPAs that defer all rendering to JavaScript.

Broken navigation paths

Links that work for humans (via client-side routing) but do not resolve when an agent follows the raw href attribute.

Missing or ambiguous pricing

Pricing pages that use interactive sliders, toggles, or custom calculators that agents cannot operate. The agent sees the page but cannot extract a price.

Gated content without signals

Content behind login walls without any indication in the HTML that authentication is required. Agents see a blank page or redirect loop.

Conflicting information

Pages that say one thing in visible text and another in JSON-LD. Agents that find contradictions lose confidence in your site.

Dead-end pages

Pages with no internal links, no schema, and no clear next step. Agents arrive and cannot figure out what to do next.

How to Test Your Site Manually

You can run a basic agent simulation yourself using any LLM with web browsing capabilities. Open ChatGPT, Claude, or Perplexity and ask it to complete tasks on your site:

"Go to [your-site.com] and tell me how much the Pro plan costs."
"Find the API documentation for [your-site.com] and show me how to authenticate."
"What does [your-company] do? Find the answer from their website, not your training data."
"Walk me through starting a free trial on [your-site.com]."

If the agent struggles, hallucinates, or gives wrong answers — those are the same failures real users will experience when they ask AI assistants about your product. The difference: you will never see those failures in your analytics because the agent never clicks through to your site. The user just gets a wrong answer and moves on.

Automated Simulation in AX Audit

The AX Audit includes automated agent simulation as part of the Agent Task Completion dimension (20% of your total AX score). When you run an audit, we send a real LLM to attempt standardized tasks on your site and measure success.

The automated simulation is faster and more consistent than manual testing. It uses the same task set across all sites, producing comparable scores. It also runs after every re-audit, so you can track improvement over time.

Sites that score well on static checks (Discoverability, Parsability, Schema) but poorly on Task Completion have a clear signal: the content is there, but the user experience for agents is broken. This usually points to rendering issues, navigation problems, or content that is technically present but practically unusable.

Task Completion and Your AX Score

Agent Task Completion carries the highest weight (tied with Discoverability at 20%) in the AX scoring model. This is intentional — all the technical optimization in the world is meaningless if agents cannot actually accomplish their goals on your site. A perfect score on the other five dimensions with a zero on Task Completion still means your site is not agent-ready.

Run a Free Agent Simulation

The AX Audit sends a real LLM to attempt tasks on your site and reports what succeeded, what failed, and why. See how your site performs when the visitor is an AI agent.

Run a Free Agent Simulation

Free audit with live LLM simulation. No signup required.

Agent Simulation TestingTest Your Site With a Real LLM

What is Agent Simulation Testing?

How Agent Simulation Works

Context Gathering

Task Assignment

Autonomous Navigation

Completion Assessment

Trace Report

What Simulation Testing Catches

Client-side rendering gaps

Broken navigation paths

Missing or ambiguous pricing

Gated content without signals

Conflicting information

Dead-end pages

How to Test Your Site Manually

Automated Simulation in AX Audit

Task Completion and Your AX Score

Run a Free Agent Simulation

Agent Simulation Testing
Test Your Site With a Real LLM