Failure patterns

Browser agent failure patterns

Most browser-agent failures are not explained by a stack trace alone. These are the failure shapes BrowserTrace is designed to preserve in one local timeline before you export a public-safe report.

View repo Open exported trace Integration snippets

Browser Use icon-only target mismatch

The screenshot shows the right plus icon, but the agent clicks a nearby toolbar button because the intended control has no stable accessible name. Tooltip text may appear only after hover and may not be associated with the actual clickable element.

Trace evidence to keep

Screenshot and candidate bounding boxes for the intended icon, nearby controls, and the clicked element.
Live button HTML, not only the intended fixed markup.
Accessibility snapshot before hover and after hover.
Model output and action label that explain why the wrong target ranked higher.

Related community case: browser-use/browser-use#4801. BrowserTrace guide: Debug Browser Use failures.

Browser Use new-tab desync

A click or Enter action can open a new tab while the agent keeps reasoning from the stale page context. The symptom is repeated retries against the old page even though the expected element exists in the new tab.

Trace evidence to keep

Page ids, tab indexes, URL, and title before and after the action.
pages_before, pages_after, and new_pages with probe status.
The action id that created the new page and the focused page before and after the step.
Recommended recovery action such as switch_tab, stored as evidence instead of inferred from later retries.

Related community case: browser-use/browser-use#4758. BrowserTrace guide: Debug new-tab desync.

Browser Use multi-step form drift

One long browser-agent run over a complex form can hide the first bad boundary. The visible failure may be a later submit, while the useful evidence is an earlier dependent field, delayed validation message, or checkpoint that should have stopped the run.

Trace evidence to keep

Canonical form payload outside the agent plus the fields assigned to the current segment.
URL, title, field labels, visible validation errors, submit state, retry count, and checkpoint id after each step.
Screenshot reference, selected element summary, and model/tool output for the action that moved the form forward.
Whether recovery resumed from the last checkpoint or restarted the whole form.

Related community case: browser-use/browser-use#4476. BrowserTrace guide: Debug multi-step form drift.

A local HTML file can be described in the task prompt as an attachment to inspect, but the model may turn that attachment context into a navigation target. The failure looks like a bad URL action even though the useful evidence starts before step 0, when file metadata first becomes model-visible.

Trace evidence to keep

Task prompt and model-visible file or attachment context before the first browser step.
Local filename, extension, MIME type, and redaction state when safe to log.
Raw model action before validation plus parsed action type and rejected URL or upload target.
Guard or watchdog block reason, allowed-domains state, and recovery recommendation.

Related community case: browser-use/browser-use#4794. BrowserTrace guide: Debug local HTML upload navigation.

Browser Use remote CDP hang

A remote browser session can look connected while one CDP request never returns. Screenshot capture, DOM snapshot, browser-state collection, and recovery can then spend the whole timeout window or block other sessions through lock timing.

Trace evidence to keep

Event id, browser/session/target id, and CDP method name.
Request id, start/end/duration, timeout, result, or transport error.
Websocket ping/pong timestamps near the stalled method.
event-bus lock wait/acquire/release timing and recovery decision.

Related community case: browser-use/browser-use#4579. BrowserTrace guide: Debug remote CDP hangs.

Screenshot blob poisons model context

A screenshot blob can break future model turns when raw image bytes are copied into conversation history. The safer boundary is an artifact record plus compact metadata, with pixels passed only when the next model call needs a typed image content block.

Trace evidence to keep

artifact_path or artifact id for durable local debugging.
Image dimensions, media type, digest, status, and error metadata.
Whether the image was sent to the model, omitted, resized, or redacted.
Public export redaction state for screenshots, URLs, and model I/O.

Related community case: browser-use/browser-use#4742. BrowserTrace guide: Keep browser artifacts out of long-term model context.

Stagehand custom-tool replay gap

A cache can replay normal page actions while skipping a custom_tool that mutated page state, filled credentials, or ran side-effectful code. Replay and diagnostic tracing need different contracts.

Trace evidence to keep

Tool name, serialized args, stable tool-call or step id, status, and error.
Whether the tool is replay-safe, replay-blocked, or manually rehydrated.
URL/page id, observation id, screenshot id, result summary, and error context.
Redaction boundary for credentials and sensitive tool args.

Related community case: browserbase/stagehand#1558. BrowserTrace guide: Debug custom tool replay gaps.

Stagehand semantic verification boundary

A semantic verification layer around act can prevent unsafe or ambiguous actions, but the verifier result needs to be visible as its own action boundary instead of collapsed into a boolean.

Trace evidence to keep

Action proposal with instruction, action type, selector, role, text, and confidence when available.
Target evidence: URL, screenshot id, DOM snapshot id, candidate elements, and semantic endpoint evidence.
Verification result with verifier type, status, reason, and whether execution was allowed.
Execution outcome: executed, blocked, escalated, failed, URL after, and error.

Related community case: browserbase/stagehand#1880. BrowserTrace guide: Debug semantic verification boundaries.

Skyvern action confidence gap

Confidence is useful diagnostic metadata, but it is not authorization and does not prove a click, submit, or extraction target was correct. High confidence can still pair with stale context, the wrong target, or a consequence larger than the model expected.

Trace evidence to keep

Action proposal with target evidence, model rationale, and confidence value.
Authorization decision with policy/scope checks, approvals, or blocks.
Execution result with state delta, error, retry, or rollback decision.
VNC screenshot and CDP DOM/console/network slices linked to the same workflow step.

Related community cases: Skyvern-AI/skyvern#5637 and Skyvern-AI/skyvern#3260. BrowserTrace guide: Debug action confidence and authorization.

Skyvern VNC/CDP debug integration

Visual VNC evidence and CDP browser-state evidence are much harder to use when they are stored as separate logs. Link them to the same task, workflow, and step ids so a reviewer can see which visual window each DOM, console, or network slice explains.

Trace evidence to keep

Connect/probe start, success, timeout, cleanup, and retry or recovery decision.
VNC screenshot or recording artifact ids linked to CDP DOM snapshot or selected-element summaries.
Task id, workflow id, step id, URL, frame/page id, action/tool name, status, and error.
Redaction state for screenshots, URLs, headers, cookies, and form values before public export.

Related community case: Skyvern-AI/skyvern#3260. BrowserTrace guide: Debug VNC and CDP evidence together.

Skyvern multi-session VNC control drift

In local and self-hosted deployments, a live VNC view can become disconnected from the workflow session it is supposed to explain. The browser may still be reachable through CDP while the VNC stream has no frames, points at a shared display, or loses manual Take Control state after reconnect.

Trace evidence to keep

VNC stream identity, CDP target identity, workflow/task/session/page ids, and redacted display id.
Manual-control lease state: agent/manual owner, acquire/renew/release timestamps, persisted-across-reconnect flag, and release reason.
Isolation metadata showing whether the run had its own X display, container, or browser context.
Failure cause category: no VNC server, connected without frames, stale stream, manual-control lease lost, or display conflict.

Related community case: Skyvern-AI/skyvern#4392. BrowserTrace guide: Debug multi-session VNC and Take Control drift.

Persistent browser recovery before first screenshot

Some custom computer-use agents fail during persistent browser recovery before a screenshot or URL exists because profile reuse, lock files, stale process detection, or CDP attach timing prevents a browser session from becoming usable.

Trace evidence to keep

session_mode, browser/session/target id, and redacted profile id.
Launch, connect, CDP attach/probe timing, timeout, and error fields.
Detected process ids, approval source, recovery action, and final connection state.
Redaction state for local profile paths, process details, URLs, and screenshots.

BrowserTrace guide: Debug persistent browser session recovery.

Use browsertrace export <run_id> --public -o public.html before attaching a trace to a public issue or community thread. The public export omits model input/output, screenshots, and URLs by default.

Browser agent failure patterns

Browser Use icon-only target mismatch

Trace evidence to keep

Browser Use new-tab desync

Trace evidence to keep

Browser Use multi-step form drift

Trace evidence to keep

Browser Use local HTML upload navigation

Trace evidence to keep

Browser Use remote CDP hang

Trace evidence to keep

Screenshot blob poisons model context

Trace evidence to keep

Stagehand custom-tool replay gap

Trace evidence to keep

Stagehand semantic verification boundary

Trace evidence to keep

Skyvern action confidence gap

Trace evidence to keep

Skyvern VNC/CDP debug integration

Trace evidence to keep

Skyvern multi-session VNC control drift

Trace evidence to keep

Persistent browser recovery before first screenshot

Trace evidence to keep

Public-safe sharing