Browser agent failure patterns
Most browser-agent failures are not explained by a stack trace alone. These are the failure shapes BrowserTrace is designed to preserve in one local timeline before you export a public-safe report.
Browser Use icon-only target mismatch
The screenshot shows the right plus icon, but the agent clicks a nearby toolbar button because the intended control has no stable accessible name. Tooltip text may appear only after hover and may not be associated with the actual clickable element.
Trace evidence to keep
- Screenshot and candidate bounding boxes for the intended icon, nearby controls, and the clicked element.
- Live button HTML, not only the intended fixed markup.
- Accessibility snapshot before hover and after hover.
- Model output and action label that explain why the wrong target ranked higher.
Related community case: browser-use/browser-use#4801. BrowserTrace guide: Debug Browser Use failures.
Browser Use new-tab desync
A click or Enter action can open a new tab while the agent keeps reasoning from the stale page context. The symptom is repeated retries against the old page even though the expected element exists in the new tab.
Trace evidence to keep
- Page ids, tab indexes, URL, and title before and after the action.
pages_before,pages_after, andnew_pageswith probe status.- The action id that created the new page and the focused page before and after the step.
- Recommended recovery action such as
switch_tab, stored as evidence instead of inferred from later retries.
Related community case: browser-use/browser-use#4758. BrowserTrace guide: Debug new-tab desync.
Browser Use remote CDP hang
A remote browser session can look connected while one CDP request never returns. Screenshot capture, DOM snapshot, browser-state collection, and recovery can then spend the whole timeout window or block other sessions through lock timing.
Trace evidence to keep
- Event id, browser/session/target id, and CDP method name.
- Request id, start/end/duration, timeout, result, or transport error.
- Websocket ping/pong timestamps near the stalled method.
- event-bus lock wait/acquire/release timing and recovery decision.
Related community case: browser-use/browser-use#4579. BrowserTrace guide: Debug remote CDP hangs.
Screenshot blob poisons model context
A screenshot blob can break future model turns when raw image bytes are copied into conversation history. The safer boundary is an artifact record plus compact metadata, with pixels passed only when the next model call needs a typed image content block.
Trace evidence to keep
artifact_pathor artifact id for durable local debugging.- Image dimensions, media type, digest, status, and error metadata.
- Whether the image was sent to the model, omitted, resized, or redacted.
- Public export redaction state for screenshots, URLs, and model I/O.
Related community case: browser-use/browser-use#4742. BrowserTrace guide: Keep browser artifacts out of long-term model context.
Stagehand custom-tool replay gap
A cache can replay normal page actions while skipping a custom_tool that mutated page state, filled credentials, or ran side-effectful code. Replay and diagnostic tracing need different contracts.
Trace evidence to keep
- Tool name, serialized args, stable tool-call or step id, status, and error.
- Whether the tool is replay-safe, replay-blocked, or manually rehydrated.
- URL/page id, observation id, screenshot id, result summary, and error context.
- Redaction boundary for credentials and sensitive tool args.
Related community case: browserbase/stagehand#1558. BrowserTrace guide: Debug custom tool replay gaps.
Stagehand semantic verification boundary
A semantic verification layer around act can prevent unsafe or ambiguous actions, but the verifier result needs to be visible as its own action boundary instead of collapsed into a boolean.
Trace evidence to keep
- Action proposal with instruction, action type, selector, role, text, and confidence when available.
- Target evidence: URL, screenshot id, DOM snapshot id, candidate elements, and semantic endpoint evidence.
- Verification result with verifier type, status, reason, and whether execution was allowed.
- Execution outcome: executed, blocked, escalated, failed, URL after, and error.
Related community case: browserbase/stagehand#1880. BrowserTrace guide: Debug semantic verification boundaries.
Skyvern action confidence gap
Confidence is useful diagnostic metadata, but it is not authorization and does not prove a click, submit, or extraction target was correct. High confidence can still pair with stale context, the wrong target, or a consequence larger than the model expected.
Trace evidence to keep
- Action proposal with target evidence, model rationale, and confidence value.
- Authorization decision with policy/scope checks, approvals, or blocks.
- Execution result with state delta, error, retry, or rollback decision.
- VNC screenshot and CDP DOM/console/network slices linked to the same workflow step.
Related community cases: Skyvern-AI/skyvern#5637 and Skyvern-AI/skyvern#3260. BrowserTrace guide: Debug action confidence and authorization.
Skyvern VNC/CDP debug integration
Visual VNC evidence and CDP browser-state evidence are much harder to use when they are stored as separate logs. Link them to the same task, workflow, and step ids so a reviewer can see which visual window each DOM, console, or network slice explains.
Trace evidence to keep
- Connect/probe start, success, timeout, cleanup, and retry or recovery decision.
- VNC screenshot or recording artifact ids linked to CDP DOM snapshot or selected-element summaries.
- Task id, workflow id, step id, URL, frame/page id, action/tool name, status, and error.
- Redaction state for screenshots, URLs, headers, cookies, and form values before public export.
Related community case: Skyvern-AI/skyvern#3260. BrowserTrace guide: Debug VNC and CDP evidence together.
Skyvern multi-session VNC control drift
In local and self-hosted deployments, a live VNC view can become disconnected from the workflow session it is supposed to explain. The browser may still be reachable through CDP while the VNC stream has no frames, points at a shared display, or loses manual Take Control state after reconnect.
Trace evidence to keep
- VNC stream identity, CDP target identity, workflow/task/session/page ids, and redacted display id.
- Manual-control lease state: agent/manual owner, acquire/renew/release timestamps, persisted-across-reconnect flag, and release reason.
- Isolation metadata showing whether the run had its own X display, container, or browser context.
- Failure cause category: no VNC server, connected without frames, stale stream, manual-control lease lost, or display conflict.
Related community case: Skyvern-AI/skyvern#4392. BrowserTrace guide: Debug multi-session VNC and Take Control drift.
Persistent browser recovery before first screenshot
Some custom computer-use agents fail during persistent browser recovery before a screenshot or URL exists because profile reuse, lock files, stale process detection, or CDP attach timing prevents a browser session from becoming usable.
Trace evidence to keep
session_mode, browser/session/target id, and redacted profile id.- Launch, connect, CDP attach/probe timing, timeout, and error fields.
- Detected process ids, approval source, recovery action, and final connection state.
- Redaction state for local profile paths, process details, URLs, and screenshots.
BrowserTrace guide: Debug persistent browser session recovery.
Public-safe sharing
Use browsertrace export <run_id> --public -o public.html before attaching a trace to a public issue or community thread. The public export omits model input/output, screenshots, and URLs by default.