Create a Browser Agent
Build an agent that navigates websites, extracts data, and fills forms — with built-in stealth mode and web search.
Prerequisites
pip install 'gantrygraph[browser]'
playwright install chromium
Step 1 — Minimal browser agent
from gantrygraph import GantryEngine
from gantrygraph.perception import WebPage
from gantrygraph.actions import BrowserTools
from langchain_anthropic import ChatAnthropic
web = WebPage(url="https://news.ycombinator.com", headless=True)
agent = GantryEngine(
llm=ChatAnthropic(model="claude-sonnet-4-6"),
perception=web,
tools=[BrowserTools(web_page=web)],
max_steps=20,
)
result = agent.run("Find the top 5 stories and return their titles and links.")
print(result)
Passing the same WebPage instance to both perception= and BrowserTools(web_page=web) ensures they share one browser tab. The agent sees a screenshot and the accessibility tree on every step.
Step 2 — Stealth mode (default on)
Both WebPage and BrowserTools ship with stealth=True by default. This sets a realistic
Chrome user-agent, patches navigator.webdriver to undefined, populates navigator.plugins
and navigator.languages, and passes --disable-blink-features=AutomationControlled at
launch. Click and fill actions also add small random delays to mimic human timing.
# stealth=True is the default — no changes needed for most sites
web = WebPage(url="https://example.com", headless=True, stealth=True)
tools = [BrowserTools(web_page=web, stealth=True)]
Turn it off only if you're testing against a local server where fingerprinting doesn't matter:
web = WebPage(url="http://localhost:3000", headless=True, stealth=False)
Step 3 — Web search (search engines block bots — use the API instead)
Google, Bing, and DuckDuckGo detect and block headless browsers with CAPTCHAs regardless of
stealth patches. For search queries, use WebSearchTool which calls the
Tavily search API instead — web-agnostic, structured results, no browser.
pip install 'gantrygraph[search]'
from gantrygraph import GantryEngine
from gantrygraph.actions import BrowserTools, WebSearchTool
from langchain_anthropic import ChatAnthropic
import os
agent = GantryEngine(
llm=ChatAnthropic(model="claude-sonnet-4-6"),
tools=[
WebSearchTool(api_key=os.environ["TAVILY_API_KEY"]),
BrowserTools(),
],
max_steps=20,
)
result = agent.run(
"Search for 'Python async best practices 2024' and open the top result."
)
The agent calls web_search to get results, then uses browser_navigate to open the page it
wants to read in depth. Get a free Tavily key (1 000 queries/month) at
tavily.com.
Preset shortcut — pass search_api_key= and everything is wired up automatically:
from gantrygraph.presets import browser_agent
from langchain_anthropic import ChatAnthropic
agent = browser_agent(
llm=ChatAnthropic(model="claude-sonnet-4-6"),
start_url="https://example.com",
search_api_key=os.environ["TAVILY_API_KEY"],
)
Step 4 — Scrape without perception
agent = GantryEngine(
llm=ChatAnthropic(model="claude-sonnet-4-6"),
tools=[BrowserTools(headless=True)],
max_steps=10,
)
result = agent.run(
"Go to https://pypi.org/project/gantrygraph/ and return the latest version number."
)
Skipping perception= omits screenshots from every loop step, which cuts token cost
significantly for pure-extraction tasks.
Step 5 — Fill a form
web = WebPage(url="https://myapp.example.com/login", headless=False)
agent = GantryEngine(
llm=ChatAnthropic(model="claude-sonnet-4-6"),
perception=web,
tools=[BrowserTools(web_page=web)],
max_steps=15,
)
agent.run("Log in with username 'admin' and password 'secret', then go to the dashboard.")
Set headless=False while developing so you can watch the agent interact with the page.
Complete example — scrape + save
from gantrygraph import GantryEngine
from gantrygraph.perception import WebPage
from gantrygraph.actions import BrowserTools, FileSystemTools
from langchain_anthropic import ChatAnthropic
web = WebPage(url="https://github.com/trending/python?since=weekly", headless=True)
agent = GantryEngine(
llm=ChatAnthropic(model="claude-sonnet-4-6"),
perception=web,
tools=[
BrowserTools(web_page=web),
FileSystemTools(workspace="/tmp/results"),
],
max_steps=30,
)
result = agent.run(
"Extract all repository names and star counts from the trending page "
"and save them as JSON to trending.json."
)
print(result)
Browser tools reference
| Tool | What it does |
|---|---|
browser_navigate |
Open a URL |
browser_click |
Click a CSS or XPath selector |
browser_click_text |
Click any button/link by its visible text label — more robust than CSS selectors on dynamic pages and consent banners |
browser_fill |
Type text into an input field |
browser_get_text |
Return visible text from an element or the whole page |
browser_get_url |
Return the current URL |
browser_scroll |
Scroll the page ("down", "up", "top", "bottom") |
browser_evaluate |
Execute a JavaScript expression and return the result |
browser_wait_for_selector |
Wait until a CSS/XPath selector becomes visible |
Stability options
| Parameter | Default | Description |
|---|---|---|
max_steps |
50 |
Hard cap on act-node executions |
max_consecutive_errors |
5 |
Stop early if the same error repeats — catches infinite CAPTCHA / redirect loops |
from gantrygraph import GantryEngine
from gantrygraph.perception import WebPage
from gantrygraph.actions import BrowserTools
from langchain_anthropic import ChatAnthropic
web = WebPage(url="https://news.ycombinator.com", headless=True)
agent = GantryEngine(
llm=ChatAnthropic(model="claude-sonnet-4-6"),
perception=web,
tools=[BrowserTools(web_page=web)],
max_steps=20,
)
result = agent.run("Find the top 5 stories and return their titles and links.")
print(result)
0
Variants
- Visible browser for development:
WebPage(url="...", headless=False) - Firefox or WebKit:
WebPage(url="...", browser_type="firefox") - Accessibility tree only (no screenshot):
WebPage(url="...", include_screenshot=False) - Screenshot only (no accessibility tree):
WebPage(url="...", include_accessibility=False) - No stealth (local dev):
WebPage(url="...", stealth=False)
Troubleshooting
ImportError: BrowserTools requires the [browser] extra — run pip install 'gantrygraph[browser]' && playwright install chromium.
CAPTCHA on Google / Bing / DuckDuckGo — search engines block headless browsers. Use WebSearchTool with a Tavily API key instead; see Web search.
TimeoutError on page load — the default wait_until="domcontentloaded" can be slow on heavy pages. Call browser_wait_for_selector first or add a BudgetPolicy(max_wall_seconds=60).
Agent clicks wrong element — use browser_click_text with the exact button label, or use browser_evaluate to click via JavaScript.
Agent loops on a blocked page — set max_consecutive_errors=3 so the engine stops early instead of exhausting max_steps.
Next: Web search · Connect external services with MCP · Read and write files