GantryGraph v0.2.0 is now live on PyPI

Build AI agents that
use your computer. In Python.

GantryGraph gives developers a simple, highly customizable way to build AI agents that operate the OS — desktop, browser, and your own tools. Built on LangGraph. MIT licensed. Yours to extend.

Read the Docs
★ MIT License· Python 3.10+· LangGraph-native· MCP-ready
agent.py — gantrygraph

// 04 PILLARS

Built for developers. Built for control.

Designed so any Python developer can spin up a custom AI agent that drives the computer — and shape every piece of how it runs. No magic, no hidden complexity, no vendor lock-in.

Secure by default · Zero-Trust

The agent is born locked-down. Use GuardrailPolicy to block domains, limit file-system roots, and require human approval for sensitive actions. Production-grade guardrails, not promises.

Model-agnostic vision

Not locked into Claude. Pass any Vision LLM that speaks the LangChain interface — Claude, GPT-4o, Gemini, or local models. Perception is a first-class plug-in.

Infinite tooling via MCP

Native Model Context Protocol support. Connect your agent to GitHub, Slack, Postgres, or any MCP server via MCPConnector. Custom tools integrate as LangChain BaseTool subclasses.

Cloud-native & headless

Designed for servers. Playwright runs headless in Docker/k8s. Async streaming via astream_events() maps cleanly to WebSockets and remote UIs.

// HOW IT WORKS

A ReAct loop, engineered for the OS.

User intent flows through the Gantry Engine, branches across desktop, browser, and local tools, then streams results asynchronously back to your UI.

User Intent"Find the pricing page"Gantry EngineLangGraph state machineobserve → think → act → reviewGuardrailPolicy · BudgetPolicyDesktopScreenpyautogui · mssWebPageplaywright · stealthFileSystemToolsread · write · shellMCPConnectorgithub · slack · pgAsync Streamevents → WebSocket / UI

// USE CASES

Browser, desktop, research — one library.

Real production patterns you can ship today.

browser_automation.py
from gantrygraph import GantryEngine
from gantrygraph.perception import WebPage
from gantrygraph.actions import BrowserTools
 
web = WebPage(url="https://news.ycombinator.com", headless=True)
 
agent = GantryEngine(
llm=ChatAnthropic(model="claude-sonnet-4-6"),
perception=web,
tools=[BrowserTools(web_page=web)],
)
 
# Scrape top stories with screenshots on every step
result = agent.run("Find the top 5 stories and their links")
Stealth mode — UA patch, webdriver undefined, random delays
browser_click_text bypasses CSS-selector failures on dynamic DOM
Combine with WebSearchTool to search + browse in one agent

// COMPARISON

How GantryGraph stacks up.

An objective technical comparison against other Computer-Use frameworks. Every claim is verifiable in the docs.

CapabilityGantryGraphOpenHandsbrowser-useAnthropic Demo
Integration modelPython library, drop into any serviceStandalone platform / DockerBrowser-only librarySingle-script demo
ScopeFull OS · browser · MCP · local fnsCoding-agent IDEWeb pages onlyDesktop screen + bash
Zero-trust guardrails~~
Model-agnostic LLM~
Native MCP support
Async event streaming~
Headless / Docker-ready
Bot-detection resistance~~
Anti-loop consecutive errors~
LicenseMITMITMITsample code

✓ supported · ~ partial / requires custom code · — not supported

// FAQ

Questions, answered.

Common questions from engineers evaluating GantryGraph.

GantryGraph is a Python library you drop into any service — not a standalone platform. Its scope spans the entire OS (desktop apps, browsers, local Python functions, and MCP servers) instead of being limited to coding workflows or web pages. GuardrailPolicy, native MCP support, and async event streaming come built-in rather than as add-ons.
No lock-in. GantryEngine accepts any LangChain chat model — Claude, GPT-4o, Gemini, Pixtral, Qwen2-VL, or any local model exposed over an OpenAI-compatible endpoint. Pass it as llm= at construction time.
GantryGraph is zero-trust by design. Use GuardrailPolicy to control which domains can be visited, which shell commands are allowed, and which actions require human approval via approval_callback. You explicitly opt in to capabilities, not out of them.
Yes. Playwright runs headless inside any container. The agent loop is fully async, so it maps cleanly to FastAPI WebSocket endpoints and remote UIs.
BrowserTools ships with stealth=True by default — it patches navigator.webdriver, sets a realistic Chrome UA, populates navigator.plugins and navigator.languages, and adds random delays to clicks. For search queries, use WebSearchTool (Tavily API) instead of navigating to search engines directly.
Pass an MCPConnector instance in tools=[]. Tools are auto-discovered from the server. Connect to GitHub, Slack, Postgres, Notion, arXiv, or any custom MCP server with one import.
Yes — MIT licensed on GitHub and PyPI. Free forever for the core library.

Build your own agent.
Make it yours.

Open source · MIT license · Python 3.10+
Get started →View on GitHub