ClawSight 0.9.2 is out ClawSight 0.9.2 is out ClawSight 0.9.2 is out

Make your life easy while working with AI agents on your Mac.

ClawSight gives your OpenClaw and Hermes agents real-time context from your Mac: what app you are in, what is on screen, and what you want to do, so they stop guessing and start acting.

How It Works

One hotkey. Full context. Better results.

ClawSight turns what is happening on your Mac into something your agent can actually use.

01

Press Control + Option

Instantly capture your screen, the active window, and what you are trying to do.

02

Rich context is built

Screenshot plus active app, structured UI elements, conversation history, and your voice.

03

Sent to your chosen agent

The enriched request is delivered to OpenClaw or Hermes so your agent can understand the task.

Why It Matters

Most agents are flying blind. We fix that.

Structured context, not just screenshots

We send the screenshot plus clean accessibility data, element positions, app state, and recent conversation history.

Makes computer use actually reliable

Agents with action capabilities perform dramatically better when they understand what they are looking at.

One interface. Your choice of agent.

Route requests to OpenClaw or Hermes depending on the task. You stay in control.

Built for real workflows

Designed for people who already use OpenClaw or Hermes and want their agents to stop guessing.

Real Examples

What becomes possible when your agents can actually see your screen

Reply to messages with full context

Instead of posting screenshots to Telegram, your agent sees the thread, the sender, and the exact message you want to respond to.

Navigate complex interfaces

Find the right button, field, or menu item even in dense apps where normal agents get lost.

Turn screen + voice into action

Say what you want while looking at the screen. The agent receives both your intent and the exact state of the interface.

Seen in the wild

From caveman screenshots to ClawSight

Too many AI workflows still rely on dragging random screenshots into chat and hoping the model figures it out. ClawSight upgrades that mess into structured screen context, active-app state, and voice intent your agent can actually use.

Same human workflow. Far less guesswork. Much better odds your agent does the right damn thing.

A post joking that people are still dropping screenshots into chat manually instead of using structured screen context.
Real screenshot. Real pain. ClawSight fixes the caveman part.

Watch it

See the real examples playlist in action

Open the lightbox to watch the ClawSight examples playlist without leaving the page.

Trust & Transparency

Runs on your Mac. Sends to the agent you choose.

ClawSight runs locally on your Mac. When you trigger it, it captures a screenshot and structured UI context, then sends that to the agent platform you choose: OpenClaw or Hermes.

Voice is transcribed before being sent. By default, we use AssemblyAI for transcription, which means audio leaves your Mac. You can switch to on-device Apple Speech if you prefer everything to stay local.

We do not store or review your screens or requests. We make the connection between your Mac and your agents.

Private Beta

Ready to give your agents eyes and ears?

We are currently in private beta for Mac users running OpenClaw or Hermes and using Telegram.

Start 7-Day Standard Trial

Not Sure Yet?

Pick the layer that matches how hands-on you want to be.

Standard

$29/mo

ClawSight Capture only.

Premium

$99/mo

Capture plus training community access.

Catalyst

$20,000/yr

Done-with-you agent design, buildout, and deployment.

If you want do-it-yourself training, Premium is the solution for you. If you want done-with-you training, Catalyst is the solution for you.