Your agent is blind.

Sense is an MCP server that lets AI agents see your desktop. No more copying error messages, describing layouts, or screenshotting UI bugs. Just say "take a look" — and it does.

Get your API key

You're the bottleneck.

Every time your AI agent needs visual context, you stop what you're doing. You screenshot. You crop. You paste. You describe what you're looking at in words that are never quite right.

"The button is slightly off. No, more to the left. The padding is wrong — not that padding, the inner one."

Words are lossy. Screenshots are manual. And every token you spend describing what's on screen is a token not spent solving the problem.

Glance. Read. Zoom.

Sense doesn't flood your context window with high-res screenshots. Your agent is smart about how it looks.

Glance

A low-resolution capture to get the lay of the land. Cheap. Fast. Usually enough.

Read

OCR extracts visible text without sending an image at all. Error messages, console output, UI labels — as raw text. Tiny payload.

Zoom

When detail matters, the agent crops into a specific region at higher resolution. Pixel-level precision, only where it's needed.

What your agent can do

See

Screen and window capture — full frame or any region. Works even on occluded windows.

Read

OCR text extraction from any part of the screen. Get content without the image cost.

Click

Click at precise coordinates. Screen-absolute or window-relative.

Type

Send text and key combinations. Ctrl+C, Enter, Tab — anything.

Scroll

Scroll within any window at any position.

Manage

List open windows, bring them to the foreground, target by process or title.

Privacy first.

Everything runs locally on your machine. Screen captures, OCR, input — all processed on-device. Nothing is sent to a server. Your screen data never leaves your computer.

No server-side processing
Screen data never leaves your device
Captures stay between you and your agent

How people use Sense

Fix what you can see

A visual bug in your app. Instead of writing "the card component has a 2px gap on the right side when the viewport is below 768px" — you say "look at the app window, the cards look off." The agent sees it, understands it, fixes it.

Implement from Figma

Your designer hands you a Figma file. Point the agent at the Figma window. It reads the specs, sees the layout, and writes code that matches — without you translating design into words.

Verify its own work

The agent makes a change. Instead of asking you "does it look right?" — it checks itself. Capture the window, compare to the intent, iterate. You stay in flow.

Pricing

Free

$0
  • 30 captures per day
  • 3 API keys
  • All tools included

Pro

$5/month
  • Unlimited captures
  • 10 API keys
  • All tools included

Better together

Sense gives your agents eyes. Superlite lets you run them in parallel.

See what your agent's been missing.

Get your API key and start in under a minute.

Get Started
llms.txt