A triage agent that runs before you wake: stars what matters, labels and archives the noise, unsubscribes from bulk at the protocol level — and reports in one daily briefing you answer in one line. Calibrated on 20 years of your own starred mail, improving with every reply you send it.
The inbox is an anxiety surface: hundreds of unread, the important drowning in newsletters and recap mails, a years-deep backlog you'll "sort out someday". Every glance costs focus; nothing has been decided.
The inbox is a processed queue: what matters is starred and waiting, noise is labeled and archived but still searchable, bulk is unsubscribed — and one morning briefing surfaces the few cases that genuinely need a human call.
Each new message is evaluated top to bottom — first match wins. Sender-specific overrides sit above domain rules, so one noisy address can't ride a trusted domain into the star queue.
| Action | What it means | What triggers it |
|---|---|---|
| Label + archive | Useful but noisy — labeled, out of the inbox, searchable forever | Known recap and notice senders (e.g. meeting-recorder summaries → label fathom); per-sender overrides — evaluated before star rules |
| Star + keep | The human queue — what you actually open | Whitelisted domains (company, tax/legal, property, key clients), individual trusted senders, money/legal subject keywords (urgent, contract, invoice, term sheet…), replies to threads you started |
| Leave alone | Looks like bulk, isn't — never filtered, never auto-starred | Banking and payment alerts, vendor billing, calendar mail, identity/security senders — and any subject carrying a 2FA or verification code |
| Unsubscribe | Bulk with a clean exit | List-Unsubscribe header present + bulk-looking sender (noreply/news/marketing localparts, ESP trace headers) |
| Filter future | Bulk with no exit — a filter archives the sender's future mail | Bulk pattern, no usable unsubscribe header, sender not on the careful list |
| Default | The resting state — differs per account | Work account: leave it in the inbox. Personal account: archive it (cleanup mode) |
Same ladder, different defaults. The work account errs toward visible — anything unmatched stays put. The personal account runs in 20-year cleanup mode and errs toward empty — anything unmatched is archived, never deleted, always searchable.
Subject keywords catch the long tail — payment, legal, deadline, cancellation terms in both your languages. But the heavy lifting is sender-level: who you reply to beats what words they used.
Before the agent touches a single message, a read-only calibration pass mines what you already told your inbox over two decades — every star is a labeled training example you didn't know you were creating.
Pull all starred mail (paginated year by year when the archive is deep), with full headers: sender, domain, subject, date, thread position. Exclude stars on your own sent mail — those are follow-up bookmarks, not importance signals, and they poison the rules.
Thousands of recent unstarred messages as the denominator. Without it, "invoice" looks important just because it's frequent. With it, every keyword is measured as lift: how much more often it appears in starred vs ordinary mail.
Domains starred 5+ times → always-star candidates. Individual senders starred 3+ times (not already covered by a domain) → whitelist. Subject keywords in 10+ starred mails and at ≥3× lift → keyword rules. Replies to threads you started: ≥70% star rate → automatic rule, <30% → dropped, in between → ask-every-time. Domains seen 20+ times with zero stars → archive candidates, surfaced but never auto-applied.
On top: 90 days of sent-mail analysis — whitelist the senders you actually answer, not just the ones you starred years ago.
Calibration writes a report and sends a briefing: sample sizes, top domains, top keywords, suggested rules. Nothing acts until you reply "approved" — the daily run literally refuses to start without the approval file on disk. Hand-written rules stay a floor; calibration adds, never overrides.
Done when: the ruleset reflects your behavior — and you've signed off on itA scheduled job processes the last 24 hours of inbox mail (never spam, never trash, never anything already starred). Each message walks the ladder; labels, stars and archives are applied. On the personal account, a historical-cleanup batch of the oldest unread runs alongside.
A single Slack DM: counts per account, the handful of subjects it starred, pending unsubscribes tagged U1, U2…, and borderline cases tagged B1, B2… — each with a real opinion, not hedging. An empty day says so in one line.
Replies are commands. Silence is fine: undecided cases simply resurface tomorrow. Replying to actual emails stays yours — the agent files and flags, it never writes to your contacts on your behalf.
Every star/skip/keep decision is appended to a learning log. Monthly recalibration re-mines everything, weighting your explicit replies 3× heavier than passive star history. A borderline rule collapses into a firm keep-or-drop once 20+ decisions land with a ≥70% majority. Teaching the system is a by-product of reading one DM.
Done when: the morning briefing takes under a minute — most days you reply nothing"unsub U1 U3" — execute those unsubscribes. "keep U2" — never touch that sender again. "star B1, skip B2" — teach the borderline rules. That's the whole interface: a one-line Slack reply, or silence.
The historical backlog melts on a leash: warm-up at 200 oldest-unread per day → only after 5 days under a 1% error rate does it shift to full at 500/day → complete when the backlog drains. Throughput is earned by a clean error record, not assumed.
Every action lands in a daily file: action, sender, subject, reason — plus stats and borderline reasoning. Not bureaucracy: it's what makes the system debuggable. Two of the three rule patches below came straight out of reading one day's log.
Unsubscribes use the List-Unsubscribe mail header exclusively: one-click POST (RFC 8058) preferred, mailto: fallback. Links inside email bodies are never clicked — that's where the phishing lives. The one-word protocol unsubscribe is the only mail the agent ever sends.
Bulk senders without a clean exit get a mail filter that auto-archives their future sends. Created idempotently — existing filters are checked first, so three mails from the same sender in one run don't stack three duplicate filters.
Some platform addresses carry both bulk digests and real humans — community platforms, social networks, code-hosting notifications all send invites and mentions from the same noreply address that sends spam-grade digests. These live on a careful list: never auto-filtered, always surfaced as a pending decision.
For 7 days after install, every unsubscribe surfaces for explicit approval instead of executing — protecting the three newsletters you actually love from an over-eager bulk pattern. From day 8, auto-execute resumes for anything you haven't keep-listed.
Done when: bulk mail unsubscribes itself — and nothing you wanted ever disappearedThe agent acts on your mail every day, so the safety rules are absolute — and the gotchas below are real patches, each traceable to a logged false positive in the first days of operation.
A whitelisted client domain carried both a key contact and a newsletter sender — the newsletter got starred for days. Fix: drop the domain rule, whitelist the human, label-and-archive the newsletter address. Sender rules now evaluate before domain rules.
The tax keyword faithfully starred a platform vendor's "tax and price updates" notices. Fix: a sender-level label-and-archive override that beats any keyword. Same lesson from a membership org: real correspondence stars, the same org's noreply digests get labeled away.
The Gmail CLI's metadata format returned zero headers on this version — silently. Unsubscribe detection needs headers, so everything fetches raw. Cost a debugging session; now one line in the runbook.
Identity senders and anything carrying a verification or 2FA code look exactly like bulk — noreply sender, templated subject. They're hard-coded leave-alone: a triage system that archives your login codes gets uninstalled the same day.
Rate-limited? Process what landed, resume tomorrow. One account's auth expires? Surface it, skip that account, still run the other — a real run did exactly this. Unhandled exception? Error trace to disk plus a DM. The run never silently skips, and it doesn't pause for vacations.
A read-only harness mirroring every rule ran against the live inbox before the skill was allowed to act. Then: ship narrow, read the audit log, patch. Two dated rule versions in the first week — each one a logged false positive turned into a fix.
Copy the bootstrap promptThe button below puts it on your clipboard.
Paste it into Claude CodeWith Gmail access (CLI or MCP) and a Slack DM route set up.
Approve the calibrationIt mines your stars read-only first; nothing touches a message until you reply "approved".
Calibration is read-only — the system earns autonomy one approval at a time.