A daily drafting pipeline that mines your real day — dictations, pair-programming prompts, meetings, build logs — drafts the one strongest post in your calibrated voice, and refuses to ship until a separate judge agent scores it at the viral bar.
The engine runs every weekday at 17:00 with zero input required. It mines the last 7 days of your actual thinking, picks the single strongest moment, drafts it against a measured voice fingerprint, fights it through an adversarial QA loop — and delivers one Slack message you answer with one word.
Content depends on the founder remembering to post. The best ideas — said out loud in meetings, dictated into tools, typed into a coding agent — die where they happened. Drafts that do get written sound like a generic ghostwriter, and a multi-draft experiment just overwhelmed the review channel until it got ignored.
Every weekday: one draft, mined from real work, in a voice calibrated on the actual corpus, scored by a separate judge agent against posts that demonstrably went viral. Below the bar, it gets reworked or honestly skipped. Plus a Sunday newsletter, a nightly performance sweep, and a monthly voice recalibration.
This wasn't designed on a whiteboard. Each stage exists because the previous one hit a wall — the timeline below is the build order to copy.
42 LinkedIn posts shipped manually over 7 months, ~6 per month, median 1,503 characters. No system yet — but those 42 posts became the calibration set everything else is built on. You can't calibrate a voice you haven't published.
LinkedIn's data export, parsed into a structured JSON + a human-readable markdown file: 42 posts with dates and lengths, plus a 30,000-line DM history flagged as the unguarded-voice sample. First wall discovered immediately: the Basic export only contains posts that uploaded media — text-only posts are invisible (the Complete archive is the fix).
A voice-fingerprint file derived from the 42 posts, cross-checked against 60 YouTube video transcripts and 184 top-decile inspiration pieces: sentence length, numbers-per-post, paragraph rhythm, a 5-pattern hook taxonomy, vocabulary tiers, CTA patterns — every claim with a count behind it (full method below).
A weekly active-theme.md file that biases the engine toward a 5-day narrative arc (built for a dogfooding sprint where every business action doubled as content capture), and 11 entertainment patterns mined from the top-decile corpus — every draft must carry at least one (surprise, reversal, self-deprecation, punchy contrarian line, vivid detail) or it gets rewritten.
Three corrections at once. The voice-debrief trigger was retired (it depended on daily input that never came, and its retry crons spammed the channel) — the engine now mines the day unconditionally. Dictation history and Claude Code prompts became the highest-priority sources. Output dropped to one draft per day on a rolling 7-day window.
And the QA gate arrived: a genuinely separate judge agent with its own context, scoring every draft on a 7-dimension rubric against viral benchmark anchors. Proven the same week in a logged demo: a competent draft scored 5.1 FAIL, three surgical fixes later it passed at 8.6.
A full re-sync of the ~44-creator corpus via the inspiration engine (cookbook 04), then a teardown of the three benchmark LinkedIn creators (Welsh, Martell, Latka) and the top YouTube hooks into a frameworks file. Wired into drafting and QA as a hard rule: a draft weaker than the median top post for its pillar gets rewritten — the median is the floor, not the ceiling.
The voice file was re-derived with a hierarchy flip: all-day dictation history (70,844 tokens) is ground truth for the thinking voice; published posts are the edit constraint, not the source. Words the first pass had dismissed as wishful turned out to be frequent in real dictation and went back in. A monthly recalibration cron re-derives the whole file; and a trusted editor's thread feedback now auto-revises drafts — no approval gate — between the draft landing and the final ship call.
The owner's entire daily workload: read one draft at 17:00, reply ship it, edit: <changes> or kill: <reason> in the thread. Everything below happens unattended.
| Step | What the engine does |
|---|---|
| 1 · Refresh corpus | Re-pulls the Airtable inspiration tables, applies the quality floor (videos: ≥1,000 views and ≥50 likes; posts: ≥100 likes), dedupes, keeps the top 10% per creator — so new creators and fresh winners flow in automatically. |
| 1.5 · Check theme | If an active-theme file exists and hasn't expired, it overrides source weighting and sets the week's pillar and format. |
| 2 · Pull the day | Rolling 7-day window across sources, priority-ranked: dictation history and Claude Code prompts highest, then build-log, meetings, intentions, content-ideas, call transcripts, briefings, inbox log. Already-used moments are deduped via source hashes stored with every shipped post. |
| 3 · Score & pick ONE | Every candidate moment scored 1–10 on four axes: specificity, pillar fit (Transition > System > Contrarian), voice authenticity, pipeline potential. Only the single highest-scoring moment gets drafted. The runner-up is held, never posted. |
| 4 · Draft | Against the voice file: hook from the taxonomy, pillar tagged, numbers-dense, one CTA. Framework match from the inspiration file (Tension→Reframe, System + before/after, anti-trends proof). |
| 5 · Style check | 10-point checklist with auto-fail rewrites: any em dash, unknown hook pattern, >12-word average sentences, walls of text, zero numbers, corporate filler, hedges. |
| 6 · Safety check | Forbidden-topics filter (details in the quality bar section). Reject the angle, not just the sentence. |
| 6.5 · QA gate | The separate judge agent scores it; the writer reworks on the judge's top-3 fixes; loop until PASS or the stall ladder is exhausted. |
| 7 · Deliver | One Slack message to the content channel: pillar, hook pattern, CTA, source attribution, the full ready-to-paste draft, and the reply protocol. |
| 8 · Handle the reply | ship it → saved with frontmatter, 7-day performance check queued. edit: → applied, reposted in-thread, diff logged. kill: → archived + appended to the negative-training file. No reply → one nudge at 22:00, expire after 24h, no further nags. |
Mon–Fri 17:00 — the daily LinkedIn pipeline. Sun 15:00 — newsletter draft (600–1,200 words synthesizing the week's shipped and killed drafts). Daily 02:00 — performance sweep on posts due for their 7-day check. 1st of month 03:00 — full voice recalibration. OOO on the calendar = the daily run skips itself, says so once, and stays quiet.
Airtable pull fails → draft from the last cached corpus and say so. Vault unreachable → narrower source pool, noted in the draft footer. Nothing scores high enough → one honest line ("Nothing strong enough today, skipping"), no forced draft. Slack send fails → fall back to messaging the owner directly with the draft body.
The voice file is not a style wishlist — it's a measurement. The full guide stays private; the method is reproducible on anyone's corpus:
Published posts (the LinkedIn export), spoken transcripts (60 video SRTs), unguarded messaging history, and — the richest layer — all-day dictation history (1,360 dictations, 70,844 tokens captured across every app). Each surface plays a different role.
Extract hard numbers: median post length and percentiles, words per sentence, paragraphs per post, numbers-per-post, punctuation habits (zero em dashes across all 42 posts — so the rule is enforced, not invented). Then classify every hook into a taxonomy with frequencies, and tier the vocabulary: core words with counts, promoted words, wishful words, and thinking-voice fillers that must be stripped before publishing.
The key calibration insight. What you dictate all day is how you actually think; what you published is how you edit. The first voice pass dismissed several words as "wishful" — dictation data proved they're frequent in real thinking and they went back into the active vocabulary. Drafts translate the thinking voice into the publishing constraint: strip filler, preserve exact numbers and named systems, keep the energy.
The voice file is read before every draft. Hook from the taxonomy, length from the distribution, at least one exact number, the line-break rhythm — and the 10-point style check auto-fails anything off-fingerprint.
The 1st-of-month job re-derives the file from the latest export, the latest 60 transcripts, every winning-pattern note, and the negative file. Killed drafts append to negative training the moment they die; winning patterns are only written back after a shipped post clears the 7-day engagement threshold (≥30 comments or ≥100 reactions or ≥5 DMs). The voice file learns from what actually traveled, not from what felt good.
The QA gate is two genuinely separate agents — not two roles in one head. The writer drafts and reworks; the judge runs in its own isolated workspace and sees only the draft and the source moment, never the writer's reasoning. That separation kills the self-justification blind spot. Both read the same rubric, hook framework, corpus and voice file, so the critique is legitimate and actionable.
| # | Dimension | Weight | What it measures |
|---|---|---|---|
| 1 | Hook strength | 25% | Does line 1 stop the scroll? Scored against the hook framework's 6 proven patterns. |
| 2 | Specificity | 15% | Concrete numbers, names, dollars, dates. The viral median is 10 numbers per post. |
| 3 | Value payoff | 15% | Can the reader act on it today? Lesson + concrete action. |
| 4 | Voice authenticity | 15% | Sounds like the owner. Zero AI-slop, zero em dashes, zero filler. Any em dash caps this at 4. |
| 5 | Emotional / contrarian charge | 10% | Surprise, reversal, vulnerability or hot take — the reason people comment. |
| 6 | Single-idea clarity | 10% | One idea, nameable in one sentence. Not five. |
| 7 | Structure & scannability | 10% | Short paragraphs, line-break rhythm, never a wall. |
Weighted total ≥ 8.0 · hook ≥ 8 (a weak hook kills the post no matter how good the body) · voice ≥ 7 (off-voice defeats the entire purpose) · no dimension below 6 · zero em dashes (binary auto-fail). Why 8.0: the benchmark posts that actually went viral score 8.5–9.5 on this rubric. Below 8.0 a post is competent but won't travel. The judge calibrates against three anchor posts from the corpus — a two-sentence contrarian punch (6,977 likes), a real-time news open with exact numbers (24,015 likes), a sensory vulnerability open (6,804 likes). The draft doesn't need their like counts; it needs their score range.
"I've been spending a lot on AI lately…" — a competent draft a normal tool would have posted. The judge: hook buries the number (4/10), one number total (5/10), no concrete action (4/10), flat charge (4/10). Three fixes returned, most impactful first — specific enough to act on without guessing.
Fixes applied literally: exact dollar figure in the first four words, the real before/after headcount, the soft CTA replaced with a standalone contrarian close. Six concrete anchors, hook 9/10, voice 9/10. Two passes, no human involved until the draft was already at the viral bar.
Style: em dashes, setup-before-payoff hooks, corporate openers, buried ledes, AI-slop phrases, hedges, walls of text, zero numbers.
Mechanics (from the inspiration frameworks): comment-gating, "like + comment + reshare", DM-bait, stacked P.S. lead magnets, 24-hour scarcity — proven to drive engagement for others, rejected as off-register. YouTube: any script not opening with a proven hook pattern in the first sentence, or opening with "in this video…" preamble.
Drafts are rejected outright — different angle, not a reword — if they contain individual salaries or departure terms, monthly revenue figures, client names (always anonymized to "a client", "an 8-figure operator I coach"), deal financials, or anything about the agent infrastructure's internals. The post describes the outcome, never the implementation. Raw dictation and coding prompts are the richest sources and the leakiest — they get the strictest pass.
Three reworks with no gain doesn't mean stop — it means change the move: (1) re-angle the same moment (3 structurally different framings, scored fresh), (2) re-mine the source for the missing specific (the number that breaks a plateau is usually in the source, never invented), (3) reader-simulation — a third agent role-plays the ICP reading it cold, (4) ask the owner ONE surgical question ("what did the AI catch this week that a human missed? One sentence unlocks it"), (5) only then skip.
Every run logs predicted scores per dimension. The nightly sweep joins them to real 7-day outcomes and classifies: false positive (scored 9.0, flopped), false negative (shipped at 7.5 with a warning, went viral), or calibrated hit. A dimension showing consistent bias across 3+ posts gets re-weighted at the monthly recalibration. And once 10+ of the owner's own posts clear the threshold, his winners replace the inspiration creators as the benchmark anchors.
Exit rule: passed → ship. Converged at 7.0–7.9 → post it flagged with the score and the ceiling, owner decides. Below 7.0 → skip honestly — a skip beats a forgettable post under your name.
Wall: v1 waited for a daily voice debrief from the owner. He didn't deliver one — and the hourly retry crons polluted the review channel with status spam.
Fix: mine the day unconditionally. The debrief became optional bonus signal; the retry crons were retired. Never build a pipeline whose trigger is human discipline.
Wall: 2–3 draft messages per day overwhelmed the channel — the owner stopped opening it entirely.
Fix: exactly ONE message per day. Ties break on pipeline potential; the runner-up is held for tomorrow's scoring, never posted. One strong draft beats three never read.
Wall: one agent writing and scoring its own draft passes itself — self-justification is structural, not a prompt problem.
Fix: a separate judge agent with isolated context that never sees the writer's reasoning. If the judge is unreachable, the writer self-scores as a fallback — and the delivered post is visibly tagged so the owner knows the independent gate didn't run.
Wall: LinkedIn's Basic data export only surfaces posts that uploaded media — text-only posts from the same window are simply absent, silently skewing the calibration corpus.
Fix: request the Complete archive (takes ~24h) and treat the Basic set as the working corpus until it lands.
Wall: the corpus' Like+Share ratio formula divides by views — and LinkedIn doesn't expose views on posts, so every text post ranks as NaN.
Fix: two-layer ranking — videos by Like+Share, text posts by raw Likes. And on YouTube, apply a ≥2,000-view floor before taking the top N, or tiny-view noise tops the list.
Wall: a global top-10% cut would be all mega-accounts — smaller creators' proven patterns vanish.
Fix: top 10% per creator, ceil-rounded so every creator contributes at least one piece. Each teaches what works for their audience, weighted equally.
Wall: all-day dictation and coding-agent prompts carry names, internal numbers, unfinished thoughts, infrastructure details and secrets.
Fix: the hard safety gate runs strictest on these sources, and the standing rule: publish the outcome ("I built a system that mines my own thinking across 5 tools"), never the implementation.
Copy the bootstrap promptThe button below puts it on your clipboard.
Paste it into Claude CodeWith your LinkedIn export requested and your daily sources in reach (notes, transcripts, dictation).
Answer its questionsIt calibrates your voice fingerprint first, then wires the daily mine → draft → judge → Slack loop.
Pairs with cookbook 04 — the inspiration engine supplies the benchmark corpus.