How many rounds do you need to run?

5 rounds is the sweet spot for a single-component redesign. Rounds 1-3 typically deliver 70 to 80% of the total lift. Rounds 4-5 polish the last 10 points. After round 5 the marginal lift drops below 2 points, which is below the friction cost of another iteration.

Which experts should I pick for a landing page review?

For a landing-page proof card or hero, the 5 lenses that matter are: visual craft (Adam Wathan or any design lead), indie SaaS trust (Pieter Levels or Marc Lou), positioning (April Dunford), conversion craft (Marc Lou or Jason Fried), and distribution leverage (Greg Isenberg). Cover those 5 lenses with whatever names you have public writing samples from.

How is this different from asking ChatGPT to review my landing page?

Three things. (1) Persona discipline: each agent has a different framework, so the feedback rarely contradicts and almost always sharpens. (2) Score-as-forcing-function: requiring a 0-100 number forces specificity. (3) Round-over-round iteration: you screenshot the new version and re-dispatch, so the panel is reacting to your applied fixes, not to a frozen snapshot. ChatGPT in a single shot gives you a wall of generic feedback. A 5-round panel gives you a 30-point measurable lift.

Can I run this on something that's not a landing page?

Yes. The same pattern works on pricing pages, onboarding flows, dashboards, email campaigns, blog post drafts, anything visual or copy-heavy. Just swap the persona panel to match the artifact. For pricing, lean on Patrick McKenzie and Rob Walling. For onboarding, lean on Sherry Jiang and Peter Yang. For email, lean on April Dunford and Patrick McKenzie.

5-Agent AI Design Braintrust: SaaS Proof Card 65 to 96

If you're a solo SaaS founder still copy-pasting cancellation reasons into a ChatGPT thread, hiring a designer for landing-page reviews, or shipping UI by gut feel and praying it converts, this post is for you.

I redesigned the above-fold proof card on retentioncheck.com this morning. Not by feel. By running it past a 5-agent design braintrust I built into my Claude Code setup. Five personas. Five rounds. Each scored 0 to 100 and gave 2 to 3 surgical fixes per round. The card lifted from 65/100 on round one to 95.6/100 on round five. A 30-point gain in one morning.

This post is the methodology, the per-round diffs, the convergent feedback patterns, and the raw braintrust output. Copy it, fork it, run it on your own landing page. The whole point of doing this in public is that the process should compound for everyone, not just me.

What is an AI design braintrust

It's a panel of 5 to 13 AI personas, each grounded in the actual public writing and design philosophy of one real expert (Adam Wathan on craft, April Dunford on positioning, etc.). Each one scores your design 0-100 against their lens and returns 2-3 specific, actionable fixes. You ship the convergent fixes, screenshot the new version, and re-dispatch the panel. Repeat until the average score plateaus.

This is different from asking ChatGPT “is my landing page good.” The structure forces specificity: each persona has a different framework, so the feedback is rarely contradictory and almost always actionable. When all 5 say “kill the orange chips,” you know it’s wallpaper, not ranking. When Adam says “kill the card chrome” three rounds in a row and only one round in does it get applied, the persistent dissent is itself a signal.

The panel

Five experts, one role each:

Adam Wathan on design / Tailwind / Refactoring UI craft
Pieter Levels on indie SaaS mobile-first trust
April Dunford on positioning / tool-vs-journalism framing
Marc Lou on above-fold conversion craft
Greg Isenberg on distribution leverage and shareability

I picked these five because they cover the four lenses that matter for a landing-page proof card: visual craft, indie founder trust, positioning, conversion, and distribution. The full RetentionCheck braintrust roster has 13 experts; I dispatched the 5 most relevant to design.

Round-by-round score trajectory

Version	Adam	Pieter	April	Marc	Greg	Avg	Δ
v1 (PNG -> inline card)	62	71	58	78	58	65.4	baseline
v3 (CTA flip + Share button + brand mark)	81	84	81	86	81	82.6	+17.2
v4 (D-as-hero + neutral chips)	87	91	89	92	89	89.6	+7.0
v5 (newspaper border-y + 8xl D)	92	94	94	94	93	93.4	+3.8
v6 (compression + meta-copy kill)	95	96	95	96	96	95.6	+2.2

Diminishing returns kicked in at v4. The first three rounds did 24 points of lift. The last two rounds did 6. If I’d run a sixth round I’d expect 1 to 2 more points, which is below the friction cost of running another iteration. That’s the natural stopping signal: when the marginal lift stops paying for itself.

What the panel actually changed

v3: detach the click sink and add a distribution loop

Marc Lou: “The card is wrapped in a Link to /blog. The visual gravity well bleeds clicks down into the blog instead of up to the paste-box, which is your actual primary CTA. Card should whisper ‘here is the artifact’ while the paste-box shouts ‘make yours now.’”

Fix: detach the wrapping Link. Add two explicit CTAs at the bottom: dark pill “Run yours” that anchors back up to the paste-box, plus a muted “Read the teardown” link for the small minority who want the long-form blog.

Greg Isenberg, in the same round: “Share-on-X is missing entirely. Every visitor who screenshots your card should be one click from posting it. Otherwise the loop dies at the artifact.”

Fix: third CTA, outlined “Share on X” pill with a prefilled X intent URL. Quote-first tweet copy (“Notion is having a mid-life crisis” leads, not the metric). Plus inline “RETENTIONCHECK” wordmark in the card eyebrow so screenshots survive an off-site crop.

v4: kill the ring gauge and the orange chip wallpaper

Adam Wathan: “The 72px circular grade gauge is fine on the live report page; on a proof slot, a 6xl serif ‘44’ next to the quote is a punch in the face. The ring is shadcn demo, the numeral is verdict.”

Fix: dropped the GradeGauge SVG primitive on this slot. Replaced with a typographic text-7xl sm:text-8xl serif “D” in the brand orange, then text-2xl/3xl tabular “44/100” as the receipt below.

Pieter Levels, same round: “Three identical orange ‘HIGH’ chips on three driver rows is not a ranking, it’s wallpaper. Confidence belongs on the teardown, not the card.” Marc agreed. Adam agreed.

Fix: stripped the chip color. Each driver row now ends with a neutral muted “88% conf., 85% conf., 82% conf.” suffix. The orange now appears in exactly one place: the grade letter. Hierarchy returns.

v5: kill the card chrome, go newspaper

Adam, third round in a row: “The rounded-xl border bg-card overflow-hidden envelope is what’s making it read like a widget. Your typography is already strong enough to stand without a box. Go border-y border-foreground/10 bg-transparent, no horizontal borders, no radius, no card fill. Reads like a Stripe case-study row instead of a shadcn dashboard tile.”

Three rounds is the persistent-dissent signal. Applied. The transformation was bigger than I expected. The card now reads top-to-bottom as a typographic newspaper proof block, not a SaaS widget.

v6: compression

Marc Lou: “You have ‘retentioncheck.com’ stamped 3 times in 200 vertical pixels. Triple-stamping the URL screams insecurity.” Adam, same round: “Three lines of receipt where the eye wants two.” April: “Eyebrow + body + footer all repeating the brand is the page being nervous.”

Fix: collapsed the score block to two lines (huge D, then one metadata line: “44/100 · Churn Health · Bottom 6 of 10 SaaS we graded”). Killed the 56px logo tile and moved it inline at 16px in the eyebrow. Folded the “30 seconds” claim into the eyebrow itself (“Notion · live grade · 30 seconds”). Single brand mark survives via the wordmark on the right of the eyebrow.

Net deletion: 1 standalone image element, 1 standalone copy block, 2 redundant domain stamps, and 3-to-2 lines on the score caption. Mobile CTAs landed above the fold on a 392px viewport without any sticky trickery, as a free side effect.

Five lessons that travel

If you take nothing else from this post, take these:

Persistent dissent is the signal. When one expert flags the same fix three rounds in a row, they’re right and you’re flinching. Apply it.
Convergent fixes ship; tensions defer. When Adam wanted a 140px gauge and Marc wanted a 64px gauge, the right move was to ask “is this a proof slot or a dashboard?” The answer (proof slot, subordinate to the paste-box CTA) resolved the tension in Marc’s favor and let v4 kill the gauge entirely per Adam’s deeper point.
Diminishing returns hit fast. Rounds 1 to 3 did 24 points. Rounds 4 to 5 did 6. Stop when the marginal lift stops paying for itself.
The artifact is the pitch. By v6 the card had no meta-copy narrating the tool from inside the proof. April’s read: “A category-defining tool doesn’t say ‘computed in 30 seconds’ mid-page. The artifact IS the pitch.” Trust the show, delete the tell.
The process is half the answer. The other half is the willingness to delete what your AI built yesterday because today’s panel said it was wallpaper.

How to run this on your own landing page

If you’re a Claude Code user, the RetentionCheck repo has the /braintrust skill checked in at .claude/skills/braintrust/. It dispatches 5 to 13 expert personas in parallel and synthesizes the panel for you. Roster organized into 6 roles (Founder Coach, Distribution Architect, Pricing & Monetization, Product Craft, Scope Surgeon, Positioning) with designated captains per role.

If you’re not on Claude Code, the same pattern works in any agent SDK that supports parallel tool calls. The minimal version: for each persona, pass the screenshot of your design + the persona’s lens + the request “score 0-100, give 3 specific fixes.” Then synthesize and ship.

If you don’t have an agent SDK: read each persona’s public writing, ask “what would they say about my page,” write the score and the 3 fixes yourself, then iterate. The persona discipline is the part that matters. The parallel dispatch is just the speedup.

What it cost

30 expert-agent calls across 5 rounds. Roughly 15 minutes of wall-clock time per round, mostly waiting for parallel agent completions. Total: about 75 minutes of iteration time plus 15 minutes of synthesis and code application per round. A morning’s work for a 30-point lift on the single most-viewed module on my landing page.

If you’re a solo SaaS founder and you haven’t shipped a landing-page redesign in 6 months because every consultant quote came back at $4,000 and 3 weeks, this is the unlock. Five experts. Five rounds. One morning. Now you have a card that converges on what real experts would have charged you $4k to tell you.

What I’m doing next

I’m running the same braintrust on the rest of the home page next week. Hero. Trust row. Bento. Founder note. Each section gets the same 5-round panel. I’ll write up the results when they land.

If you want to run RetentionCheck on your own cancellation data while you’re here, the paste box on /try takes plaintext, CSV, or forwarded emails. Returns a Churn Health Score in 30 seconds. No signup for the first analysis.

Brian Farello is the founder of RetentionCheck, an AI churn analysis tool for solo SaaS founders. He builds in public at @brianfofficial.

We rebuilt our SaaS proof card with a 5-agent design braintrust. Score went 65 to 96.