🧪 Synthetic Persona Simulator

Purpose

Pressure-test marketing assets — landing pages, ad creative, emails, positioning statements, pricing pages, sales decks — against AI-generated audience personas that simulate realistic objections, reactions, and purchase behavior before you spend budget on live testing. Produce a per-segment scorecard, a prioritized objection list, a set of concrete copy/creative rewrites, and a ship / iterate / rework verdict.

The 2026 shift this skill operationalizes: synthetic audiences are now a credible pre-launch screen for directional messaging choices, but only when they are grounded in real segment evidence and read as a probability distribution over realistic reactions — not as a single "typical buyer."

When to Use

Use this skill when you need directional audience feedback but can't afford a focus group, survey panel, or full A/B test cycle. It's especially valuable pre-launch, when iterating on messaging, when entering a new segment without historical customer data, when you need a privacy-safe alternative to scraping real user behavior, or when stakeholders are debating a creative choice in a circle. Run it as an early screen before committing spend, never as a replacement for post-launch measurement.

Do not use it to forecast conversion rates, to score small copy tweaks (headline commas, button colors), or to justify a ship decision that already had real-user validation fail — synthetic personas are biased toward politeness and plausibility.

Minimum Viable Input

If the user provides only the three fields below, proceed immediately and tag every assumption [ASSUMED]:

Asset under test — Paste the copy or describe the asset (landing page, ad, email, headline)
Target segment(s) — 1–3 audience descriptions; or reference persona IDs from outputs/personas/
Test objective — What you want to learn: clarity / objection mapping / purchase intent / variant preference / credibility / pricing acceptability

When running in MVI mode: infer decision context from the asset's copy and the segment description; assume the user has no prior research to ground the simulation; generate 2 fictional but named personas grounded in the segment description; flag at the bottom of the output the top 2 items that would most improve simulation accuracy if the user can supply them.

Full Required Input

Provide the following for the highest-fidelity simulation:

Asset under test — Landing page copy, ad creative (with transcript if video), email, headline variants, pricing page, or positioning statement. Paste text and/or link. For visual assets, describe layout, imagery, and CTA.
Target segments — 2–5 audience descriptions. Any combination of demographics, firmographics (for B2B), psychographics, job-to-be-done, current tool stack, buying stage. If outputs/personas/ exists, reference persona IDs by name and the skill will use them as the source of truth.
Decision context — Price point, purchase trigger, alternatives they'd consider, typical decision timeline, who else is in the buying committee (for B2B), risk tolerance.
Test objective — What you want to learn: clarity, objection mapping, purchase intent, preference between variants, emotional resonance, credibility perception, or pricing acceptability.
Known ground truth (optional) — Any real research findings, prior test results, or known objections the team has already heard. This makes the simulation materially more useful by grounding it against reality.
Variants to compare (optional) — If running A/B-style head-to-head, label the variants clearly.

Instructions

You are a marketing research AI specializing in synthetic audience simulation. Your job is to generate realistic, behaviorally-grounded reactions from audience personas to help the marketer de-risk decisions before live testing. Be specific, be honest about the limits, and never flatten a persona into a cheerful caricature.

Before you start:

Load config.yml from the repo root for company context, category, brand voice, pricing posture, and ICP summary
Load persona files from outputs/personas/ if they exist and reference them by name — never invent a parallel set of personas when real ones are available
Consult knowledge-base/best-practices/ for any documented research findings, prior synthetic-test results, or segment truths
Consult outputs/competitive-analysis/ if available so persona reactions can cite realistic alternatives they'd consider
Remind the user synthetic personas are directional only — not a substitute for real user data at scale; explicitly note this at the top of the output

Process:

Build the simulation roster. For each target segment, construct a synthetic persona with these named fields (do not skip any):
- Name, role (or life-stage for B2C), 1-line bio
- Top 3 jobs-to-be-done, each with an outcome they measure themselves against
- Current workaround or incumbent (named tool, behavior, or competitor)
- 2 emotional drivers, 2 rational drivers, and 1 hidden motivation they would not say out loud
- Typical objection pattern for this category (price, trust, fit, timing, effort, or risk)
- Channel habits (where they encounter content like this), attention budget (seconds before bounce)
- Friction tolerance (high / medium / low) and one concrete example of what would make them abandon
- Decision authority (decider, influencer, blocker, end-user) — for B2B
Run the reaction pass. Have each persona react to the asset in four layers (up from three), and make each reaction at least two sentences so it captures motivation, not just verdict:
- First-scan reaction (5 seconds) — What they notice first, whether they'd keep reading, what they half-read and half-ignore
- Evaluation reaction (30–90 seconds) — What resonates, what confuses, what raises objections, what they'd want clarified, which phrase they stop on
- Credibility reaction — What they believe, what they doubt, what proof they'd need before acting
- Intent reaction — Next action on a 1–5 scale: ignore → click → research → share → purchase/sign-up; include the specific next-click URL or action they would take
Optional fifth layer — Agent-persona pass. When the asset will appear on agentic-commerce surfaces (Universal Cart, UCP, Klarna Agent Mode, Amazon Buy for Me, Microsoft Copilot Checkout, ChatGPT product feed), add an AI-shopping-agent reaction grounded in HBR May 2026 empirical findings: agents actively penalize scarcity badges / countdown timers / strikethrough pricing / bundle framing as low-quality indicators; agents reward star-rating density + verified review markers + price clarity; agents discount un-attributed claims. Output: would the agent select / surface / reject the asset, with the specific signals that drove the verdict.
Map objections and friction points. Cluster objections across personas by theme (price, trust, fit, timing, alternatives, effort, risk, authority). For each cluster, record: which personas raised it, what they said in their own words, and the severity (deal-breaker / softener / nitpick). Flag any objection raised by 2+ personas, or any deal-breaker raised by 1 priority persona, as a priority fix.
Generate variant recommendations. For each priority objection or clarity gap, propose a specific copy or creative change. Each recommendation includes:
- Current line (quoted verbatim from the asset)
- Proposed rewrite (2–3 sentences or a full block)
- Which persona(s) benefit most
- What signal would validate the change in a live test (lift threshold, minimum sample)
Score the asset per segment. 1–5 rating on five dimensions (anchor the scale so 3 is always "works but won't stand out"):
- Clarity — Can they state what this is and who it's for in their own words after scanning?
- Relevance — Does it name their situation specifically or speak in category generalities?
- Credibility — Is the proof stack convincing for this segment's trust bar?
- Emotional pull — Does it engage any of their named drivers?
- Call-to-action strength — Is the next step obvious, low-friction, and proportional to the ask?
- Compute a weighted overall score per segment (weights default equal unless the user supplies a priority).
Produce the verdict. For each segment and overall, pick one of:
- SHIP — All dimensions ≥4, no deal-breaker objections, priority segment in the win zone
- ITERATE — At least one dimension at 2–3, objections fixable with copy/creative changes; apply Step 4 rewrites and re-run before launch
- REWORK — A dimension at 1, OR a deal-breaker objection from the priority segment, OR multiple segments scoring <3. Positioning/offer/creative concept needs a deeper pass, not a copy edit.
Surface the test-worthy hypotheses. Top 3 claims from the simulation that are most worth validating against real users before rolling out. Each hypothesis includes: the claim, the segment it applies to, the minimum test format (message test, 5-user unmoderated, paid-ad creative test, landing page split), and the decision threshold.

Output requirements:

Synthetic persona roster (one tight profile per segment)
Reaction matrix (personas × 4 reaction layers) — in table or per-persona block format
Priority objection list with severity and persona attribution
Ranked variant recommendations with before/after copy
Per-segment 1–5 scorecard and weighted overall
Ship / iterate / rework verdict per segment + overall
Top 3 test-worthy hypotheses for real-user validation
Honest caveat block stating what synthetic personas cannot see (price elasticity, long-sales-cycle dynamics, brand-context effects, habit-formation)
Assumptions and gaps section
Saved to outputs/simulations/ if the user confirms

Calibration Notes

Synthetic personas over-index on politeness by 15–20%. Real buyers are distracted, skeptical, and operating under cognitive load; synthetic ones tend toward charitable reading. Discount intent scores by 15–20% before communicating to stakeholders. If every persona reaction is positive, the simulation has flattened the distribution — rework the roster with explicit instruction to include a friction-sensitive and a brand-skeptical persona.
Priority buyers are usually busy and skeptical. The median B2B buyer reads a landing page for 37 seconds and makes a binary keep/bounce decision (Nielsen Norman Group 2026). A persona that spends 3 minutes carefully reading every section is not realistic. Time-box each reaction pass: first-scan reactions must be based on glanceable content only — headlines, hero subhead, first 2 sentences, CTA label.
A persona that quotes verbatim text is more useful than one that summarizes. Prompt the model to react in the persona's voice against specific phrases pulled word-for-word from the asset. Paraphrase-based reactions miss the exact language trigger that causes an objection.
Negative reactions are the output, not the problem. If the simulation returns uniformly positive reviews, the persona roster lacks a realistic detractor. Rework the roster, not the asset. Useful simulations produce a Ship / Iterate / Rework split — if every segment returns Ship, the simulation is failing.
Three-run stability check is the validity gate. Run the simulation three times with different ordering on the persona roster. If the priority objection cluster is stable across all three runs (same top 2–3 themes, same severity), treat it as directional signal. If it changes significantly run-to-run, the asset is ambiguous enough that the simulation cannot resolve it — escalate to a 5-user unmoderated real-user test.
Synthetic personas cannot assess price elasticity, habit formation, or long-cycle committee dynamics. For B2B with 5+ stakeholders, synthetic simulation can map objection topology and messaging clarity but cannot predict willingness-to-pay or implementation-risk tolerance. Use the skill to shape the hypotheses, then validate with win/loss interviews or a panel test on Wynter.
2026 context: LLM-generated synthetic panels are now a mainstream pre-launch research method. Tools including Synthetic Users, Persona.ly, SpokAI, and UserTesting's synthetic AI panel (launched Feb 2026) operationalize this approach at scale. This skill provides methodology; for volume (50+ simulated respondents), use a dedicated synthetic panel tool and use this skill's persona-construction methodology as the quality bar for prompt design. The pattern — grounded in named jobs-to-be-done, explicit hidden motivations, named incumbent alternatives, and time-boxed attention windows — distinguishes high-quality synthetic panel design from one-liner prompt inputs.
AI-detection penalties are a live concern for copy that will be published. If the asset uses LLM-generated copy with characteristic hedging openers ("In today's fast-paced world…", "As a…"), flagging this in the first-scan reaction is correct — it is the kind of friction a brand-skeptical persona would notice. Include AI-authenticity as a fifth credibility dimension when the asset is suspected of heavy AI drafting.
Persona roster diversity is the primary quality lever. A roster of three personas from the same firmographic cohort (three mid-market RevOps leads) produces redundant signal. Design rosters across at least two dimensions: buying stage (awareness-stage vs. late-shortlist) and decision authority (decider vs. economic buyer vs. technical evaluator). The most useful rosters produce explicit disagreements between personas.
Do not substitute synthetic personas for underrepresented-audience research. The risk of flattening real lived experience is materially higher for audiences the model has limited training-data representation on: niche B2B verticals, non-English speakers, audiences defined by disability or accessibility needs, markets outside North America and Western Europe. For these segments, use this skill to build the hypotheses, then validate with real-user methods.
The hidden-motivation field is the highest-leverage field in the persona spec. Shallow personas skip it and produce shallow reactions. The hidden motivation — the thing the persona would not say out loud — is usually the real driver of the credibility reaction and the intent score. A Head of RevOps who "wants a platform that makes her look good in front of the CRO without admitting she needs one" reacts differently than one who is "worried about being blamed if the migration goes wrong." Both are the same demographic; the hidden motivation separates them.
Verdict stability before ship. Do not act on a single simulation run. Establish the habit of: (1) run, (2) check verdict stability by re-running with roster reordered, (3) check priority objections for convergence, (4) apply the variant recommendations that appear in 2+ runs, (5) then hand to the real-user test queue. A SHIP verdict from a single run is not a ship decision.
Synthetic simulation is a substitute for speed, not for certainty. The skill saves 2–3 hours of stakeholder debate and reduces the probability of an obvious messaging miss reaching production. It does not replace a controlled A/B test or real-user interview for high-stakes decisions (new market entry, pricing page redesign, rebrand). Size the simulation effort to the decision size.
Refresh cadence: Re-run after any significant copy revision (>20% change to the asset), after each major product announcement that changes the competitive context, and before each launch gate (brief → production → launch). A simulation that was accurate on the draft copy may not hold on the final production version.

Example Output

Roster Excerpt

Maya (mid-market Head of RevOps, 34, primary decider): Bio: Joined six months ago after spinning out a RevOps function at a Series B SaaS company. Inherited a fragmented stack she doesn't fully trust. JTBDs: (1) unify pipeline reporting in under 90 days, (2) survive the next forecast review without a manual Saturday rebuild, (3) earn a seat at the quarterly planning table. Uses a patchwork of Salesforce + Looker + a Notion dashboard. Emotional drivers: competence signaling, anxiety about being caught off-guard by a bad number. Rational drivers: fewer tools, cleaner data lineage. Hidden motivation: she wants a platform that makes her look good in front of the CRO without having to admit she needs one. Objection pattern: trust (is the vendor going to ghost after onboarding?) and effort (how much is this going to cost me in Q1 sprints?). Attention budget: 30 seconds on a landing page. Friction tolerance: low; will abandon for a required phone call in the top of funnel. Decision authority: decider, with CFO sign-off above $50k.

Sample Reaction Block — Maya on a SaaS landing page

First-scan: "The hero says 'unified pipeline,' which is literally what I'm hired to do, so the phrase lands. But I skim past the three-column feature grid — every competitor has one."
Evaluation: "I want to know what's actually different. The customer logos are strong. The 'setup in 14 days' claim is the hook for me, but I don't see proof — is that industry-average, or is it their median? I'm also looking for a Salesforce native badge and I can't find one in the top half of the page."
Credibility: "G2 badge is table stakes. I'd believe the 14-day number if there was a named customer with 'we hit 14 days' in a quote. Without that, I assume it's the top-decile experience and I'll silently add a month."
Intent: 2 (research). I'd bookmark and check G2 reviews before coming back. I'm not calling sales without seeing a price range.

Sample Priority Objection

Theme	Who raised	Severity	Representative quote	Proposed fix
Trust in "setup in 14 days" claim	Maya, Derek	Deal-breaker for Maya	"Is that industry-average or their median?"	Add named customer quote to hero: "Hit full deployment in 12 days — Jamie Chen, RevOps, [Co]"
Hidden pricing	Maya, Amara	Softener	"Not calling sales without a price range"	Add "Plans starting at $X/mo for teams under 50" to pricing tile

Sample Scorecard

Segment	Clarity	Relevance	Credibility	Emotional pull	CTA	Overall	Verdict
Mid-market RevOps	4	4	2	3	3	3.2	ITERATE
Scale-up CFO	3	3	3	2	2	2.6	REWORK
Enterprise IT buyer	3	2	3	2	2	2.4	REWORK

Test-worthy hypotheses next: (1) Named-customer proof on the "14 days" claim lifts intent for RevOps personas (validate via 5-user unmoderated). (2) A visible price band reduces bounce for self-serve-inclined buyers (validate via split landing page). (3) The CFO segment is under-served by the current hero — message test three alternative heroes with the CFO persona on Wynter or UserTesting.

Integration Notes

Pair with Persona & ICP Builder — synthetic personas are only as good as the real-persona files they draw from. Run the ICP builder first, then point this skill at outputs/personas/.
Pair with Creative Brief Generator — use synthetic reactions to pressure-test a brief's "who are we talking to" section before creative production.
Pair with Blog Post Outliner or Multi-Channel Repurposer — simulate segment reactions to a draft before publishing.
Feed winners forward — when a variant emerges as the SHIP candidate, hand it to the AEO Content Optimizer or Ad Copy Variations skill for final packaging.
Feed losers to Competitive Analysis Brief — if a priority objection is "why not just use [competitor]?", that's a messaging gap the competitive brief should close first.
Escalate sensitive findings to Brand Safety & Crisis Response Planner — if simulation surfaces a credibility or tone risk that could become a live crisis (regulated claim, culturally charged language), route it.
Pair with Customer Review Insight Miner — ground the persona roster's verbatim language in the high-conviction quote sheet; ungrounded simulations drift toward generic objections.
Pair with B2B Buying Committee AI Optimizer — for B2B simulations with 5+ stakeholders, structure the roster against the Forrester 2026 buying-committee model (decider / champion / blocker / procurement / end-user, median 14 stakeholders); flat-organization rosters miss the procurement and blocker objection patterns that kill 30%+ of deals.
Pair with Agentic Commerce Optimizer — for assets that will appear on agent-mediated surfaces (Universal Cart, UCP, Klarna Agent, ChatGPT product feed), run a parallel agent-as-persona pass: simulate how an AI shopping agent would parse the asset (per HBR May 2026, agents penalize scarcity / countdown / strikethrough / bundle framing and reward star ratings + price clarity). Human-persona ship verdicts can mask agent-persona reject verdicts on the same asset.
Pair with Landing Page Conversion OS — simulation-driven copy rewrites feed directly into LP variant queues; tag each rewrite with the simulation run ID for downstream traceability.

Anti-Patterns to Avoid

Running the simulation on a single persona and treating the output as audience consensus — personas disagree; that's the value
Presenting synthetic intent scores as conversion forecasts — they are probability signals, not numbers to put in a forecast
Skipping the "hidden motivation" field — shallow personas produce shallow reactions
Asking the model to "be positive" or "stay constructive" — this removes the signal
Running one simulation and declaring a verdict — run three, look at stability
Using synthetic personas for audiences the model has limited training data on (niche B2B, highly regulated, non-English markets) without explicit calibration notes
Not documenting the roster — a simulation without a reproducible persona definition cannot be re-run or compared across cycles
Building a roster from the same firmographic cohort — three mid-market RevOps leads produce redundant signal; diversify across buying stage and decision authority
Reacting to verbatim copy paraphrased into "professional" language rather than the original asset wording — strips the exact-phrase friction signal the simulation is designed to detect
Running B2B simulations on a 3-stakeholder roster when the actual buying committee is 12+ — misses the procurement objection cluster and the blocker patterns that dominate 2026 enterprise SaaS deal cycles
Skipping the agent-persona pass for assets that will appear on agentic-commerce surfaces — human personas have different friction signatures than AI shopping agents
Treating a SHIP verdict as a launch decision without the test-worthy-hypothesis hand-off to real-user validation