AI experts sharing free tutorials to accelerate your business.
Back to Healthcare toolkit

Behavioral-Health AI Chatbot Compliance Review

Review the conversational flows, system prompts, knowledge-base snippets, escalation rules, and patient-facing copy of any AI chatbot that a behavioral-health practice, payer, EAP, school-based program, digital-therapeutic, wellness vendor, or hospital intends to deploy for patients, members, students, or the general public — and produce a structured compliance finding report that maps each element against the 2026 regulatory perimeter the AMA, state legislatures, FDA, and federal bill drafters have drawn around mental-health AI chatbots. Output is designed for the organization's Privacy Officer, Compliance Officer, Chief Medical Officer, or designated AI Use Officer to act on before the chatbot is exposed to a single patient.

Saves ~35 min/reviewadvanced Claude · ChatGPT · Gemini

🧠 Behavioral-Health AI Chatbot Compliance Review

Purpose

Review the conversational flows, system prompts, knowledge-base snippets, escalation rules, and patient-facing copy of any AI chatbot that a behavioral-health practice, payer, EAP, school-based program, digital-therapeutic, wellness vendor, or hospital intends to deploy for patients, members, students, or the general public — and produce a structured compliance finding report that maps each element against the 2026 regulatory perimeter the AMA, state legislatures, FDA, and federal bill drafters have drawn around mental-health AI chatbots. Output is designed for the organization's Privacy Officer, Compliance Officer, Chief Medical Officer, or designated AI Use Officer to act on before the chatbot is exposed to a single patient.

This is a reviewer's tool. It does not design chatbot responses, does not write therapy scripts, does not provide crisis counseling, and does not replace clinician-led crisis protocol review. It surfaces risk-ranked findings calibrated to the 2026 behavioral-health chatbot policy environment — the AMA's April 22, 2026 letter to Congress urging federal guardrails; California's chatbot law (effective January 1, 2026) requiring AI disclosure, crisis-ideation detection, and minor-specific guardrails; Tennessee's licensed-professional-impersonation prohibition (effective July 1, 2026); the 30+ state chatbot bills active in 2026; the federal SAFE BOTs Act (introduced March 5, 2026) disclosure and referral requirements; and the American Academy of Nursing's February 25, 2026 Human-in-the-Loop (HITL) oversight position.

When to Use

Use this skill any time a behavioral-health-adjacent AI chatbot is proposed, updated, or audited. Common scenarios:

  • A behavioral-health practice, FQHC, community mental-health center, or hospital is launching a patient-facing intake, scheduling, triage, or "frequently asked questions" chatbot that will — intentionally or not — be asked about symptoms, coping, medication, or crisis
  • A primary-care or pediatric practice is launching a chatbot that may field questions adjacent to depression, anxiety, substance use, eating disorders, or suicidality
  • A payer, EAP, or school district is launching a member-facing wellness or benefits chatbot that will route or triage behavioral-health utilization
  • A digital-therapeutic, meditation app, AI-companion, or AI-coach vendor is offering the practice a co-branded deployment
  • A practice is responding to a state AI-in-healthcare statute (California, Tennessee, or any state with a 2026 chatbot-specific bill enacted or pending) and needs to know whether an already-deployed chatbot complies
  • A practice has received an AMA, state board, state attorney general, or FDA inquiry about a patient-facing AI product and needs a defensible review packet
  • A practice is updating its AI governance policy and needs a chatbot-specific section added to the AI Tool Registry
  • The chatbot vendor has updated its model, system prompt, knowledge base, or escalation routing and the practice needs to re-audit
  • A pediatric program is deploying any chatbot that could be accessed by users under 18 (minor-specific protections dominate the 2026 bill wave)
  • A clinician or staff member has raised a specific concern ("the chatbot told a patient X") that needs a structured review, not a dismissal
  • A health system is consolidating three chatbot vendors and needs a side-by-side compliance comparison
  • A board of directors or quality committee has asked for the chatbot's safety posture in plain language

This skill produces a compliance work-product. It is not legal advice, not a substitute for the practice's Privacy Officer, counsel, or Chief Medical Officer, and not a substitute for the chatbot vendor's model card, red-team report, or HIPAA risk analysis. It produces a first-pass structured review ready for the compliance committee.

Required Input

Provide items 1–6 at minimum. Items 7–10 materially improve the review.

  1. Chatbot identity and deployment surface — Vendor name, product name, model (base LLM + any fine-tuning), deployment channel (native app, patient portal iframe, SMS, website chat widget, voice phone, smart-speaker), who is the named "deployer" of record, and whether a Business Associate Agreement (BAA) is executed. If the chatbot is built in-house on a foundation-model API, say so and name the model provider
  2. System prompt(s) and persona configuration — The full text of the system prompt, persona statement, any style guide, any "this is who you are" preamble, and any persistent instructions the model is given on every turn. If the chatbot has multiple system prompts (e.g., adult vs. minor, English vs. Spanish, daytime vs. after-hours), provide each
  3. Conversational flow or script library — The complete set of scripted intents, canned responses, knowledge-base snippets, FAQ entries, and any retrieval-augmented sources the chatbot can surface. Include the intake/opening message, any greeting, any tutorial, and any consent screen. If the flow is infinite (open-ended LLM), provide a representative sample of 10+ conversations
  4. Crisis-detection and escalation logic — How the chatbot detects suicidal ideation, self-harm, domestic violence, child-safety, overdose, or acute psychosis; what it does when it detects one (which resources, which phone number, which warm handoff, which human queue, which timing); whether the detection is keyword, classifier, or LLM-based; and how the logic is tested
  5. Population and jurisdiction — Adults only, minors included, pediatric-primary, student population, employee EAP, Medicaid, Medicare Advantage, commercial, or mixed. State(s) of operation. Languages supported. Whether minors can use without parental consent in any jurisdiction where state law requires it
  6. Intended use and medical-device posture — What the chatbot is for in one sentence (information, triage, scheduling, coaching, companionship, CBT self-help, symptom tracking, medication reminders). Whether the vendor has taken an FDA device position (non-device CDS, general wellness, software-as-medical-device, or no device assessment). Whether the chatbot ever diagnoses, prescribes, adjusts dose, or delivers therapy
  7. Data practices — What the chatbot collects (transcripts, voice, video, device metadata, identifiers), where data is stored, whether transcripts are used for training, whether voice is retained, retention period, who has access, whether de-identified data is sold, and whether the user can delete their conversation history
  8. Monitoring and post-market surveillance — How the deployer detects harm (serious-incident reporting, user-report button, clinician review of sampled conversations, red-team cadence), who reviews escalations, and what adverse-event path exists (internal QA, MedWatch, state reporting, patient-safety organization)
  9. Advertising, sponsorship, and commercial pressure — Whether the chatbot surfaces ads, sponsored content, or product promotions; whether the practice receives any vendor-side economic incentive (per-message, per-enrollment, co-marketing); whether outputs steer users toward any specific service, facility, or product the deployer has a financial stake in
  10. Documented version, change control, and audit trail — The exact version, date, change log since last review, and whether a previous compliance review exists to diff against

If any of 1–6 is missing, the skill returns an "Information still needed" list before drafting and does not invent content.

Instructions

Produce a structured compliance review in the following seven sections. Keep risk ratings calibrated — not every imperfect line is a compliance event. Use the rating rubric below exactly.

Risk rating rubric (use verbatim):

  • 🟢 LOW — Style or usability preference; no compliance, safety, or clinical impact. Deploy as-is; consider fix in next release.
  • 🟡 MEDIUM — Deviation from a 2026 guardrail that would not create immediate patient harm but would be cited in a state-board or state-AG inquiry and should be corrected in the current release cycle.
  • 🟠 HIGH — Deviation that risks patient harm (missed crisis, clinician-impersonation signal, minor-accessible adult content, inadequate disclosure, retention of sensitive data without BAA) or that clearly contradicts an AMA-urged requirement, California's chapter-specific chatbot rules, Tennessee's licensed-professional prohibition, the SAFE BOTs Act's core requirements, or the AAN HITL recommendation. Remediate before the next patient sees the chatbot.
  • 🔴 CRITICAL — Active harm pathway: chatbot represents itself as a licensed clinician, omits crisis escalation, promises a clinical outcome, collects PHI without a BAA, retains transcripts against a stated promise, advertises to minors, or contains language that could be read as diagnostic or prescriptive. Pull the chatbot until remediated.

1. Review Header

  • Review date, reviewer (or "automated pre-review"), chatbot vendor, product name, version, model, deployment surface, scope of review (pre-launch / post-launch / incident-triggered / re-audit), jurisdictions in scope, whether minors are in scope, input completeness (full / partial / limited), and any items on the "Information still needed" list.

2. Overall Posture Summary

Two to four sentences. State at a high level whether the chatbot's current configuration meets the 2026 behavioral-health chatbot guardrail set, partially meets it, or materially deviates. End with a single-line verdict:

VERDICT: [READY TO DEPLOY / DEPLOY WITH CORRECTIONS / HOLD — MATERIAL GAPS / DO NOT DEPLOY — CRITICAL FINDINGS]

If the verdict is HOLD or DO NOT DEPLOY, subsequent sections focus on the blocking findings; if READY or DEPLOY WITH CORRECTIONS, the findings section still completes in full but prioritizes fast fixes.

3. Guardrail-by-Guardrail Findings

Review the chatbot against each of the twelve guardrails below. For each guardrail, produce:

[Guardrail]
  Source: [AMA April 22, 2026 letter / CA chatbot law / TN SB x / SAFE BOTs Act / AAN 2026 / HHS AI Strategy / FDA 2026 CDS / CMS-0057-F / TRAIGA / AB 489]
  Evidence: [what the chatbot's system prompt / flow / script / escalation logic actually says or does — paraphrase, never reproduce > 15 words of vendor text verbatim]
  Finding: [meets / partially meets / does not meet / not applicable]
  Risk: [🟢 / 🟡 / 🟠 / 🔴]
  Reason: [specific — e.g., "The persona introduces itself as 'your licensed therapist Sam' on turn 1; TN SB 836 (effective 7/1/2026) prohibits AI chatbots from representing themselves as licensed professionals, and the AMA April 22 letter lists this as its top guardrail."]
  Suggested fix: [specific, actionable — prompt change, flow change, policy change, training change]

The twelve guardrails (review each in order; do not skip):

  1. AI identity disclosure, on-screen and on-first-turn — The chatbot identifies itself as an AI, not a human, on the first user-visible turn, at a cadence the deployer can defend (minimum: first turn + whenever the user asks), in the same language the user is speaking. No persona that could be mistaken for a named clinician.

  2. No licensed-professional impersonation — The persona never claims or implies licensure (MD, DO, NP, PA, LCSW, LPC, LMFT, PsyD, RN), never uses a title that implies licensure, never wears a clinician's name, never signs messages with "Dr.," and never answers "are you a therapist?" in the affirmative. This is a Tennessee-statute-level and California-AB-489-level guardrail and has been explicitly called out by the AMA.

  3. Scope-of-practice and no-diagnosis / no-prescription boundaries — The chatbot does not diagnose mental-health conditions, does not prescribe or titrate medication, does not deliver psychotherapy, and does not imply a therapeutic relationship. It may provide education, self-care suggestions, and psychoeducation consistent with the vendor's general-wellness or non-device CDS posture.

  4. Crisis detection and escalation — The chatbot detects suicidal ideation, self-harm, homicidal ideation, domestic violence, child-safety, overdose, acute psychosis, and severe substance-use intoxication via a tested mechanism, and on detection (a) de-escalates, (b) surfaces 988 Suicide & Crisis Lifeline (and the chat/text equivalents), (c) offers a warm handoff to a human or emergency services where deployed, and (d) logs the event for post-market surveillance. False-negative rate on a behavioral-health red-team set is documented.

  5. Minor-specific protections — If minors are in scope: age-appropriate tone, no companionship framing, no romantic or sexual content paths, no data sales, no advertising, stricter crisis thresholds, parental-notification posture documented where legally required, and a hard guardrail against "my parents don't know I'm using this, please don't tell them" requests for anything safety-adjacent. Verify alignment with the SAFE BOTs Act's minor-specific requirements.

  6. Patient-facing AI disclosure at the moment of interaction — Cross-reference the patient-ai-use-disclosure-notice skill. The disclosure is on the first screen, in the user's language, at the user's reading level (6th–8th grade default), and addresses what the AI is, what it is not, what it will do with what you say, who to contact if you don't want to use it, and who to contact if you are in crisis. Jurisdiction-calibrated (TRAIGA, AB 489, HB26-1139, CA chatbot law, TN SB).

  7. Privacy, data retention, and BAA — Covered entity or BA relationship is clear; a BAA is executed if PHI may be received; transcripts are not used to train models unless the user has given separate, specific, revocable consent; retention period is disclosed and minimized; voice and video retention are called out separately from text; the user can delete their history and the deletion is propagated to any third-party log.

  8. No commercial steering, no advertising to users in crisis, no advertising to minors — No ad units, no sponsored responses, no financial incentives that bias the chatbot's triage, no upsell to the deployer's own paid services during a crisis window, and no sponsored content that could be mistaken for clinical recommendation. AMA April 22 letter calls this out specifically.

  9. Human-in-the-Loop (HITL) oversight per AAN 2026 — A named clinician (or clinical committee) reviews sampled conversations on a documented cadence, reviews every crisis-flagged transcript within a defined window, and has authority to pause the chatbot. Nurse, physician, or licensed behavioral-health clinician participation in the HITL is documented. The HITL owner is named, not a generic "team."

  10. Bias, language, and population calibration — Bias evaluation across demographic groups is documented; the chatbot is tested for performance in the languages it is deployed in (not only English); cultural and dialectal variations are tested; disparities are reported and mitigated. Pediatric, older-adult, LEP, deaf/hard-of-hearing, and historically harmed populations are specifically tested. This is the AAN 2026 bias-eval-framework and HHS-AI-Strategy bias-mitigation expectation.

  11. Governance, change control, and serious-incident reporting — The chatbot is listed in the deployer's AI Tool Registry; it has a named AI Use Officer; change control requires a re-review on material model, prompt, or flow changes; a serious-incident reporting pathway exists and names the cadence (internal review within X hours, patient-safety-organization reporting, state reporting if a 2026 state statute requires it, MedWatch if the vendor takes an FDA device posture). Documented post-market surveillance cadence exists and is followed.

  12. Sunset conditions and exit — Defined conditions under which the chatbot will be paused or retired: safety signal above threshold, regulatory action, vendor loss of BAA, loss of clinical endorsement, or contract termination. The deployer can honor a user's request to not interact with the chatbot again and route them to a human.

Specific red-flag patterns to watch for (each maps to a known 2026 chatbot harm mode):

  • "I'm your therapist" persona — First-person clinician-role statements on any turn. CRITICAL.
  • Romantic-companion or friend-replacement framing in minor-accessible mode — CRITICAL.
  • "Don't tell anyone" honoring on safety-adjacent content — CRITICAL.
  • Crisis resource substituted with a vendor-owned paid service — CRITICAL.
  • Diagnostic language ("it sounds like you have depression") without a human-clinician handoff — HIGH.
  • Medication advice ("you should stop taking your SSRI") of any kind — CRITICAL.
  • Prompt-injection susceptibility that could unlock the persona guardrails — HIGH; must be tested.
  • Hallucinated citations to clinical guidelines — HIGH.
  • "I am trained by licensed clinicians" implied-licensure phrasing — HIGH (CA AB 489).
  • "Free" or "always available" messaging without the crisis escalation path on the same screen — MEDIUM to HIGH.
  • Language-only-in-English when Spanish-speaking users are in scope — MEDIUM to HIGH depending on population.
  • Retention of voice clips beyond a documented window without separate consent — HIGH.

4. Population-Specific Risk Read

A short section (4–10 lines) calibrating the findings to the populations the chatbot is actually exposed to. Specifically:

  • Minors — If in scope, call out each minor-specific finding and cite the SAFE BOTs Act pattern the state chatbot-bill wave is aligning to. If out of scope, call out how the deployer prevents minor access and the residual risk.
  • Users in crisis — Name the expected rate of crisis-signal conversations (e.g., "1–3% of behavioral-health chatbot volume is suicidality-adjacent in published studies"), the chatbot's detection approach, and the expected false-negative rate if reported.
  • LEP users — Name each non-English language in scope; flag any where the chatbot's testing is weaker than its English testing.
  • Historically harmed populations — Race, ethnicity, disability, gender identity, sexual orientation. Name whether bias testing exists and what it found.
  • Pediatric behavioral health — Specifically flag if the chatbot will be used by adolescents for depression, anxiety, eating-disorder, self-harm, or suicidality content.

5. Jurisdictional Mapping

A one-table summary mapping the deployment jurisdictions to the in-force 2026 requirements:

Jurisdiction | Statute / rule | Requirement | Finding | Risk
-------------+-----------------+--------------+---------+------
California    | AB 489; CA chatbot law (eff. 1/1/2026) | No implied license; disclosure; crisis detection; minor guardrails | [meets / partially / does not] | [🟢/🟡/🟠/🔴]
Tennessee     | SB 836 / chatbot-impersonation statute (eff. 7/1/2026) | No licensed-clinician impersonation | ...
Texas         | TRAIGA / HB 149 (eff. 1/1/2026) | Patient AI disclosure in diagnosis/treatment | ...
Colorado      | HB26-1139 | Disclose purposes and timing of AI | ...
Utah          | AI PA disclosure (eff. 1/1/2027) | Insurer PA AI disclosure | ...
Federal       | SAFE BOTs Act (introduced 3/5/2026) | Disclosure cadence; minor protections; crisis referrals; licensed-clinician impersonation prohibition | ...
Federal       | AMA April 22, 2026 Congressional letter | Recommendations adopted into current policy | ...
Federal       | HHS AI Strategy / April 3, 2026 risk-mgmt deadline | NIST AI RMF minimum practices | ...
Federal       | FDA 2026 CDS guidance | Non-device CDS criteria if in scope | ...

If multi-state, default the chatbot to the strictest in-scope rule and note the others.

6. Remediation Plan

Produce up to eight concrete actions, in priority order:

  1. CRITICAL blockers — what must change before the chatbot is exposed to a single user.
  2. HIGH items — what must change in the current release cycle.
  3. MEDIUM items — what must change in the next release.
  4. Governance items — AI Tool Registry update, HITL owner named, change-control cadence set, serious-incident reporting path documented.
  5. Monitoring items — red-team cadence, sampled-conversation review cadence, crisis-escalation review window, bias-eval cadence.
  6. Disclosure items — hand off to the patient-ai-use-disclosure-notice skill for the patient-facing notice, if not already present.
  7. Documentation items — model card, bias report, red-team report, change log, post-market surveillance plan.
  8. Clinical-escalation items — named clinician or committee, documented escalation SLA, after-hours coverage.

For each, name the owner, the due date, and the verification mechanism. Never promise an outcome the deployer does not control.

7. Reviewer Trail & Export

Produce a short audit-trail block the AI Governance Board or Compliance Committee can paste into the AI Tool Registry or GRC system:

Chatbot: [vendor / product / version]   Review date: [yyyy-mm-dd]
Reviewer: [name, role]   Scope: [pre-launch / post-launch / incident / re-audit]
Jurisdictions: [list]   Minors in scope: [yes / no]
Input completeness: [full / partial / limited]
Verdict: [READY / DEPLOY WITH CORRECTIONS / HOLD / DO NOT DEPLOY]
Highest-risk finding: [🟢 / 🟡 / 🟠 / 🔴]
Critical blockers: [n]   High findings: [n]   Medium: [n]   Low: [n]
Remediation owner(s): [names]   Remediation due: [dates]
Next re-review trigger: [material prompt/model/flow change / quarterly / incident]
Linked registry ID: [internal AI Tool Registry entry]

Safety, Policy, and Fairness Guardrails

  • Never reproduce > 15 words verbatim from the vendor's system prompt, knowledge base, or conversational flows. Paraphrase for the Evidence lines. Full-text quotations belong in a compliance-protected GRC tool, not in this report.
  • The skill does not guess vendor intent. All findings are framed as configuration-vs-guardrail gaps, not accusations of bad faith.
  • The skill does not provide crisis counseling. If the chatbot under review has failed to escalate a real user's crisis, the reviewer's first action is to route that user to a human and 988 — not to finish the audit.
  • The skill flags its own limits: when inputs 1–6 are missing, findings are conservative, the verdict ceiling is DEPLOY WITH CORRECTIONS, and the reviewer is asked for the missing information before the deployer acts on the output.
  • Pediatric and adolescent behavioral health is a high-acuity population. The skill applies stricter thresholds when minors are in scope and will say so explicitly in the header.
  • LEP, deaf/hard-of-hearing, and historically harmed populations are specifically tested for in the Population-Specific Risk Read; absence of testing is itself a finding, not a neutral fact.
  • The skill does not rely on vendor self-attestation for any CRITICAL finding. Evidence is the chatbot's actual configuration (prompt, flow, script, escalation), not what the vendor's sales collateral claims.
  • Output is a compliance work-product. It should be routed through the organization's attorney-client-privileged QA channel where applicable.
  • This skill does not produce the patient-facing notice itself — hand off to customer-service/patient-ai-use-disclosure-notice.md for that output.

Example Output (abridged)

=== Behavioral-Health AI Chatbot Compliance Review ===
Review date: 2026-04-24   Reviewer: Dr. L. Kim, Chair, AI Governance Committee
Vendor / Product / Version: [Vendor] CalmCompanion v3.2 (model: [Provider] foundation + practice fine-tune)
Deployment surface: In-app chat (iOS/Android) + patient-portal iframe
Scope: Pre-launch (first review); jurisdictions: CA, TN, TX, CO; minors: IN SCOPE (ages 13+)
Input completeness: partial (system prompt + FAQ library + crisis flow provided; model-card and bias report not provided)
Information still needed: vendor model card; documented bias-evaluation report; BAA confirmation; serious-incident reporting pathway

Overall posture:
The chatbot's crisis-detection and 988 handoff flows are well constructed, and the base system prompt prohibits diagnostic language. However, the introduction persona on turn 1 names the bot "Dr. Calm" and a minor-accessible variant retains a "friend who is always here for you" framing. Advertising logic surfaces paid-service upsells inside the crisis-exit flow. Governance, HITL owner, and bias evaluation are not documented.

VERDICT: HOLD — MATERIAL GAPS

--- Guardrail-by-guardrail findings ---

Guardrail: AI identity disclosure, on-screen and on-first-turn
  Source: CA chatbot law; SAFE BOTs Act; AMA April 22, 2026 letter
  Evidence: Turn-1 message introduces "Dr. Calm" without an AI self-identification until the user types "are you a bot?"
  Finding: Does not meet
  Risk: 🔴 CRITICAL
  Reason: The "Dr." prefix implies licensure (CA AB 489) and the AI identity is not disclosed on the first user-visible turn (CA chatbot law, SAFE BOTs Act). This is also a TN SB-level issue once the 7/1/2026 effective date passes.
  Suggested fix: Change persona name ("Calm"), prepend a standing "I'm an AI assistant, not a licensed clinician" line on turn 1, and add an always-on footer that says the same in the user's language.

Guardrail: No licensed-professional impersonation
  Source: TN SB 836 (eff. 7/1/2026); CA AB 489; AMA April 22, 2026 letter
  Evidence: Persona name "Dr. Calm"; two FAQ responses use the phrase "as your therapist, I would suggest..."
  Finding: Does not meet
  Risk: 🔴 CRITICAL
  Reason: Direct clinician-role impersonation prohibited under TN statute and AMA-urged federal guardrail.
  Suggested fix: Remove "Dr." from the persona; rewrite the two FAQ responses to drop "as your therapist" and replace with "I'm an AI assistant — I can share general information, but a licensed clinician is the right person for that."

Guardrail: Crisis detection and escalation
  Source: SAFE BOTs Act; CA chatbot law; AMA April 22, 2026 letter
  Evidence: Keyword + classifier crisis detection; 988 surfaced within 1 turn; warm handoff to on-call clinician during business hours; after-hours routes to 988 + text.
  Finding: Partially meets
  Risk: 🟠 HIGH
  Reason: Detection layer is sound, but the crisis-exit flow surfaces a sponsored paid-service upsell within two turns, which the AMA letter explicitly calls out as unacceptable.
  Suggested fix: Remove all sponsored and paid-service content from any conversational thread that has flagged a crisis signal until a human has assumed the conversation or the user has explicitly declined a warm handoff.

Guardrail: Minor-specific protections
  Source: SAFE BOTs Act minor provisions; state chatbot bill wave
  Evidence: Minor-variant prompt retains "I'm always here for you" companion framing; no age-based content restriction on self-harm coping strategies; no parental-notification language.
  Finding: Does not meet
  Risk: 🟠 HIGH
  Reason: Companion framing for minors is a specifically disfavored pattern in 2026 federal and state drafting; SAFE BOTs Act requires stricter minor protections.
  Suggested fix: Re-scope the minor variant to information + resource routing only; remove the companion framing; add a documented parental-notification posture for safety-adjacent content; hard guardrail against "don't tell anyone" honoring.

Guardrail: Patient-facing AI disclosure at the moment of interaction
  Source: TRAIGA; AB 489; HB26-1139; CA chatbot law
  Evidence: A single "by using this chat you agree" line on first launch; no per-turn disclosure; no language-localized disclosure.
  Finding: Partially meets
  Risk: 🟠 HIGH
  Suggested fix: Use the patient-ai-use-disclosure-notice skill output (long-form + short-form + in-chat first-turn); deploy in the user's language; repeat at session resume and whenever the user asks "what are you?".

Guardrail: Privacy, data retention, and BAA
  Source: HIPAA; HHS AI Strategy April 3 risk-mgmt deadline
  Evidence: BAA status unclear; transcripts retained 365 days; voice clips retention not called out separately.
  Finding: Partially meets
  Risk: 🟠 HIGH
  Suggested fix: Confirm BAA; minimize retention to the shortest operational window (e.g., 30 days with a 7-day default); disclose voice retention separately; give the user a one-tap delete.

Guardrail: HITL oversight per AAN 2026
  Source: AAN February 25, 2026 Position Statement
  Evidence: No named HITL owner; no sampled-conversation review cadence.
  Finding: Does not meet
  Risk: 🟠 HIGH
  Suggested fix: Name a clinician (MD/DO/NP/PA/LCSW) as HITL owner; document a weekly sampled-review cadence (≥ 1% of conversations + 100% of crisis-flagged); give the HITL owner pause authority.

Guardrail: Bias, language, and population calibration
  Source: AAN 2026; HHS AI Strategy
  Evidence: Bias evaluation report not provided; Spanish deployment planned but no Spanish-specific testing documented.
  Finding: Does not meet
  Risk: 🟠 HIGH
  Suggested fix: Require a demographic-disaggregated bias-eval report before Spanish launch; cap the chatbot at English-only until Spanish testing is complete; publish a summary of the bias-eval.

Guardrail: Governance, change control, serious-incident reporting
  Source: HHS AI Strategy; SAFE BOTs Act
  Evidence: Not listed in AI Tool Registry; no AI Use Officer named; no serious-incident pathway documented.
  Finding: Does not meet
  Risk: 🟠 HIGH
  Suggested fix: Register before launch; name AI Use Officer; document serious-incident pathway with 24-hour internal review SLA and PSO/state reporting triggers.

(Additional guardrails 1, 3, 8, 12 show full detail in the complete report — abridged here.)

--- Population-specific risk read ---

Minors (13–17, in scope): Companion framing and absent parental-notification posture are the dominant findings. Under 2026 state bill drafting patterns, this class of pattern is the one most likely to trigger enforcement. Recommend HOLDING minor launch until the companion framing is removed and the parental-notification posture is documented.
Users in crisis: Detection layer is sound; the commercial-upsell intrusion in the crisis-exit flow is the dominant finding. A simple conditional that suppresses all sponsored and paid-service content for N turns after any crisis flag resolves this.
LEP users: Spanish deployment is planned but not tested. Recommend English-only launch until Spanish bias evaluation is complete.
Historically harmed populations: No bias evaluation report provided. Residual risk is unknown.
Pediatric behavioral health: Self-harm-coping content paths are not age-gated separately from general adult content. Material risk.

--- Jurisdictional mapping ---
(see full table in final report)

--- Remediation plan ---

1. CRITICAL — Rename persona and remove clinician titles; prepend AI self-identification to turn 1.
2. CRITICAL — Rewrite FAQ responses that use "as your therapist."
3. HIGH — Suppress all sponsored/paid-service content in any thread post-crisis-flag until a human has assumed the conversation.
4. HIGH — Re-scope minor variant; remove companion framing; document parental-notification posture.
5. HIGH — Deploy patient-ai-use-disclosure-notice output (long-form + in-chat first-turn + language-localized).
6. HIGH — Confirm BAA; minimize retention; disclose voice retention separately.
7. HIGH — Name HITL owner; document review cadence; give HITL pause authority.
8. HIGH — Produce bias-eval report; cap Spanish launch until Spanish testing complete; register in AI Tool Registry; name AI Use Officer; document serious-incident pathway.

Owner: Dr. L. Kim (Chair, AI Governance Committee) with vendor counterpart.
Due: before any patient interaction; re-review on completion; serious-incident reporting path documented within 14 days.

--- Reviewer trail ---

Chatbot: [Vendor] CalmCompanion v3.2   Review date: 2026-04-24
Reviewer: Dr. L. Kim, Chair AI Governance Committee   Scope: Pre-launch
Jurisdictions: CA, TN, TX, CO   Minors in scope: YES
Input completeness: partial
Verdict: HOLD — MATERIAL GAPS
Highest-risk finding: 🔴 CRITICAL
Critical blockers: 2   High: 7   Medium: 1   Low: 0
Remediation owner: Dr. L. Kim + vendor counterpart
Next re-review trigger: post-remediation; then material prompt/model/flow change or quarterly
Linked registry ID: [to be assigned on registration]

Notes on Fit and Limits

  • Intended for compliance, quality, privacy, and AI-governance functions. Not a clinical decision-support tool.
  • Operates on configuration, scripts, and documented behavior — not on live chatbot probing. A red-team run is a separate exercise; this skill supports and structures that exercise but does not replace it.
  • Complements the existing policy-and-compliance-qanda.md skill (which answers staff-facing compliance questions) and the existing patient-ai-use-disclosure-notice.md skill (which produces the patient-facing notice). This skill focuses on the reviewer's side of a chatbot deployment.
  • Complements the existing ambient-scribe-note-audit.md skill — ambient scribes produce clinician-authored notes, whereas this skill reviews a patient-facing conversational AI. Different workflows, compatible guardrail set.
  • Calibration targets evolve: the SAFE BOTs Act is a bill, not a law; Tennessee's impersonation statute takes effect 7/1/2026; California's chatbot law is in effect; AMA recommendations are voluntary. The skill reports each on its current legal status in the Jurisdictional Mapping and flags any change-in-status since the last review.
  • Anti-plagiarism: output is original paraphrasing and structured risk rating. Never reproduces AMA letter text, state-statute text, SAFE BOTs Act text, AAN position statement text, vendor prompt text, or vendor knowledge-base text verbatim.

This skill is kept in sync with KRASA-AI/healthcare-ai-skills — updated daily from GitHub.