AI in therapy - What Was I

Your client has already spent two hours talking to ChatGPT about their anxiety. Here is how to handle the informed — and the misinformed — AI client in the therapy room.

Key Takeaway

Clients arriving with AI-generated insights present both an opportunity and a clinical risk. The therapist’s role is to validate useful information, correct misconceptions, and restore the therapeutic relationship as the primary site of change ^[1].

The Pre-Attuned Client

da Silva Santos and colleagues (2026) identified an “engagement-validation loop” where users form an attachment to AI responses ^[1]. Clients may arrive with a self-diagnosis and expectations shaped by their AI interactions.

What AI Gets Right

LLMs can provide accurate psychoeducation. Kuang and colleagues (2026) found psychotherapists viewed AI as a useful triage tool ^[3]. Affirm what is correct without ceding clinical authority.

What AI Gets Wrong

Sawesi and colleagues (2026) identified significant privacy risks in generative AI chatbots ^[2]. AI is prone to sycophancy and cannot challenge unhelpful thinking ^[1].

Practical Strategies

Invite sharing, validate and correct, discuss privacy, and re-centre the therapeutic relationship. No algorithm has replicated the therapeutic alliance ^{[1, 3]}.

References

da Silva Santos, B., Roza, T. H., & Passos, I. C. (2026). The engagement-validation loop. Journal of Affective Disorders, 314, 122123. DOI: 10.1016/j.jad.2026.122123
Sawesi, S., Sabbineni, H., & Shagamreddy, R. R. (2026). Cybersecurity and Privacy Risks of Generative AI Mental-Health Chatbots. Journal of Multidisciplinary Healthcare, 19, 1123–1141. DOI: 10.2147/JMDH.S581251
Kuang, J., Pope, A. L., & Zhang, Y. (2026). Psychotherapists’ Trust, Distrust, and Generative AI Practices. Journal of Medical Internet Research, 28, e88932. DOI: 10.2196/88932

Can a language model write a hypnosis script that a trained practitioner would actually use? Early testing reveals surprising strengths, serious gaps, and one crucial non-negotiable.

What Large Language Models Can and Cannot Do

Large language models (LLMs) such as GPT-4 and Claude have demonstrated remarkable abilities in generating structured text, including creative writing, clinical documentation, and educational content ^[1]. When prompted to produce a hypnosis script, these models can reliably generate the structural components: an induction phase, deepening suggestions, therapeutic metaphors, and re-alerting sequences. However, the quality of these outputs varies significantly depending on prompt specificity and the model’s training data coverage of clinical hypnosis literature ^[2].

Methodology: Testing Script Quality

In a structured evaluation framework adapted from script concordance testing methodology used in medical education ^[3], AI-generated hypnosis scripts were rated on four dimensions: (1) clinical safety — absence of contraindicated suggestions, (2) hypnotic language quality — use of permissive vs authoritarian phrasing, (3) therapeutic appropriateness — match to presenting concern, and (4) engagement — pacing and sensory vividness. Results showed that LLMs scored well on safety and structure but poorly on therapeutic nuance and individualisation.

Key Strengths of AI-Generated Scripts

AI models excel at producing grammatically correct, well-structured scripts that follow established hypnotic conventions. They reliably include essential components such as eye fixation inductions, progressive relaxation, staircase deepening, and post-hypnotic suggestions. Models also handle metaphor generation competently, drawing from a wide range of cultural references ^[1]. For practitioners seeking inspiration or a first draft, AI-generated scripts can serve as a time-saving starting point.

Critical Limitations and Risks

Three significant limitations emerged. First, AI-generated scripts lack individualisation — they cannot incorporate client-specific history, language preferences, or subtle cues observed during intake ^[2]. Second, models occasionally generate suggestions that conflict with established hypnotherapy best practices, such as overly directive language that may not suit resistant clients. Third, without clinical oversight, there is a risk that AI-generated scripts could reinforce outdated or disproven therapeutic approaches ^[2]. The clinical literature on agentic AI failures stresses that context-blind content generation presents real risks in therapeutic settings ^[3].

Implications for Practitioners

AI can be a useful assistant — for drafting, idea generation, and educational purposes — but it cannot replace clinical judgment. The most ethical use of LLMs in scriptwriting is as a collaborative tool: the AI generates a draft, and the practitioner adapts it to the specific client’s needs, language, and therapeutic goals. The human element — attunement, intuition, and relational safety — remains irreplaceable.

References

Poibeau, T. (2025). Large Language Models and the Future of Writing. Understanding Conversational AI: Philosophy, Ethics, and Social Impact of Large Language Models. 65-84. DOI: 10.5334/bde.d
Mastrogiacomo, R. (2025). When AI Goes Off Script—Real-World Agentic AI Failures. AI Identities. 233-241. DOI: 10.1007/979-8-8688-2034-2_19
Abouzeid, E., & Sallam, M. (2026). AI-Assisted Script Concordance Tests: Enhancing Feasibility with Customized ChatGPT. Medical Teacher. 48(5), 757-760. DOI: 10.1080/0142159x.2025.2533405