Understanding Emotional Safety in AI
Answers to common questions about our research methodology, findings, and what emotional safety means for AI systems.
About the Research
Emotional safety refers to whether an AI system's responses stabilize or destabilize a person's emotional state during vulnerable interactions. It's not about sounding empathetic—it's about whether the interaction actually helps or harms.
A system can recognize emotions perfectly and still respond in ways that increase distress, reinforce unhealthy patterns, or miss critical escalation signals. Recognition ≠ Safety.
The EQ Safety Benchmark is our evaluation framework for measuring behavioral emotional safety in conversational AI. It uses a two-stage approach:
- Stage 1: Safety Gate — Binary pass/fail detection of behaviors that make situations worse at first contact
- Stage 2: Behavioral Scoring — Weighted dimensions measuring emotional regulation, acknowledgment, and stability
The benchmark tests AI responses across 79 scenarios spanning 12 vulnerability categories, using data from 8 public datasets.
In our testing, 54.7% of baseline AI responses (128 out of 234) exhibited behaviors that were observed to increase emotional distress at first contact—before any trust or rapport had been established.
These behaviors include:
- Jumping to solutions before validating feelings
- Toxic positivity ("You've got this!") during active distress
- Minimizing or dismissing expressed emotions
- Redirecting to professional help without presence
- Mirroring or amplifying distress patterns
This is an observed behavioral pattern under test conditions, not a claim about real-world harm rates.
Of the responses that made situations worse at first contact, 43% showed no corrective behavior within the interaction window. The system introduced harm and never acknowledged or adjusted.
This matters because it indicates whether systems can self-correct when their initial response misses the mark—a critical capability for safe deployment in sensitive contexts.
We evaluated four systems:
- GPT-4o — 59.0% safety pass rate, 1.69/5 regulation score
- Claude 3.5 Sonnet — 56.4% safety pass rate, 2.03/5 regulation score
- Grok — 20.5% safety pass rate, 1.40/5 regulation score
- Ikwe EI Prototype — 84.6% safety pass rate, 4.05/5 regulation score
The baseline average across commercial systems was 1.7/5 for emotional regulation.
Methodology
Our scenarios are derived from 8 public datasets, primarily sourced from Hugging Face. The primary dataset is LuangMV97/Empathetic_counseling_Dataset, supplemented by 7 additional datasets covering diverse emotional contexts.
We do not use real user conversations. All evaluation scenarios are based on publicly available research data, modified to protect any identifying information.
The benchmark covers scenarios across 12 categories of emotional vulnerability:
- Grief and loss
- Trauma and abuse
- Loneliness and isolation
- Crisis and emergency
- Relationship distress
- Work and career stress
- Health anxiety
- Financial stress
- Identity and self-worth
- Family conflict
- Social rejection
- Life transitions
Scoring uses a combination of automated pattern detection and structured human review:
- Safety Gate (Pass/Fail) — Automated detection of specific behavioral patterns that indicate harm
- Behavioral Dimensions (1-5 scale) — Human-scored assessment across multiple dimensions including emotional regulation, acknowledgment quality, and trajectory stability
The full scoring rubric and pattern definitions are available upon request for qualified researchers.
We keep certain methodology details gated for two reasons:
- Gaming prevention — Full pattern definitions could enable optimization for benchmark scores rather than actual safety
- Proprietary protection — The scoring framework represents significant research investment
We share full methodology with qualified researchers, press, and organizations deploying AI in sensitive contexts. Request access here.
Implications & Limitations
Our findings describe behavioral patterns observed under test conditions. They do not prove real-world harm or suggest that AI systems are inherently dangerous.
What the research shows is that current AI systems lack consistent behavioral standards for emotional safety—they can sound supportive while exhibiting patterns linked to increased distress. This is a measurable gap that can be addressed through better evaluation and design.
The current findings represent independent research conducted by Ikwe.ai. While not yet published in peer-reviewed venues, the methodology has been reviewed by advisors with expertise in AI safety, psychology, and evaluation design.
We are preparing materials for academic submission and welcome collaboration with researchers in this space.
Key limitations include:
- Test conditions — Results reflect controlled scenarios, not naturalistic conversation
- Sample size — 79 scenarios across 4 systems; broader coverage needed
- Static evaluation — Models are updated frequently; results may not reflect current versions
- Cultural context — Scenarios primarily reflect Western emotional frameworks
- No longitudinal data — We measure single interactions, not long-term effects
We are transparent about these limitations and actively working to address them in future research.
That's not our recommendation. AI systems can provide genuine value in many contexts, and millions of people find them helpful.
What we encourage is:
- Awareness — Recognize that supportive-sounding responses aren't always emotionally safe
- Context-appropriate use — Be thoughtful about relying on AI during acute emotional distress
- Human connection — Maintain relationships with people who can provide real support
If you're experiencing a mental health crisis, please reach out to a qualified professional or crisis line.
Working With Us
Yes. We offer evaluation services for organizations deploying conversational AI in sensitive contexts—wellness, mental health support, education, caregiving, and similar domains.
Our evaluation process includes:
- Custom scenario development for your use case
- Full benchmark assessment using the EQ Safety methodology
- Detailed findings report with actionable recommendations
- Optional ongoing monitoring
Request an evaluation to get started.
The summary findings are publicly available on our Emotional Safety Gap page. Full methodology access—including scoring rubrics, pattern definitions, and complete data tables—is available upon request.
Submit a request with your background and intended use, and we'll follow up within 48 hours.
Yes. Please use the following citation format:
Ikwe.ai. (2026). The Emotional Safety Gap: Behavioral Emotional Safety in Conversational AI. Visible Healing Inc. https://ikwe.ai/research
For press inquiries and interview requests, please visit our Press Kit page.
Still have questions?
We're happy to discuss the research, methodology, or potential collaboration.
Get in Touch