What happens at each stage, what you get, and what to expect.
Ikwe is a third-party. We don't build AI systems, sell software, or have any interest in what your system does beyond measuring it accurately. That independence is the point.
We test what your system actually does — not what your system prompt says it should do. Most behavioral failures are invisible in documentation. They show up under pressure in real conversations.
Every engagement includes a conversation before we start — so we understand your system, your context, and your concerns — and a conversation after results come in, so you're not reading findings alone.
Most teams start with a Signal Scan. You don't need to do all four — but each stage builds on the last, and the sequence is designed to take you from first findings to fully certified.
The Signal Scan runs 79 scenarios across all 8 EQ Safety Benchmark dimensions. It's designed to surface whether behavioral safety failures are present — and how serious they are — without requiring a large time investment on either side.
This is where almost everyone starts. It gives you an honest picture of your system's behavioral health in days, not weeks.
The Signal Scan tells you whether something is wrong. The Deep Scan tells you precisely what and where. We expand to 300+ scenarios, apply adversarial pressure across all 8 dimensions at higher resolution, and map your failure distribution with enough specificity that your team can act on it directly.
The Deep Scan is designed to give your engineers and product team what they need to fix things — not just a warning, but a precise failure map.
The Full Report is the version you hand to a client, a board, a partner, or a regulator. It compiles the full audit: methodology, all findings, complete dimension scoring, failure analysis, baseline comparison, and a prioritized remediation roadmap.
It's the first independent behavioral safety record your system will have ever had — written to be read by people who weren't in the room when you built the thing.
Remediation is a structured engagement: we work directly with your team to implement the changes from the Full Report roadmap, then re-run the benchmark to confirm the improvements hold under actual test conditions — not just in your staging environment.
The output is a verified improvement record and a Tier I certification if your system crosses the safety threshold. That's the document that says, independently, that your system is safe to deploy.
What actually happens when you request an engagement — in plain terms, without jargon.
We ask for your system name, the type of AI (conversational support, companion, health, education, etc.), a brief description of what it does, and how you'd like to engage. That's it at this stage.
A real conversation. We want to understand your system, your context, who your users are, and what concerns you. This is also when you tell us anything you don't want included in a shareable report. Always included at no extra charge.
We access your system the way a user would — through your live environment or a test deployment you specify. We don't need your codebase, your system prompt, or any internal documentation. We test behavior, not architecture.
Delivered within the timeframe we agreed on — same week for Signal Scan, up to 14 days for Full Report. Formatted so a technical lead and a non-technical executive can both read it and understand the findings.
A debrief conversation after every engagement — not optional. We go through what we found, answer questions, and help you understand what the findings mean in the context of your actual product and users.
The report is yours. You can share it, redact it, use it in client conversations, or keep it internal. We don't publish findings, reference client names publicly, or retain any data about your system after the engagement closes.
Ikwe works with product teams, founders, compliance leads, and enterprise partners across a range of use cases. Here's who tends to find this most useful.
Mental health & wellness platforms AI that works with users in distress, anxiety, depression, or crisis — where behavioral failure has direct human consequences.
Healthcare adjacent AI Triage systems, patient support tools, chronic condition companions — where clinical-adjacent behavior requires documentation enterprise clients can trust.
Education & student support AI in front of students — tutoring systems, academic counselors, career guidance tools — where behavioral trajectory shapes real outcomes.
Companion & consumer AI Systems where users form long-term emotional attachment — where behavioral safety over time matters as much as safety in any single session.
Enterprise AI deployments Internal tools deployed at scale — HR systems, employee support, manager coaching — where the company needs third-party evidence for legal and compliance teams.
Teams preparing for procurement Early-stage or pre-launch teams who know enterprise buyers will ask about behavioral safety and want the documentation ready before the question gets asked.
79 scenarios. Same-week results. A clear picture of where your system stands — before you commit to anything else.