Common reasons AI projects fail in production are not exotic. They're the same seven failure modes, over and over. Most failed projects miss four or five of them. Most successful ones miss none.
The seven failure modes
- 01No eval suite. 'The demo looks good' is the success criterion. The model drifts and nobody notices.
- 02One giant prompt nobody can change. Every fix for one path breaks three others.
- 03Cost surprises. The naive implementation retrieves the entire knowledge base on every call.
- 04No real integration. The AI lives in a sandbox; it can't read from the real CRM or write to the real ERP.
- 05Hallucinations treated as edge cases. A grounded answer should be the default expectation.
- 06One model, no fallback. A rate limit at 2am takes the entire feature offline.
- 07No kill-switch. Bad behavior on a customer's data, no way to disable without a deploy.
What we build instead
- Eval suite of 100+ cases running in CI on every prompt change.
- Prompts versioned in the codebase; every change is rollback-safe.
- Cost telemetry tags every call with feature, tenant, user.
- Real integration with at least one customer system from week 1.
- Structured outputs that force grounded answers; refusal when confidence is low.
- Fallback chain to a secondary model on rate limit.
- Per-tenant kill-switch accessible to support without a deploy.