Monitoring AI systems — CCA-F Exam Prep

PencilPrepPencilPrep
L2.33|Monitoring AI systems
1/12
Real story
A customer support dashboard showing 'AI Agent: Online. Status: Healthy.' Green lights everywhere. Below it, a Twitter thread going viral: 'Your AI support agent just told me to microwave my laptop to fix the WiFi. Thread:' with 2,400 retweets. The monitoring says everything is fine. The internet says it isn't. Split screen.

The monitoring dashboard said 'Healthy.' Twitter said the AI was telling customers to microwave their laptops.

The AI agent was online. Response times were fast. Error rate was 0%. By every metric on the dashboard, the system was perfect.

But the dashboard only tracked uptime, latency, and HTTP errors. It didn't track response quality. Nobody was monitoring what the AI actually said. For 6 hours, the agent gave increasingly bizarre advice. The first sign was a viral tweet.

The system was up. The system was fast. The system was hallucinating. And nobody knew until customers told Twitter.