Prompt caching — CCA-F Exam Prep

L2.28|Prompt caching

1/12

Person

A startup CTO looking at a Claude API bill on her laptop. The number: $12,400/month. She's highlighting one line item: 'System prompt tokens: 78% of total input cost.' Behind her, a whiteboard with 'Launch budget: $3,000/mo' crossed out. Office, morning light.

Maria's AI startup was burning $12,400 a month on Claude API calls. 78% was the same system prompt, repeated.

Her system prompt was 4,000 tokens. Her app made 50,000 API calls per day. That's 200 million system prompt tokens per month -- the same 4,000 tokens, sent 1.5 million times. She was paying full price every time.

Then she enabled prompt caching. The system prompt was cached on the first call. Every subsequent call reused the cache. Her bill dropped from $12,400 to $1,800.

Same product. Same quality. 85% less cost. She just stopped paying for the same tokens over and over.