One command, all three models, clear recommendation. Test your actual prompt and get results in 60 seconds.
Get StartedReal example: A ticket classifier ran on Sonnet. Tested — Haiku matched quality perfectly. Overpaying for no reason.
Sonnet for tasks where Haiku works just as well → overpaying for no benefit
Haiku for complex logic → poor user experience, lost customers
Manual testing → hours of dev time guessing which model works best
Run your test cases against all three Claude models. Get a clear recommendation in seconds.
No installation required. Just run the command.
npx which-claude
export ANTHROPIC_API_KEY=sk-ant-...
Add your test cases to which-claude.yaml
Get your recommendation in 60 seconds
Built for developers who want data, not clever product names.
Test your actual prompt on Haiku, Sonnet, and Opus. No guessing, just data.
See exactly what you'll pay at 1K, 10K, or 100K calls per day.
Know if extended thinking improves quality and whether it's worth the cost.
Identify 40-60% savings opportunities with caching recommendations.
Auto-rerun on config changes. Fast feedback during iteration.
No installation. No dependencies. Just npx and go.
Stop guessing. Start testing. Get results in 60 seconds.