Stop guessing
which Claude model to use

One command, all three models, clear recommendation. Test your actual prompt and get results in 60 seconds.

Get Started
60s Time to first result
74% Average cost savings
0 Setup required
$0.00003 Spent on the name

Choosing wrong costs you

Real example: A ticket classifier ran on Sonnet. Tested — Haiku matched quality perfectly. Overpaying for no reason.

💸

Too expensive

Sonnet for tasks where Haiku works just as well → overpaying for no benefit

⚠️

Too cheap

Haiku for complex logic → poor user experience, lost customers

⏱️

Too slow

Manual testing → hours of dev time guessing which model works best

See the difference instantly

Run your test cases against all three Claude models. Get a clear recommendation in seconds.

$ npx which-claude
which-claude · Email tone classifier · 3 cases ┌────────┬─────────┬────────┬────────┬───────────┐ │ Model │ Quality │ Avg ms │ Tokens │ Cost / 1K │ ├────────┼─────────┼────────┼────────┼───────────┤ │ Haiku │ 3/3 │ 507 │ 181 │ $0.061 │ │ Sonnet │ 3/3 │ 1161 │ 184 │ $0.232 │ └────────┴─────────┴────────┴────────┴───────────┘
Use HAIKU — All models scored 3/3. Haiku saves 74% vs Sonnet.

Get started in 30 seconds

No installation required. Just run the command.

npx which-claude
1

Set API key

export ANTHROPIC_API_KEY=sk-ant-...

2

Create config

Add your test cases to which-claude.yaml

3

Run it

Get your recommendation in 60 seconds

Everything you need

Built for developers who want data, not clever product names.

Empirical testing

Test your actual prompt on Haiku, Sonnet, and Opus. No guessing, just data.

💰

Cost projection

See exactly what you'll pay at 1K, 10K, or 100K calls per day.

🧠

Thinking mode

Know if extended thinking improves quality and whether it's worth the cost.

📦

Prompt caching

Identify 40-60% savings opportunities with caching recommendations.

👀

Watch mode

Auto-rerun on config changes. Fast feedback during iteration.

🚀

Zero setup

No installation. No dependencies. Just npx and go.

Ready to find your model?

Stop guessing. Start testing. Get results in 60 seconds.