Question 1

How much does PromptEval cost?

Accepted Answer

PromptEval has four plans: Free ($0/month, 3 web evaluations/month plus eval API lint 10/month, no credit card required), Basic ($9 USD/month, 30 credits/month), Pro ($19 USD/month, unlimited web plus slug serving, a CI regression gate with GitHub Action, and the full eval API), and Team ($49 USD/month, Pro plus workspaces with roles, a production approval workflow, and an audit log). The REST eval API is open on every plan with a monthly managed quota plus unlimited BYOK.

Question 2

What is included in the free plan?

Accepted Answer

The free plan includes 3 prompt evaluations per month, a versioned prompt library (up to 5 prompts with unlimited versions per prompt), and daily challenge access. No credit card required.

Question 3

What is the difference between Pro and Team?

Accepted Answer

Pro ($19/month) gives unlimited web usage plus the platform: slug serving (prompt in production without redeploy), a CI regression gate with an official GitHub Action, the full eval API, and batch A/B testing. Team ($49/month) adds team governance: workspaces with roles, a production approval workflow, an audit log, library export, and a larger API quota (250 lint/month).

Question 4

Can I cancel my PromptEval subscription anytime?

Accepted Answer

Yes. You can cancel anytime from the customer portal with no questions asked. Your access continues until the end of the billing period.

Question 5

Does PromptEval require an API key?

Accepted Answer

No API key is required for prompt evaluation, token optimization, production iterator, or the versioned library. The Playground and Batch A/B Test features use BYOK (bring your own key) for Anthropic or OpenAI, so you pay only for what you use there.

Question 6

Is the score reliable? How is it calculated?

Accepted Answer

The score runs 8 sub-criteria across 4 dimensions (clarity, specificity, structure, robustness), each at temperature 0 against an explicit, anchored rubric: below 60 means serious gaps, above 85 means genuinely robust. It adjusts ±8 for technical factors like instruction positioning and system/user separation. The same prompt always returns the same score — reproducible, not a subjective opinion.

Question 7

What do you do with my prompts?

Accepted Answer

Your prompts are sent to Claude for evaluation and discarded after processing. We never use your prompts to train AI models. You own your data.

From first score toprompt in production.

From first score to
prompt in production.