Precision and Recall
The two metrics that capture how a model fails — flagging too many false alarms versus missing too many real cases — and why choosing between them is a business decision, not a technical one.
Precision and recall are two metrics that describe different ways a classification model can fail. Precision measures how often the model is right when it flags something — if it flags 100 transactions as fraudulent and 80 actually are, precision is 80%. Recall measures how much it catches — if there were 200 fraudulent transactions and the model flagged 160 of them, recall is 80%. Improving one typically worsens the other: a model tuned to catch more fraud will also generate more false alarms, and a model tuned to avoid false alarms will miss more fraud. Where to set that threshold is not a technical question.
The precision-recall trade-off is a business decision in disguise. False positives and false negatives have different costs depending on context — blocking a legitimate customer transaction costs differently than missing a fraudulent one, and both carry reputational and financial consequences. Technical teams often set default thresholds without input from the business on what those costs actually are. When executives ask only about accuracy and ignore precision and recall, they're letting the technical team make an implicit business judgment about which type of mistake is more acceptable.
Read next
Related concepts
Accuracy
The most widely reported AI performance metric — and one of the easiest to be misled by.
Operations and DeploymentModel Evaluation
How teams determine whether a model actually works — and the reason 'it works in testing' is often the most dangerous thing anyone says before launch.
FoundationsClassification
Teaching a model to sort things into categories — and learning why the wrong kind of wrong can be more costly than no AI at all.
Optional map
Concept neighborhood
Focused neighborhood
Precision and Recall
The two metrics that capture how a model fails — flagging too many false alarms versus missing too many real cases — and why choosing between them is a business decision, not a technical one.
In these paths
Selected concept
Directly related
One step further
via Model Evaluation
via Classification
via Machine Learning