Rule Quality in ezrules: Precision and Recall by Outcome-Label Pair¶

Teams usually know which rules fire often. The harder part is knowing whether those rules are right.

In ezrules v0.19, the new Rule Quality view makes that explicit by comparing rule outcomes against ground-truth labels.

The practical problem¶

Rules return outcomes like HOLD, RELEASE, CANCEL. Labels are business ground truth like FRAUD, NORMAL, CHARGEBACK.

Those vocabularies are intentionally separate, so quality analysis must evaluate pairs:

Each pair can have different precision/recall behavior.

Before ranking rules, analysts can confirm that labels are flowing over a useful window in Analytics.

30-day label analytics charts

Open Rule Quality from the sidebar.

You get:

This helps analysts answer:

Rule quality report with ranked rules and pair metrics

Low-volume pairs can create noisy rankings. The page includes a Min support filter.

Under the hood, this maps to:

Raise the threshold when you want stable operational ranking; lower it for exploratory analysis.

Configured outcome-label pairs keep the report focused on mappings the team actually reviews.

Curated rule quality pair settings

The bombardment script now supports in-line labeling of a small random percentage of evaluated events:

uv run python scripts/bombard_evaluator.py \
  --api-key <api_key> \
  --token <access_token> \
  --fraud-rate 0.01

That gives you continuous traffic plus a trickle of labeled events for quality monitoring, without waiting for manual CSV uploads.

Run traffic (live or bombardment).
Label events (manual, CSV, or bombardment rate).
Open Rule Quality and rank by F1.
Inspect low-scoring pairs and decide whether to adjust rule logic or mapping assumptions.

For API details, see Manager API reference.