📊

Confidence Scoring

Each candidate gets a confidence score. Below 70%? Probably just a bad week. Above 90%? You already knew. We just confirmed it.

Why Confidence, Not Just a Score

A single bad BSA score on a single bad day doesn't mean much. People have bad days. Bad weeks. Sometimes bad months driven by circumstances that have nothing to do with who they are. A confidence score captures something different: how certain the system is that the pattern is real and not situational.

Confidence is built from two things: the severity of recent behavior, and the persistence of the pattern over time.

How It's Calculated

confidence = (avg_recent_score × 0.6) + (bad_day_persistence × 0.4)

where avg_recent_score is the mean BSA score over the last 7 days, and bad_day_persistence is the fraction of all logged days that were bad days.

The 60/40 split is deliberate. Recent behavior matters more than historical behavior — people can change — but history still counts. Someone who scores badly 80% of the time over six months and badly this week is a very different case from someone who scored badly this week for the first time.

What the Ranges Mean

0 – 40%

Not flagged

Not enough signal to act on. Either the behavior isn't frequent enough, or the history is too short to distinguish pattern from noise. Keep logging if you're unsure. Give it more time.

40 – 70%

Monitor

Something is there but it's not clear yet. This is the most important range to sit with. Don't act. Don't dismiss. Watch, log, and let the pattern resolve itself. Either it gets worse and confirms, or it improves and resets.

70 – 90%

Review — strong signal

The threshold is met and confidence is high enough to flag. This is where most cases land. You should run --dry-run, review the evidence, and make a decision. Don't skip that step.

90%+

Flag — High confidence

You already knew. The system is telling you what you've known for a while. The question isn't whether the pattern is real — it's what you're going to do about it. There are very few false positives at this level.

The 90% Phenomenon

In practice, scores above 90% almost never come as a surprise. When we've asked users to describe what they felt when they saw a 90%+ result, the most common response is relief — not shock. The score didn't tell them something new. It told them they weren't imagining it.

That's the real value of confidence scoring at the high end. Not information. Validation.

What Lowers Confidence

Several things push confidence down even when the consecutive day count is above threshold:

Context flags — Logging evaluations with "deadline week" or "personal crisis" active reduces signal weight. A difficult month can keep confidence suppressed even through bad behavior, which is intentional.
Single-source signals — If you're the only person logging this subject, confidence is slightly discounted. A second perspective would help.
Short history — The persistence component of the formula requires meaningful history. Three bad days in a row in the first week of tracking won't hit 90%.
Inconsistent patterns — Good days interspersed with bad days lower the persistence fraction and therefore lower confidence, even if the bad days are genuinely bad.

Confidence Is Not a Verdict

High confidence means the pattern is real. It doesn't automatically mean you should act on it. Some patterns are real and still manageable. Some relationships are worth maintaining despite genuine friction. The confidence score tells you what's happening — what you do with that is still entirely up to you.

That's why there's a --dry-run.

← Smart Detection Next: Dry-Run Mode →