⚙️

Configurable Thresholds

3 consecutive bad days is the default. Working in finance? Bump it to 5. Running a startup? 2 might be generous enough.

Why Three

The default threshold of three consecutive bad days wasn't chosen arbitrarily. It reflects a specific philosophy about what constitutes a pattern versus noise:

Noise

One bad day is noise. Everyone has them. External pressure, poor sleep, a bad phone call before they walked in the door. A single incident says almost nothing about a person.

Pattern Forming

Two consecutive bad days is worth noting. Something is going on. But it's still not enough to act on — it could still be circumstantial, and one good day sits between you and a clean bill of health.

Lifestyle

Three in a row is a choice. At this point, the behavior is no longer being overridden by their better nature on a given day. It's what you're getting. The question is whether it's acceptable.

Adjusting for Context

The right threshold depends heavily on the relationship, the environment, and what the stakes are.

Personal friendships 3 (default) Balanced. Enough grace, enough signal.

Romantic relationships 2–3 Intimacy amplifies patterns. Bad days hit harder and more often.

Professional — peer 3–4 Work stress is real. More room for situational behavior.

Professional — power over you 2 Power asymmetry makes patterns more damaging. Act sooner.

Finance / law / politics 2 High-stakes environments where bad-faith actors cause outsized harm.

Extended family 4–5 Social complexity and obligations warrant more patience.

Nuclear family Use judgment The tool is one input. Professional support is another.

The Reset Asymmetry

The flag threshold (when you start counting a day as bad) is 0.55. The reset threshold (the score required to actually reset the counter) is 0.30. This 25-point gap is intentional.

Without the gap, a subject could game the system with minimal compliance — one slightly-below-threshold day every few days to prevent the streak from building. The lower reset threshold requires genuinely decent behavior to clear a flag, not just technically-not-flagging behavior.

In practice: if someone is being bad enough to score 0.54 on their "good" days, the counter doesn't reset. That's still a problem. You need to get to 0.30 — which requires actually being decent.

Tuning the Score Thresholds

Beyond consecutive days, two score thresholds are configurable:

--day-threshold N — the BSA score at which a day counts as "bad." Default 0.55. Lower this if you want to catch subtler patterns; raise it if you want to filter out more noise.
--confidence-min N — the minimum confidence score to actually flag someone. Default 0.70. Raise this for fewer, higher-certainty flags. Lower it if you want to surface borderline cases for manual review.

← Grace Period Next: Scheduled Runs →