Blog Philosophy

The 'don't cry wolf' philosophy — what we don't flag

By Alejandro Taubas February 18, 2025 6 min read

Here is a thing that happened to a parent in our beta program. (Synthetic composite, not a real person — but we've heard versions of this from multiple households.) Their baby tracking app sent them four alerts in a single morning: a notification that the baby hadn't slept in three hours, a "feeding reminder," a flag that the diaper count was below the daily average, and a "milestone check" asking if the baby had started cooing.

They turned off all notifications that afternoon. Two weeks later, their baby had a genuine sustained feeding refusal that was worth calling their pediatrician about. They didn't get an alert. They almost missed it.

This is the don't-cry-wolf problem, and it is the reason Xoul's pediatric flag system was the hardest part of the product to build.

How notification fatigue happens

Baby tracking apps are built by people who want to be useful. Useful, in the product mindset, often means more features, more information, more alerts. If the app flags a diaper gap, it seems like it's paying attention. If it flags that feeding is slightly below yesterday's average, it seems like it cares. The more alerts, the more the app seems worth having.

The problem is that each unnecessary alert costs something. It costs the parent two minutes of attention, a small spike of anxiety, and — critically — a small reduction in how much they trust the next alert. After the fourth false alarm in a week, parents do what anyone does with a system that cries wolf: they stop listening. They turn off notifications. They glance at alerts and assume they're probably nothing.

"An alert system that's right 80% of the time is worse than no alert system, because the 20% miss happens when parents have already tuned out."

When the alert that actually matters arrives — when there's a genuine sustained volume drop, or an inconsolable crying episode that's been going on for three hours — it looks exactly like every other alert. There's no way to know this one is different. So it gets the same response as the others: a glance, a mental note of "probably fine," and back to sleep.

What we actually flag — and why just seven

When we designed the flag system, we started with a longer list. We looked at what other apps flagged. We read literature on pediatric warning signs. We talked to parents about what had made them call their pediatrician, and what in retrospect hadn't needed a call.

Then we cut. Hard.

We kept flags only when all three of the following were true:

The pattern is observable from a routine log (we're tracking feeds, sleep, and diapers — not vitals, not temperature measured by a device)
The pattern, if real and sustained, is something a pediatrician would want to know about at the next available conversation
The pattern is not likely to occur as a false positive during normal infant variation

That gave us seven flags: feeding refusal beyond a sustained threshold in newborns, significant volume drop from baseline, extended inconsolable crying, a sharp sustained sleep disruption combined with other markers, a sustained gap in wet diapers, a caregiver-noted fever alongside disrupted patterns, and multiple concurrent pattern disruptions happening together.

These aren't diagnoses. They're pattern observations that, in combination or sustained over time, suggest it's worth talking to your pediatrician. Xoul doesn't tell you what's wrong. It tells you that something in the pattern has changed in a way that crossed our threshold, and you should consider making a call.

What we deliberately don't flag

Mild diaper rash is common and not flagged. Occasional spit-up is normal in the first year and not flagged. Brief fussiness under 90 minutes is essentially constant in infants and not flagged. A single feed that's slightly below average is noise and not flagged. A single night of worse-than-average sleep is not flagged.

These exclusions are as important as the inclusions. Every time we leave a normal variation unflagged, we're preserving the credibility of the flags that do fire. The system only works if the alerts mean something. They only mean something if we're disciplined about not using them for everything.

The epistemics of not knowing

There is a real discomfort in building a system that doesn't tell you everything. Parents in the newborn phase are anxious by design — their brains are wired to treat infant signals as important. When the app stays quiet, it can feel like the app isn't working.

We made a deliberate choice to carry that discomfort rather than resolve it with more alerts. A quiet Xoul is a Xoul that hasn't detected anything worth surfacing. That's different from a quiet Xoul that stopped paying attention. The flags are running in the background on every entry.

And if your gut says something is off, that's worth more than anything an app can tell you. Xoul is not a medical device. It is a personal organizer for your household. Nothing in this app constitutes medical advice. If your gut says call the pediatrician, call the pediatrician. Don't wait for an alert that may never come because what you're noticing doesn't show up in a log.

The app does the pattern-watching. Your instincts do the intuition. They're not the same thing, and both are useful.

Why we talk about this publicly

Most apps don't explain their alert logic because explaining it invites scrutiny. If you know that the flag threshold for feeding refusal is 8 hours in a newborn, you might argue it should be 6 hours, or you might argue that the threshold should vary by age. You'd be having a legitimate conversation about the product design.

We think that conversation is worthwhile. We'd rather you understand what Xoul flags and why, and make an informed decision about whether that matches your household's risk tolerance, than have you use a system you don't understand and either over-trust or under-trust it.

The design philosophy is that Xoul's value comes from the signal-to-noise ratio, not from volume of information. If we get a flag wrong — if our threshold is off in either direction — we want to hear about it. The right way to tell us is [email protected]. We're two engineers. We read all of it.

Xoul is not a medical device. Nothing in this app constitutes medical advice. The flag system is a pattern observation tool, not a clinical screening system. For any health concern about your child, your pediatrician is the right call — not an app.

If you want to see the actual flag list — what we do and don't flag, and the thresholds we use — it's on the Pediatric Flags page. You can also read the longer companion piece: what merits a call versus what's normal.

See the flags in action

Seven flags. Not seventy.

Download Xoul and see how the pediatric flags actually work — and don't work.

Download Xoul