Louder is the wrong fix.
The hard-of-hearing cohort doesn't need more volume — they need the consonants back. Why presbycusis breaks intelligibility, and why turning the speaker up makes it worse.
The reflex when a resident says what? is to turn the speaker up. It almost never works, and it often backfires. A large share of the residents Companion talks to have presbycusis — age-related hearing loss — and the thing about presbycusis is that it isn't a volume problem. It's a spectrum problem.
What presbycusis actually takes
Presbycusis eats the top of the frequency range first and works down. Vowels live low — roughly 250 to 2000 Hz — and they're loud and long, so they survive. Consonants like s, t, f, sh, and th live high, between 2000 and 8000 Hz, are quiet to begin with, and last only a few dozen milliseconds. Those are precisely the frequencies that are gone. So the resident hears ay-uh for take, oo for soup. The energy arrives. The meaning is erased.
Now turn the volume up. You amplify the vowels the resident already heard fine, and you amplify them past the point of comfort and into distortion. The consonants she was missing are still missing — there's no receptor left up there to catch them. You've made the device loud and still unintelligible, which is the worst of both. With hearing aids in the room it's worse again: louder input drives the aid into clipping and feedback.
Clarity, not gain
So the hard-of-hearing cohort gets a TTS profile built around clarity. The levers we actually pull:
- Slower delivery — in our pilot, roughly 15% under the stock rate, so each consonant burst gets a moment to register before the next syllable lands on top of it.
- Broadened vowel duration, which protects intelligibility without sounding theatrical.
- A mild low-shelf bias that pulls wattage out of the very high frequencies the listener can't recover anyway, instead of wasting it on inaudible brightness.
- Short, single-clause sentences, because a missed word is easier to repair when there are fewer words around it.
- Volume held at a comfortable plateau, not pushed — and a clean, fast repeat on request rather than a louder one.
There's a hardware floor under all of this. The 16kHz mono PCM we run gives us a usable band up to 8kHz, which is exactly where the consonants we're fighting for live — so we're not synthesizing brightness we can't even carry through the pipe.
The result we care about is the resident in room 214B hearing the question the first time. She doesn't say what?. She doesn't decide the device is broken and turn back to the wall. Slower and clearer beat louder every time — and our next voice update fits the low-shelf curve per-room to the speaker hardware actually deployed there.