EngineeringFebruary 28, 2026·3 min read

Logging a symptom can't leave dead air in the conversation.

Calling a tool — record a symptom, page a nurse — in the middle of a spoken exchange, without the awkward silence that exposes the machinery underneath.

Companion is most useful at the exact moment it has to do something other than talk. A resident says her knee is swollen and warm — that needs to become a structured event and, depending on the rules for that wing, a nurse alert. In a text chatbot, a tool call is invisible: the UI shows a spinner. In a spoken conversation there is no spinner. There is only silence, and silence in the middle of a sentence sounds like the device just died.

The dead-air problem

The naive flow is: resident speaks, model decides to call log_symptom, we suspend generation, hit Firestore and the alerting path, wait for the result, then resume. The tool round-trip might be a few hundred milliseconds — sometimes more if a downstream service is slow. During that window Companion says nothing. To the resident it reads as I said something serious and the thing went quiet, which is the worst possible response to a health complaint.

Talk first, act underneath

The fix is to break the human-facing acknowledgment apart from the machine-facing work. Most of our care tools are write-and-confirm, not read-and-depend: the model doesn't need the tool's result to keep the conversation going, it just needs to act. So we let it speak the acknowledgment immediately while the tool executes in the background.

The model emits a spoken acknowledgment — I'll make a note of that and let your nurse know — which starts playing right away.
In parallel, the Go server executes the tool call: write the structured event, trigger the alert per that wing's escalation rules.
If the tool succeeds, nothing further is needed — the resident already heard the confirmation and the conversation never stalled.
If it fails, that becomes its own escalation through a different channel, never a mid-sentence error read aloud to the resident.

This only works because we are honest about which tools can run optimistically. Log a symptom and flag for follow-up are fire-and-confirm — safe to acknowledge before the write lands, because the write is designed not to fail silently. A tool whose answer the model needs in order to speak the next sentence is a different shape and gets a different, more careful treatment: a brief grounded filler while it waits, rather than a confident reply built on data that hasn't arrived.

There is a discipline cost. The model has to be steered to acknowledge in plain, calm language and never to invent a result it hasn't gotten back. And anything clinical it records is still just an event — a nurse reviews it before it counts. But the resident in 214B never feels the machinery. She mentions her knee, Companion answers like a person who's writing it down while still looking at her, and the nurse's pager goes off down the hall before the sentence is even finished.

conversational-aitoolsrealtime

The dead-air problem

Talk first, act underneath

30 days. One wing. Your numbers.