Realtime Voice AI AGENTS Will Explode in 2025 | SHOWCASE
Based on All About AI's video on YouTube. If you like this content, support the original creators by watching, liking and subscribing to their content.
The appointment workflow relies on real-time voice plus function calling to query a scheduling database for exact date/slot availability.
Briefing
Real-time voice AI agents are moving from demos to practical business workflows—using function calling to check availability, confirm bookings, and write results into a database during a live phone conversation. The showcase centers on an “AI Dental” receptionist that answers calls, gathers appointment details, queries a scheduling database for open slots, rejects unavailable times, and then books the customer automatically.
The system is built around a real-time API plus a tightly controlled system message that defines the agent’s role and the exact tools it can use. When a caller asks for a specific time—like “9 a.m. tomorrow”—the agent triggers a function call that checks the database for that exact date and slot. If the requested slot is already taken, the agent responds with an apology and offers alternatives by calling another function to list available slots. Once the caller selects an open time (for example, “11 a.m.”), the agent confirms the appointment, collects contact information and any special requirements, and records the booking.
A key operational detail is the pipeline that turns a live call into structured data. The conversation is recorded and converted into an MP3 file, then transcribed using Whisper. From the transcript, structured outputs extract the fields needed to fill the appointment schema—first and last name, appointment date, chosen slot, contact email/phone, and special requests (such as requesting “calm background music” or even a “beer” to be offered during the visit). Those extracted values are then saved to the database and reflected on a dashboard, demonstrating end-to-end automation from voice interaction to persistent scheduling records.
The agent’s behavior also depends heavily on instruction-following. The system message instructs the agent to greet and identify the caller’s needs, collect required information, use the “list available slots” tool when asked for options, use the “check availability” tool when a specific time is requested, and always read back the confirmed date and time. The creator notes that some fields can be buggy (special requirements and certain extracted details aren’t always perfect yet), but the overall booking loop works reliably enough to be compelling.
Beyond scheduling logic, the showcase highlights voice and personality flexibility. Using the real-time API playground, different voices (e.g., “Sage” and “Ash”) can be selected, and the agent’s spoken style can be adjusted with voice configuration settings such as accent and pacing. Quick call snippets show the receptionist persona shifting while still performing the same appointment workflow.
The takeaway for 2025 is less about a single dental use case and more about a repeatable pattern: real-time voice input, tool-based function calling for database-backed decisions, structured extraction for reliable record-keeping, and configurable voice/personality for user experience. With more testing before production, the approach points to a broader wave of voice agents that can handle customer interactions—answering questions, checking availability, and completing transactions—without human intervention.
Cornell Notes
A real-time voice AI receptionist automates dental appointment bookings by combining a real-time API with function calling and a database-backed scheduling workflow. During a call, the agent identifies the requested date/time, checks availability for exact slots, lists alternatives when a time is taken, and confirms the chosen appointment. After the conversation, the system records audio, transcribes it with Whisper, and uses structured outputs to extract key fields like name, appointment date, slot, contact info, and special requirements, then saves the booking to a dashboard. The system’s reliability depends on a detailed system message that forces consistent tool use and confirmation behavior. Voice and personality can be swapped by selecting different voices in the real-time API playground.
How does the agent decide whether a requested appointment time is available?
What happens when a caller asks for available times instead of a specific slot?
How does the system turn a phone conversation into a structured appointment record?
Why does the system message matter so much to the agent’s performance?
How can the agent’s voice and personality be changed without rewriting the workflow?
Review Questions
- What tool/function calls are triggered when a caller requests a specific time versus when they ask for available slots?
- Describe the end-to-end pipeline from recorded audio to a saved appointment in the database.
- Which extracted fields are required to fill the appointment schema, and how are they obtained from the transcript?
Key Points
- 1
The appointment workflow relies on real-time voice plus function calling to query a scheduling database for exact date/slot availability.
- 2
When a requested slot is unavailable, the agent switches to listing open slots and then re-checks availability after the caller chooses a time.
- 3
A practical pipeline records the call, transcribes it with Whisper, and uses structured outputs to extract appointment fields for database storage.
- 4
A detailed system message enforces consistent behavior: correct tool selection, required data collection, and read-back confirmation of date and time.
- 5
Voice and personality can be swapped via real-time API voice configuration while keeping the same booking logic.
- 6
Special requirements and some extracted details may still be imperfect, so additional testing is needed before production use.