Fucius AI | AI consulting that builds, not advises

Problem

A 400-staff dealership group in Queensland was losing inbound service calls. During business hours, staff were occupied with routine booking enquiries. After hours, calls went entirely unanswered.

Solution

A production AI voice agent handling inbound service booking calls. Two configurations - business hours and after-hours - sharing a single notification pipeline.

The agent captures customer details, vehicle information, preferred dates, and service location through conversation. Structured data is extracted from the transcript and routed to the booking system.

Business hours: Routine booking calls. Transfers to staff for complex enquiries.
After-hours: Full booking capture. Structured notifications sent to staff for next-morning follow-up.

Results

50+ calls handled per day
Under $500/month operating cost
After-hours coverage where none previously existed
Structured booking data extracted and routed from every call

Technical Challenges

Integration

The voice AI is a managed service. The complexity is in connecting it to an existing phone system, CRM, booking workflow, and notification pipeline.

SIP bridging between the phone system and the AI platform. Time-based call routing via Twilio. Webhook-driven notifications through n8n. A separate LLM extraction step converting conversational transcripts into structured data.

Production Debugging

Failures span telephony (Twilio/SIP), speech recognition, and the LLM simultaneously. Reproduction in test environments is not possible - diagnosis is done from production transcripts.

Examples encountered:

SIP transfer failures where the agent read a phone number aloud rather than transferring
Double transfers creating 3-way calls
Speech recognition failing on "Capalaba" - resolved with a custom pronunciation dictionary
After-hours notification emails delayed 1.5 hours due to webhook timing

Latency and Model Selection

End-of-turn detection and response latency are the primary factors in conversational quality. The critical metric is time to first token - once generation begins, output keeps pace with text-to-speech. The initial pause is where the experience degrades.

Smaller language models produce faster responses but follow instructions less reliably and handle tool calling poorly. The architectural decision was to remove tool calling from the conversation entirely. Pre-call webhooks inject context. Post-call extraction captures structured data. The model's only responsibility during the call is conversation.

Date Calculation

The agent provides callers with the earliest available booking date - 2 business days ahead, excluding weekends and public holidays. Edge cases across holidays, far-future dates, and Saturday cut-offs required a pre-call webhook that calculates the date and injects it as a dynamic variable before each conversation.

Daily Reporting

Automated daily reports are delivered to the team covering call volume, booking completion rates, transfer rates, customer sentiment, and data capture rates across business hours and after-hours periods. No manual effort required - the reporting pipeline runs from the same structured data the system already produces.

Continuous Improvement

The system has a built-in feedback mechanism. When the agent cannot handle a call, it transfers to staff. Each transfer produces two data points: where the agent reached its limit, and how the staff member resolved the enquiry.

This is a complete feedback loop without additional implementation. The system captures what it could not do and how it should have been done. As the underlying technology improves - faster models, better turn detection, improved speech recognition - the percentage of calls handled without transfer increases.

Stack

ElevenLabs ConversationalAI, Twilio SIP, n8n, custom LLM extraction pipeline, Podium API.

AI Voice Agent - Queensland Automotive Group