VOICE · REAL-TIME AGENT
Chinese real-time voice assistant, in production.
End-to-end STT → LLM → TTS at ~600 ms TTFA, with barge-in and multi-turn context. Built on Aliyun ASR + Qwen + CosyVoice TTS. Already deployed across 3 verticals.
LIVE DEMO
Try the voice demo
Talk to it directly in your browser.
All three verticals are clickable — restaurant, salon, clinic. Real pipeline, not a mock.
~ 600 ms TTFA (Time To First Audio)
< 200 ms Barge-in response
zh · yue · en Languages
99.5%+ Production recognition accuracy
One foundation, three live verticals.
Each scenario shipped 0-to-production in 4–6 weeks. Give us a new vertical + your domain corpus, and our FDE team replicates the same cadence.
🍽
Front-of-house · reservations
Restaurant Booking
- Restaurant info / hours / cuisine Q&A
- Multi-party · multi-slot table booking
- Special needs (high chair · allergies · private rooms)
- Reschedule / cancellation flow
💇
Stylist matching · scheduling
Salon Appointment
- Stylist recommendation (by style · price · history)
- Service duration + bundling
- Returning-customer identity check
- Reschedule / waitlist + auto-reminders
🏥
Triage · appointments
Clinic Intake
- Symptom → department triage
- Doctor schedule lookup + booking
- Insurance / self-pay / health-check packages
- Follow-up reminders + prep instructions
What's underneath.
STT
Aliyun Paraformer (real-time)Mandarin + Cantonese + English mix
LLM
Qwen-Max (default)Swap to GPT / Claude / domesticScenario prompt library + tool calling
TTS
Aliyun CosyVoiceVoice selection · emotion
TRANSPORT
LiveKit Cloud AgentWebRTC + SSE edge token
YOUR SCENARIO
Want this pipeline on your own business?
Common adaptations: support hotlines · outbound sales · post-sale follow-up · industry advisory hotline. Our FDE team ships a new vertical in 4–6 weeks.