How does the real-time avatar system work technically?

The system combines a real-time conversational AI layer with a custom animation engine. When a student speaks or submits input, the AI generates a response, passes the text to a TTS engine (currently ElevenLabs), and routes the resulting audio through a phoneme-to-viseme pipeline that translates each spoken sound into precise facial muscle positions for the cartoon avatar. This all happens in under a second, producing the appearance of natural, responsive conversation. The pipeline is vendor-agnostic, the lip-sync layer is decoupled from the TTS engine, which allows the underlying voice technology to be upgraded without rebuilding the animation system.

Why did AE Studio build a proprietary system instead of using an existing platform like HeyGen?

HeyGen and similar platforms are designed around pre-rendered video, not real-time generative conversation. For Alpha's use case, tutors that adapt dynamically to each student's current lesson, performance history, and conversational context, pre-rendered video is a dead end. You can't pre-render every possible thing a student might say. Building proprietary also gave Alpha full control over the technology stack. No licensing dependencies, no feature constraints imposed by a third-party roadmap, and no ceiling on how the system can evolve as Alpha's product grows.

What products are the avatars currently live in?

The avatar system is integrated into AskElle, Alpha's AI-powered student Q&A companion, and DreamLauncher, Alpha's passion and career exploration platform. A publicly accessible demo is available at personas.alpha.school, where users can interact with multiple historical figure personas, Abraham Lincoln and others, all running on the same underlying avatar engine.

How does the system personalize interactions for each student?

Because the avatar is embedded directly into Alpha's product ecosystem rather than running as a standalone tool, it has access to each student's educational profile, their current unit, recent assessment results, skill gaps, learning pace, and goals. This context is passed into every conversational interaction, allowing the avatar to reference what the student has been working on, adjust the difficulty and framing of explanations, and provide feedback that reflects actual performance rather than generic responses.

Can the avatar system support different languages and devices?

Yes. The platform includes multi-language support to serve Alpha's global student population, and multi-resolution rendering to maintain visual quality across the range of devices students use, from tablets to lower-powered hardware in under-resourced schools. The architecture was designed for global scale from the start, supporting thousands of simultaneous sessions without degradation in latency or response quality.

Real-Time AI Avatar Tutors for Alpha School

TL;DR

Built a fully proprietary real-time conversational AI avatar system from scratch, outperforming HeyGen's capabilities at the time of development

Engineered a custom phoneme-to-viseme lip-sync pipeline on Azure Cognitive Services, vendor-agnostic, audio-in to facial-expression-out, enabling lifelike speech animation for cartoon avatars

Deployed across multiple Alpha School products including AskElle and DreamLauncher, with a live interactive demo at personas.alpha.school featuring selectable historical figure personas

The Challenge

Alpha School runs on a radical premise: students spend just two hours per day on AI-driven core instruction, then own the rest of their time for passion projects, physical activity, and self-directed learning. To make that model work, the AI doing the teaching has to be extraordinary. It can't feel like a chatbot reading from a script. It has to feel like a tutor who knows the student, responds naturally, and keeps them engaged.

Existing avatar solutions weren't up to the task. HeyGen and similar platforms offered pre-rendered video loops with limited interactivity. They couldn't hold a real conversation, adapt to a student's current emotional state, or respond dynamically to what was happening in a lesson. For Alpha's vision, AI tutors that millions of students would interact with daily, these tools were a dead end.

Alpha needed a fully custom, real-time conversational avatar system. One that could be integrated into any product across their ecosystem, support thousands of simultaneous student sessions, and deliver the kind of lifelike, responsive interaction that makes students forget they're talking to software.

The technical bar was high. Real-time lip-sync for cartoon avatars is a hard problem. Natural-sounding, emotionally expressive AI voice is a hard problem. Building all of it into a scalable, multi-product platform, while shipping fast enough to keep pace with Alpha's weekly release cadence, made it harder still.

Key Results

Outperformed HeyGen on real-time interactivity at time of build

Supports thousands of simultaneous avatar sessions

Multi-language and multi-resolution support across all devices

Live across AskElle and DreamLauncher with full educational context integration

Vendor-agnostic lip-sync pipeline enabling seamless TTS provider migration

Alpha School's AI Avatar Tutors

TL;DR

The Challenge

Key Results

The Solution

A Proprietary Avatar Engine Built From Scratch

Custom Lip-Sync: Phoneme-to-Viseme Pipeline

Expressive Voice: From Azure TTS to ElevenLabs

Multi-Persona Architecture: One Base, Infinite Characters

Seamless Integration Across Alpha's Product Ecosystem

Built to Scale: Thousands of Simultaneous Sessions

Outperforming HeyGen: The Benchmark That Mattered

Results

Key Metrics

The Full Story

Conclusion

Key Insights

Frequently Asked Questions

How does the real-time avatar system work technically?

Why did AE Studio build a proprietary system instead of using an existing platform like HeyGen?

What products are the avatars currently live in?

How does the system personalize interactions for each student?

Can the avatar system support different languages and devices?

Ready to build something amazing?