SakuraSensei: Japanese Conversational AI Tutor
·1 min
Read the full blog post about building SakuraSensei
Context-aware Japanese Telegram bot with LangChain, custom persona, memory persistence, multi-dataset RAG (JLPT, JMDICT, Tatoeba, JaSquad), multi-agent news explanation, cloze-question generation from YouTube via Whisper + VAD.
Technologies Used #
AI & Language Models:
- LangChain for conversation orchestration
- Custom persona and memory persistence
- Multi-agent architecture
Data Sources & RAG:
- JLPT vocabulary and grammar datasets
- JMDICT (Japanese-English dictionary)
- Tatoeba example sentences
- JaSquad question-answering dataset
Audio Processing:
- Whisper for speech recognition
- Voice Activity Detection (VAD)
- YouTube audio extraction
Platform:
- Telegram Bot API
- Python backend
Key Features #
Conversational Learning:
- Context-aware conversations in Japanese
- Personalized learning experience with memory
- Custom AI persona for engaging interactions
Multi-Dataset RAG:
- Retrieval-augmented generation from multiple Japanese learning resources
- JLPT-level appropriate content
- Example sentences and definitions
News Explanation:
- Multi-agent system for explaining Japanese news
- Breaking down complex articles into learnable content
Interactive Quizzes:
- Cloze-question generation from YouTube videos
- Automated question creation using Whisper transcription
- Real-time practice materials
Memory & Persistence:
- Conversation history tracking
- User progress monitoring
- Personalized learning paths