Andrew Koh

I am a Machine Learning Engineer and Researcher with a Ph.D. in Computer Science from Nanyang Technological University, Singapore, specializing in audio and language understanding. While my primary focus has been leveraging deep learning to build cross-modal systems for audio captioning, retrieval, and acoustic scene analysis, I maintain a broad interest in the wider AI landscape, staying current with research in computer vision, generative modeling, and reinforcement learning to bring a versatile perspective to my work.
Alongside research, I enjoy building applied AI products end-to-end. I also take on independent and freelance engineering work, including web development and automation tools for small teams and businesses. I enjoy the challenge of identifying a gap and building the end-to-end product needed to fill it, making sure the final tool is as useful as it is technically sound.
Projects
SakuraSensei: Japanese Conversational AI Tutor (2025)
Context-aware Japanese Telegram bot with LangChain, custom persona, memory persistence, multi-dataset RAG (JLPT, JMDICT, Tatoeba, JaSquad), multi-agent news explanation, cloze-question generation from YouTube via Whisper + VAD.
FaceChangerGIFBot: Face Swap for GIFs and Clips (2025)
Real-time face swap Telegram bot using ONNX inference. Stripe-integrated subscriptions, content moderation, Cloudflare Tunnel webhooks.
Acoustic Event Detection for Emergency Telephony (2023–2024)
Streaming audio event-detection with custom PyTorch model. Nine acoustic classes, log-mel extraction, VAD, caption generation (SRT/XML/JSON), Dockerized deployment.
Hand Pose Estimation for Rheumatoid Arthritis (2018)
Region Ensemble Network in PyTorch + dlib. Real-time Kinect tracking, geometric augmentation.
Blog
- Building SakuraSensei: Notes From a Japanese Learning Experiment
What started as a joke about my friend became a lesson in product thinking, user behavior, and the quiet satisfaction of finishing something - What my Second PhD Paper Taught Me
A personal reflection on an APSIPA paper, focusing on the pressures, constraints, and lessons that shaped the work beyond what appeared in the final publication. - Building FaceChangerGIFBot: Reflections on Shipping a Side Project
What started as a joke about my friend became a lesson in product thinking, user behavior, and the quiet satisfaction of finishing something - The Hidden Struggles of My First Doctoral Paper
A reflective look back at my first published paper at ICASSP 2022. The post moves beyond technical results to discuss the realities of PhD life: GPU shortages, the shift from architecture design to empirical iteration, and the personal motivation behind researching audio tools as a hearing-impaired individual.
Publications
Full list on Google Scholar.
A. Koh, C. E. Siong — APSIPA ASC 2022 — DOI
A. Koh, S. Tiwari, C. E. Siong — APSIPA ASC 2022 — DOI
A. Koh, X. Fuzhao, C. E. Siong — ICASSP 2022 — DOI
B. P. Yap, A. Koh, E. S. Chng — EMNLP 2020 Findings — ACL
Other Work
Telegram Bottle Management System
Engineered an automated workflow to streamline operational processes for Rabbit Hole, significantly improving internal efficiency via Telegram API integration.
Product Launch Landing Page
Developed a high-conversion, "rebellious" aesthetic landing page using Tailwind CSS. Integrated Stripe API for secure, seamless payment processing.
Enterprise WordPress Development & Maintenance
Lead developer for high-traffic organizational websites, focusing on security, performance optimization, and UI/UX consistency.