Liang-Yuan “Leo” Wu 吳兩原
I am now working with Prof. Hua Shen at New York University Shanghai and with Prof. Dhruv “DJ” Jain and the Soundability Lab at University of Michigan. I hold a Master's Degree in Computer Science & Engineering from the University of Michigan, and a Bachelor's Degree in Electrical Engineering from National Taiwan University.
My research lies at the intersection of Speech and Language Processing and Human-Computer Interaction, with a focus on building human-centered AI systems for sound and speech understanding. I work closely with Deaf and Hard of Hearing (DHH) communities to explore how people perceive, trust, and interact with audio AI systems.
My recent focus includes:
- Verbal sounds (Speech Understanding): Developing adaptive captioning and ASR systems to improve accessibility for real-world communication. [CARTGPT, MedCaption, EvolveCaptions]
- Non-verbal sound (Audio Scene Understanding): Leveraging multimodal models to interpret complex environmental and non-verbal sounds, often with human-in-the-loop consideration to make auditory environments understandable and actionable for users. [SoundNarratives, SoundWeaver, CapTune]
- Emotional cues in speech and sound: Investigating how emotional content in verbal and non-verbal audio is perceived and processed by the recent advanced multimodal models.
I bring hands-on experience in deep learning for speech and audio, full-stack system development, and human-centered evaluation. My work bridges algorithmic advances with mixed-methods user studies to ensure AI systems are not only technically strong but also practically meaningful.
I am applying for 2026 Fall CS PhD programs, focusing on sound and human-centered AI for social good. I am always excited to discuss my research and potential collaboration, please feel free to reach out!