PAPER_TITLE

FIRST_AUTHOR_LAST, FIRST_AUTHOR_FIRST; SECOND_AUTHOR_LAST, SECOND_AUTHOR_FIRST

EvolveCaptions: Real-Time Collaborative ASR Adaptation for DHH Speakers

Liang-Yuan Wu, Dhruv Jain

University of Michigan
ACM SIGACCESS Conference on Computers and Accessibility (ASSETS 2025)

Paper Code (Coming Soon)

Overview of EvolveCaptions. (1) Hearing users correct live captions of the DHH speaker’s voice. (2) The DHH speaker records targeted phrases generated from the corrected terms. (3) The Whisper ASR model is fine-tuned with the recordings and adapts to the speaker over time.

Abstract

Current ASR systems struggle to reliably recognize the speech of Deaf and Hard of Hearing (DHH) individuals, particularly in real-time communication. Existing personalization methods typically require extensive pre-recorded data and place the burden entirely on DHH users. We present EvolveCaptions, a live ASR adaptation system that supports collaborative, in-the-moment personalization. Hearing participants correct ASR errors during conversation, and the system generates short, phonetically relevant phrases for the DHH speaker to record. These recordings are then used to iteratively fine-tune the ASR model. In a preliminary evaluation, our system reduced word error rate from 0.53 to 0.27 over four adaptation rounds with minimal user effort. This work introduces a low-effort, socially collaborative method for adapting ASR to diverse DHH voices in real-world settings.

Video Presentation

BibTeX

        
          @inproceedings{wu2025evolvecaptions,
            title={EvolveCaptions: Real-Time Collaborative ASR Adaptation for DHH Speakers},
            author={Wu, Liang-Yuan and Jain, Dhruv},
            booktitle={Proceedings of the 27th International ACM SIGACCESS Conference on Computers and Accessibility},
            pages={1--4},
            year={2025}
          }

More Works from me!

CARTGPT: Real-Time Correction of CART Captions Using Large Language Models

SoundNarratives: Rich Auditory Scene Descriptions to Support Deaf and Hard of Hearing People

Assessing the Role of Medical Caption Technology to Support Physician-Patient Communication for Patients with Hearing Loss: A Pilot Study

EvolveCaptions: Real-Time Collaborative ASR Adaptation for DHH Speakers

Overview of EvolveCaptions. (1) Hearing users correct live captions of the DHH speaker’s voice. (2) The DHH speaker records targeted phrases generated from the corrected terms. (3) The Whisper ASR model is fine-tuned with the recordings and adapts to the speaker over time.

Abstract

Video Presentation

BibTeX