SoundNarratives: Rich Auditory Scene Descriptions to Support Deaf and Hard of Hearing People

University of Michigan
ACM SIGACCESS 2025
Teaser figure

SoundNarratives delivers semantically rich auditory scene descriptions to enhance sound awareness for deaf and hard of hearing individuals. Unlike existing approaches that only provide sound event labels (dotted-line box), our system offers more detailed descriptions across multiple sound parameters (solid-line box), enabling users to better engage with their surroundings.

Abstract

Sound recognition enhances safety, social interaction, and situational awareness for deaf and hard of hearing (DHH) individuals. However, existing sound recognition technologies primarily classify sounds into predefined categories (e.g., door opening, speech), which fail to capture the full complexity of real-world auditory scenes (e.g., temporal variations, sound transitions, overlapping sound layers). In this work, we introduce SoundNarratives, a real-time system that generates rich, contextual auditory scene descriptions tailored to DHH users. We began with conducting a formative study with 10 DHH participants to identify nine key auditory scene parameters (e.g., sound class, loudness, emotion, semantic description), and used these insights to guide prompt engineering with a state-of-the-art audio language model. A user study with 10 DHH participants demonstrated a significant preference for SoundNarratives over a baseline model, along with a potential for improved confidence and situational awareness.

BibTeX

        
        TBD