The Influence of Narrative Specificity and Voice Quality When Listening to Audio Descriptions:

A Comparison of the Sighted and the Blind

Authors

DOI:

https://doi.org/10.47476/jat.v7i2.2024.314

Keywords:

audio description, voice quality, narrative specificity, spatiotemporal language, mental imagery

Abstract

Audio description (AD) serves as a vital means to make visual media accessible to non-sighted and visually impaired audiences. This study systematically investigates the impact of narrative specificity and voice quality on imageability and comprehension in both sighted and non-sighted populations. Twenty non-sighted participants, including congenitally blind individuals and those who lost their sight early in life, were compared with a group of 20 sighted participants, matched for verbal working memory capabilities. Participants listened to 50 short event descriptions, describing spatiotemporal relations with varying levels of narrative specificity, presented in both typical and dysphonic voices. After each event description, participants rated their ability to imagine the content, overall comprehension, listening effort, and listening enjoyment. Results indicate that high narrative specificity enhanced imageability in non-sighted individuals, especially for scenarios involving changes in motion, and, to some extent, for visuospatial relations, irrespective of sightedness. Additionally, dysphonic voices increased listening effort and reduced enjoyment for non-sighted participants only. These findings underscore the importance of considering voice quality and narrative specificity in AD for non-sighted users and have implications for both professional audio describers and the development of automated AD systems.

Lay summary

Imagine you're watching a movie but instead of seeing the scenes, you're listening to someone describe them to you. This is what audio description (AD) does - it makes movies, TV shows, and other visual media accessible for people who can't see or have trouble seeing. But not all descriptions are created equal. Think about the difference between someone telling you "a person walks into a room" versus "a tall, anxious man in a red shirt bursts into a sunlit, cluttered room, glancing over his shoulder." The second description paints a much clearer picture in your mind, doesn't it?

This study looked at how specific these descriptions are (like our detailed scene above) and the quality of the voice telling the story, to see how they affect people's ability to imagine and understand what's being described. We worked with 40 people - half of whom have never been able to see or lost their sight when they were very young, and the other half who can see. Everyone listened to 50 short stories about different events. These stories varied in how detailed they were and were told in either a clear voice or a voice that was hard to listen to (a hoarse voice).

After hearing each story, participants judged how well they could picture what was described, how well they understood it, how hard they had to work to listen, and how much they enjoyed listening. The results were interesting. When the stories were really detailed, people who couldn't see were better able to "see" the action in their minds, especially if the story involved movement or where things were located. This was true for everyone, regardless of whether they could see or not. But, when the voice telling the story was hard to listen to, it made it tougher for people who couldn't see to follow along and enjoy the story.

What this tells us is that for audio descriptions to be really helpful and enjoyable for everyone, especially those who rely on them, it's important to not only choose the right words but also the right voice. This insight is valuable not just for people who create audio descriptions but also for developing technology that can automatically generate them. Making movies and TV shows more enjoyable and accessible for everyone is the goal, and we hope that this study help getting us there.

Declaration of use of AI: ChatGPT4 was used to construct this lay-text based on the abstract from the original manuscript

Downloads

Download data is not yet available.

Author Biographies

Viveka Lyberg-Åhlander, Åbo Akademi University

Viveka Lyberg Åhlander is a registered speech pathologist and professor of speech-language pathology at Åbo Akademi University in Turku, Finland and at Lund University in Sweden.

Her research focuses on communication and the acoustic environment in educational settings, emphasizing teacher vocal health and student listening effort. She investigates how classroom acoustics, background noise, and vocal load affect teaching and learning. Her intervention studies assess the impact of teacher training on both educator and student well-being and language development. She is currently heading a research project on “the communication supporting workplace” funded by the Swedish Research Council for Health, Working Life and Welfare and one research project on “Singing health in Schools, a Societal matter” along with a twin project on Singing health in bilingual pupils in Finland.

Jana Holsanova, Lund University

Jana Holsanova is Associate Professor in Cognitive Science at Lund University, Sweden. Her research focuses on audio description, multimodality, cognition and communication. Jana Holsanova is currently heading a project on Audio description and accessible information funded by TSI Lund university (2021-2024) and together with Roger Johansson, she is heading a research project on reception of audio description (2019-2021). Jana Holsanova is Chair of The Swedish Braille Authority, Swedish Agency for Accessible Media (MTM) and Coordinator of the initiative "Audio Description for Accessible Communication”.  She is the author of Discourse, Vision and Cognition (2008), Myths and Truths About Reading (2010), Image description for accessibility (2019) and editor of Methodologies for Multimodal Research (2012) and Audio description - Research and Practices (2016).

Roger Johansson, Lund University

Roger Johansson is an Associate Professor in Psychology, Lund University. His research expertise revolves around cognition, communication and learning, with a special focus on the relationship between eye movements, mental imagery, episodic memory, event cognition and narrative processing. In his research, he has further engaged in methodological development for investigating such topics using eye-tracking and pupillometry techniques. Together with Jana Holsanova he has been heading a research project investigating principles that underlie successful communication between the sighted and the blind, with a specific focus on audio description (funded by the Swedish Research Council for Health, Working Life and Welfare, grant no. 2019-2021).

Downloads

Published

2024-12-19

How to Cite

Lyberg-Åhlander, V., Holsanova, J., & Johansson, R. (2024). The Influence of Narrative Specificity and Voice Quality When Listening to Audio Descriptions:: A Comparison of the Sighted and the Blind. Journal of Audiovisual Translation, 7(2), 1–25. https://doi.org/10.47476/jat.v7i2.2024.314