Brain Bridge researchers find — for the first time — intrinsic memorability to voices

February 26, 2025

Psychology| Faculty

In the first study of auditory memorability, researchers at the Brain Bridge Laboratory found significant consistency across participants in their memory for voice clips and for speakers across different utterances. Plus, the team was able to reliably predict, through quantifiable voice features, which voices listeners would remember. Their findings have been published in Nature Human Behaviour.

The lab previously showed that people tend to be very similar in the images they remember and forget, to the degree to which researchers can predict a person’s memory for an image based on that image alone when plugged into a neural network. The researchers became curious to know if they could identify these same effects for sound. In particular, they wanted to explore the realm of voices, as parallels have been found in the brain and behavior for how humans process voices in the auditory domain and how we process faces in the visual domain.

The team — Wilma A. Bainbridge, Assistant Professor in the Department of Psychology; Cambria Revsine, a fourth-year doctoral student in the Psychology program; and Esther Goldberg, a fourth-year Data Science major — focused their research on finding whether there’s an inherent memorability to voices. Are there some very memorable voices and some very forgettable voices? Or do people remember and forget based on their own personal previous experiences?

“Even though the bulk of our research has been on vision, our memories as people are inherently multisensory, and sound is just as important for us as people functioning in the real world as vision is,” Bainbridge says. “We're always listening to music, podcasts, the news, and conversing with others. But at the same time, within the field of psychology, there's a lot less known about auditory memory than visual memory.”

In focusing on voice memorability, the team collected recognition memory scores from more than 3,000 participants listening to a sequence of different speakers saying the same sentence. The voices came from a database called TIMIT, a corpus of read speech by American English speakers of different sexes and dialects that was developed in the 1990s and is “very naturalistic,” per Bainbridge. The study used 6,300 sentences from TIMIT, or 10 sentences spoken by 630 people. After controlling for things like gender, the team had close to 400 voice clips saying the same sentence, which were placed into the online memory experiment.

“We used really unique sentences that sampled the different phonemes we use in English,” Bainbridge says. “For example, one sentence is, ‘Don't ask me to carry an oily rag like that.’ It’s a weird sentence, but it lets you get a sense of what someone's accent or dialect sounds like.”

Study participants would hear the voices all saying this same sentence and were prompted to press a key whenever they recognized a voice that they heard earlier in the stream. The team ran two versions of the experiment with two different sentences (the other: “He had his dark suit in greasy wash water all year.”).

They found that there were, in fact, voices everyone remembered well, and some that people forgot. The study participants’ memory for a voice was pretty much the same between the two experiments (or two sentences). That meant no matter what sentence was said, a person could have a memorable voice or a forgettable voice. To test these results further, the team ran another experiment where about 1,500 participants heard a mix of different sentences, but had to identify a repeated voice no matter the sentence. Once again, the same voices were remembered, and the same voices were forgotten. The findings suggest there is an intrinsic memorability to voices, no matter what that voice is saying or who is listening.

The team did a final analysis that used computational software called VoiceSauce that can piece apart different acoustic features in a voice recording, everything from pitch to how certain vowels are formed. They ran an online experiment in which participants were asked to label each voice for different qualities: How attractive it sounded, how typical it sounded, and so forth. They combined these aspects, along with features like gender and dialect, to see what was most predictive of what makes something memorable. Surprisingly, they learned that low-level acoustic features were most related to memorability, versus a deeper interpretation of things like attractiveness, gender, or dialect. The team then created a computational model using the low-level features to predict people's memories.

“You could imagine, in a future application, that if you want to pick a memorable podcast host, you feed candidates’ voices through our model, and you see who has the most memorable voice,” Bainbridge says. The same could be true for choosing voices for advertising, language learning apps, and more instances where memorability is crucial.

Drawings suggest changes in object memory, but not spatial memory, across time

In another of the lab’s memory studies, published in Cognition, researchers used crowdsourced scoring of hundreds of drawings made from memory to explore changes to memory over time. They found that spatial memory is highly accurate and relatively independent of time, whereas the proportion of objects recalled from a scene and false recall for objects not in the original image are highly dependent on time.

The research was executed by Bainbridge; Emma Megla, a fourth-year PhD student in the Psychology program; and Samuel R. Rosenthal, a fourth-year triple major in Psychology, Anthropology, and History. Over the last five years, the lab has pioneered a technique in which people draw what's in their memories. From their drawings, the researchers can see what aspects of their memory are sharp and accurate, and what aspects are inaccurate (including false memories). In this most recent study, the team revisited classic questions of how time impacts memories. They asked individuals to draw their memories after looking at images for different amounts of time, and then also had people draw their memories after different amounts of delay (ranging from drawing five minutes after seeing an image, up to a week after having seen an image).

The researchers replicated the classic effects of how looking at an image for a longer period of time benefits memory, and that waiting longer after seeing the image causes memory to fade. But they also found interesting differences in what information sticks around and what fades. People will draw more objects the more time they have to look at an image, and draw fewer objects the longer the delay they have; however, their spatial accuracy stays pretty much the same throughout. A week after seeing an image, people accurately placed different objects — even if they don't remember those specific objects, they might draw a blob in the correct place.

“This suggests that there are different systems in the brain possibly for preserving details and object-based information, versus spatial location information,” Bainbridge says. The study also showed that different types of features determine what objects stuck around in memory. When people were given different amounts of time to view an image, the meaning of the object was beneficial to its memorability. In other words, people first save into memory the most meaningful objects in an image. If given more time, people then start to save things that are less meaningful.

On the other hand, when there are longer and longer delays after seeing an image, how visually salient or attention-grabbing an object is determines what lasts. Objects that are not so salient are the first to be dropped in memory after longer time delays.