Research


Are there electrophysiological signatures of listening effort?

In one line of research, we work to develop direct measures of neural activity during effortful listening. Here we focus on both brain oscillations (brain waves) in the frequency domain of the electroencephalogram (EEG), and event-related potentials (ERPs), transient changes in electrical activity that are time-locked to a stimulus such as a speech sound.

In a recent study, we varied the signal-to-noise ratio of spoken words and memory load, which was implemented as a varying number of digits to be remembered on each trial. The goal was to identify electrophysiological markers of cognitive demand during speech perception. In this dual-task experiment, listeners viewed a sequence of digits that varied in length as one (low load), three (medium load), or five (high load) digits. Participants then heard a spoken word that was presented in a background of multi-talker babble at either an easy, intermediate, or difficult signal-to-noise ratio (SNR). At the end of each trial, participants were prompted to type the word and then recall the digits. For the behavioral results, we hypothesized that digit recall accuracy would decrease at more difficult SNRs, and this was the case (see Figure 1). In the electrophysiological data, a main interest was alpha-band (8-13 Hz) power, which had been identified in prior studies as reflecting listening effort. As shown in Figure 2, alpha power desynchronized during speech perception, indicating listening effort, and did so to the greatest degree for speech at the intermediate SNR. These results for alpha power may indicate that listening effort increased from the easy to intermediate SNR, but decreased when the SNR became more difficult. This pattern of alpha power modulation may indicate disengagement of neural resources at difficult SNRs that was undetected by the behavioral measure.

Graph showing that word accuracy decreases with signal-to-noise ratio (SNR). X-axis, SNR (categorical axis, with values of 10dB, 0 dB, -5dB); y-axis, Word Identification Accuracy; separate lines for digit load conditions (1, 3, or 5 digits).

Figure 1a. Mean word accuracy as a function of signal-to-noise ratio and memory (digit) load. Error bars indicate +/- 1 SE.

Graph showing that digit accuracy decreases with signal-to-noise ratio (SNR).  X-axis, SNR (categorical axis, with values of 10dB, 0 dB, -5dB); y-axis, Digit Recall Accuracy; separate lines for digit load conditions (1, 3, or 5 digits).

Figure 1b. Mean digit recall accuracy as a function of signal-to-noise ratio and memory (digit) load. Error bars indicate +/- 1 SE.

Scalp topography in the alpha frequency range (8-12 Hz) in each condition. Nine scalp maps (for three levels of signal-to-noise ratio and three levels of memory load) show a posterior distribution of alpha-band desynchronization.

Figure 2a. Scalp topography in the alpha frequency range (8-12 Hz) in each condition.

Graphs showing mean alpha ERSP during spoken sentence presentation.  X-axis, signal-to-noise ratio (categorical axis, with values of 10dB, 0 dB, -5dB); y-axis, ERSP in alpha band (8-13 Hz); separate plots for each digit load condition (1, 3, or 5 digits).

Figure 2b. Mean alpha ERSP during spoken sentence presentation (0 - 2 sec following sentence onset). Shown is mean ERSP in the alpha band collapsed across region and electrode. Gray lines show individual participants. Error bars show +/- 1 SE.


Can Predictive Context Reduce Listening Effort?

Another line of research examines the role of predictive context in listening effort. Oftentimes people with hearing loss rely on context clues to understand speech. Although it is well-known that predictable context improves speech perception accuracy, whether and how context affects listening effort is not known.

In a behavioral study with young adult participants with normal hearing (Hunter, in press), we found that listening effort was reduced when spoken sentences presented in a background of multi-talker babble were predictable (e.g., “He swept the floor with a broom.”) compared to unpredictable (e.g., “They discussed the broom.”). This study used a sequential dual-task or memory load design, in which each trial began with visual presentation of a short (low load) or long (high load) sequence of to-be-remembered digits. Both words and digits were identified more quickly and accurately in predictable than unpredictable sentence contexts, indicating that predictable context both increased intelligibility and reduced listening effort.

We have also conducted an electrophysiological study (Hunter, 2020) with young adult participants with normal hearing, in which we showed that the effect of sentence predictability on cognition could be observed during speech perception in the power of alpha-band (8-13 Hz) neural oscillations and in the P300 or late positive event-related potential (ERP) (see Figure 3, below). Consistent with our behavioral studies, these EEG and ERP findings indicated that listening effort was reduced when sentences were predictable.

Graph showing ERSP across a trial. X-axis, Time (-2.5 to 1.5 seconds); y-axis, Frequency (3 to 55 Hz). Separate plot for each condition (predictable v. unpredictable sentence and high v. low memory load).Plots show ERD beginning at digit onset and continuing past spoken sentence offset.

Figure 3a. Mean ERSP response across a trial for each level of predictability and memory load. Sentence onset is at 0 ms. Arrows mark the approximate onset of visual digits at ~2 sec before sentence onset. Digits remained on the screen for 1 sec.

Graph of mean ERSP across conditions. X-axis (categorical), sentence predictability; Y-axis, ERSP in alpha band (8-13 Hz). Separate bars for high and low memory load conditions. Graph shows more negative ERSP for unpredictable than predictable sentences, and more negative ERSP for high than low memory load.

Figure 3b. Mean alpha ERSP during spoken sentences (0-2 sec following sentence onset). Shown is mean ERSP in the alpha band collapsed across participant, region, and electrode. Error bars show +/- 1 SE. Pred, predictable; Unpred, unpredictable.

A subsequent behavioral study (Hunter and Humes, under review) extended the finding of reduced listening effort for predictable sentences to the population of older adults hearing loss and normal hearing. This was observed for healthy older adults with both low and high working memory capacity (see Figure 4, below), indicating that older adults across a wide range of working memory capacities obtain a short-term boost in available cognitive capacity when sentences are predictable.

Overall, these results suggest that predictive context may be useful as a tool for reducing listening effort.

Word Accuracy Data. X-axis (categorical) shows high and low cognitive load; y-axis, Word Percent Correct; separate lines for predictable and unpredictable sentences, and for participants with high and low working memory capacity; separate graphs for participants with normal hearing and hearing loss.

Figure 4a. Mean word identification accuracy as a function of sentence predictiveness and memory (digit) load. WM; participants' working memory capacity (high or low). Error bars indicate +/- 1 SE.

Digit Accuracy Data. X-axis (categorical) shows high and low cognitive load; y-axis, Digit Percent Corret; separate lines for predictable and unpredictable sentences, and for participants with high and low working memory capacity; separate graphs for participants with normal hearing and hearing loss.

Figure 4a. Mean digit recall accuracy as a function of sentence predictiveness and memory (digit) load. WM; participants' working memory capacity (high or low). Error bars indicate +/- 1 SE.

Word Response Time Data. X-axis (categorical) shows high and low cognitive load; y-axis, Word response time (seconds); separate lines for predictable and unpredictable sentences, and for participants with high and low working memory capacity; separate graphs for participants with normal hearing and hearing loss.

Figure 4c. Mean word identification response time as a function of sentence predictiveness and memory (digit) load. WM; participants' working memory capacity (high or low). Error bars indicate +/- 1 SE.

Digit Recall Response Time Data. X-axis (categorical) shows high and low cognitive load; y-axis, Digit Response Time (seconds); separate lines for predictable and unpredictable sentences, and for participants with high and low working memory capacity; separate graphs for participants with normal hearing and hearing loss.

Figure 4d. Mean digit recall response time (RT) as a function of sentence predictiveness and memory (digit) load. WM; participants' working memory capacity (high or low). Error bars indicate +/- 1 SE.