Branding/Logomark minus Citation Combined Shape Icon/Bookmark-empty Icon/Copy Icon/Collection Icon/Close Copy 7 no author result Created with Sketch. Icon/Back Created with Sketch. Match!

Probing auditory scene analysis.

Published on Sep 12, 2014in Frontiers in Neuroscience 3.65
· DOI :10.3389/fnins.2014.00293
Susann Deike7
Estimated H-index: 7
(Leibniz Institute for Neurobiology),
Susan L. Denham19
Estimated H-index: 19
(Plymouth University),
Elyse Sussman35
Estimated H-index: 35
(Albert Einstein College of Medicine)
Cite
Abstract
In natural environments, the auditory system is typically confronted with a mixture of sounds originating from different sound sources. The sounds emanating from different sources can overlap each other in time and feature space. Thus, the auditory system has to continuously decompose competing sounds into distinct meaningful auditory objects or “auditory streams” associated with the possible sound sources. This decomposition of the sounds, termed “Auditory scene analysis” (ASA) by Bregman (1990), involves two kinds of grouping. Grouping based on simultaneous cues (e.g., harmonicity) and on sequential cues (e.g., similarity of acoustic features over time). Understanding how the brain solves these tasks is a fundamental challenge facing auditory scientists. In recent years, the topic of ASA was broadly investigated in different fields of auditory research using a wide range of methods, including studies in different species (Hulse et al., 1997; Fay, 2000; Fishman et al., 2001; Moss and Surlykke, 2001), and computer modeling of ASA (for recent reviews see, Winkler et al., 2012; Gutschalk and Dykstra, 2014). Despite advances in understanding ASA, it still proves to be a major challenge for auditory research, especially in verifying whether experimental findings are transferable to more realistic auditory scenes. This special issue is a collection of 10 research papers and one review paper providing a snapshot of current ASA research. The research paper on visual perception provides a comparative view of modality specific as well as general characteristics of perception. One approach for understanding ASA in real auditory scenes is the use of stimulus parameters that produce an ambiguous percept (cf. Pressnitzer et al., 2011). The advantage of such an approach is that different perceptual organizations can be studied without varying physical stimulus parameters. Using a visual ambiguous stimulus and combining real-time functional magnetic resonance imaging and machine learning techniques, Reichert et al. (2014) showed that it is possible to determine the momentary state of a subject's conscious percept from time resolved BOLD-activity. The high classification accuracy of this data-driven classification approach may be particularly useful for auditory research investigating perception in continuous, ecologically-relevant sound scenes. A second advantage in using ambiguous stimuli in experiments on ASA is that perception of them can be influenced by intention or task (Moore and Gockel, 2002). By manipulating task requirements one can mirror real hearing situations where listeners often need to identify and localize sound sources. The studies by Shestopalova et al. (2014) and Kondo et al. (2014) examined the influence of motion on stream segregation. In general, and corresponding to earlier findings, both of these studies found that sound source separation in space promoted segregation. Surprisingly, however, the effect of spatial separation on stream segregation was found to be temporally limited and affected by volitional head motion (Kondo et al., 2014), but unaffected by movement of sound sources or by the presentation of movement-congruent visual cues (Shestopalova et al., 2014). Another study, by Sussman-Fort and Sussman (2014), investigated the influence of stimulus context on the buildup of stream segregation. They found that the build-up of stream segregation was context-dependent, occurring faster under constant than varying stimulus conditions. Based on these findings the authors suggested that the auditory system maintains a representation of the environment that is only updated when new information indicates that reanalyzing the scene is necessary. Two further studies examined the influence of attention on stream segregation. Nie et al. (2014) found that in conditions of weak spectral contrast, attention facilitated stream segregation. Shuai and Elhilali (2014) found that different forms of attention, both stimulus-driven and top-down attentional processes, modulated the response to a salient event detected within a sound stream. The special issue also includes two research papers that extend current views on multistability and perceptual ambiguity. The psychophysical study by Denham et al. (2014) showed that streaming sequences could be perceived in many more ways than in the traditionally assumed (Integrated vs. Segregated organizations) and that the different interpretations continuously compete for dominance. Moreover, despite being highly stochastic, the switching patterns of individual participants could be distinguished from those of others. Hence, perceptual multistability can be used to characterize both general mechanisms and individual differences in human perception. By comparing stimulus conditions that promote one perceptual organization with those causing an ambiguous percept Dollezal et al. (2014) found specific BOLD responses for the ambiguous condition in higher cognitive areas (i.e., posterior medial prefrontal cortex and posterior cingulate cortex). Both of these regions were associated with cognitive functions, monitoring decision uncertainty (Ridderinkhof et al., 2004) and being involved when higher task demands were imposed (Raichle et al., 2001; Dosenbach et al., 2007), respectively. This suggests that perceptual ambiguity may be characterized by uncertainty regarding the appropriate perceptual organization, and by higher cognitive load due to this uncertainty. A second group of research papers within this special issue focused on understanding hearing deficits in older listeners and cochlear implant (CI) users. Gallun et al. (2013) demonstrated that listeners could be categorized in terms of their ability to use spatial and spectrotemporal cues to separate competing speech streams. They showed that the factor of age substantially reduced spatial release from masking, supporting the hypothesis that aging, independent of an individual's hearing threshold, can result in changes in the cortical and/or subcortical structures essential for spatial hearing. Divenyi (2014) compared the signal to noise (S/N) ratio at which normal hearing young and elderly listeners were able to discriminate single formant dynamics in vowel-analog streams and found that elderly listeners required a 15 and 20 dB larger S/N ratio than younger listeners. Since formant transitions represent potent cues for speech intelligibility, this result may at least partially explain the well-documented intelligibility loss of speech in babble noise by the elderly. Bockmann-Barthel et al. (2014) pursued the question whether the time course of auditory streaming differs between normal-hearing listeners and CI users and found that the perception of streaming sequences was similar in quality between both groups. This similarity may suggest that stream segregation is not solely determined by frequency discrimination, and that CI users do not simply respond to differences between A and B sounds but actually experience the phenomenon of stream segregation. The review by Bendixen (2014) suggests predictability as a cue for sound source decomposition. Bendixen collected empirical evidence spanning issues of predictive auditory processing, predictive processing in ASA, and methodological aspects of measuring ASA. As a result, and as a theoretical framework, an analogy with the old-plus-new heuristic for grouping simultaneous acoustic signals was proposed. Taken together, this special issue provides a comprehensive summary of current research in ASA, relating the approaches and experimental findings to natural listening conditions. It would be highly desirable in future research on ASA to use more natural stimuli and to test the ecological validity of these findings. With this special issue we hope to raise awareness of this issue.
  • References (24)
  • Citations (1)
Cite
References24
Newest
Published on Sep 12, 2014in Frontiers in Neuroscience 3.65
Yingjiu Nie3
Estimated H-index: 3
(JMU: James Madison University),
Yang Zhang13
Estimated H-index: 13
(UMN: University of Minnesota),
Peggy B. Nelson14
Estimated H-index: 14
(UMN: University of Minnesota)
The current study measured neural responses to investigate auditory stream segregation of noise stimuli with or without clear spectral contrast. Sequences of alternating A and B noise bursts were presented to elicit stream segregation in normal-hearing listeners. The successive B bursts in each sequence maintained an equal amount of temporal separation with manipulations introduced on the last stimulus. The last B burst was either delayed for 50% of the sequences or not delayed for the other 50%...
Published on Jul 21, 2014in Frontiers in Psychology 2.13
Martin Böckmann-Barthel4
Estimated H-index: 4
(Otto-von-Guericke University Magdeburg),
Susann Deike7
Estimated H-index: 7
(Leibniz Institute for Neurobiology)
+ 2 AuthorsJesko L. Verhey16
Estimated H-index: 16
(Otto-von-Guericke University Magdeburg)
In a complex acoustical environment with multiple sound sources the auditory system uses streaming as a tool to organize the incoming sounds in one or more streams depending on the stimulus parameters. Streaming is commonly studied by alternating sequences of signals. These are often tones with different frequencies. The present study investigates stream segregation in cochlear implant (CI) users, where hearing is restored by electrical stimulation of the auditory nerve. CI users listened to 30-...
Published on Jul 21, 2014in Frontiers in Neuroscience 3.65
Lan Shuai8
Estimated H-index: 8
,
Mounya Elhilali22
Estimated H-index: 22
(Johns Hopkins University)
Selecting pertinent events in the cacophony of sounds that impinge on our ears every day is regulated by the acoustic salience of sounds in the scene as well as their behavioral relevance as dictated by top-down task-dependent demands. The current study aims to explore the neural signature of both facets of attention, as well as their possible interactions in the context of auditory scenes. Using a paradigm with dynamic auditory streams with occasional salient events, we recorded neurophysiologi...
Published on Jun 24, 2014in Frontiers in Neuroscience 3.65
Hirohito M. Kondo1
Estimated H-index: 1
(Hamamatsu University),
Iwaki Toshima2
Estimated H-index: 2
+ 1 AuthorsMakio Kashino17
Estimated H-index: 17
(TITech: Tokyo Institute of Technology)
The perceptual organization of auditory scenes is a hard but important problem to solve for human listeners. It is thus likely that cues from several modalities are pooled for auditory scene analysis, including sensory-motor cues related to the active exploration of the scene. We previously reported a strong effect of head motion on auditory streaming. Streaming refers to an experimental paradigm where listeners hear sequences of pure tones, and rate their perception of one or more subjective so...
Published on Jun 12, 2014in Frontiers in Neuroscience 3.65
Pierre Divenyi2
Estimated H-index: 2
(Stanford University)
Pairs of harmonic complexes with different fundamental frequencies f0 (105 and 189 Hz or 105 and 136 Hz) but identical bandwidth (0.25-3 kHz) were band-pass filtered using a filter having an identical center frequency of 1 kHz. The filter’s center frequency was modulated using a triangular wave having a 5-Hz modulation frequency fmod to obtain a pair of vowel-analog waveforms with dynamically varying single-formant transitions. The target signal S contained a single modulation cycle starting eit...
Published on Jun 6, 2014in Frontiers in Neuroscience 3.65
Lena-Vanessa Dolležal4
Estimated H-index: 4
,
André Brechmann18
Estimated H-index: 18
(Leibniz Institute for Neurobiology)
+ 1 AuthorsSusann Deike7
Estimated H-index: 7
(Leibniz Institute for Neurobiology)
Auditory stream segregation refers to a segregated percept of signal streams with different acoustic features. Different approaches have been pursued in studies of stream segregation. In psychoacoustics, stream segregation has mostly been investigated with a subjective task asking the subjects to report their percept. Few studies have applied an objective task in which stream segregation is evaluated indirectly by determining thresholds for a percept that depends on whether auditory streams are ...
Published on May 23, 2014in Frontiers in Neuroscience 3.65
Christoph Reichert5
Estimated H-index: 5
(Otto-von-Guericke University Magdeburg),
Robert Fendrich24
Estimated H-index: 24
(Dartmouth College)
+ 3 AuthorsJochem W. Rieger19
Estimated H-index: 19
Perception is an active process that interprets and structures the stimulus input based on assumptions about its possible causes. We use real-time functional magnetic resonance imaging (rtfMRI) to investigate a particularly powerful demonstration of dynamic object integration in which the same physical stimulus intermittently elicits categorically different conscious object percepts. In this study, we simulated an outline object that is moving behind a narrow slit. With such displays, the physic...
Published on Apr 29, 2014in Frontiers in Neuroscience 3.65
Jonathan Sussman-Fort1
Estimated H-index: 1
(Albert Einstein College of Medicine),
Elyse Sussman35
Estimated H-index: 35
(Albert Einstein College of Medicine)
Stream segregation is the process by which the auditory system disentangles the mixture of sound inputs into discrete sources that cohere across time. The length of time required for this to occur is termed the ‘buildup’ period. In the current study, we used the buildup period as an index of how quickly sounds are segregated into constituent parts. Specifically, we tested the hypothesis that stimulus context impacts the timing of the buildup and, therefore, affects when stream segregation is det...
Published on Apr 7, 2014in Frontiers in Neuroscience 3.65
L. B. Shestopalova5
Estimated H-index: 5
(RAS: Russian Academy of Sciences),
Tamás M. Bőhm7
Estimated H-index: 7
(BME: Budapest University of Technology and Economics)
+ 6 AuthorsIstván Winkler57
Estimated H-index: 57
(University of Szeged)
An audio-visual experiment using moving sound sources was designed to investigate whether the analysis of auditory scenes is modulated by synchronous presentation of visual information. Listeners were presented with an alternating sequence of two pure tones delivered by two separate sound sources. In different conditions, the two sound sources were either stationary or moving on random trajectories around the listener. Both the sounds and the movement trajectories were derived from recordings in...
Published on Mar 31, 2014in Frontiers in Neuroscience 3.65
Alexandra Bendixen22
Estimated H-index: 22
(University of Oldenburg)
Many sound sources emit signals in a predictable manner. The idea that predictability can be exploited to support the segregation of one source’s signal emissions from the overlapping signals of other sources has been expressed for a long time. Yet experimental evidence for a strong role of predictability within auditory scene analysis (ASA) has been scarce. Recently, there has been an upsurge in experimental and theoretical work on this topic resulting from fundamental changes in our perspectiv...
Cited By1
Newest
Lars Haab6
Estimated H-index: 6
(Saarland University),
Caroline Lehser (Saarland University)+ 4 AuthorsDaniel J. Strauss16
Estimated H-index: 16
(Saarland University)
Recent work has shown that sharp spectral edges in acoustic stimuli might have advantageous effects in the treatment of tonal tinnitus. In the course of this paper, we evaluate the long-term effects of spectrally notched hearing aids on the subjective tinnitus distress. By merging recent experimental work with a computational tinnitus model, we modified the commercially available behind-the-ear hearing aids so that a frequency band of 0.5 octaves, centered on the patient’s individual tinnitus fr...
Published on May 1, 2017in Neuroscience 3.24
Hirohito M. Kondo1
Estimated H-index: 1
(Chukyo University),
Takanori Kochiyama27
Estimated H-index: 27
Abstract Age-related changes in auditory and visual perception have an impact on the quality of life. It has been debated how perceptual organization is influenced by advancing age. From the neurochemical perspective, we investigated age effects on auditory and visual bistability. In perceptual bistability, a sequence of sensory inputs induces spontaneous switching between different perceptual objects. We used different modality tasks of auditory streaming and visual plaids. Young and middle-age...