Cochleogram-based approach for detecting perceived emotions in music

Published on Sep 1, 2020in Information Processing and Management3.892
· DOI :10.1016/J.IPM.2020.102270
Mladen Russo8
Estimated H-index: 8
(University of Split),
Luka Kraljevic1
Estimated H-index: 1
(University of Split)
+ 1 AuthorsMarjan Sikora4
Estimated H-index: 4
(University of Split)
Abstract Identifying perceived emotional content of music constitutes an important aspect of easy and efficient search, retrieval, and management of the media. One of the most promising use cases of music organization is an emotion-based playlist, where automatic music emotion recognition plays a significant role in providing emotion related information, which is otherwise, generally unavailable. Based on the importance of the auditory system in emotional recognition and processing, in this study, we propose a new cochleogram-based system for detecting the affective musical content. To effectively simulate the response of the human auditory periphery, the music audio signal is processed by a detailed biophysical cochlear model, thus obtaining an output that closely matches the characteristics of human hearing. In this proposed approach, based on the cochleogram images, which we construct directly from the response of the basilar membrane, a convolutional neural network (CNN) is used to extract the relevant music features. To validate the practical implications of the proposed approach with regard to its possible integration in different digital music libraries, an extensive study was conducted to evaluate the predictive performance of our approach in different aspects of music emotion recognition. The proposed approach was evaluated on publicly available 1000 songs database and the experimental results showed that it performed better in comparison with common musical features (such as tempo, mode, pitch, clarity, and perceptually motivated mel-frequency cepstral coefficients (MFCC)) as well as official ”MediaEval” challenge results on the same reference database. Our findings clearly show that the proposed approach can lead to better music emotion recognition performance and be used as part of a state-of-the-art music information retrieval system.
  • References (58)
  • Citations (0)
📖 Papers frequently viewed together
2016ICMLC: International Conference on Machine Learning and Cybernetics
4 Authors (Chingshun Lin, ..., Jhihsiang Jhang)
6 Citations
6 Authors (Jaesung Lee, ..., Dae-Won Kim)
1 Citations
3 Authors (Fan Zhang, ..., Maozhen Li)
5 Citations
78% of Scinapse members use related papers. After signing in, all features are FREE.
With the advent of digital music and music-streaming platforms, the amount of music available for selection is now greater than ever. Sorting through all this music is impossible for anyone. Music recommendation systems reduce human effort by automatically recommending music based on genre, artist, instrument, and user reviews. Although music recommendation systems are widely used commercially, there does not exist any perfect recommendation system that can provide best music recommendation to t...
1 CitationsSource
#1Zheng Wang (University of Electronic Science and Technology of China)H-Index: 1
#2Jie Zhou (University of Electronic Science and Technology of China)H-Index: 51
Last. Yang Yang (University of Electronic Science and Technology of China)H-Index: 32
view all 6 authors...
Abstract With the rapid development of digital equipment and the continuous upgrading of online media, a growing number of people are willing to post videos on the web to share their daily lives ( Jelodar, Paulius, & Sun, 2019 ). Generally, not all video segments are popular with audiences, some of which may be boring. If we can predict which segment in a newly generated video stream would be popular, the audiences can only enjoy this segment rather than watch the whole video to find the funny p...
#1Gabriele Pergola (Warw.: University of Warwick)
#2Lin Gui (Warw.: University of Warwick)H-Index: 10
Last. Yulan He (Warw.: University of Warwick)H-Index: 3
view all 3 authors...
Abstract We propose a topic-dependent attention model for sentiment classification and topic extraction. Our model assumes that a global topic embedding is shared across documents and employs an attention mechanism to derive local topic embedding for words and sentences. These are subsequently incorporated in a modified Gated Recurrent Unit (GRU) for sentiment classification and extraction of topics bearing different sentiment polarities. Those topics emerge from the words’ local topic embedding...
#1Chloe Lara MacGregor (Goldsmiths, University of London)H-Index: 1
#2Daniel Müllensiefen (Goldsmiths, University of London)H-Index: 17
Previous research has shown that levels of musical training and emotional engagement with music are associated with an individual’s ability to decode the intended emotional expression from a music performance. The present study aimed to assess traits and abilities that might influence emotion recognition, and to create a new test of emotion discrimination ability. The first experiment investigated musical features that influenced the difficulty of the stimulus items (length, type of melody, inst...
1 CitationsSource
#1Jessica Akkermans (Goldsmiths, University of London)H-Index: 1
#2Renee Schapiro (Goldsmiths, University of London)H-Index: 1
Last. Klaus Frieler (Weimar Institute)H-Index: 8
view all 12 authors...
With over 560 citations reported on Google Scholar by April 2018, a publication by Juslin and Gabrielsson (1996) presented evidence supporting performers’ abilities to communicate, with high accuracy, their intended emotional expressions in music to listeners. Though there have been related studies published on this topic, there has yet to be a direct replication of this paper. A replication is warranted given the paper’s influence in the field and the implications of its results. The present ex...
1 CitationsSource
#1Asad Abdi (UTM: Universiti Teknologi Malaysia)H-Index: 7
#2Siti Mariyam Shamsuddin (UTM: Universiti Teknologi Malaysia)H-Index: 26
Last. Jalil Piran (Sejong University)H-Index: 2
view all 4 authors...
Abstract Sentiment analysis concerns the study of opinions expressed in a text. Due to the huge amount of reviews, sentiment analysis plays a basic role to extract significant information and overall sentiment orientation of reviews. In this paper, we present a deep-learning-based method to classify a user's opinion expressed in reviews (called RNSA). To the best of our knowledge, a deep learning-based method in which a unified feature set which is representative of word embedding, sentiment kno...
3 CitationsSource
#1Stefan EhrlichH-Index: 38
Last. Gordon ChengH-Index: 36
view all 4 authors...
: Emotions play a critical role in rational and intelligent behavior; a better fundamental knowledge of them is indispensable for understanding higher order brain function. We propose a non-invasive brain-computer interface (BCI) system to feedback a person's affective state such that a closed-loop interaction between the participant's brain responses and the musical stimuli is established. We realized this concept technically in a functional prototype of an algorithm that generates continuous a...
#1Shitao Chen (Xi'an Jiaotong University)H-Index: 3
#2Songyi Zhang (Xi'an Jiaotong University)H-Index: 3
Last. Nanning Zheng (Xi'an Jiaotong University)H-Index: 24
view all 5 authors...
The perception-driven approach and end-to-end system are two major vision-based frameworks for self-driving cars. However, it is difficult to introduce attention and historical information into the autonomous driving process, which are essential for achieving human-like driving in these two methods. In this paper, we propose a novel model for self-driving cars called the brain-inspired cognitive model with attention. This model comprises three parts: 1) a convolutional neural network for simulat...
10 CitationsSource
#1Sébastien Paquette (UdeM: Université de Montréal)H-Index: 6
#2G.D. Ahmed (McGill University)H-Index: 1
Last. Alexandre Lehmann (McGill University)H-Index: 7
view all 6 authors...
Abstract Cochlear implants can successfully restore hearing in profoundly deaf individuals and enable speech comprehension. However, the acoustic signal provided is severely degraded and, as a result, many important acoustic cues for perceiving emotion in voices and music are unavailable. The deficit of cochlear implant users in auditory emotion processing has been clearly established. Yet, the extent to which this deficit and the specific cues that remain available to cochlear implant users are...
3 CitationsSource
#1Y. V. Srinivasa Murthy (KREC: National Institute of Technology, Karnataka)H-Index: 3
#2Shashidhar G. Koolagudi (KREC: National Institute of Technology, Karnataka)H-Index: 16
A huge increase in the number of digital music tracks has created the necessity to develop an automated tool to extract the useful information from these tracks. As this information has to be extracted from the contents of the music, it is known as content-based music information retrieval (CB-MIR). In the past two decades, several research outcomes have been observed in the area of CB-MIR. There is a need to consolidate and critically analyze these research findings to evolve future research di...
5 CitationsSource
Cited By0