Project thumbnail image
College of Engineering Unit: 
Electrical Engineering and Computer Science
Project Team Member(s): 
Aidan Beery
Physical Location at Expo: 
Community Plaza
Project ID: 
Project Description: 

The ability to predict a song’s emotive qualities is of interest in the field of music information retrieval. Music mood analysis applies across many domains, from automatic playlist generation to therapeutic uses. However, most music emotion recognition experiments rely on expensive crowd-sourced surveys to manually annotate samples of music. An intelligent agent capable of predicting music emotive properties would aide in accelerating music information retrieval research.  Our hypothesis is that the conversations people have online about a given piece of music contain semantic information, which a machine learning model could be trained on to predict the emotion evoked by that song.

Three music emotive datasets are evaluated – AMG1608, PmEmo, and DEAM. These datasets all provide valence and arousal ratings (indicating a song’s positivity and energy) provided by survey recipients on crowdsourced annotation platforms such as Amazon Mechanical Turk. We take the songs included in these experiments and scrape YouTube, Reddit, and Twitter for the top 10 posts referencing a given artist name and track.

From this data, we design 2 experiments to attempt to estimate music emotion directly from social media commentary. First, we focus on extracting individual affective terms from a given comment. Using existing word-emotive dictionaries, we find the valence, arousal, and sentiment of the words in a series of comments and generate summary statistics to create our feature space.  From these features, a random forest model is trained to identify a song’s valence and arousal given information on the emotive qualities of the comments around it.

Our second method applies transfer learning to a pre-trained transformer model. distilBERT is a natural language understanding model trained on a large corpus of English text. The weights are provided open-source, and as a result our model requires relatively few epochs to fine-tune on our social media commentary. We provide raw comments as inputs, allowing our model to learn from not just the individual affective terms, but also the context in which they are used and the larger semantic structure of the comment.

Our results are presented in the poster below.

Project Communication Piece(s): 
PDF icon 2022 Research Poster605.91 KB
This team is open to networking
This team is open to collaboration opportunities