Towards a Perceptual Model of Clarity in Music Mixes

This project investigates into the process undertaken towards producing models of musical mix clarity perception.‘Clarity’ is a term commonly used by listeners when describing the perceptual qualities of a mixed piece of music, reflecting the perceptual effects of various signal characteristics. Determining the relationship between these objective signal characteristics and the perception of mix clarity allows for modelling of the clarity attribute.

A detailed literature review was conducted forming the foundations to both highlight potentially important signal features and outline potential methods of modelling. These observations were then affirmed and understood through an exploratory investigation of music stimuli rated in controlled subjective listening tests.

Three novel approaches to mix clarity prediction are proposed, based on inter-band relationship (IBR & IBR MR), signal separation and component masking (L2PM MC & L3PM MC), and a semi-supervised convolutional neural network (CNN) classifier.

The proposed models were evaluated using a second perceptually relevant data set elicited from a controlled listening test. This showed the IBR model had a strong relationship to the perceptual data (r = 0.6 & rho = 0.6079), which was improved using a novel parameter optimised multi-resolution approach, IBR MR (r = 0.7635 & rho = 0.8024). The L2PM MC (r = −0.6103 & rho = −0.7964) and L3PM MC (r = −0.6483 & rho = −0.6930) models also achieved strong relationship to the perceptual data. As a classifier, the CNN approach showed good accuracy (80%) on unseen data. When coupled with an evaluation of the learned representation, this suggested the CNN model had learned a representation in agreement with, and as comprehensive as, the aforementioned models from the perceptual data.

An objective model of mix clarity perception would be useful as a measure to supplement the judgement of engineers producing music, and in automatic mixing/mastering systems, as a target to guide them towards perceptually meaningful results. Indeed, to the authors knowledge, this work represents the first perceptually relevant objective models of musical mix clarity.

PhD project: 2019 – 2023

Researcher: Dr Andrew Parker

Supervisor: Dr Steve Fenton


Parker, A., & Fenton, S. ‘Musical Mix Clarity Prediction Using Decomposition and Perceptual Masking Thresholds‘. Applied Sciences (Switzerland)11(20), 2021.

Parker, A., Fenton, S. and Lee, H. (2018) ‘Development of a Real-Time Punch Meter Plugin’, In: Proceedings of the 4th Workshop on Intelligent Music Production, 14 Sep 2018, Huddersfield.

Next Post
The influence of the Harmonic Structure of sounds on the Spatial Impression
Previous Post
Auditory Immersion in 3D Multichannel Audio Reproduction