The issue of auditory segregation of simultaneous sound sources has been addressed in speech research but was given less attention in musical acoustics. In perception of concurrent speech, or speech with noise, the operation of time-frequency masking was often used as a research tool. In this work, an ex- tension of time-frequency masking, leading to the removal of spectro-temporal overlap between sound sources, was applied to musical instruments playing together. The perception of the original mixture was compared with the perception of the same mixture with all spectral overlap electronically removed. Ex- periments differed in the method of listening (headphones or a loudspeaker), sets of instruments mixed, and populations of participants. The main findings were: (i) in one of the experimental conditions the removal of spectro-temporal overlap was imperceptible, (ii) perception of the effect increased when removal of spectro-temporal overlap was performed in larger time-frequency regions rather than in small ones, (iii) perception of the effect decreased in loudspeaker listening. The results support both the multiple looks hypothesis and the “glimpsing” hypothesis known from speech perception.
Independent Component Analysis (ICA) can be used for single channel audio separation, if a mixed signal is transformed into time-frequency domain and the resulting matrix of magnitude coefficients is processed by ICA. Previous works used only frequency (spectral) vectors and Kullback-Leibler distance measure for this task. New decomposition bases are proposed: time vectors and time-frequency components. The applicability of several different measures of distance of components are analysed. An algorithm for clustering of components is presented. It was tested on mixes of two and three sounds. The perceptual quality of separation obtained with the measures of distance proposed was evaluated by listening tests, indicating "beta" and "correlation" measures as the most appropriate. The "Euclidean" distance is shown to be appropriate for sounds with varying amplitudes. The perceptual effect of the amount of variance used was also evaluated.
Whenever the recording engineer uses stereo microphone techniques, he/she has to consider a recording angle resulting from the positioning of microphones relative to sound sources, besides other acoustic factors. The recording angle, the width of a captured acoustic scene and the properties of a particular microphone technique are closely related. We propose a decision supporting method, based on the mapping of the actual position of a sound source to its position in the reproduced acoustic scene. This research resulted in a set of localisation curves characterising four most popular stereo microphone techniques. The curves were obtained by two methods: calculation, based on appropriate engineering formulae, and experiment consisting in the recording of sources and estimation of the perceived position in listening tests. The analysis of curves brings several conclusions important in the recording practice.