Adv Projects in Computer Music: Spatialization techniques

Sunday, October 16, 2005

Spatialization techniques

Here is what I have compiled so far about current spatialization technology, and I've tried to roughly categorize the papers. As I mentioned in the last meeting, I won't be able to come to this week's meeting, but I would appreciate some feedback on this list.

Thanks, Grace

Spatial Perception

Spatial perception is frequency dependent, cues mostly come from higher spectral components, which makes it easier to localize widely distributed spectra.

Frequency cues for localization seems to be based on experience. In nature, high frequency sounds are more likely to originate above our heads and low sounds from below. Inverting this mapping will create bad localization.

Locating stationary sounds is easier; adding Doppler shift eases localization.

Malham, CMJ 19(4)

D. G. MALHAM “Approaches to spatialization” Organised Sound 3(2): 167–77.
J. Chowning, "The simulation of moving sound sources", Journal of the Audio Engineering Society, vol. 19, no. 1, pp. 2-6, 1971.
F. R. Moore, "A general model for spatial processing of sounds", Computer Music Journal, vol. 7, no. 6, pp. 6-15, 1983.
M. Kleiner, B.-I. Dalenback, P. Svensson, "Auralization - An overview", Journal of the Audio Engineering Society, vol. 41, no. 11, pp. 861-875, 1993
Thiele, G. and G. Plenge “Localization of Lateral Phantom Sources: JAES 25(4): 196-200.

Hrtf-based binaural

Mimic the change in timbre, delay, and amplitude that happens directly at the ears.

Are most successful when personal hrtfs are used to custom tailor the experience for the listener, and when the listener’s head movement is tracked (a way to prevent front-back reversal error). Measurements of dummy heads such as KEMAR provide an approximate effect. Recordings made in binaural format can be transferred to stereo using interaural crosstalk cancellation, though the result will have an extremely small sweet spot. Binaural setups are suitable for Internet concerts, but not for traditional concert settings. HRTFs take a huge amount of processing power compared to other spatialization techniques.

Researchers: Larcher, Jot, Warusfel

Véronique Larcher et Jean-Marc Jot “Techniques d'interpolation de filtres audio-numériques: Application à la reproduction spatiale des sons sur écouteurs” Congrès Français d'Acoustique, Marseille, France, Avril 1997
Psychophysical calibration of auditory range control in binaural synthesis with independent adjustment of virtual source loudness William L. Martens PDF (JASA)
Practical system for recording spatially lifelike 5.1 surround sound and 3D fully periphonic reproduction Robert E. (Robin) Miller III PDF (JASA)
Individualized HRTFs using computer vision and computational acoustics, JASA Volume 108, Issue 5, p. 2597 PDF
Cooper, D. H., and Bauck, J. L. 1989. "Prospects for transaural recording". J. Audio Eng. Soc. 37(1/2).

Cinema-style loudspeaker arrays (5.1, 8-channel)

Phantom sound images are created. When there is at least an 18dB difference in power between loudspeakers, the sound-producing body will seem to be located near the louder speaker. The perceived width of the sound image varies depending on the spectral content of the sound. The optimum angle between speakers is 60 degrees, meaning that a minumum of six speakers is needed to cover the two dimensional space surrounding the listener; smaller numbers of speakers will create more unstable images. The speaker layouts cannot change between studio monitoring and concert hall settings.

VBAP (Vector-Based Amplitude Panning)
Jot, Jean-Marc et Olivier Warusfel: Spat~ : A Spatial Processor for Musicians and Sound Engineers, CIARM: International Conference on Acoustics and Musical Research, Mai 1995.
Jot, J.-M., and Warusfel, O. 1995. "A real-time spatial sound processor for music and virtual reality applications". Proc. 1995 ICMC
Dérogis, Philippe, René Caussé et Olivier Warusfel: On the Reproduction of Directivity Patterns Using Multi-Loudspeaker Sources, ISMA: International Symposium of Music Acoustics 1995

Ambisonics

Ambisonic stages can be either pantophonic (2-D) or periphonic (3-D or with-height). Ambisonic encoding and decoding are completely separate and mostly autonomous processes, meaning that decoding of an ambisonic signal can be performed with little information about the encoding process. Ambisonic microphones can provide 3-D encoded signals. After encoding, each dimension (x,y,z,w) of an ambisonic signal can be manipulated independently. A pantophonic setup requires 3 channels and 4 loudspeakers, and a with-height setup requires 4 channels and 8 loudspeakers.

Peter Felgett. ``Ambisonics. Part One: General System Description''. Studio Sound, 1:20--22,40, August 1975.
Michael A. Gerzon. ``Periphony: With-Height Sound Reproduction''. Journal of the Audio Engineering Socitey, 21(1):2--10, 1973.
Michael A. Gerzon. ``Ambisonics. Part two: Studio Techniques''. Studio Sound, pages 24--30, October 1975
Malham, David 'Higher order Ambisonic systems for the spatialisation of sound' Proceedings, ICMC99, Beijing, October 1999
Malham, D and Myatt, A “3-D Sound Spatialization using Ambisonic Techniques” CMJ 19(4) 1995
D.G. Malham “Homogeneous And Nonhomogeneous Surround Sound Systems”.
Ambisonics Encoding of Other Audio Formats for Multiple Listening Conditions Jérôme Daniel, Jean-Bernard Rault, and Jean-Dominique Polack, AES Convention 1998
David Malham at University of York

Holophonics and Wave-field synthesis

Multiple speakers are used to represent secondary sources of the wave front (see Huygens’ Principle). Because the actual sound wave is reproduced, this technique creates an acceptable image over a larger area, but spatial aliasing is a potential error.

Acoustic rendering with wave field synthesis, Marinus M. Boone
Wave field synthesis: A promising spatial audio rendering concept, Gunther Theile and Helmut Witteky, Acoust. Sci. & Tech. 25, 6 (2004)
Wave Field Synthesis And Analysis Using Array Technology, Diemer de Vries and Marinus M.Boone, Proc. 1999 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, New Paltz, New York, Oct. 17-20, 1999
Nicol, R and M. Emerit 1999

Miscellaneous

Hyper-dense transducer array (Malham)
Hybrid speaker-headphone approach?
Huopaniemi, J. 1999. Virtual acoustics and 3-D sound in multimedia signal processing

Organized Sound Issue 3(2)
Computer Music Journal Issue 19(4)
Jean-Marc Jot Synthesizing Three-Dimensional Sound Scenes in Audio or Multimedia Production and Interactive Human-Computer Interfaces 5th International Conference: Interface to Real & Virtual Worlds, Montpellier, France, Mai 1996
Belin, Pascal, Bennett Smith, L Thivard, Sophie Savel, Séverine Samson et Yves Samson: The functional anatomy of sound intensity change detection, Society for Neuroscience, 1997.
Belin, Pascal, Stephen McAdams, Bennett K Smith, Sophie Savel, Lionel Thivard et Séverine Samson: The functional anatomy of sound intensity discrimination, Journal of Neuroscience, 1998.
Jean-Pascal Jullien, Olivier Warusfel Technologies et perception auditive de l'espace. Cahiers de l'Ircam (5), mars 1994
Jot, Jean-Marc: Efficient models for reverberation and distance rendering in computer music and virtual audio reality, ICMC: International Computer Music Conference, Septembre 1997.

# posted by Shlomo Dubnov @ 10:59 AM

Comments: Post a Comment

<< Home

Adv Projects in Computer Music