|
|
Close your eyes and listen to the sounds around you. How well can you tell where theyre coming from? Pretty well, hopefully! How do we do that? And how could we use a computer to simulate moving sound so that, for example, we can make a car go screaming across a movie screen or a bass player seem to walk over our heads?
Humans have a pretty complicated system for perceptually locating sounds,
involving, among other factors, the relative loudness of the sound in
each ear, the time difference between the sounds arrival in each
ear, and the difference in frequency content of the sound as heard by
each ear. How would a "cyclaural" (the equivalent of a "cyclops")
hear? Most attempts at spatializing, or localizing, recorded sounds make
use of some combination of factors involving the two ears on either side
of the head.
Simulating Sound Placement
Simulating a loudness difference is pretty simpleif someone standing
to your right says your name, their voice is going to sound louder in
your right ear than in your left. The simplest way to simulate this volume
difference is to increase the volume of the signal in one channel while
lowering it in the otheryouve probably used the pan
or balance knob on a car stereo or boombox, which does exactly this. Panning
is a fast, cheap, and fairly effective means of localizing a signal, although
it can often sound artificial.
Interaural Time Delay (ITD)
Simulating a time difference is a little trickier, but it adds a lot to the realism of the localization. Why would a sound reach your ears at different times? After all, arent our ears pretty close together? Were generally not even aware that this is true: snap your finger on one side of your head, and youll think that you hear the sound in both ears at exactly the same time.
But you dont. Sound moves at a specific speed, and its not
all that fast (compared to light, anyway): about 345 meters/second. Since
your fingers are closer to one ear than the other, the sound waves will
arrive at your ears at different times, if only by a small fraction of
a second. Since most of us have ears that are quite close together, the
time difference is very slighttoo small for us to consciously "perceive."
Lets say your head is a bit wide: roughly 250 cm, or a quarter
of a meter. It takes sound around 1/345 of a second to go 1 meter, which
is approximately 0.003 second (3 thousandths of a second). It takes about
a quarter of that time to get from one ear of your wide head to the other,
which is about 0.0007 second (0.7 thousandths of a second). Thats
a pretty small amount of time! Do you believe that our brains perceive
that tiny interval and use the difference to help us localize the sound?
We hope so, because if theres a frisbee coming at you, it would
be nice to know which direction its coming from! In fact, though,
the delay is even smaller because your heads smaller than 0.25 meter
(we just rounded it off for simplicity). The technical name for this delay
is interaural time delay (ITD).
To simulate ITD by computer, we simply need to add a delay to one channel of the sound. The longer the delay, the more the sound will seem to be panned to one side or the other (depending on which channel is delayed). The delays must be kept very short so that, as in nature, we dont consciously perceive them as delays, just as location cues. Our brains take over and use them to calculate the position of the sound. Wow!
Modeling Our Ears and Our Heads
That the ears perceive and respond to a difference in volume and arrival
time of a sound seems pretty straightforward, albeit amazing. But whats
this about a difference in the frequency content of the sound? How could
the position of a bird change the spectral makeup of its song? The answer:
your head!
Imagine someone speaking to you from another room. What does the voice
sound like? Its probably a bit muffled or hard to understand. Thats
because the wall through which the sound is travelingbesides simply
cutting down the loudness of the soundacts like a low-pass filter.
It lets the low frequencies in the voice pass through while attenuating
or muffling the higher ones.
Your head does the same thing. When a sound comes from your right, it must first pass through, or go around, your head in order to reach your left ear. In the process, your head absorbs, or blocks, some of the high-frequency energy in the sound. Since the sound didnt have to pass through your head to get to your right ear, there is a difference in the spectral makeup of the sound that each ear hears. As with ITD, this is a subtle effect, although if youre in a quiet room and you turn your head from side to side while listening to a steady sound, you may start to perceive it.
Modeling this by computer is easy, provided you know something about
how the head filters sounds (what frequencies are attenuated and by how
much). If youre interested in the frequency response of the human
head, there are a number of published sources available for the data,
since they are used by, among others, the government for all sorts of
things (like flight simulators, for example). Researcher and author Durand
Begault has been a leading pioneer in the design and implementation of
what are called head transfer functionsfrequency response
curves for different locations of sound.
What Are Head-Related Transfer Functions (HRTFs)?
Figure 5.10 This illustration
shows how the spectral contents of a sound change depending on which direction
the sound is coming from. The body (head and shoulders) and the time-of-arrival
difference that occurs between the left and right ears create a filtering
effect.
Figure 5.11 The binaural dummy head recording system includes an acoustic baffle with the approximate size, shape, and weight of a human head. Small microphones are mounted where our ears are located.
This recording system is designed to emulate the acoustic effects of the human head (just as our ears might hear sounds) and then capture the information on recording media.
A number of recording equipment manufacturers make these "heads," and they often have funny names (Sven, etc.).
Thanks to Sonic Studios for this photo.
Not surprisingly, humans are extremely adept at locating sounds in two
dimensions, or the plane. Were great at figuring out the source
direction of a sound, but not the height. When a lion is coming at us,
its nice of evolution to have provided us with the ability to know,
quickly and without much thought, which way to run. Its perhaps
more of a surprise that were less adept at locating sounds in the
third dimension, or more accurately, in the "up/down" axis.
But we dont really need this ability. We cant jump high enough
for that perception to do us much good. Barn owls, on the other hand,
have little filters on their cheeks, making them extraordinarily good
at sensing their sonic altitude distances. You would be good at sensing
your sonic altitude distance, too, if you had to catch and eat, from the
air, rapidly running field mice. So if its not a frisbee heading
at you more or less in the two-dimensional plane, but a softball headed
straight down toward your head, wed suggest a helmet!
|