Physiology of the Auditory System


Key Points

  • The external ear funnels acoustic signals into the ear and plays an important role in sound localization.

  • The middle ear matches the impedance between the air-filled external environment and the fluid-filled inner ear.

  • The inner ear has two mobile windows, the oval and round windows. A so-called third window associated with an abnormality of the inner ear (e.g., superior canal dehiscence or large vestibular aqueduct syndrome) can cause vestibular symptoms as well as conductive hearing loss as a result of a change in inner ear impedance.

  • A traveling wave is initiated on the basilar membrane as sound energy enters the cochlea.

  • The basilar membrane acts as a series of frequency filters and maps frequency-specific information (tonotopic organization).

  • The amount of sound energy that reaches the cochlea can be altered by changes in middle or inner ear impedance.

  • Basilar membrane displacement causes deflection of the stereocilia on hair cells and results in depolarization or hyperpolarization of the hair cells.

  • Hair cells transduce mechanical energy (acoustic energy) to an electrochemical signal that is propagated as an action potential along the auditory nerve.

  • Inner hair cells are innervated by type I auditory afferents; they conduct sound information to the cochlear nucleus.

  • Outer hair cells contract with depolarization and elongate with hyperpolarization to alter the mechanical properties of the basilar membrane.

  • The spiral ganglion cell is the first-order neuron of the auditory system and is a bipolar neuron that sends processes both peripherally and centrally.

  • The auditory nerve carries information about which fibers are responding to sound as well as the rate and timing pattern of each fiber.

  • The cochlear nucleus is the first relay station for all ascending auditory information that originates in the ear; it receives inputs from the spiral ganglion cells via the auditory nerve.

  • Tonotopic organization is preserved in the central auditory pathway.

  • The auditory brainstem processes interaural timing and amplitude differences between the ears to determine the location of a sound source.

  • Binaural hearing improves the signal-to-noise ratio of a sound source in background noise using three mechanisms: the head shadow effect, binaural squelch, and summation.

  • The primary auditory cortex is located on the Heschl gyrus and is tonotopically organized. This region processes complex auditory signals to enable language comprehension.

  • The auditory association cortex is located lateral to the primary auditory cortex and is part of the language reception area known as the Wernicke area .

  • Descending pathways to the ear (auditory efferents) modulate the response of the middle ear (acoustic reflex) and inner ear (medial olivocochlear [OC] reflex) to certain types of sounds.

Through a fundamental understanding of normal auditory physiology, otolaryngologists can correlate structural changes with pathologic disturbances in the auditory system. In this system, both passive and active mechanisms work synergistically to provide sensitive hearing in humans. This chapter summarizes current knowledge of the exquisite mechanisms of sound and speech perception processing by the peripheral and central auditory pathways.

Sound and Its Measurement

Sound is produced by a disturbance in particle density, where a particle is made up of many molecules of the sound-propagating medium; this is triggered by a sound-producing body or sound source. If this disturbance in particle density travels through an elastic medium, such as air or fluid, sound propagation occurs. For example, as a tuning fork—a sound source—is struck, it causes air particles adjacent to it to vibrate. The vibration of these air particles causes other nearby air particles to vibrate, thereby enabling sound propagation. The velocity of sound propagation in dry air is about 340 m/s at room temperature, whereas sound travels at around 1500 m/s in water.

One of the most important concepts in the study of acoustics is simple harmonic motion ( Fig. 128.1 ), which is a periodic motion that undulates around a null point with equal amplitudes, similar to a sine function. The frequency of a simple harmonic motion is the number of cycles per second and is measured in Hertz (Hz). The period of a cycle is the inverse of its frequency (1/ f ) and represents the duration of a single cycle. The amplitude is the maximum amount of displacement from the null point in one direction. The sound produced by simple harmonic motion is called a pure tone . However, in an everyday environment, most sound sources do not produce sounds that follow simple harmonic motion, and any vibration that does not follow simple harmonic motion is said to be complex . If the complex vibration has a repetitive periodic pattern, it always produces tones . If the complex vibration has no repetitive pattern, it results in noise .

Fig. 128.1, Simple harmonic motion is a periodic motion that undulates around a null point with equal amplitudes. The amplitude is the maximum amount of displacement from the null point in one direction; the frequency of a simple harmonic motion is the number of cycles per second and is measured in Hertz; the period of a cycle is the inverse of its frequency (1/ f ) and represents the duration of a single cycle.

One of the ways in which to quantify sound is by its intensity, which is cumbersome to measure directly. However, sound pressure , which is related to the square root of intensity, is relatively easy to measure and is the most common way of quantifying sound. Sound pressure, measured in Pascals (Pa, equal to newtons per meter squared [N/m 2 ]), represents the amount of force that vibrating sound particles exert on a surface area.

Because the human ear is capable of perceiving a huge dynamic range of sound intensity (about 10 12 -fold), a convenient way of expressing sound intensity is by taking the logarithmic ratio of two sound intensities—the numerator being the sound intensity of interest and the denominator being a reference sound intensity—and multiplying by 10. This calculation is based on the decibel scale, and the formula for determining decibels (dB) for sound intensity is

dB=10log10J/Jr

where J is the intensity of the sound of interest and Jr is the intensity of the reference sound. Because pressure varies as the square root of intensity, it is necessary to square the pressures in the decibel formula when sound pressure is being used. Therefore the formula for determining decibels for sound pressure is

dB=10log10P2/Pr2=10log10(P/Pr)2=20log10P/Pr

where P is the sound pressure of interest and Pr is the reference sound pressure. For example, if the sound of interest has 10 times more pressure than the reference sound pressure, the sound of interest is 20 dB louder than the reference. If the sound of interest has 100 times more pressure, it is 40 dB louder than the reference. The most commonly used reference sound pressure is 20 µPa, which is referred to as sound-pressure level (SPL). Another reference sound pressure that is occasionally used is called hearing level , which is the threshold sound pressure at a specific frequency as measured in normal subjects; this threshold sound pressure varies across the frequency range.

Impedance

On a general level, impedance can be thought of as the impediment to movement. In the study of acoustics, impedance is defined as the ratio of the acoustic pressure to the volume velocity generated by the acoustic pressure. Imagine an acoustic stimulus striking an elastic membrane, such as the tympanic membrane (TM); the greater the sound pressure of the acoustic stimulus, the larger the motion of the membrane and the higher the velocities achieved by that motion. The exact relationship between the pressure of the acoustic stimulus and the membrane volume velocity is governed by acoustic impedance. Acoustic impedance has three components: stiffness, resistance (damping), and mass. If the membrane were stiffer than normal, the volume velocity generated by the acoustic stimulus would be decreased. Similarly, if the mass of the membrane were increased, it would be reasonable to also expect the volume velocity generated by the acoustic stimulus to decrease. In addition, a small amount of sound energy is lost as a result of the damping effect of the system. In a simple acoustic resonator, stiffness varies inversely with frequency and dominates the acoustic impedance at low frequencies, whereas the impedance of a mass increases with frequency and dominates at high frequencies. When the acoustic impedance is at its lowest point—that is, at the frequency where the stiffness and mass components of the acoustic impedance cancel each other out—the system is said to be in resonance.

External Ear

The external ear is composed of the pinna and the external auditory canal. The external ear serves to funnel sound from the external environment into the ear. The peculiar shape of the pinna and the external auditory canal gives rise to specific resonant frequencies as these structures are struck by sound: the concha has a resonant frequency of around 5300 Hz, and the external auditory canal has a resonant frequency of around 3000 Hz. The external ear plays an important role in sound localization, which is achieved by two major mechanisms: interaural time difference and interaural amplitude difference. Because the left and right ears are located at the opposite sides of the head, the amount of time it takes for a sound stimulus to arrive at each individual ear is governed by the distance from the sound source to that particular ear: the farther the distance, the longer it takes for the sound stimulus to arrive. As a cue for sound localization, the differences in the arrival times of the sound stimulus between the two ears can be used, as can the differences in amplitude perceived by the two ears. This difference in amplitude is further increased by the so-called head shadow effect, in which sound coming from one side is attenuated by the head as the sound travels to the contralateral ear. The head shadow effect in binaural hearing helps to improve the signal-to-noise ratio in adverse listening environments; one ear can be closer to the source of sound or speech while the contralateral ear is exposed to the background noise. It has been shown that the interaural time difference is important for low-frequency sound localization, whereas the interaural amplitude difference is important for higher frequencies.

Middle Ear Mechanics

The middle ear is composed of the TM, the ossicles—malleus, incus, and stapes—and the stapedius and tensor tympani muscles. The TM has a conical shape, and its medial surface is coupled to the manubrium of the malleus. As sound stimulus enters the external auditory canal, it causes the TM to vibrate. The malleus, which is coupled to the TM, vibrates in response to the motion of the TM. This causes the entire ossicular chain to vibrate and results in sound transmission to the inner ear via the stapes footplate. This pathway of sound transmission is referred to as ossicular coupling . The ossicular chain has two synovial joints that are mobile, the incudomalleal and incudostapedial joints. The ossicular chain vibrates along an axis that projects through the head of the malleus and the body of the incus in an anteroposterior direction ( Fig. 128.2 ). The stapes, the smallest bone in the body, transmits the output of the middle ear into the inner ear through the oval window.

Fig. 128.2, Schematic of the middle ear system.

The inner ear is filled with fluid; if a sound stimulus strikes the fluid directly, most of the acoustic energy will be deflected because the impedance of fluid is much greater than that of air. The pathway of sound transmission to the inner ear in the absence of the ossicular system is referred to as acoustic coupling . It has been shown that the difference between ossicular coupling and acoustic coupling is about 60 dB, which is the maximal amount of hearing loss expected in patients with ossicular discontinuity. The middle ear plays an important role in the process of impedance matching between the air-filled middle ear and the fluid-filled inner ear to allow for efficient sound transmission. The most important factor in the middle ear's impedance-matching capability comes from the area ratio between the TM and the stapes footplate (see Fig. 128.2 ). The human TM has a surface area approximately 20 times larger than that of the stapes footplate (69 mm 2 vs 3.4 mm 2 , respectively). If all the force applied to the TM were to be transferred to the stapes footplate, the force per unit area would be 20 times larger (26 dB) on the footplate than on the TM. A second mechanism for impedance matching is called the lever ratio, which refers to the difference in length of the manubrium of the malleus and the long process of the incus. Because the manubrium is slightly longer than the long process of the incus, a slight force applied to the long arm of the lever (manubrium) results in a greater force on the short arm of the lever (incus long process). In humans, the lever ratio is about 1.31 to 1 (2.3 dB). The combined effects of the area ratio and the lever ratio give the middle ear output a 28 dB gain theoretically; in reality, the middle ear sound pressure gain is only about 20 dB. This is mostly due to the fact that the TM does not move as a rigid diaphragm. In fact, at higher frequencies, it vibrates in a complex manner, with multiple areas that vibrate differently. Therefore the effective area of the TM involved with impedance matching is smaller than its total area. Nevertheless, the 20-dB middle ear sound-pressure gain helps to facilitate sound transmission from the air-filled middle ear into the fluid-filled inner ear.

Inner Ear Physiology

The inner ear is enclosed in a bony cavity called the otic capsule, and it has two mobile windows, one oval and one round. The inner ear serves the important functions of hearing and balancing. The portion of the inner ear that deals with hearing is the cochlea, whereas the portion of the inner ear that deals with maintaining balance is collectively known as the vestibular organs: the semicircular canals, utricle, and saccule. The cochlea is shaped like a snail and has a spiral configuration turns ( Fig. 128.3A ). The center portion of the spiral is called the modiolus. The portion of the cochlea that is closest to the oval window is referred to as the base, whereas the portion of the cochlea that is farthest away from the oval window is referred to as the apex. The cochlea is a fluid-filled space with three compartments known as the scala tympani, scala media, and scala vestibuli (see Fig. 128.3B ). The scala tympani and the scala media are separated by the basilar membrane, whereas the scala media and the scala vestibuli are separated by the Reissner membrane. The scala tympani and the scala vestibuli join together at the apex of the cochlea to form the helicotrema.

Fig. 128.3, (A) Histologic section showing a normal human cochlea (hematoxylin and eosin stain). The cochlea is shaped like a snail and has a spiral configuration with turns. The center portion of the spiral is called the modiolus. The portion of the cochlea that is closest to the oval window is the base, and the portion farthest away from the oval window is the apex. (B) Higher magnification of this histologic section shows the organ of Corti. The cochlea is a fluid-filled space with three compartments: the scala tympani, scala media, and scala vestibuli. The scala tympani and scala media are separated by the basilar membrane; the scala media and scala vestibuli are separated by the Reissner membrane. The scala tympani and scala vestibuli join together at the apex of the cochlea to form the helicotrema. The scala media contains the organ of Corti, which rests on the basilar membrane. The organ of Corti and the basilar membrane together are sometimes referred to as the cochlear partition. The stria vascularis plays an important role in maintaining the electrochemical environment of the cochlea.

The scala media contains the organ of Corti, which rests on the basilar membrane. Taken together, the organ of Corti and the basilar membrane are referred to as the cochlear partition. The organ of Corti has an arch at its center, called the arch of Corti, formed by the inner and outer pillar cells. The inner hair cells are flask-shaped cells that rest on the side of the inner pillar cells, whereas the outer hair cells are cylindrical cells that rest on the side of the outer pillar cells. In the human cochlea, about 3000 inner hair cells extend in a single row from the base to the apex, whereas about 12,000 outer hair cells are arranged in three or four rows. The hair cells derive their names from the hair-like projections apparent on their apical surface termed stereocilia, which play an important role in the signal-transduction properties of the hair cells.

The scala vestibuli and scala tympani are filled with perilymph, which has a composition similar to that of the extracellular fluid (high in sodium, and low in potassium; Fig. 128.4A ). The scala media is filled with endolymph, which has a similar composition to the intracellular fluid (low in sodium, high in potassium). The unique electrolyte composition of the scala media sets up a large electrochemical gradient, called the endocochlear potential, which is +60 to +100 mV relative to the perilymph ( Fig. 128.5 ). The maintenance of such a large electrochemical gradient is accomplished by the stria vascularis , which resides on the outer wall of the scala media, away from the modiolus. The stria vascularis contains multiple active ion channels and maintains the chemical composition of the endolymph and its positive electrical potential.

Fig. 128.4, Mechanoelectrical transduction of the auditory signal depends on the recycling of potassium ions in the organ of Corti. (A) Schematic cross-sectional view of the human cochlea. The scala media (cochlear duct) is filled with endolymph, and the scala vestibuli and tympani are filled with perilymph. The endolymph of the scala media bathes the organ of Corti, located between the basilar and tectorial membranes and containing the inner and outer hair cells. A relatively high concentration of potassium in the endolymph of the scala media relative to the hair cell creates a cation gradient maintained by the activity of the epithelial supporting cells, spiral ligament, and stria vascularis. (B) Cells contain stereocilia along their apical surface and are connected by tip links. The potassium gradient is essential to enable depolarization of the hair cell following influx of potassium ions in response to mechanical vibration of the basilar membrane, deflection of stereocilia, displacement of tip links, and opening of gated potassium channels. Depolarization results in calcium influx through channels along the basolateral membrane of the hair cell, which causes degranulation of neurotransmitter vesicles into the synaptic terminal and propagates an action potential along the auditory nerve. Gap junction proteins between the hair cells (potassium channel, yellow ) and epithelial supporting cells (connexin channels, red ) allow for the flow of potassium ions back to the stria vascularis, where they are pumped back into the endolymph.

Fig. 128.5, Schematic showing the electrochemical environment of the cochlea.

As sound energy travels through the external and middle ears, it causes the stapes footplate to vibrate. This vibration results in a compressional wave in the inner ear fluid, which travels across the scala vestibuli, around the helicotrema, and across the scala tympani toward the round window; therefore an inward motion of the stapes causes an outward motion of the round window. However, as this compressional wave travels across the scala vestibuli, the pressure in the scala vestibuli is higher than that in the scala tympani. This sets up a pressure gradient, which causes the cochlear partition to vibrate. Georg von Békésy first described the vibration of the cochlear partition in cadaveric human cochleas. He demonstrated that as the cochlear partition is deflected by the compressional wave created by the stapes footplate vibration, it sets up a traveling wave on the basilar membrane, which travels from the base of the cochlea to its apex ( Fig. 128.6 ). In addition, von Békésy found that the basilar membrane varied in its stiffness along its length, with higher stiffness near the base and lower stiffness near the apex. This property of the basilar membrane allows it to respond to various frequencies differently, such that the amplitude of the traveling wave peaks (resonates) at a specific place along the basilar membrane, with the higher frequencies at the base and the lower frequencies toward the apex. This enables the basilar membrane to act as a series of filters that respond to specific sound frequencies at specific locations along its length. In other words, the basilar membrane is tonotopically tuned to different frequencies along its length. His seminal work on cochlear mechanics earned von Békésy the Nobel Prize in Physiology or Medicine in 1961.

Fig. 128.6, Schematic showing sound propagation in the cochlea.

Even though the cochlea is usually thought of as having two mobile (oval and round) windows, the possibility of a third mobile window has been proposed. Typically, an air-bone gap shown on audiologic testing is associated with some form of middle ear pathology. More recently, a select group of patients with air-bone gaps detected on audiologic testing has been described who do not have any middle ear pathology on intraoperative exploration. It has been shown that the air-bone gap in this group of patients can be explained by a pathologic “third window.” Perhaps the best-studied example of this phenomenon is seen in superior canal dehiscence (SCD) syndrome ( Fig. 128.7 ). Patients with SCD often complain of autophony, aural fullness, sound- and/or pressure-induced vertigo, and hearing loss. It is thought that the dehiscence in the superior semicircular canal acts as a third mobile window of the inner ear that shunts acoustic energy away from the cochlea, resulting in a decreased sensitivity to air-conducted sound and the air-bone gap seen on audiologic testing. This third window is also theorized to decrease cochlear input impedance at the oval window, which increases the pressure gradient across the cochlear partition and results in hypersensitivity to bone-conducted sound. Repair, or plugging of SCD in some cases, results in closure of the preoperative air-bone gap (see Fig. 128.7 ; ). The third-window hypothesis has also been used to explain the air-bone gap associated with other temporal bone anomalies, as an enlarged vestibular aqueduct and other inner ear malformations.

Fig. 128.7, Inner ear impedance changes because of superior canal dehiscence is associated with an air-bone gap on audiometric testing. (A through F) Clinical images from a 28-year-old woman with autophony, hearing loss, and sound- and pressure-induced oscillopsia and dizziness. Left-sided superior canal dehiscence (SCD) syndrome was diagnosed. (A and B) Temporal bone CT scans reformatted to Stenvers view (perpendicular to the superior canal, A) and Pöschl view (parallel to the superior canal, B) of the left ear show a 4-mm dehiscence (arrowheads) . (C and D) Intraoperative photos of left SCD during middle fossa craniotomy and dural retraction. Note the exposed membranous labyrinth of the superior canal and surrounding dehiscent tegmen in C, which correlate with radiology in A and B. Plugging of the left SCD with bone wax is shown in D. Temporalis fascia and a split-calvarial bone graft were also used to reconstruct the dehiscent middle fossa floor. Preoperative audiogram (E) and post-SCD plugging audiogram (F) demonstrate closure of air-bone gap associated with SCD. ANSI , American National Standards Institute; SSC , superior semicircular canal.

As the cochlear partition is deflected in response to the compressional wave initiated by the stapes, it causes a shearing force between the stereocilia of the hair cells and the tectorial membrane. This causes a deflection of the hair cell stereocilia, which are arranged in orderly rows by height (see Fig. 128.4B ), and the tips of these stereocilia are connected from one row to the next by elastic filaments called tip links . It is thought that as the stereocilia are deflected in the direction of the tallest row, it causes these tip links to stretch. The stretch of the tip links then causes the opening of stretch-sensitive cationic channels located on the stereocilia ( Fig. 128.8 ). Because there is a large electrochemical gradient across the apical surface of the hair cells, with a large positive endocochlear potential on one side and a large negative intracellular potential on the other, the opening of these stretch-sensitive cationic channels on the stereocilia causes a large influx of cationic current, which leads to hair cell depolarization. As the stereocilia are deflected away from the tallest row, the tip links relax and thereby decrease the probability of the ion channel opening; this leads to hyperpolarization of the hair cell. It is important to note that the relationship between the degree of stereocilia deflection and hair cell depolarization/hyperpolarization is neither symmetrical nor linear. In fact, stereocilia deflection in the depolarization direction produces a greater response than deflection in the hyperpolarization direction ( Fig. 128.9 ). The deflection of the hair cell stereocilia and the resulting hair cell depolarization or hyperpolarization represents an important step in the signal transduction process of the hair cell by converting a mechanical signal (inner ear fluid wave) into an electrochemical signal. It has recently been shown that the transmembrane channel–like proteins 1 and 2 (TMC1 and TMC2) play an integral part in this mechanotransduction process.

Fig. 128.8, Schematic showing the role of tip links in hair cell signal transduction.

Fig. 128.9, Generation of receptor potentials in both inner hair cell (IHC, upper panel) and outer hair cell (OHC, lower panel) in response to a sound stimulus of 84 dB sound-pressure level. The sound stimulus changes the receptor potential according to each hair cell's input-output curve (left) . Note that the input-output curves for both types of hair cells are nonlinear.

Because potassium is the major cation in the endolymph, it is believed that potassium current plays an important role in triggering the signal transduction process in hair cells. Once inner hair cells are depolarized, voltage-gated calcium channels open. These channels are clustered in several “hot spots” along the basolateral surface of the inner hair cells, where synaptic contacts with primary afferent auditory nerve fibers are located. The calcium current mediated by these voltage-gated ion channels are important for triggering neurotransmitter release across the synapse, which leads to activation of the auditory nerve fibers. The neurotransmitter involved in this process has not been definitively identified but is believed to be a molecule closely related to glutamate.

Unlike the inner hair cell, an outer hair cell can change its length in response to voltage changes: it contracts with depolarization and elongates with hyperpolarization. The “molecular motor” associated with rapid changes in outer hair cell length is a voltage-dependent integral membrane protein called prestin . The change in outer hair cell length in response to voltage changes is believed to add energy to the motion of the basilar membrane through a mechanical feedback scheme. In other words, the outer hair cell acts as a cochlear “amplifier” that augments the signals transmitted into the inner ear by the stapes vibration. The importance of prestin in hearing is further supported by the finding that in animals in which prestin has been knocked out or altered, hearing sensitivity and frequency selectivity are impaired.

Because different regions of the basilar membrane are tonotopically tuned to specific frequencies, and because the hair cells reside on top of the basilar membrane, it is logical to assume that the hair cells from different regions are also tonotopically tuned to specific frequencies. Indeed, the frequency tuning curves for both outer and inner hair cells have been recorded in guinea pigs in response to various frequencies, and the hair cells in different regions along the basilar membrane are tonotopically tuned to specific frequencies that correspond to the tonotopic arrangement of the basilar membrane ( Fig. 128.10 ). The frequency that a hair cell is most sensitive to is called the characteristic frequency . As to be discussed, this tonotopic arrangement is essential for the processing of auditory information, and it is preserved throughout the entire auditory pathway.

Fig. 128.10, Tuning curves of the basilar membrane (BM) and the inner hair cells (IHCs) and outer hair cells (OHCs) at a basal location in the guinea pig cochlea. The tuning curve plots the sound-pressure level (SPL) required to produce a fixed level of response at a given location along the cochlear partition. The required sound level is lowest when the sound stimulus is at its characteristic frequency. The tuning curves for the basilar membrane and the inner and outer hair cells at the same location on the cochlear partition are very similar (similar characteristic frequency).

You're Reading a Preview

Become a Clinical Tree membership for Full access and enjoy Unlimited articles

Become membership

If you are a member. Log in here