Key Points

  • Management of patients with voice disorders is most effectively performed by a multidisciplinary team that includes a laryngologist, a speech-language pathologist, and a vocal pedagogue. All patients with voice disorders receive the same level of care regardless of their level of voice use.

  • Voice and speech are produced according to the source-filter theory of voice production. In this theory, the vocal folds provide a harmonic sound source that is filtered by the vocal tract. During filtering of the sound source, the vocal tract attenuates and amplifies certain frequencies of the harmonic sound source to create sounds characteristic of speech.

  • Regardless of vocal style, patients all use the same basic mechanism to produce voice. The intrinsic laryngeal musculature contracts to bring the vocal folds into a nearly closed position, setting the vocal folds into a position that allows vibration to occur, while the abdominal, thoracic, and/or pelvic musculature is used to control the rate and volume of air intake and egress to drive the vocal folds into vibration. The patient uses the extrinsic laryngeal muscles to position the larynx within the neck, and this positioning controls the length of the vocal tract. The patient uses the pharyngeal muscles, tongue, lips, and teeth to control the size and shape of the vocal tract. Together, the length and shape of the vocal tract determine which frequencies of the harmonic sound source are attenuated and which are resonated.

  • Vocal efficiency is achieved when patients are able to produce their desired vocal output with minimal input.

  • Laryngeal stroboscopy is currently the best method for evaluating patterns of vocal fold vibration. To be used most effectively, laryngeal stroboscopy should be performed at multiple vocal pitches and intensities to enable study of the vocal fold vibratory patterns under different requirements.

  • Voice analysis with acoustic and aerodynamic measures may be helpful to quantify or measure voice disorders; however, perceptual analysis of the voice by a trained observer remains the most sensitive test to objectify vocal changes.

  • Patient-reported outcome measures have also become standard quality-of-life indicators for tracking change in voice quality, as perceived by the patient.

  • Often, vocal professionals/performance voice users complain of a change in vocal effort or slight change in voice quality before other people and trained clinicians can identify a change. “Hoarseness,” a term often used to describe voice changes, is, therefore, inadequate. Clinicians must strive to describe the voice changes in terms of the specific alterations in effort of production and in vocal quality (roughness, breathiness, weakness, and strain) as well as in terms of which portion of the vocal range is most disturbed.

  • Physical examination of patients with voice disorders should include palpation of the anterior cervical structures in the region of the larynx and tongue base for tension and tenderness at rest and during phonation. Excess tension or tenderness indicates excessive extralaryngeal muscular effort for voice production. Findings of tension or tenderness may be secondary to habitual inefficient vocal mechanisms or to compensate for suboptimal laryngeal vibration from loss of mucosal pliability or loss of laryngeal closure.

  • Benign, nonneoplastic vocal fold lesions are most commonly a response to injury. The most common source of injury is the use of excessive, loud, and inefficient voice-production methods.

  • Even if distinct anatomic abnormalities are identified within the larynx, the patient with a voice complaint should be evaluated by a speech-language pathologist and/or a vocal pedagogue to assess the efficiency of voice production and ability to improve the voice through behavioral changes.

  • Voice rest means complete vocal silence. It is useful primarily as a diagnostic tool. By eliminating vocal behaviors, it allows the clinician to assess what portion of the acute vocal changes is caused by the patient's abusive vocalization patterns. The period of voice rest rarely needs to exceed 1 week. Voice rest may also be useful in patients with acute injuries or illness to allow quicker recovery, or in the postoperative setting.

  • Speech-language pathologists and vocal pedagogues help patients modify their voice-use techniques through vocal hygiene techniques and techniques of direct voice therapy.

  • Surgical intervention is appropriate if malignancy is suspected or if the patient has been compliant with speech therapy recommendations, but the voice has not improved.

Care of patients who use their voices professionally requires knowledge and skills not easily mastered within the field of otolaryngology alone. It is part of the discipline of performing arts medicine. The laryngologist enlists the expertise of speech-language pathologists and vocal pedagogues to retrain and rehabilitate the professional voice patient. A team approach is mandatory and has been strengthened over the past decade by the establishment of several multidisciplinary voice centers.

Professional voice patients are a diverse group. Limiting the definition to performance voice users such as singers and actors is too narrow. All people who depend on speaking or singing skills for employment—salesmen, receptionists, telephone operators, lawyers, clergy, teachers, politicians, public speakers, fitness instructors, and most physicians—should be considered professional voice users because all of them place diverse yet significant demands on their voices.

Singers and actors, performing vocal professionals, put the greatest demand on the vocal apparatus. The extraordinary amount of practice and performance stress they are under exceeds that of any other type of vocal professional. They are often highly trained and push their voices to their physical limits. No other patients are as sensitive to subtle changes in their vocal abilities. Singers and actors with voice disorders often challenge the most experienced laryngologist. The knowledge and expertise gained from managing such patients can, and should, be generalized to care for other professional and nonprofessional voice users with voice disorders. Although professional voice users (and in particular, performance vocalists) may demand expedited workup for voice issues, it is incumbent upon the laryngologist not to treat voice problems in nonprofessional voice users differently from those in professional voice users. Guidelines to the contrary are, in these authors’ opinions, misguided. While this chapter is aimed toward the otolaryngologist treating performance vocalists, its lessons can, and should, be applied to all patients with voice complaints.

Anatomic Considerations

Voice and voice use patterns can be affected by emotional status and general health. Therefore, in the evaluation of the patient with a voice disorder, the entire body and psyche should be considered. The body itself is the vocal instrument, and the larynx is its most sensitive part. Altered function in nearly any area of the vocal professional's body can result in vocal changes. The larynx, therefore, should not be evaluated as an isolated entity.

Sound generation of any type requires a (1) power source, (2) vibrator, and (3) resonator: the lungs are the power supply, the larynx is the vibratory source, and the supraglottal vocal tract —supraglottic larynx, pharynx, oral cavity, and, potentially, the nasal cavity—is the resonator, which shapes the sound into words and song. The sound of the voice is affected by changes in any of these three systems, which should be regarded as a unit during evaluation of the professional voice patient.

Laryngeal function depends on extrinsic and intrinsic laryngeal musculature. The extrinsic laryngeal muscles alter the position of the larynx, which, in turn, can affect the length of the vocal tract resonator. Classically trained singers use the extrinsic musculature to stabilize the larynx within the neck when singing. The intrinsic laryngeal muscles allow delicate control of adduction, abduction, and tension of the vocal folds.

Within the larynx, the human vocal folds are unique structures with no correlates in any other animal species. Hirano, who contributed greatly to the understanding of the laminar structure of the human vocal fold, described the cover-body theory of vocal fold vibration. The vocal fold is covered by a layer of stratified squamous epithelium. The subepithelial tissue—the lamina propria—is divided into superficial, intermediate, and deep layers. The superficial layer, often called the Reinke space, is composed of fibroblasts that produce proteins and glycoproteins to form an extracellular matrix of loose connective tissue; the intermediate layer is composed chiefly of elastin fibers, and the deep layer is composed primarily of collagen fibers. Collagen fibers from the deep layer blend into the underlying thyroarytenoid muscle, which forms the main bulk of the vocal fold ( Figs. 58.1 and 58.2 ).

Fig. 58.1, Cross section of a true vocal fold stained for elastin (black) and collagen (yellow) . The trilaminar arrangement to the lamina propria is shown. Nonkeratinized squamous epithelium forms the mucosal layer over the superficial portion of the lamina propria. The black arrow indicates the superficial portion; the red arrow indicates the intermediate layer, which is rich in elastin; and the blue arrow points to the deep layer, which is rich in collagen (Movat stain, ×40).

Fig. 58.2, Cross section of a true vocal fold under higher magnification.

According to the cover-body theory of vocal fold vibration, the cover is composed of the overlying epithelium combined with the superficial layer of the lamina propria. The intermediate and deep layers of the lamina propria, known as the vocal ligament, form a transition zone, and the body is composed primarily of the thyroarytenoid muscle. The contrasting masses and physical properties of the vocal fold cover and body cause them to move at different rates as air passes between the vocal folds. This movement, or vibration, creates sound at the level of the vocal folds by disturbing the local pressure equilibrium within the area of the glottis. The sound, a buzz-like tone, is modulated and radiated by the supraglottal vocal tract into audible speech or song.

Blood vessels enter the vocal fold anteriorly and posteriorly, and vessels run parallel to the longitudinal axis of the fold. This arrangement allows the cover to vibrate over the body without placing excessive stretch or shearing forces on the vessels. Electron microscopy has shown that several arteriovenous shunts are present in the vocal fold microcirculation. These shunts may allow autoregulation of blood flow to this area.

Gray and colleagues began to identify the contents of the basement membrane zone and the lamina propria. The basement membrane zone is a complex area that anchors the epidermis to the superficial layer of the lamina propria ( Fig. 58.3 ). It is the site of tremendous shearing forces in the human vocal fold that occur during vocal fold vibration. Excessive shear forces can lead to disruption of the basement membrane zone as well as development of infiltrates in this area, which is a process important in the formation of vocal fold lesions. In the superficial layer of the lamina propria, collagen type III and VII fibers intertwine. This arrangement fixates the basement membrane zone to the superficial layer of the lamina propria, yet allows passive stretch during vibration ( Fig. 58.4 ).

Fig. 58.3, Arrangement of structures visible with the electron microscope from the basement cell to the fibers of the superficial layer of the lamina propria. AF, Anchoring fibers; AFL, anchoring filaments; AP, attachment plaques; DP, subbasal dense plate.

Fig. 58.4, The basement membrane and basal lamina of cell-anchoring fibers attach to the lamina densa of the basement membrane and type III collagen fibers pass through the loops of the anchoring fibers.

Immunohistochemical analysis has also been used to study the basement membrane zone and extracellular matrix of the lamina propria. In diseased states, which correlate clinically with vocal fold nodules, the basement membrane zone is widened significantly. In lesions that are clinically labeled polyps, collagen type IV within the basement membrane zone appears less pronounced than in the healthy state. Perhaps this relative weakness predisposes patients to polyp formation under phonotraumatic stress.

Voice Production

Vocalization begins with air, or the power supply: the lungs supply the essential energy for sound production by presenting the larynx (oscillator) with a stream of air. The diaphragm—the intercostal, back, and abdominal musculature—and the elastic recoil of the chest wall work in concert during inspiration and expiration to control the release of air. Classically trained singers use the abdominal and thoracic musculature to regulate exhalation and tend to use a greater percentage of total lung capacity than nonclassically trained singers to produce sound in a more efficient manner. This enhanced efficiency of air propulsion to the larynx is a key difference between trained and untrained voice users.

As the diaphragm relaxes and the chest wall recoils to a resting state, air is pushed through the nearly closed vocal folds. The air passage at the glottal level is smaller than the air passage of the trachea and subglottis; hence, pressure in the region of the glottis drops as the velocity of the air column increases. The relative vacuum created by this drop in pressure draws the pliable rima glottal tissues of the membranous vocal fold region together, a phenomenon known as the Bernoulli effect . After closure at the membranous vocal fold at the glottal level, the air column from the lungs and trachea continues to flow into the subglottal region. The rising subglottal air pressure forces the vocal folds back open. The vocal folds—or rima glottal tissues—open from inferiorly to superiorly (inferior to superior lip) to form an alternating convergent and divergent glottal configuration. The aerodynamic forces of the air column and the inherent myoelastic properties of the vocal folds, particularly in the region of the vocal fold cover, are responsible for the repeated opening and closing of the rima glottal tissues that pulses the air column as it flows out of the glottis. These disruptions in the steady state of the tracheal air pressure by glottal activity result in sound production. The sound produced by the vibratory source has a buzz-like quality. In professional voice production, glottal sound production can be further complicated by voluntary muscular activity that can influence the intensity and frequency characteristics of the glottal sound before its presentation to the supraglottal vocal tract.

The intensity of the sound source is related directly to subglottic pressure—that is, as subglottal pressure increases, sound intensity also increases. Humans can alter subglottal pressure and, therefore, sound intensity by two methods. The first and probably more efficient method is to modify the force of the expelled air from the trachea. This is accomplished through activation of the abdominal and thoracic musculature to increase the amount of air inspired and then, partially through elastic recoil properties of the thoracic cavity and partially through voluntary muscular activity, to control the rate of air egress. The varied regional schools of classical singing all emphasize different areas of muscular control to accomplish this phenomenon; however, the effect is the same in that the percentage of air used during singing is greater. The second method used to control subglottal pressure is to modify the force of vocal fold adduction. This method is somewhat less efficient. Increasing the force of laryngeal closure through activity of the thyroarytenoid, lateral cricoarytenoid, and interarytenoid muscles achieves greater resistance to the glottal opening, which, in turn, raises subglottal pressure and increases sound intensity; however, frequency of vocal fold vibration is directly related to tension within the vibratory system. Therefore, if sound intensity is controlled by the addition of tension in the vibrating system, the frequency of vibration can be inadvertently affected.

Well-trained vocal professionals can independently modulate the frequency characteristic of the source signal from vocal fold vibration through voluntary behaviors. They do so through adjustments in cricothyroid, thyroarytenoid, lateral cricoarytenoid, and interarytenoid muscle activity. When activated, the cricothyroid muscle elongates the vocal fold, thereby tensing the cover and elevating the frequency of vibration. In trained singers, this action alone is sufficient to achieve the first octave of pitch from the fundamental frequency. Further control of the amount of tension is accomplished by balancing these cricothyroid contraction forces against thyroarytenoid, lateral cricoarytenoid, and interarytenoid muscle forces to keep the vocal folds in an appropriate position for phonation. Unopposed cricothyroid muscle contraction leads to an increase in the glottal width, which negatively affects the vibratory cycle. In addition, fine control of this mechanism allows the blending of the registers of the singing voice for a smoother transition between what singers term the “chest” and “head” voice regions. Inappropriate or unbalanced changes lead to what is perceived as voice breaks. Although these breaks may be unappealing in a classically trained singer, they can be used for stylistic effects in commercial singing voice production. The yodel is probably the most commonly appreciated stylistic technique using the break in registers to produce a desired sound.

The sound source signal produced by vocal fold oscillation has a fundamental vibratory rate termed the fundamental frequency . Owing to the characteristics of the larynx as a natural vibrator, the sound that is produced has harmonic qualities—that is, as the vocal fold tissue vibrates to disturb the local air pressure, the pressure waves created are refracted. Refracted pressure waves that are out of phase with the fundamental frequency cancel each other out. Waves that are in phase with the fundamental frequency, on the other hand, are also radiated; these in-phase waves may be faster or slower by a whole-number multiple of the fundamental frequency and they create the harmonic or subharmonic frequencies produced by the laryngeal sound source. Again, each harmonic is a whole-number multiple of the fundamental frequency. This harmonic sound source is presented to the supraglottal vocal tract. In turn, the supraglottal vocal tract, on the basis of its physical characteristics (length, shape, and size of the opening at the distal end) amplifies or attenuates particular regions in the source harmonic spectrum.

The harmonic frequencies that are amplified are referred to as formant regions . They shape the output from the sound source into sounds appreciated as vocal communication. Through spectral analysis of the voiced signal, we can measure four or five formant regions significant in vocal sound production. The first two of these regions are primarily responsible for vowel determination, whereas the third, fourth, and fifth formant regions color the sound or provide timbre. Vocal professionals, particularly classically trained singers, are able to alter the characteristics of the vocal tract to modulate or shift these formant regions. When the third through fifth formant regions are brought closer together by the voluntary changes in characteristics of the vocal tract, they amplify one another to form a ring, termed the singer's formant . This formant region, in the range of 2300 to 3200 cycles per second, is detected by the human auditory system preferentially over other frequencies; this allows the singer to be heard and understood above the sound of an orchestra or other instruments. Appropriate use of these principles may give a professional voice user greater vocal efficiency, that is, greater radiated output with less physical effort. A trained vocal professional provides an aesthetically pleasing sound quality for the listener by modulating the formant regions of the sound produced by: (1) altering the length of the vocal tract through actions of the abdominal, thoracic, and cervical musculature; (2) altering the shape of the vocal tract through the action of the pharynx, tongue, jaw, and lips; and (3) altering the size of the distal opening primarily through the actions of the jaw and lips. The purpose of all-vocal training, either commercial or classical, is to teach the performer to control these vocal subsystems to produce the desired, and aesthetically pleasing, sound.

From this simplified discussion of voice science and the source-filter theory of voice production, the reader should understand that voice and speech production for complex human communication involves the interplay of several subsystems of the human body. These subsystems are the (1) respiratory tract, (2) abdominal cavity and abdominal wall musculature, (3) anterior cervical musculature, (4) paraspinous musculature, (5) pharyngeal musculature, and (6) laryngeal musculature. The pelvic cavity and pelvic musculature may also be recruited, depending on the patient's respiratory technique. Changes in any one of these subsystems can affect the vocal output; thus, coordination of these subsystems for complex vocalization is not an innate activity. As with any complex activity or sporting skill, such as a golf or tennis swing, natural ability is affected by a person's muscular set pattern, which, in turn, is determined in part by the individual's genetics and epigenetics.

Laryngeal Stroboscopy

First reported by Oertel in 1878, stroboscopic examination of the larynx has become a standard tool for the modern laryngologist. Stroboscopy is necessary to evaluate the vibratory patterns of the vocal folds that occur too rapidly to be visualized by the unaided human eye. Stroboscopy is not the modality of choice for visualizing pharyngeal motion, vocal fold abduction and adduction, or lesions of the pharynx and larynx. According to Talbot's law, the retina is able to resolve only five images per second; thus, images presented to the retina for less than 0.2 seconds (5 images/s) persist and are fused together by the ocular cortex to produce apparent motion. Because the vocal folds vibrate at a rate of 75 to 1000 cycles per second, even the slowest vibratory patterns cannot be visualized without assistance. During stroboscopy, the larynx is visualized with a xenon light source because characteristics of xenon light allow rapid on-and-off bursts. In this manner, the larynx is visualized for only brief periods in the range of 1/1000 second. These brief images, sampled from various points across many vibratory cycles, are then fused together to provide apparent slow motion of the laryngeal vibratory tissue. In modern stroboscopic equipment, the rate of laryngeal vibration is sensed by a microphone and is used to control the rate of xenon light firing. When the rate of visual sampling of the laryngeal image is out of phase with the rate of vibration, the laryngeal tissue appears to move. When the sampling rate is in phase with the vibratory rate, the laryngeal tissue appears to stand still.

Stroboscopy permits observation of the vibratory action of the vocal folds, which is not possible with continuous light examination ( Fig. 58.5 ). As previously described, this vibratory action is responsible for sound production. Therefore, by using stroboscopy the examiner can observe how small lesions alter the normal laryngeal vibratory pattern and glottal closure. The significance of a given lesion can then be determined.

Fig. 58.5, Vocal fold vibration.

In addition to providing information about vibratory status, examinations captured in video format can be reviewed for comparison with previous examinations and for consultation. This information improves accuracy in the diagnosis of vocal problems. Ideally, a baseline laryngeal stroboscopic examination should be performed in each professional voice patient while the health and voice are good. The findings can be compared with the vocal fold appearance during dysphonic states, and conclusions regarding the effects of vibration patterns on the cause of dysphonia can be made.

Recorded laryngeal stroboscopic examinations can be used to follow changes in the glottal vibratory pattern over days, weeks, and years. This process, known as interval examination , helps determine the effects of behavioral, medical, and surgical interventions on the larynx. Changes in laryngeal stroboscopy findings can be shown and documented on videotape and in computerized formats and still prints.

Interpretation of laryngeal stroboscopy requires knowledge of the stroboscopic appearance of the healthy larynx phonating at various frequencies and intensities. A regular format for evaluation also enables a more objective interpretation of this subjective test. Standardized checklists for laryngeal stroboscopy interpretation are available, and evaluation criteria include symmetry, amplitude, periodicity, mucosal wave propagation, and glottal closure ( Table 58.1 ). These vibratory characteristics are evaluated at a comfortable loudness level and modal speech frequency. In patients with voice complaints, it is beneficial to perform laryngeal stroboscopy during high and low pitch and loud and soft phonation. This approach provides additional data and is required to fully examine the vibratory characteristics. If a patient is having difficulty at a particular point in the vocal range, stroboscopy and laryngoscopy must be performed while the patient phonates within the troubled range. With this approach, the clinician may observe subtle vibratory changes that may be the source of the patient's vocal difficulties.

TABLE 58.1
Interpretation of Laryngovideostroboscopy
Criteria Result
Symmetry Normal
Side to side
Teeter-totter
Vertical O not symmetric
Amplitude Right equals left
Right is greater than left
Left is greater than right
Both decreased
Periodicity Yes, consistent
Yes, inconsistent
No, inconsistent
No, consistent
Mucosal wave Right normal
Right great
Right abnormal pattern
Right decreased
Right adynamic (where)
Left normal
Left great
Left abnormal pattern
Left decreased
Left adynamic (where)
Closure Complete, long
Complete, short
Small posterior gap
Large posterior gap
Slit
Elliptic
Half elliptic
Hourglass
Asymmetric hourglass
Other
Recording quality (1 = Poor, 4 = Great)
Focus ______________ Size ______________ Brightness _____________
Color _____________
Notable feature _____________
Videotape number ______________
Verbal diagnosis _____________________________________________________________________________________________________________

Symmetry refers to the paired appearance of the vocal folds, which are mirror images of each other during glottal vibration. Any difference in the mechanical properties of the vocal folds—mass, tension, pliability of the superficial layer of the lamina propria or mucosa, elasticity, position, or inflammation—can alter symmetry. Asymmetry of vibration can result in dysphonia.

Amplitude of vibration refers to the lateral excursion of the midmembranous portion of the vocal fold during vibration. This movement is normally one-third to one-half of the width of the visible fold. As with symmetry, lesions that affect the mass, tension, or pliability of the vocal fold alter amplitude. Vocal pitch and vocal intensity also alter vibratory amplitude. Vocal folds vibrating at a high pitch are stiffer and thinner, so their vibratory amplitude is reduced in both total and relative relation to the visible size of the vocal fold. On the other hand, when volume is increased, particularly by an increase in the force of expelled air, the vibratory amplitude increases as well. This phenomenon occurs at all pitches of phonation.

Periodicity , or the regularity of successive glottal cycles, is ascertained by synchronizing the stroboscopic flash with the frequency of vocal fold vibration. The vocal folds are visualized at approximately the same point in each cycle. This maneuver “freezes” the image or makes the vocal folds appear to be standing still. Any perceived motion of the folds indicates aperiodicity and any alteration in the balance of the vocal folds and the lungs can result in aperiodic vibrations. During a single phonation, vibratory cycles can range from periodic to aperiodic. Therefore, it may be helpful to determine whether the vibratory pattern is completely periodic, mostly periodic, mostly aperiodic, or completely aperiodic.

Mucosal wave propagation has both vertical and horizontal components. The vertical component, known as the vertical phase , is visualized during stroboscopic examination along the medial surface of the vocal fold. During vocal fold vibration, two distinct ridges appear to form due to characteristics of the mucosa and the vocalis muscle. These ridges are referred to as the upper and lower masses or lips of the vocal fold. The position of the upper lip is determined by the reflection of the vocal fold mucosa as it turns from horizontal to vertical over the superior surface of the vocal fold; the position of the upper lip is relatively fixed by the physical characteristics of the vocal fold. The position of the lower lip is determined by the changes in the physical properties of the mucosa that covers the vocal folds relative to those of the mucosa that covers the respiratory tract. Mucosa is technically defined as epithelium and submucosal tissue. Respiratory mucosa consists of a columnar epithelial layer supported by a relatively thin submucosal layer with occasional mucus-producing cells and minor salivary glands; however, vocal fold mucosa consists of a stratified nonkeratinized epithelium supported by a relatively thickened and specialized submucosa (see the section on vocal fold anatomy and physiology). The transition between the two types of epithelia is known as the inferior arcuate line or the conus elasticus . As the air pushes through the glottis, the specialized submucosa of the vocal fold separates from the underlying structure. This is the point of mucosal upheaval. As the vocal fold mucosa is tensed through the action of the cricothyroid muscle, the submucosal tissue is thinned three-dimensionally in the same manner that a rubber band thins as it is stretched. This thinning causes the lower lip of the vocal fold to move in a cephalad direction relative to the upper lip. In this manner, the tension in the vocal fold is increased and the mass of vocal fold tissue available for vibration is reduced to allow pitch elevation. Because the lower lip moves closer to the upper lip, the vertical phase and the time difference between closure of the lower lip and upper lip regions, known as the vertical phase difference , are also reduced.

In short, with tensing of the vocal fold for elevation of pitch, the vocal fold cover thins in three dimensions, and the time difference between closings of the lower and upper lips of the vocal fold—the vertical phase difference—is reduced. This action, which can be witnessed under stroboscopic light examination, is a critical feature in professional voice patients. Often a small lesion or stiffness along the medial surface of the vocal fold will become noticeable only as the vocal fold is stiffened by elevation of pitch, which limits the vibratory motions of the vocal fold to the superficial region of the cover. This is one of the first areas injured by prolonged or excessive phonation. It is visualized as a reduction in the distinctness of the upper and lower mass formation from one vocal fold to the other.

The horizontal phase of vocal fold vibration has been described stroboscopically as a “ripple of light across the superior surface” of the vocal fold. It is a reflection of light either from the upper lip of the vocal fold as it travels in a medial to lateral direction or from motion of the mucosa created by a shockwave as the two upper lips meet during closure; this wave is similar to the wave that moves across the surface of a pond after disturbance of the water by a pebble. Lesions that stiffen the mucosa and reduce its pliability lead to loss of this light reflex. This is an important characteristic when visualized under stroboscopic light examination, particularly when the vocal folds are compared with each other at various pitches of phonation. Lesions that fill the superficial layer of the lamina propria and abut or infiltrate the vocal ligament tend to restrict or eliminate both components of the mucosal wave. On the other hand, small- to moderate-sized lesions limited to the superficial portion of the superficial layer of the lamina propria usually allow propagation of the wave, although it may be decreased and asymmetric. Finally, large and exophytic lesions may disrupt the mucosal vibratory characteristic, even if they do not infiltrate deeply into the lamina propria, by altering the glottal shape and impairing glottal closure.

Closure of the membranous glottis is vital to laryngeal efficiency. Men usually have complete glottal closure, whereas up to 70% of women normally show a small posterior glottal gap. This glottal gap, however, is considered normal only when it extends from the vocal process of the arytenoid posteriorly. This region from the vocal process to the posterior commissure, referred to as the cartilaginous glottis, is not typically important in phonation unless the closure deficiency is large enough to create alterations in closure of the membranous portions of the rima glottic tissue. Berry and colleagues determined that the most efficient glottal output occurs when the vocal folds are approximately 1 mm apart at the region of the vocal process. Glottic closure patterns can be described as complete, long or short, small or large posterior gap, slit, elliptic, hourglass, or asymmetric hourglass. Closure can be altered by a mass lesion, scarring, muscular tension, and neurologic abnormalities, which become clinically significant when they involve deficiencies of closure at the membranous vocal fold level.

High-speed digital video is currently a research instrument but it holds great promise for the evaluation of the vibrational function of the vocal folds, especially in patients with severe aperiodic dysphonia in whom stroboscopy is of limited value.

Voice Analysis

Several methods can be used to quantify voice or to measure vocal vibration. No single test is considered the gold standard for documenting vocal fold function, and all tests have significant limitations. In addition, intrapatient and interpatient variability exists. Therefore, in professional voice patients, perceptual analysis by a trained observer and patient satisfaction with vocal outcome are often the most useful indicators of a successful intervention. Most laryngologists consider objective and semiobjective voice analysis to be important, particularly regarding preoperative and postoperative voice documentation. Little agreement exists as to the optimal tests and their performance, relative importance, or interpretation.

Acoustic Measures

Acoustic analysis has been used to objectively document voice and compare preoperative and postoperative surgical results. Acoustic measures include fundamental frequency, perturbation or cycle-to-cycle variation in frequency and amplitude, and maximal phonation range. Comparison of interval examinations requires a high-quality microphone and recording system with strict, standardized recording techniques and patient tasks. Although several computer-integrated acoustic analysis systems are available, they are of limited benefit for the average patient. The reliability of acoustic measures, secondary to variation in patient effort, is limited. In addition, the validity of acoustic measures designed to evaluate periodic vibration is questionable in dysphonic voices because dysphonia results from aperiodic vibration.

Spectrometry

Spectrometry provides a visual display of vocal harmonics and noise. In spectral analysis of sound, time is plotted on the vertical axis against frequency and intensity. This display shows the impact of resonance (formant structure) and articulation on the laryngeal buzz. Spectral analysis can evaluate and compare resonance changes and may be useful in documenting vocal alterations after surgical procedures on the pharynx. Some laryngologists have found it to be valuable in singers and other professional voice patients.

Electroglottography

Electroglottography (EGG) measures the efficiency of glottal closure by graphically recording the contact time of the vocal folds. It shows the opening and closing rates of the vocal folds, which are not well visualized by stroboscopy. EGG is performed by the passage of a low-voltage, high-frequency current between two electrodes placed on either side of the patient's neck. It measures the electrical impedance, which varies with opening and closing of the glottis; some clinicians consider this measure objective and reproducible. EGG may provide clinically useful information when combined with laryngeal stroboscopy or other measurements of laryngeal function.

Aerodynamic Measures

Aerodynamic studies are based on the fluid mechanics of airflow and involve the measure of airflow, volume, and pressure. Some are related to Ohm's law, which states that laryngeal resistance is equal to subglottal pressure divided by airflow.

Normative aerodynamic data are very broad, which makes comparisons among patients almost useless; however, changes in measures after intervention in individual patients are quite useful, particularly in the evaluation of changes in laryngeal closure.

Standard pulmonary function testing may be used to objectively evaluate the lungs. Mild obstructive or restrictive pulmonary disorders may be the basis of a patient's vocal fatigue or dysphonia. Bronchodilator trials and methacholine challenge may rule out cough-variant asthma and other types of reactive airway disease.

Subglottal pressure is usually measured indirectly rather than by tracheal puncture or esophageal balloon. It is measured via oral pressure when the glottis is open. The oral pressure achieves equilibration with the subglottal pressure across the open vocal folds for voiceless stop consonants, such as /p/ and /t/.

Maximum phonation time is an average of the phonation length in one breath, voicing the vowel /a/ at a comfortable pitch, and the loudness after deep inspiration. It is highly variable but can reasonably estimate laryngeal competence and glottal closure.

Mean airflow rate (airflow volume divided by phonation time) of a sustained vowel /a/ is occasionally tested. In general, low flow rates suggest laryngeal hyperfunction, obstruction, or primary pulmonary disorders. Higher values imply abnormalities in glottal competence that allow air loss.

Perceptual Analysis

For evaluation of the professional voice, the “trained” ear remains the most discerning instrument. Perceptual improvement or degradation of the professional voice to the performer, manager, other performers, the laryngologist, the speech-language pathologist, and the vocal pedagogue is critical. To make perceptual vocal analysis more objective, vocal characteristics can be evaluated independently in a systematic manner. In addition, judges are trained in vocal evaluation to decrease subjective bias; however, agreement on the terminology of vocal characteristics is not universal. Hirano proposed the so-called GRBAS scale—for grade, rough, breathy, asthenic, and strained —and this scale is widely used; however, Sundberg and Kreiman and associates, who performed research in this area, concluded that the clinical application of perceptual analysis was difficult and it remains so even today.

Voice Outcomes

Because of daily variation in vocal measures, vocal quality is frequently best judged by measures of patient satisfaction or direct comparisons of voice at different times. Patient satisfaction can be assessed through direct questioning or with specially designed questionnaires for rating perceived vocal problems. Voice can be directly compared with the use of taped samples. By eliminating dates and other identifying factors, blinded analysis of vocal changes can be determined. A panel of judges can, therefore, evaluate objective qualitative changes in a blinded manner over time. This approach is useful to determine the effect of a specific therapeutic intervention.

Outcomes research for judging patient satisfaction with voice and treatment is a valuable tool for studying vocal disorders. A Voice Handicap Index has been developed to quantify a patient's perception of his or her voice and its change in response to therapy. In addition, Cohen and associates have validated a Voice Handicap Index specifically designed to measure self-perceived vocal handicaps among patients who sing. Both of these self-rating questionnaires are useful for measuring patients’ self-perceptions about voice and the changes in voice after medical, behavioral, or surgical intervention. These surveys are inherently and deliberately subjective. It should not be surprising that patients with higher occupational voice demands tend to report higher levels of voice impairment independent of acoustic and perceptual voice data.

You're Reading a Preview

Become a Clinical Tree membership for Full access and enjoy Unlimited articles

Become membership

If you are a member. Log in here