Amino acids, peptides, and proteins


Abstract

Background

Amino acids are not only the building blocks of proteins, but they also play diverse roles in the provision of energy and the formation of a number of other important biomolecules, including hormones, neurotransmitters, and signaling molecules. The polymers of amino acids, peptides and proteins, orchestrate and control the vast array of human physiologic and biochemical processes. The catalog of amino acids, peptides, and proteins in various biological fluids is a target-rich environment for the detection of pathologic states.

Content

This chapter first describes the chemistry, metabolism, transport, and analysis of amino acids. Polymers of amino acids may be relatively short (peptides) or long (proteins). The human genome contains the information to dictate formation of approximately 20,000 polypeptides, but the actual diversity of the human proteome and peptidome is manifold more expansive. Proteome diversity arises from linear amino acid sequence and an array of modifications that include acylation, phosphorylation, glycosylation, and isoprenylation. Systems of short peptides, larger protein monomers, and multimeric protein complexes are the tools that orchestrate and control human physiologic and biochemical processes. Proper synthesis, folding, subcellular targeting, and catabolism of proteins and peptides are therefore essential for human health. Analytic exploitation of biologic fluids including blood, urine, and cerebrospinal fluid using chemical, immunologic, and mass spectrometric methods enables informed diagnosis and therapy in a multitude of disease states.

Introduction

Amino acids, peptides, and proteins are crucial for virtually all biologic processes. Amino acids serve as structural subunits of peptides and proteins but also play diverse roles in metabolism, neurotransmission, and intercellular signaling. Peptides serve as autocrine and endocrine signaling molecules that control appetite, vascular tone, and electrolyte homeostasis, as well as carbohydrate and mineral metabolism. Proteins, longer peptide chains with molecular mass typically greater than approximately 6000 Daltons (Da) serve as (1) intracellular and extracellular structural components, (2) biologic catalysts, (3) mediators of contractility and motility, (4) agents of molecular assembly, (5) ion channels and pumps, (6) molecular transporters, (7) mediators of immunity, and (8) components of intracellular and intercellular signaling networks.

The human genome contains more than 20,000 open reading frames that encode proteins. The actual number of proteins is far greater, however, because of alternative splicing of messenger RNA (mRNA), somatic recombination, mutation, proteolytic processing, and post-translational modification. The proteome represents the complete set of proteins in an organism or compartment of an organism such as the plasma space. Efforts to catalog the proteome include those by the Human Proteome Organization ( hupo.org ), the National Center for Biotechnology Information ( ncbi.nlm.nih.gov ), the Swiss Institute of Bioinformatics ( expasy.org ), and the Healthy Human Individual’s Integrated Plasma Proteome Database ( bio.informatics.iupui.edu/HIP2 ). Most databases were designed mainly to assist with peptide and protein identification, but efforts have shifted to characterizing the abundance of specific protein components in healthy and diseased populations, the usual basis for diagnostic applications.

This chapter begins with a discussion of the chemistry, metabolism, and analysis of amino acids. Inherited disorders of amino acid metabolism are discussed in Chapter 60 . A description of the chemistry and biochemistry of the peptide bond is then followed by a description of several clinically relevant peptide systems and methods for in vitro assessment. The protein narrative begins with an account of protein structure and cellular compartmentalization followed by discussion of co- and post-translational modifications. Constituents of the proteome in body fluids are also addressed, followed lastly by a description of methods for specific and global assessment of the proteome for clinical purposes. More in-depth treatment of other specific proteins and protein networks may be found in Chapters 32 (serum enzymes), 78 (enzymes of the red blood cell [RBC]), 33 (tumor markers), 36 (lipoproteins), and 77 (hemoglobin), as well as other chapters dedicated to the specific pathophysiology of cardiac, liver, renal, bone, pituitary, thyroid, and adrenal disease. In-depth treatment of measurement modalities for amino acids, peptides, and proteins such as electrophoresis, chromatography, mass spectrometry (MS), and immunoassay may be found in Chapters 18 , 19 , 20 , and 26 , respectively.

Amino acids

Amino acids were likely among the first organic molecules to emerge from the mix of methane, hydrogen, ammonia, and water in earth’s primordial atmosphere. Only 20 of the hundreds of known amino acids account for the vast majority of residues in human polypeptide chains. Their structure and molecular properties are summarized in Table 31.1 . These 20 along with dozens of non–protein-forming amino acids are critical to the form and function of the human body. Disrupted amino acid metabolism is not surprisingly associated with a multitude of pathologic processes.

TABLE 31.1
Structure and Chemical Properties of the 20 Proteogenic Amino Acids
Amino Acid MW (Da) Structure (pH 7.0) pK 1 pK 2 pK 3 pI HI
Alanine (ALA, A) 89.09

2.4 9.7 6.0 1.8
Arginine (ARG, R) 174.20

2.2 9.0 12.5 10.8 −4.5
Asparagine (ASN, N) 132.12

2.0 8.8 5.4 −3.5
Aspartate (ASP, D) 133.10

2.1 9.8 3.9 2.9 −3.5
Cysteine (CYS, C) 121.16

1.7 10.8 8.3 5.1 2.5
Glycine (GLY, G) 75.07

2.3 9.6 6.0 −0.4
Glutamate (GLU, E) 147.13

2.2 9.7 4.3 3.2 −3.5
Glutamine (GLN, Q) 146.15

2.2 9.1 5.7 −3.5
Histidine (HIS, H) 155.16

1.8 9.2 6.0 7.6 −3.2
Isoleucine (ILE, I) 131.17

2.4 9.7 6.0 4.5
Leucine (LEU, L) 131.17

2.4 9.6 6.0 3.8
Lysine (LYS, K) 146.19

2.2 9.0 10.5 9.7 −3.9
Methionine (MET, M) 149.21

2.3 9.2 5.8 1.9
Phenylalanine (PHE, F) 165.19

1.8 9.1 5.5 2.8
Proline (PRO, P) 115.13

2.1 10.6 6.1 1.6
Serine (SER, S) 105.09

2.2 9.2 5.7 −0.8
Threonine (THR, T) 119.12

2.6 10.4 6.5 −0.7
Tryptophan (TRP, W) 201.22

2.5 9.4 5.9 −0.9
Tyrosine (TYR, Y) 181.19

2.2 9.2 10.5 5.7 −1.3
Valine (VAL, V) 117.17

2.3 9.6 6.0 4.2
HI , Hydropathy index; MW , molecular weight; pk , acid ionization constant; pI , isoelectric point.

Basic biochemistry

Amino acids are organic compounds containing both an amino group (–NH 2 ) and a carboxyl group (–COOH) or another acidic group such as a sulfonate group (–SO 3 ). In a majority of biologically relevant amino acids, the amine moiety is primary (–NH 2 ), but some (e.g., sarcosine) are secondary (–NH–) amines, and others containing tertiary amines (e.g., proline) are referred to as imino (=N–) acids. With the exception of proline, the amino acids that occur in protein are α-amino acids (below).

The R group represents the unique side chains responsible for the chemical properties of individual amino acids. Not all biologic amino acids are α amino acids. β amino acids such as β-alanine and taurine, as well as γ-amino acids such as γ-aminobutyric acid (GABA) also play key biochemical roles ( Fig. 31.1 ).

FIGURE 31.1, Planar structures of rare or unusual, naturally occurring amino acids.

With the exception of glycine, all α amino acids contain four distinct moieties asymmetrically arranged around the α carbon. As a consequence, amino acids may exist as mirror images (enantiomers) referred to as the D or L configuration. With few exceptions, the biologically relevant amino acids exist in the L configuration. Small quantities of D amino acids occur in physiologic fluids but typically do not have specific functions. An exception is D serine, which represents 5 to 20% of total serine in cerebrospinal fluid (CSF) and may serve as a neurotransmitter. Amino acids with the D configuration occur in some bacterial products, foods, and pharmaceuticals. D amino acid oxidases in liver and kidney convert D amino acids to ketoacids, which can be further metabolized. L amino acids in proteins undergo slow racemization to a DL mixture over many years. , Aspartic acid undergoes the most rapid racemization, and this rate can be used to estimate the time of synthesis of proteins with very slow turnover, such as ocular lens proteins or intervertebral collagen in which half-lives may exceed 50 years. Two amino acids, threonine and isoleucine, have a second asymmetric carbon, and their stereoisomers are referred to as allothreonine and alloisoleucine . The latter compound has utility in the diagnosis of maple syrup urine disease ( Chapter 60 and Online Mendelian Inheritance in Man; https://www.omim.org/entry/248600 ).

In addition to the 20 well-known protein-forming amino acids, a number of unusual amino acids are recovered from protein hydrolysates. For example, 4-hydroxyproline and 5-hydroxylysine are found in collagen lysates, and desmosine and isodesmosine are recovered in elastin hydrolysates. Citrulline may also be recovered secondary to deimination of arginine. These amino acids are formed by post-translational mechanisms because no codon is responsible for their incorporation into growing polypeptides. Selenocysteine is a special case of an amino acid synthesized on a specific transfer RNA and incorporated into a few sites in only about 25 proteins that include members of the thyroid hormone deiodinase and glutathione peroxidase families. Some of these unusual amino acids are shown in Fig. 31.1 .

Acid-base properties of amino acids depend on the amino and carboxyl groups attached to the α carbon and on the basic or acidic groups occurring on some sidechains (R). At a physiologic pH near 7.4, the α-carboxyl group is ionized and carries a negative charge, and the α amino group is protonated and carries a positive charge. Molecules existing simultaneously as cations and anions are referred to as zwitterions or ampholytes (diagrammed below).

The pH at which ionizable groups exist equally as charged and uncharged forms is referred to as the pK. Amino acids thus have two or more pKs—one for the carboxyl, one for the amino group, and an additional one in the presence of an ionizable side chain. The isoelectric point (pI) is the pH at which an amino acid or other molecule has a net charge of 0. For a typical neutral amino acid such as glycine, the pI of 5.97 is midway between the pK of 2.34 for the carboxylic acid and the pK of 9.60 for the amino group. The pKs of amino acid side chains in proteins vary somewhat from those in free amino acids because of the influence of neighboring amino acids. The buffering capacity of ionizable groups is primarily in a pH range within ±1 unit of the pK for the respective groups. Amino acids and proteins therefore have a limited buffering capacity near physiologic pH. Glycine, for example, is used as a buffer near pH 2.5 or 9.5. The imidazole side chain of histidine is an exception with a pK near 6.0.

The structural diversity of side chains permits formation of proteins with a variety of structure and function. Sidechain diversity is dictated not only by pK but by size and hydrophobicity as well. Amino acids with longer aliphatic or aromatic side chains such as isoleucine, leucine, and phenylalanine have greater hydrophobicity than shorter side chains such as the methyl group found in alanine. Neutral amino acids with polar groups such as hydroxyl or amide groups in their side chains are more hydrophilic. Acidic amino acids have side chains with carboxylic acids, and basic amino acids have side chains with amino, guanidino, or imidazole groups. The thiol side chain (–SH) of cysteine oxidizes easily and may become linked to other molecules via disulfide bonds. In plasma, cysteine occurs as cystine (cysteine homodimer linked via a disulfide) or as a mixed disulfide with heterogeneous thiol compounds, albumin, or other proteins.

With some exceptions, amino acids are water soluble and stable in plasma. The most soluble amino acids have small side chains with polar or ionizable moieties such as glycine, alanine, arginine, serine, and threonine. Less soluble amino acids such as phenylalanine, tyrosine, leucine, and tryptophan tend to have larger, nonpolar aliphatic or alicyclic side chains. Amino acid solubility is rarely limiting in vivo except in some metabolic disorders. Deposition of tyrosine crystals in the eye and skin is common in tyrosinemia, particularly type II ( https://www.omim.org/entry/276600 ). Likewise, cystine may crystallize in the renal parenchyma in patients with cystinuria ( https://omim.org/entry/220100 ). Structural and chemical details for the 20 protein-forming amino acids are displayed in Table 31.1 .

Amino acid supply and transport

Amino acids participate in many metabolic pathways in addition to serving as substrates for protein synthesis. In the healthy state, women require approximately 46 g/day and men approximately 56 g/day of dietary protein (0.8 g/kg body weight), and substantial increases in demand occur during growth and in many disease states (see Chapter 46 ). Dietary protein is digested by proteases in the stomach (e.g., pepsin) and small intestine (e.g., trypsin, chymotrypsin) to yield amino acids (see Chapter 52 ). Endogenous protein turnover serves as another source of free amino acids. Eight amino acids used for protein synthesis (isoleucine, leucine, lysine, methionine, phenylalanine, threonine, tryptophan, and valine) are not synthesized by humans and therefore are considered “essential” constituents of the diet. Meat, milk, eggs, and fish contain a full range of essential amino acids. Gelatin is deficient in tryptophan, and some plant sources of protein may be additionally deficient in lysine or methionine. Therefore diets based on a single source of plant protein may be deficient in some amino acids. When liver function is compromised, cysteine and tyrosine become essential because they are not produced from their usual precursors, methionine and phenylalanine, respectively. Arginine may be conditionally essential as well because endogenous rates of synthesis may be insufficient to meet requirements in adults under metabolic stress or in growing children.

Requirements for dietary protein to maintain nitrogen balance increase in infancy and childhood when there are increased demands for growth. , Daily requirements increase by up to 3.5 to 4 g protein/kg body weight for premature infants, for example. Protein demand is also increased in pregnancy, lactation, and states of protein loss or catabolic states (e.g., burn patients). Persistent negative nitrogen balance results in a number of undesirable phenotypic features. A diet severely deficient in protein and consisting primarily of high-starch foods can lead to kwashiorkor, a disorder characterized by decreased serum albumin, immune deficiency, edema, ascites, growth failure, apathy, and many other symptoms (see Chapter 46 ). Marasmus results when protein and energy sources such as carbohydrates are deficient, causing wasting of muscles and subcutaneous tissues. Albumin or prealbumin concentrations are sometimes used to assess adequacy of the amino acid supply. The shorter biologic half-life of prealbumin compared with albumin (2 vs. 20 days) makes it a valuable marker for acute dietary assessment. ,

Homeostasis of cellular amino acid concentrations is dependent on supply, catabolism, and excretion. Supply and excretion are regulated by a series of transport systems with overlapping substrate specificity, strategic tissue expression, and polarized cellular distribution. , Amino acids are derived from dietary protein precursors through the action of proteolytic enzymes in the stomach and small intestine that produce shorter oligopeptides and individual amino acids. Enteral absorption of di- and tripeptides is mediated by a single proton-coupled transport system termed peptide transporter 1 or PEPT1 (encoded by SLC15A1 ). Transport of individual amino acids across the intestinal and renal epithelium and the blood–brain barrier is far more specialized.

Early biochemical characterization of amino acid transport was technologically limited to studies of the plasma membrane. These broad-specificity transport systems function as co-transporters, exchange transporters, or facilitative transporters. Nomenclature was based on substrate specificity, co-transport requirements, and sensitivity to inhibitors. By convention, capital letters indicate a requirement for Na + , and lowercase descriptors imply the lack of Na + dependence. Systems A and ASC are responsible for Na + -mediated symport of neutral species with small side chains. System L facilitates exchange of amino acids with large, hydrophobic side chains. Cationic amino acid transport is mediated by a system termed y + . System y + L facilitates exchange of neutral amino acids with cationic species. System B o catalyzes Na + mediated transport of branched-chain and aromatic amino acids, and system b o enables exchange of bulky neutral and cationic side chains. Finally, system X mediates transport of anionic amino acids and system x c facilitates transmembrane exchange of glutamate and cystine. These systems act in a coordinated way to achieve amino acid homeostasis.

Distinct transport systems cooperate to achieve net amino acid transport across epithelial barriers. For example, transcellular transport of cationic amino acids across the renal brush border is achieved by a combination of two exchange systems, b o and y + L. On the apical surface, positively charged amino acids are imported in exchange for an uncharged amino acid via system b o . System b o uses the transmembrane electrical potential to drive transport. Efflux across the basolateral membrane is also achieved via exchange transport via system y + L. System y + L uses the driving force of the transmembrane sodium gradient. Lack of proper polarized expression in appropriate tissues results in transport disorders such as cystinuria ( https://omim.org/entry/220100 ) and lysinuric protein intolerance ( https://omim.org/entry/227100 ).

Intracellular amino acid compartments are maintained by a distinct set of transport systems. These systems are important for metabolizing, sequestering, and recycling various amino acids. Substrate concentrations in the urea cycle are regulated in part via the mitochondrial ornithine-citrulline antiporter. Defects in this transport system leads to HHH (hyperornithinemia, hyperammonemia, homocitrullinuria) syndrome ( https://omim.org/entry/238970 ) (see Chapter 60 ). Reclamation of amino acids from lysosomal protein digestion also relies on transport systems. Defects in the CTNS gene, for example, inhibits lysosomal cystine transport and results in the clinical disorder known as cystinosis ( https://omim.org/entry/219750 ). Finally, neuronal vesicles must concentrate synaptic transmitters to achieve interneuronal communication. Two of these transmitters, glycine and glutamate, are actively transported across vesicle membranes.

The rather coarse biochemical definition of amino acid transport is being redefined as the genetic basis for these systems is clarified. , Table 31.2 summarizes transporter gene families and their connection to functional transport systems. In some cases, functional characterization of these gene families is incomplete and many of these transporter families have not been linked definitively to human disease. Genes of the SLC (solute carrier) family encode integral transmembrane spanning proteins that catalyze amino acid transport. The SLC1 family, for example, encodes transporters primarily responsible for transport of anionic amino acids that are particularly important in brain and neural tissues. Neutral amino acid transporters (e.g., System A) are encoded by the SLC38 family. Mitochondrial transport systems are expressed by the SLC25 gene family. The SLC3 and SLC7 gene families encode a wide array of heterodimeric transporters, including the aforementioned b o and y + L transport systems. Each of these transporters contains one of two SLC3 genes that encode membrane-targeting subunits that are disulfide linked to one of at least a dozen SLC7 gene products dictating transport specificity. Metabolic flux of amino acid carbon is critically dependent on the proper expression and regulation of transport and enzyme-catalyzed chemical transformation.

TABLE 31.2
Genetic Basis of Selected Amino Acid Transport Systems
Gene Family System Expression Substrates Disease Link
SLC1 X , ASC Brain, gut, kidney, liver D,E, A, S, T, C, N, Q Dicarboxylic aminoaciduria
SLC3A1-2 y + , L, y + L, b 0 Broad R, K, H, M, L, A, C, L, I, V, cystine
  • Cystinuria

SLC6 B 0 Brain, kidney, gut, liver F, Y, L, I, V, P, C, A, Q, S, H, G, M
  • Hartnup disorder

  • Iminoglycinuria

SLC7 (A1-14) y + , y + L, b 0 , ASC, x Broad R, K, cystine, ornithine Lysinuric protein intolerance
SLC15 a Brain, kidney, gut, immune cells H None known
SLC16 a Gut, kidney F, Y, W None known
SLC17 a Neurons E None known
SLC25 ASP/GLU
ORN/CIT
antiporters
Broad, (mitochondria) D, E, ornithine, citrulline
  • Type II citrullinemia

  • HHH syndrome

SLC32 a Neurons G, γ-aminobutyrate None known
SLC36 a Brain, intestine, kidney P, G, W None known
SLC38 A, N Broad Q, A, N, C, H, S, T None known
SLC43 L Liver, kidney, gut, muscle L, I, V None known
CTNS (SLC66) a Broad Cystine Cystinosis
HHH , Hyperornithinemia, hyperammonemia, homocitrullinuria.

a Transport properties not classified by classical naming convention.

Amino acid metabolism

Amino acids serve as scaffolds for the synthesis of many hormones, nucleotides, lipids, signaling molecules, and metabolic intermediates that play a role in energy production. As portrayed in Fig. 31.2 , the transformation of amino acid carbon to energetic intermediate typically begins with transamination. Excess nitrogen is excreted as urea. Resulting α-ketoacids may enter the Krebs cycle; undergo conversion to ketone bodies, fatty acids, or glucose; or be completely oxidized to CO 2 depending on cellular energy demands. A vast array of enzyme networks has evolved to orchestrate demand for amino acids. Information regarding the substrates, products, kinetics, and inhibitors of these enzymes may be found in multiple databases, including BRENDA ( brenda-enzymes.org ), ExPASy-enzyme, ( enzyme.expasy.org ), and ExplorEnz ( enzyme-database.org ). Pathway databases include KEGG ( genome.jp/kegg ), GenMAPP ( genmapp.org ), and BioCyc ( biocyc.org ).

FIGURE 31.2, A generalized scheme of amino acid metabolism in the liver. After transamination, amino acid carbon may be used in the Krebs cycle directly or transformed into other respiratory fuels such as glucose and ketones. Waste nitrogen is disposed of via the urea cycle. ATP , Adenosine triphosphate.

Glucose, fatty acids, and ketones are primary respiratory substrates in humans. These substrates generate adenosine triphosphate (ATP) via the mitochondrial Krebs cycle. Amino acids play two key roles in energy provision. First, amino acids are converted to Krebs cycle intermediates to maintain the activity of the cycle through a process called anaplerosis. Glutamine and glutamate, for example, are converted to α-ketoglutarate via loss of the epsilon and alpha amino groups. Fumarate may be derived from asparagine and aspartate, and succinate is derived from methionine, threonine, and valine. Second, amino acids may be mobilized to generate fuels for a variety of organ systems. Five amino acids (isoleucine, leucine, lysine, phenylalanine, and tyrosine) may be converted to ketones. All of the amino acids except for leucine may be used to produce glucose. Therefore in times of high energy demand and limiting fuel sources, flux of amino acid carbon through proper pathways becomes an important source of respiratory fuel.

Excess tissue nitrogen is disposed of as urea, which contains two moles of nitrogen per mole (see also Chapter 60 ). Urea production is limited to the liver, so selected amino acids, primarily glutamine and alanine, serve to shuttle excess nitrogen to the liver. Nitrogen in the form of ammonium ion is first converted to carbamoyl phosphate, which is transferred to ornithine to form citrulline. Aspartic acid and citrulline are condensed to form argininosuccinic acid, which, in turn, is cleaved to arginine and fumaric acid. Arginase hydrolyzes arginine to urea and ornithine to allow the cycle to repeat. Urea is usually viewed simply as waste, but it is also the primary contributor to the high osmolality in the renal medulla and enables maximal urinary concentrating ability.

Amino acids are precursors for many hormones and signaling molecules. Tyrosine provides a scaffold for thyroxine, dopamine, and adrenaline synthesis. Tryptophan is a precursor of serotonin and melatonin. The potent vasodilator nitric oxide (NO) is produced from arginine. Glycine, aspartate, glutamine, and serine contribute atoms to purine and pyrimidine synthesis. Glycine and arginine are precursors for creatine synthesis.

Creatine synthesis and many other biochemical processes rely on a series of single-carbon transfer reactions mediated by serine, glycine, histidine, and methionine. Transfer of fully oxidized carbonyl carbon (=C=O) to molecules such as propionyl CoA (to form methylmalonyl CoA) is mediated by biotin. Glycine, serine, and histidine contribute less oxidized carbon units such as methylidine (=CH–), and methylene (–CH 2 –) groups that enable purine and pyrimidine synthesis via folate derivatives. Folate also mediates transfer of methyl (–CH 3 ) groups to homocysteine to form methionine. The resulting methionine is, in turn, activated to S-adenosylmethionine becoming a methyl donor to a vast array of substrates including DNA, RNA, histones, choline, and catecholamines. The importance of folate and single-carbon metabolism to cell growth and division cannot be overstated. Folate deficiency in a developing embryo can lead to death or severe neurologic birth defects. The use of folate antimetabolites such as methotrexate has been a mainstay in the treatment of cancer for many decades. These pathways are treated in more detail in Chapter 39 .

Amino acid concentrations

Plasma amino acid concentrations collectively span four orders of magnitude from very low micromolar quantities (e.g., β-alanine, cystathionine) to near 1 mmol/L (e.g., glutamine, glycine). With protein intake of 1 to 2 g/kg, daily variation of approximately 30% in healthy adults has been observed. Concentrations of both essential and nonessential amino acids vary in a coordinated way, suggesting that mechanisms beyond diet and enteral extraction are responsible. Amino acid concentrations tend to peak between 12 and 8 pm with a nadir between midnight and 4 am. , After an ingested protein bolus, dietary amino acids rise and tend to return to preprandial levels in 3 to 6 hours. Therefore determination of “fasting” amino acid concentrations requires extended periods of dietary abstinence.

Most amino acids in blood undergo glomerular filtration but are efficiently reabsorbed in proximal renal tubules by previously described saturable transport systems. Increased renal excretion of amino acids ( aminoaciduria ) results from filtration of excessive plasma concentrations, generalized tubular impairment, or heritable defects in amino acid transport systems. Glycine tends to be most abundant in normal urine followed by histidine, glutamine, and serine. Increased concentrations of proteogenic amino acids in plasma tend to precipitate only mildly elevated excretion because of efficient reabsorption. Other amino acids that accumulate in plasma secondary to metabolic errors (e.g., argininosuccinate, homocitrulline) demonstrate pronounced excretion because of the absence of specific tubular mechanisms enabling reclamation from the filtrate.

With the exception of glutamine, CSF amino acid concentrations are typically less than 10% of those found in plasma. This high plasma to CSF gradient suggests active net brain to blood transport across the blood–brain barrier. , Glutamine concentrations in CSF are generally equal to those in plasma, suggesting a bidirectional facilitative transport process. Insofar as CSF concentrations reflect synaptic concentrations, regulation of neurotransmitter amino acid concentrations is critical for normal neural action potential propagation. Glutamate is the most abundant amino acid in the brain and is the primary excitatory transmitter. Glycine and GABA are the predominant inhibitory transmitters. Lumbar puncture to access the CSF amino acid pool must be done with great care to avoid overestimation of central amino acid concentrations secondary to contamination with peripheral blood.

Assessment of amino acid concentrations in blood, urine, and spinal fluid has been historically applied to the detection of inborn errors of metabolism. These are comprehensively covered in Chapter 60 . Aside from the measurement of homocysteine as a marker of vitamin B12 and folate status, clinical applications of amino acid measurement beyond metabolic diseases are limited. Future applications may include assessment of immunity using tryptophan and its metabolites such as kynurenine. , Increased plasma concentrations of α-aminobutyric acid may be useful in detecting liver regeneration. Branched-chain amino acid concentrations may be early indicators of diabetes, while a combination of phenylalanine, glutamate, and alanine has some value to predict the onset of preeclampsia. Finally, quantitation of arginine and its dimethylated derivatives (asymmetric and symmetric dimethylarginine) may have utility in assessing endothelial function. , These applications require further clinical validation.

Analysis of amino acids

For decades, the standard method of amino acid analysis was cation-exchange chromatography with postcolumn spectrophotometric or fluorescent detection of various primary amine derivatives. Derivatizing agents have included dansyl chloride, o-phthalaldehyde, and ninhydrin. The ninhydrin approach developed by Stein and Moore in the 1950s was initially applied to determination of amino acid content of protein hydrolysates and then adapted for profiling of free amino acids in deproteinized body fluids. Other systems commercialized by Beckman (Brea, CA) and Hitachi (Tokyo, Japan) were large floor models and required as long as 2 to 3 hours to quantitate 30 to 50 physiologic amino acids in a single patient specimen. These systems have given way to smaller bench-top systems using ninhydrin (Biochrom, Cambridge, United Kingdom) or fluorescent (quinolyl- N -hydroxysuccinimidyl) amine derivatives (Waters, Milford, MA) that still require 90 to 120 minutes for full sample analysis. In addition to long cycle times, these methods are subject to interference from co-eluting amines, leading to overestimation of some amino acid concentrations. Common co-eluting compounds include methionine with homocitrulline, phenylalanine with aminoglycosides, and histidine with gabapentin.

MS is increasingly being adopted as the method of choice for amino acid profiling. Newborn screening programs quantitate amino acid butyl esters derived from dried blood spots using flow-injection MS protocols. These methods do not use chromatographic separation and so cannot distinguish between isomeric or isobaric amino acids such as leucine, isoleucine, alloisoleucine, and hydroxyproline. Liquid chromatography–tandem MS (LC-MS/MS) methods for detection of amino acids in plasma and other body fluids that use liquid chromatography have also been developed (see Chapters 19 and 20 ). Some of these use amine-targeted derivatives, others target the carboxyl group, and some others use no chemical derivatization. Advantages of MS-based techniques include improved analytic specificity, a three- to four-order-of-magnitude dynamic range, and rapid (20-minute) throughput. MS methods may also be optimized for profiling multiple molecular species in addition to amino acids. Such approaches promise to improve the scope of metabolic disorders detectable with a single patient specimen in a single analytic run.

Peptides

This section describes the basic biochemistry of peptides. In general, the term peptide applies to relatively short polymers of amino acids with molecular weights less than 6000 Da (<~50 amino acid residues). The chemistry of the peptide bond and the physical characteristics of the peptide backbone are discussed in this section along with a number of clinically relevant peptide systems.

Peptide bond

A peptide bond, also referred to as an amide bond, is formed between the α-nitrogen atom of one amino acid and the carbonyl carbon of a second (diagrammed below).

So-called isopeptide bonds refer to amide bonds between sidechain amines or carbonyl carbons on the side chain rather than α-amine or α-carbonyl. In glutathione, for example, the γ-carboxyl group of glutamic acid is linked to the α-amino group of cysteine. During translation, peptide bonds are formed from the amino (N) to the carboxyl (C) terminus by removal of water (also referred to as dehydration or condensation) and catalyzed by RNA (referred to as a ribozyme) that forms part of the ribosome. Peptides are also synthesized in vitro for therapeutic and experimental purposes. Such chemical peptide synthesis proceeds from C to N terminus using N-protected amino acids and catalyzed by N,N ′-dicylohexylcarbodiimide. , In this scheme, the nucleophilic amine group reacts with a carbodiimide : carbonyl intermediate, resulting in the formation of a new peptide bond and dicyclohexylurea. Dicyclohexylurea is insoluble in most solvents and can be easily removed from the maturing peptide. Cleavage of peptide bonds may be nonspecifically achieved by acid hydrolysis or accomplished specifically by a host of proteolytic enzymes with affinity for bonds between specific amino acid residues. These protease systems are described later in this chapter.

Electron sharing in the amide bond (also known as the ω bond) is delocalized, effectively preventing rotation about this bond. This bond is fixed in one plane. Conformational flexibility of the peptide backbone results entirely from rotation about the axes of the two bonds to the α-carbon. Angles of rotation about these bonds are referred to as Ramachandran angles. The nitrogen to α-carbon bond angle is referred to as the Φ angle, and the α-carbon to carbonyl bond is referred to as the Ψ angle (below).

Theoretically, free rotation about these bonds allows angles ranging from −180 to 180 degrees. In reality, steric and energetic factors limit the possible combinations. These bond angles play a key role in dictating the secondary structure of proteins. For example, values of Φ and Ψ in α-helices are approximately −60 and −45 degrees, respectively. Secondary structure is addressed more extensively later in this chapter.

Peptide heterogeneity and analysis

Assessment of circulating peptide concentrations has a number of limitations. In the absence of enzymatic activity, peptide measurements have been historically limited to immunologic techniques (discussed in more detail in Chapter 26 ). Antibodies may recognize linear sequence epitopes or discontinuous, conformational epitopes. These epitopes typically involve 10 to 20 amino acids binding exposed areas of 600 to 1000 Å. Measurement of short peptides (<20 to 30 amino acids) are therefore limited to single-site, competitive assays that lack the analytic specificity of two-site (sandwich) immunoassays. The molecular specificity issues may be addressed using MS as an alternative. Small peptides may be ionized via electrospray (ESI) or matrix-assisted laser desorption (MALDI) and interfaced to tandem quadrupole mass analyzers (see Chapter 20 ).

Absolute analytic specificity is not always ideal when applied to biologic peptide systems. Peptide populations may consist of species with a variable number of amino acid residues possessing sometimes unknown biologic potency. Hepcidin, for example, is an iron transport regulatory peptide that circulates principally as a 25–amino acid peptide but also as shorter peptides of 22 and 20 amino acids with diminished biologic activity (see Chapter 40 ). Likewise, dozens of truncated forms of the mature 32–amino acid B-type natriuretic peptide ranging from 24 to 31 amino acids are detectable in heart failure patients (see Chapter 48 ). Some of these truncated forms are present in vivo, and others likely develop in vitro. Thus narrowly targeted MS assays may exclude active peptide species and run the risk of underestimating bioactive peptide. Cross-reactive immunoassays, on the other hand, may stoichiometrically detect both active and inactive peptide, thus running the risk of overestimating bioactive peptide concentrations. Examples of several important biologic peptide systems and their analytic considerations follow.

Selected clinically relevant peptide systems

Pro-opiomelanocortin system

The pro-opiomelanocortin ( POMC ) gene on chromosome 2 is expressed primarily in the pituitary gland, arcuate nucleus of the hypothalamus, and skin melanocytes. The gene produces a 241–amino acid prohormone that can yield as many as 10 distinct biologically active peptides depending on patterns of cleavage in specific tissue types. The POMC peptides have diverse effects on glucose and electrolyte homeostasis (via adrenocorticotropic hormone [ACTH]), body mass and appetite (via lipotropins and melanocortins), pigmentation (also via melanocortins), and pain (via endorphins).

Clinical exploitation of this complex peptide system is currently limited to the impact of ACTH on the adrenal gland and subsequent feedback by cortisol. Measurement of circulating ACTH concentrations may be used to clarify the mechanism of adrenal disease. For example, Cushing syndrome may result from autonomous adrenal function (ACTH low) or may be fueled by ectopic ACTH production (ACTH high). Likewise, Addison disease may result from adrenal failure (ACTH high) or pituitary failure (ACTH low). The pathophysiology of this axis is treated more extensively in Chapters 55 and 56 . Full-length ACTH consists of 39 amino acid residues and circulates with a half-life ranging from 10 to 15 minutes. The biologic activity of ACTH is contained in residues 1 to 18, and the length of the C-terminus mediates circulating half-life. Two-site immunoassays typically target the extreme N and C termini to avoid detection of shorter, inactive circulating species. This approach, however, can lead to cross-reactivity with longer precursor forms of ACTH such as pro-ACTH and the full POMC gene product. These precursors typically circulate at concentrations that are five times greater (5 to 50 pmol/L) than ACTH (1 to 10 pmol/L). ACTH assays are poorly standardized.

Natriuretic peptides

This peptide family consists of atrial natriuretic peptide (ANP), B-type natriuretic peptide (BNP, formerly brain natriuretic peptide), and C-type natriuretic peptide (CNP). ANP and BNP are highly expressed in cardiac tissue relative to CNP, which is expressed at low concentrations in a broad variety of tissue types. The mature forms of these peptides contain a 17–amino acid loop stabilized by an intramolecular disulfide bond. Each peptide acts via a specific guanylate cyclase-coupled receptor to promote sodium and water excretion, blunt activation of the renin–angiotensin system, and decrease vascular resistance. Circulating concentrations of ANP and BNP (but not CNP) increase rapidly in response to increased cardiac filling pressures that are characteristic of heart failure. Clinical measurement of BNP has become a widely used tool to detect heart failure and monitor its progression.

B-type natriuretic peptide is synthesized as a 108–amino acid precursor (pro-BNP) that is cleaved upon cellular release to the active 32–amino acid peptide and an N-terminal fragment (NT-proBNP), which lacks biologic activity. The diagnostic and prognostic role of these peptides is treated extensively in Chapter 48 . Two-site immunoassays for both BNP and NT-proBNP are commercially available and widely used. NT-proBNP circulates at higher concentration than BNP by virtue of its longer biologic half-life (~60 vs. ~20 minutes). Recent evidence suggests that the BNP detected immunologically in heart failure is not the bioactive form of BNP. Using MS, almost no mature 32–amino acid peptide was detected in heart failure patients despite the significant presence of immunoreactive BNP. , Further studies suggest that the immunoreactive BNP in the plasma of heart failure patients may be attributed to higher molecular weight forms such as proBNP that exhibit a fraction of the bioactivity of the mature peptide. , This new analytic information may prompt a reevaluation of the role of BNP in the pathophysiology, diagnosis, and treatment of heart failure.

Hepcidin

Hepcidin was initially described as an antimicrobial peptide. The role of hepcidin in iron metabolism was noted by Nicolas et al. in 2001. The mature 25–amino acid molecule is derived from a 60–amino acid precursor expressed from 3 exons of the HAMP gene on chromosome 19. The tightly looped structure of hepcidin is stabilized by four intramolecular disulfide bonds. The biologic activity of hepcidin is mediated via its interaction with ferroportin, the transport protein that mediates iron transport from duodenal enterocytes and macrophages. Hepcidin binding promotes the internalization and degradation of ferroportin, thus inhibiting mobilization of iron stores. , Physiologic states such as chronic inflammation are characterized by microcytic anemia with paradoxically adequate iron stores known as anemia of chronic disease. Increased hepcidin expression is at the pathologic root of this condition. In addition to its diagnostic role in differentiating iron deficiency anemia from the anemia of chronic disease, hepcidin measurement may aid in the treatment of hemochromatosis, transfusion-associated iron overload, and anemia associated with chronic renal failure.

Despite the important role of hepcidin in iron metabolism, its clinical use remains infrequent partly because of difficulties associated with its measurement. Antibodies toward hepcidin have been difficult to develop because of its small, compact size and because it is highly conserved across multiple species. Immunoassays have largely been limited to single-site, competitive formats that cross-react significantly with shorter versions of the molecule (22-, 20-mers). This can lead to overestimation of circulating hepcidin compared with MS techniques when applied to patients with renal failure in whom shorter hepcidin peptides tend to accumulate in the plasma. Improvements in the molecular specificity and harmonization of hepcidin determination will further clarify its role in both normal and pathologic physiology and also promise to enhance its clinical utility. For additional discussion on hepcidin, see Chapter 40 .

Angiotensins

Renin is secreted by the afferent arterioles of the kidney in response to decreased flow, pressure, and sodium delivery to the renal juxtaglomerular apparatus. Renin acts on circulating angiotensinogen to initiate the formation of vasoactive peptides that act to reestablish glomerular flow. The N-terminal decapeptide cleaved from the 452–amino acid angiotensinogen molecule by renin is referred to as angiotensin I. Angiotensin-converting enzyme (ACE) cleaves 2 C-terminal residues from angiotensin I to form the octapeptide, angiotensin II, which promotes contraction of vascular smooth muscle and stimulates proximal tubular sodium reabsorption to increase blood pressure. ACE inhibitors (e.g., captopril, enalapril, quinapril) are important pharmacologic tools used to treat hypertension (see also Chapter 56 ).

Although the renin–angiotensin–aldosterone axis is an important therapeutic target, it is a far more infrequent target for diagnostic laboratory studies. Plasma renin activity is assessed to explore the possibility of renovascular hypertension. In this condition, unilateral restriction of renal blood flow results in inappropriate release of renin and severe hypertension. In nonpathologic states, renin activity in the plasma is normally very low (<10 ng/mL/h) and is typically measured by assessing the production of angiotensin I from endogenous angiotensinogen after a long (>12 hours) incubation period. Angiotensin I generation is most commonly monitored via a competitive immunoassay with the potential to cross-react with shorter peptides. MS approaches that address peptide specificity and stability may mitigate these analytic limitations. ,

Endothelins

Endothelins (ETs) are peptides with 21 amino acids derived from the vascular endothelium (ET-1), intestinal and renal tissue (ET-2), and neural tissue (ET-3). ET-1 is produced from a 203–amino acid precursor (preproendothelin) and smaller, 30– to 40–amino acid “big” ET molecules that are inactive. ET-1 is a potent vasoconstrictor and may mediate pathology associated with diabetic nephropathy and hypertension. Increased circulating concentrations of ET after myocardial infarction suggest a negative survival prognosis. The reliability of these observations using currently available immunoassays and other potential clinical applications for ET measurement remain unclear.

Vasopressin

Vasopressin (arginine vasopressin [AVP]), also known as antidiuretic hormone (ADH), is a nonapeptide stored in and secreted from the posterior pituitary gland (see also Chapter 55 ). Its primary target organ is the distal convoluted tubule and collecting duct, where it acts to promote water reabsorption. ADH circulates at very low concentrations (<40 pmol/L) and has a very short half-life (15 to 20 minutes), making routine diagnostic measurement impractical. Copeptin, a prohormone form of ADH, exhibits a longer half-life and is an attractive alternative target for measurement. Diabetes insipidus (DI) may result from faulty secretion (central DI) or from end-organ resistance (nephrogenic DI). Head injury, tumors, and some medications may also induce pathologic secretion of ADH, resulting in fluid overload referred to as the syndrome of inappropriate antidiuretic hormone secretion (SIADH). A synthetic analog referred to as DDAVP (1-desamino, 8-D-AVP) is used therapeutically to treat DI and some forms of coagulopathy. DDAVP stimulates release of von Willebrand factor from endothelial cells and extends the half-life of circulating factor VIII, thereby mediating improvements of circulating hemostatic factors associated with various forms of von Willebrand disease and hemophilia A.

Glutathione

Glutathione consists of a glutamate residue linked to cysteine via its γ-carboxyl rather than the α-carboxyl group and followed by a conventional peptide bond between cysteine and glycine. This ubiquitous tripeptide is the most abundant intracellular thiol (1 to 10 mmol/L) and circulates in the blood at micromolar concentrations. The cellular ratio of reduced glutathione (GSH) to oxidized glutathione (GSSG) ranges from 10 to 100. Intracellular glutathione performs a variety of important functions. It plays an important role in maintaining the proper ratio of oxidized to reduced forms of metabolically important thiols such as coenzyme A. It also provides reducing equivalents that detoxify reactive oxygen species such as peroxides (catalyzed by glutathione peroxidase). Through the activity of glutathione-S-transferase, glutathione also serves to detoxify other xenobiotic compounds via formation of a thioether derivative, which can then be excreted. Amines and peptides are transported across the plasma membrane via the γ-glutamyl moiety of glutathione, a reaction catalyzed by γ-glutamyl-transpeptidase (see also Chapter 32 ). The tripeptide is then regenerated through the concerted action of enzymes in the so-called γ-glutamyl cycle ( Fig. 31.3 ).

FIGURE 31.3, Transmembrane transport of amino acids (AAs) using the γ-glutamyl cycle. Three-letter AA abbreviations are used. Extracellular AA is transferred to glutathione via activity of membrane-bound γ-glutamyl transpeptidase (1) . AA is released in the cytoplasm via the activity of γ-glutamyl cyclotransferase (2) , which also results in the formation of pyroglutamate (5-oxoproline). Cysteine and glycine generated via dipeptidase activity (3) are recycled with pyroglutamate to reform glutathione via successive activities of 5-oxoprolinase (4) , γ-glutamyl-cysteine synthetase (5) , and glutathione synthetase (6) . ADP , Adenosine diphosphate; ATP , adenosine triphosphate; CYS , cysteine; GLU , glutamate; GLY , glycine.

Determination of circulating GSH and GSSG is not routinely called for in clinical practice as the site of action is intracellular. Nonetheless, a variety of techniques to measure glutathione have been used. Measurement of total glutathione requires prior reduction of the sample to release all oxidized forms. The simplest techniques employ the colorimetric detection of free thiol using 5,5′-dithio-bis-2-nitrobenzoic acid (DTNB). Reaction of DTNB with thiols results in the formation of 2-thio-5-thiobenzoate, which absorbs with high extinction (~14,000 L cm −1 mol −1 ) at 410 nm. Other techniques use derivatization and stabilization of GSH followed by high-performance liquid chromatography or MS. Inborn errors of glutathione metabolism such as glutathione synthetase deficiency are detected via the accumulation of components of the γ-glutamyl cycle such as pyroglutamic acid (5-oxoproline).

Gastrointestinal peptide hormones

Enteral organs produce a number of peptides that mediate important digestive processes. Assessment of these peptides is indicated in rare digestive disorders. In addition to enteral effects, many of these peptides also impact the central nervous system. Clarification of their function in the interplay between brain, gut, and microbiome may clarify the pathologic basis of multiple disorders including obesity, anxiety, depression, bowel disease, and various forms of neurodegeneration. , These peptides are still largely assessed by immunologic techniques but improvements to analytic specificity would likely be achieved using MS. Select peptides are considered briefly here but treated in detail in Chapter 52 .

Gastrin is produced as a larger precursor (preprogastrin, progastrin) by G cells in the pyloric antrum of the stomach. Precursor forms are cleaved to peptides of multiple lengths including the most potent 17–amino acid version. Gastrin stimulates acid production by the parietal cells of the stomach and is overproduced in Zollinger-Ellison syndrome (gastrinoma) with a worldwide incidence of 0.5 to 3 cases per million population annually.

Cholecystokinin (CCK) is expressed as a 115 amino acid precursor and cleaved into progressively shorter fragments of 39, 33, 22, 12, and 8 amino acids that also possess biological activity (see also Chapter 52 ). CCK is secreted by the duodenal mucosa and stimulates gall bladder contraction and secretion of pancreatic enzymes.

Secretin is synthesized as a 120 amino acid precursor by the duodenal mucosa. The mature 27 amino acid peptide is released in response to acid and acts to stimulate pancreatic bicarbonate release and neutralize the proximal intestinal environment and ensure proper pancreatic enzyme function. Recombinant secretin is used to stimulate and assess pancreatic secretory function.

Vasoactive intestinal peptide (VIP) is a 28 amino acid peptide secreted by enteric neurons throughout the gut. It shares structural similarity with glucagon and secretin. In the gut, VIP acts to increase pancreatic water and electrolyte secretion, gut motility, as well as hepatic bile flow. VIP also impacts coronary vasodilation and myocardial contractility. In the brain, VIP is an important mediator of circadian rhythms.

Ghrelin is a 28 amino acid peptide that requires covalent modification by octanoic acid for full activity. Ghrelin is produced in the stomach and is best known as a potent stimulus of growth hormone production. Ghrelin itself is metabolized to obestatin, a 24 amino acid peptide that antagonizes the function of ghrelin.

Proteins

The structural diversity of proteins may be described using the following features:

  • 1.

    Primary structure is the linear sequence of amino acids in a peptide or protein. Post-translational modifications of amino acids contribute to increased diversity.

  • 2.

    Secondary structure describes the nature of the peptide backbone dictated by the peptide bond angles described earlier and stabilized by hydrogen bonds. Examples of secondary structure include α-helix, β-sheet, and β-turn. An α-helix has about 3.6 residues per turn and is stabilized by hydrogen bonds between the N–H and C=O group of the fourth following amino acid. A β-sheet involves hydrogen bonds between the peptide bonds of adjacent peptide chains arranged in parallel or antiparallel configurations. Random coils refer to segments of peptide that lack defined secondary structure.

  • 3.

    Tertiary structure refers to the folding of the polypeptide chain and elements of secondary structure into a compact three-dimensional (3D) shape. Folding is a complex process driven by energy minimization of intramolecular and solvent interactions. Hydrophobic groups tend to fold into the interior with less exposure to solvent, while charged and polar sidechains tend to be located on the surface. The 3D structure is stabilized by intramolecular hydrogen bonds, van der Waals forces, and hydrophobic interactions. Disulfide bonds between cysteine residues also stabilize 3D structure. Denaturation of protein refers to unfolding that occurs with temperature change or in the presence of organic solvents, detergents, or reagents that disrupt hydrogen bonds. Limited denaturation can be reversible, but extensive unfolding and denaturation of proteins often lead to irreversible aggregation and precipitation.

  • 4.

    Quaternary structure refers to the incorporation of two or more polypeptide chains or subunits into a larger multimeric unit. Examples range from the relatively simple creatine kinase, a heterodimer of M and B subunits, to branched chain α-ketoacid dehydrogenase, which is a heteromeric complex of 12 E1, 24 E2, and 6 E3 subunits.

  • 5.

    Ligands and prosthetic groups provide additional functional and structural elements, such as metals in metalloenzymes, heme in hemoglobin and cytochromes, and lipids in lipoproteins. Proteins without their associated ligands are often referred to as apoproteins (e.g., apotransferrin without iron, apolipoproteins without lipid).

Physical properties of proteins

The diverse structural features of proteins result in unique physical properties that can be exploited for analysis. For example, tyrosine and tryptophan residues absorb light at 280 nm, and the abundance of these amino acids determines the extinction coefficient of a peptide or protein. A pure protein, therefore may be quantitated using A 280 . Some prosthetic groups such as heme also possess intrinsic absorbance that may be monitored to assess the presence of specific proteins. Automated clinical analyzers assess the presence of hemoglobin at 540 to 570 nm, for example, to detect hemolyzed plasma or serum specimens. Ionizable groups exert a strong effect on physical properties depending on the pH of the surrounding solution. Differing physical properties serve as the basis of methods to separate proteins. Some important characteristics include the following:

  • 1.

    Differential solubility . The solubility of proteins is affected by pH, ionic strength, temperature, and the characteristics of the solvent. Changing solvent pH affects the net charge of a protein. Changing ionic strength affects the hydration and solubility of proteins. “Salting-in” and “salting-out” procedures were early methods for separating and characterizing protein. Albumin, for example, stays in solution at high concentrations of ammonium sulfate that precipitate globulins. Addition of organic solvents and polyethylene glycol is also useful for differential precipitation. Fractional precipitation of plasma with ethanol, using protocols developed by Cohn and coworkers, enables isolation of plasma fractions that are enriched in immunoglobulins, α- and β-globulins, or albumin (fraction V). Polyethylene glycols induce precipitation by steric exclusion and therefore preferentially precipitate large proteins or complexes.

  • 2.

    Molecular size . Separation of small and large molecules is commonly achieved by differential migration through molecular filters. Examples are size exclusion chromatography (also known as gel filtration), ultracentrifugation, and electrophoresis. These techniques may be used under conditions when proteins and peptides are in native globular states or under denaturing conditions. Addition of reducing agents allows separation of disulfide-linked components.

  • 3.

    Molecular mass . Advances in MS allow the determination of masses of peptides and proteins with increasing accuracy. Peptides and proteins can be ionized by MALDI or by ESI (see Chapter 24 ).

  • 4.

    Electrical charge . Ion-exchange chromatography, isoelectric focusing (IEF), and electrophoresis separate peptides and proteins based on charge (see Chapters 18 and 19 ).

  • 5.

    Surface adsorption . The affinity of peptides and proteins for a variety of physical surfaces may also be used as the basis for separation. Reverse-phase chromatography, for example, exploits the interaction of hydrophobic molecular moieties with hydrophobic surfaces (C8 or C18 alkyl chains) when the ratio of water to organic solvent is high but not when organic content is increased (see Chapter 19 ).

  • 6.

    Affinity chromatography . Specific ligands, antibodies, and other recognition molecules have been used to separate peptides or proteins selectively (see Chapter 19 ).

Protein formation

Folding

Proteins are synthesized by ribosomes reading from the 5′-end of mRNA. Triplet codons in mRNA are matched with complementary sequence in transfer RNA carrying specific amino acids. Protein synthesis begins with an AUG codon encoding methionine, and the polypeptide chain is synthesized from the N terminus. During translation, the initiator methionine is typically cleaved and the resulting N-terminal residue commonly acetylated. Although 80 to 90% of proteins carry an N-terminal acetyl group, the function of this modification is not entirely clear, but it may play a role in stabilizing the growing peptide chain.

Instructions for folding are largely contained in the primary amino acid sequence of the growing polypeptide chain. The rate of elongation (typically 5 to 10 amino acids per second in eukaryotes) may have a significant impact on folding. The use of rare codons, secondary structural elements of the mRNA, and polybasic stretches may dictate pauses in translation and enhance formation of secondary structural elements. , Folding begins as the chain is elongated and still associated with the ribosome, assisted by a family of proteins referred to as chaperones. The function of chaperones was originally ascribed to a group of proteins called “heat shock proteins” that prevented protein denaturation and aggregation in response to heat and other extreme environmental conditions.

Many gene products that share common 3D features have arisen from common ancestral genes. The serpin (serine proteinase inhibitor) superfamily consists of more than 1000 related proteins in different organisms. Humans have 36 serpins, 29 of which are protease inhibitors and 7 of which lack protease inhibitor function. Serpins that act as protease inhibitors in plasma include α 1 -antitrypsin (AAT), α 1 -antichymotrypsin, α 2 -antiplasmin, antithrombin, C1 inhibitor, heparin cofactor II, protein C inhibitor, and plasminogen activator inhibitor-1 (PAI-1). Serpins without known protease inhibitor function are cortisol-binding globulin, thyroxine-binding globulin, angiotensinogen, intracellular proteins, heat shock protein 47, and the tumor suppressor maspin. Serpins illustrate how a common structure motif may be adapted to multiple functions. Other examples of plasma protein families are the albumin and lipocalin families. The albumin family includes albumin, α-fetoprotein, and afamin. The lipocalin family includes several plasma proteins such as α 1 -acid glycoprotein (AAG), retinol-binding protein, apolipoprotein D, α 1 -microglobulin, prostaglandin D synthase (β-trace), β-lactoglobulin, neutrophil gelatinase-associated lipocalin (NGAL), inter-α-trypsin inhibitor, and C8 γ-chain. Lipocalins generally have a barrel-shaped structure that is well suited to serve as a carrier for small molecules.

Protein folding is an error-prone process, and many molecular chaperones work to refold, prevent aggregation, or degrade misfolded proteins. Several heat shock proteins that increase in response to a variety of stresses are molecular chaperones. Increased accumulation of misfolded proteins induces an adaptive mechanism—the unfolded protein response. This response increases production of chaperones and slows general protein synthesis to allow more time to fold new proteins.

Despite these protective mechanisms, several families of age-related, genetic, and infectious diseases appear connected to disorders of protein folding and protein aggregation. Prion diseases are infectious diseases in which the transmissible protein agent may catalyze misfolding of endogenous proteins. In Alzheimer disease, deposits of amyloid may contribute to pathogenesis. Polyglutamine diseases result from genetic expansion of repeat units encoding glutamine and are associated with Huntington disease and other neurodegenerative disorders. These expanded polyglutamine sequences tend to aggregate as β-sheets. TDP-43 proteinopathies include amyotrophic lateral sclerosis (Lou Gehrig disease), resulting from aggregation of transactive DNA-binding proteins. Several inherited disorders related to mutations in specific proteins probably result from problems in protein folding. In AAT deficiency, hepatic injury results from aggregation and accumulation of misfolded protein. , The most common cause of cystic fibrosis results from a single amino acid deletion (ΔF508), which results in rapid degradation of the cystic fibrosis transmembrane conductance regulator (CFTR). Accumulation of misfolded proteins has been suggested as a pathogenic mechanism contributing to vascular, cardiac, and β-cell failure in diabetes. Small molecule therapeutics capable of modulating protein folding have shown some promise in mitigating disease caused by abnormal protein aggregation.

Targeting

As originally outlined by Lingappa and Blobel, proteins that are secreted, located in vesicular compartments, or oriented on the external surface of cell membranes usually contain a hydrophobic N-terminal signal peptide about 15 to 30 amino acids in length. Signal peptides interact with signal recognition particles (SRPs) and mediate interaction with the endoplasmic reticulum (ER). Nascent peptide chains are inserted through the membrane of the ER as the protein is synthesized. Signal peptides of most secretory proteins are removed even before synthesis of the entire protein chain is completed. Co-translational membrane retention may be achieved via an uncleaved signal sequence, by one or more hydrophobic transmembrane domains, or by lipid modifications such as N-myristoylation.

Newly synthesized proteins ultimately reside in a number of membranous or soluble compartments, including the nucleus, lysosome, peroxisome, mitochondrion, or plasma membrane. Plasma membrane sorting is further complicated in polarized epithelial cells where proteins may be targeted to basolateral or apical environments. In the so-called “secretory” pathway, proteins are shuttled via membrane-bound vesicles bearing COP II (coat protein II) from the ER through the Golgi apparatus. , Intra-Golgi transport and retrograde Golgi to ER transport is mediated by COP I vesicles. Upon fusion of vesicle with specific membranes, soluble components are extruded, and lipid-associated components take up residence as stable membrane components. Sorting in polarized epithelia is mediated by association of proteins with unique membrane domains. For example, proteins anchored to the membrane via a glycosylphosphatidylinositol (GPI) anchor tend to cluster in cholesterol- and sphingolipid-rich domains called lipid rafts that are selectively sorted to apical surfaces. Proteins destined for mitochondria contain a unique N-terminal targeting sequence that mediates their interaction and import into the proper submitochondrial location (e.g., outer membrane, inner membrane, matrix, intermembrane space).

Post-translational modifications

Acetylation

Eighty to 90% of eukaryotic cellular proteins are acetylated. Acetyl-CoA is the typical substrate for a variety of acetyl transferase enzymes localized to the nucleus, cytoplasm, and mitochondria. Most acetylation occurs co-translationally on the N-terminal α-amino group of methionine or another exposed N-terminal residue after excision of the initiator methionine. Acetylation targeting the e-amino group of lysine is post-translational and reversible. Common targets of reversible lysine acetylation include histones, cytoskeletal proteins, mitochondrial proteins, and proteins controlling cell growth including the tumor suppressor, p53. Acetylation, therefore contributes to control of gene expression, cell motility and division, metabolism, and oncogenesis.

Fatty acylation

The activity and localization of a variety of proteins may be modulated by covalent attachment of fatty acyl chains. Co-translational attachment of myristate via a glycine residue has been previously mentioned as a mechanism for membrane association. The most common acylation of eukaryotic proteins involves thioester linkage of palmitate to membrane proximal cysteine residues. S-palmitoylation reversibly controls localization to membrane microdomains such as lipid rafts and thus regulates interaction of proteins with signaling and other effector molecules. Examples of palmitoylated proteins include caveolin-1, some members of the SRC protein kinase family, NO synthase (NOS), β-adrenergic receptor, and transferrin receptor. Ghrelin, a potent growth hormone secretagogue, is modified by covalent attachment of an octanoyl moiety at serine 3 of the polypeptide. , Only octanoylated ghrelin promotes growth hormone release.

Phosphorylation

Reversible phosphorylation may impact as many as one third of all human cellular proteins. O-phosphorylation occurs at serine, threonine, and tyrosine residues. The human genome encodes approximately 1000 kinases, enzymes responsible for phosphorylation, and about 500 phosphatases responsible for removal of covalent phosphate groups. Detailed treatment of reversible phosphorylation is beyond the scope of this chapter but, in general, serine and threonine phosphorylation acutely modifies enzyme activity (e.g., glycogen phosphorylase) and subcellular localization (e.g., cAMP-dependent protein kinase). Tyrosine phosphorylation, on the other hand, regulates a plethora of signaling pathways (e.g., mitogen activated protein kinase, Janus kinase pathways) in part by providing docking site proteins that propagate a transmembrane signal such as those of the SRC kinase family (lyn, lck, fyn). Mitochondria contain members of a primitive kinase family that modulate flux through the pyruvate dehydrogenase and branched chain α-ketoacid dehydrogenase via a unique phosphohistidine intermediate.

Prenylation

Isoprenoid compounds such as farnesyl pyrophosphate (15 carbons) and geranylgeranyl pyrophosphate (20 carbons) are hydrophobic moieties formed from 3-hydroxy-3-methylglutaryl CoA (HMGCoA) and mevalonate via HMG CoA reductase. These groups modify more than 300 members of the human proteome via enzymatic attachment to a cysteine residue in a so-called “CaaX” motif where C is cysteine, aa represents aliphatic amino acids such as glycine or alanine, and X is typically serine, methionine, glutamine, alanine, or threonine. , Isoprenylation regulates membrane and molecular association of a number of proteins important for signal transduction (H-Ras, K-Ras), vesicular trafficking (Rab2, Rab3a), cytoskeletal function (RhoA, RhoB), and the integrity of the nuclear membrane (lamin A).

Glycosylphosphatidylinositol anchor

The GPI anchor is a glycoglycero-phospholipid construct that mediates membrane attachment for a variety of proteins. The anchor is presynthesized in the ER and then transferred to the target protein via a C-terminal hydrophobic signal sequence. After modification, this hydrophobic sequence is removed, leaving a protein that is uniquely membrane anchored via interdigitation of two fatty acyl chains with a single membrane leaflet. , The purpose of the GPI anchor remains unclear, although it has been proposed that such lipid-anchored proteins diffuse in the lateral plane of biologic membranes more rapidly than transmembrane proteins. GPI-anchored proteins also uniquely associate with cholesterol- and glycosphingolipid-rich plasma membrane domains referred to as lipid rafts and caveolae. Notable examples of GPI-anchored proteins include decay accelerating factor (DAF, CD55), membrane inhibitor of reactive lysis (MIRL, CD59), alkaline phosphatase, 5′-nucleotidase, and glypican family members. Defects in the PIG-A gene product that mediates GPI synthesis are responsible for paroxysmal nocturnal hemoglobinuria (PNH). PNH is characterized by abnormal complement-mediated lysis of erythrocytes deficient in CD55 and CD59.

You're Reading a Preview

Become a Clinical Tree membership for Full access and enjoy Unlimited articles

Become membership

If you are a member. Log in here