Molecular Endocrinology, Endocrine Genetics, and Precision Medicine


Introduction

The study of the endocrine system has undergone a dramatic evolution since the 1990s, from the traditional physiologic studies that dominated the field for many years to the discoveries of molecular endocrinology and endocrine genetics. At the present time, the major impact of molecular medicine on the practice of pediatric endocrinology relates to diagnosis and genetic counseling for a variety of inherited endocrine disorders. In contrast, the direct therapeutic application of this new knowledge is still in its infancy. Endocrine oncology has greatly benefited from the application of new drugs that were designed to target specific mutations in, for example, thyroid cancer. A notable recent therapeutic advancement that followed the identification of the molecular basis of an endocrine disorder is the development of monoclonal antibody burosumab directed against the fibroblast growth factor-23 protein to treat X-linked hypophosphatemic rickets. This chapter is an introduction to the basic principles of molecular biology, common laboratory techniques, and some examples of the recent advances made in clinical pediatric endocrinologic disorders with an emphasis on endocrine genetics. Most new diagnostic testing, pharmacogenetics, and molecular therapies are discussed in the disease-specific chapters of this book, and only examples that highlight the principle/strategy under discussion are discussed in this chapter.

Basic molecular tools

Isolation and Digestion of DNA and Southern Blotting

The human chromosome comprises a long double-stranded helical molecule of deoxyribonucleic acid (DNA) associated with different nuclear proteins. As DNA forms the starting point of the synthesis of all the protein molecules in the body, molecular techniques using DNA have proven to be crucial in the development of diagnostic tools to analyze endocrine diseases. DNA can be isolated from any human tissue, including circulating white blood cells. About 200 μg of DNA can be obtained from 10 to 20 mL of whole blood, with the efficiency of DNA extraction being dependent on the technique used and the method of anticoagulation used. The extracted DNA can be stored almost indefinitely at an appropriate temperature. Furthermore, lymphocytes can be transformed with the Epstein-Barr virus (or other means) to propagate indefinitely in cell culture as “immortal” cell lines, thus providing a renewable source of DNA. For performing molecular genetic studies, transformed lymphoid lines are routinely the tissue of choice, because a renewable source of DNA obviates the need to obtain further blood from the family. Fibroblast-derived cultures can also serve as a permanent source of DNA or ribonucleic acid (RNA) (once transformed), but they have to be derived from surgical specimens or a biopsy. It should be noted that, because the expression of many genes is tissue specific, immortalized lymphoid or fibroblastoid cell lines cannot be used to analyze the abundance or composition of messenger RNA (mRNA) for a specific gene. Hence, studies involving mRNA necessitate the analysis of the tissue(s) expressing the gene, as outlined in the section on “RNA Analysis.” More recently, the problem of limited amounts of DNA obtainable from certain sources has been circumvented by the utilization of the polymerase chain reaction (PCR), a versatile way to faithfully multiply segments of the original DNA.

DNA is present in extremely large molecules; the smallest autosomal chromosome (chromosome 22) has about 50 million base pairs and the entire haploid human genome is estimated to comprise 3 to 4 billion base pairs. This extreme size precludes the analysis of DNA in its native form in routine molecular biology techniques. The techniques for identification and analysis of DNA became feasible and readily accessible with the discovery of enzymes termed restriction endonucleases . These enzymes, originally isolated from bacteria, cut DNA into smaller sizes on the basis of specific recognition sites that vary from two to eight base pairs in length. The term restriction refers to the function of these enzymes in bacteria. A restriction endonuclease destroys foreign DNA (such as bacteriophage DNA) by cleaving the DNA at specific sites, thereby “restricting” the entry of foreign DNA in the bacterium. Several hundred restriction enzymes with different recognition sites are now commercially available. Because the recognition site for a given enzyme is fixed, the number and sizes of fragments generated for a particular DNA molecule remain consistent with the number of recognition sites and provide predictable patterns after separation by electrophoresis.

The analysis of a few hundred base pairs of DNA in the region of interest is difficult when DNA from all human chromosomes is cut and separated on the same gel. These limitations are circumvented by the technique of Southern blotting (named after its originator, Edward Southern). Southern blotting involves digestion of DNA and separation by electrophoresis through agarose. After electrophoresis, the DNA is transferred to a solid support (such as nitrocellulose or nylon membranes), enabling the pattern of separated DNA fragments to be replicated onto the membrane ( Fig. 2.1 ). The DNA is then denatured (i.e., the two strands are physically separated), fixed to the membrane, and the dried membrane is mixed with a solution containing the DNA probe. A DNA probe is a fragment of DNA that contains a nucleotide sequence specific for the gene or chromosomal region of interest. For purposes of detection, the DNA probe is labeled with an identifiable tag, such as radioactive phosphorus (e.g., 32 P) or a chemiluminescent moiety; the latter has now almost exclusively replaced radioactivity. The process of mixing the DNA probe with the denatured DNA fixed to the membrane is called hybridization , the principle being that there are only four nucleic acid bases in DNA—adenine (A), thymidine (T), guanine (G), and cytosine (C)—that always remain complementary on the two strands of DNA, A pairing with T, and G pairing with C. Following hybridization, the membrane is washed to remove the unbound probe and exposed to an x-ray film either in a process called radioautography (also referred to as autoradiography ) to detect radioactive phosphorus or in a process used to detect the chemiluminescent tag. Only those fragments that are complementary and have bound to the probe containing the DNA of interest will be evident on the x-ray film, enabling the analysis of the size and pattern of these fragments. As routinely performed, the technique of Southern analysis can detect a single copy gene in as little as 5 μg of DNA, the DNA content of about 10 6 cells.

Fig. 2.1, Southern blot. Fragments of double-stranded deoxyribonucleic acid (DNA) are separated by size by agarose gel electrophoresis. To render the DNA single stranded (denatured), the agarose gel is soaked in an acidic solution. After neutralization of the acid, the gel is placed onto filter paper, the ends of which rest in a reservoir of concentrated salt buffer solution. A sheet of nitrocellulose membrane is placed on top of the gel and absorbent paper is stacked on top of the nitrocellulose membrane. The salt solution is drawn up through the gel by the capillary action of the filter paper wick and the absorbent paper towels. As the salt solution moves through the gel, it carries along with it the DNA fragments. Because nitrocellulose binds single-stranded DNA, the DNA fragments are deposited onto the nitrocellulose in the same pattern that they were placed in the agarose gel. The DNA fragments bound to the nitrocellulose are fixed to the membrane by heat or ultraviolet irradiation. The nitrocellulose membrane with the bound DNA can then be used for procedures, such as hybridization to a labeled DNA probe. Techniques to transfer DNA to other bonding matrices, such as nylon, are similar.

Restriction Fragment Length Polymorphism

Restriction fragment length polymorphism (RFLP) is a technique that is currently rarely used but is widely present in endocrine genetic literature, as a number of endocrine genetic discoveries over the last 2 to 3 decades were based on this technique. The number and size of DNA fragments resulting from the digestion of any particular region of DNA form a recognizable pattern. Small variations in a sequence among unrelated individuals may cause a restriction enzyme recognition site to be present or absent; this results in a variation in the number and size pattern of the DNA fragments produced by digestion with that particular enzyme. Thus this region is said to be polymorphic for the particular enzyme tested—that is, an RFLP ( Fig. 2.2 ). The value of RFLP is that it can be used as a molecular tag for tracing the inheritance of the maternal and paternal alleles. Furthermore, the polymorphic region analyzed does not need to encode the genetic variation that is the cause of the disease being studied, but only to be located near the gene of interest. When a particular RFLP pattern can be shown to be associated with a disease, comparing the offspring’s RFLP pattern with the RFLP pattern of the affected or carrier parents can determine the likelihood of an offspring inheriting the disease. The major limitation of the RFLP technique is that its applicability for the analysis of any particular gene is dependent on the prior knowledge of the presence of convenient (“informative”) polymorphic restriction sites that flank the gene of interest by at most a few kilobases. Because these criteria may not be fulfilled in any given case, the applicability of RFLP cannot be guaranteed for the analysis of a given gene.

Fig. 2.2, Restriction fragment length polymorphism (RFLP). A, Schematic illustration. A and B represent two alleles that display a polymorphic site for the restriction enzyme EcoR I. EcoR I will cut deoxyribonucleic acid (DNA) with the sequence “GAATTC”; hence, allele B will be cut by EcoR I at three sites to generate two fragments of DNA, whereas allele A will be cut by EcoR I only twice and not at the site (indicated by horizontal bar) where nucleotide G (underlined) replaces the nucleotide A present in allele B. Following digestion, the DNA is size-fractionated by agarose gel electrophoresis and transferred to a membrane by Southern blot technique (see Fig. 2.1 for details). The membrane is then hybridized with a labeled DNA probe, which contains the entire sequence spanned by the three EcoR I sites. Radioautography of the membrane will detect the size of the DNA fragments generated by the restriction enzyme digestion. In this particular illustration, both parents are heterozygous and possess both A and B alleles. Matching the pattern of the DNA bands of the offspring with that of the parents will establish the inheritance pattern of the alleles. For example, if allele A represents the abnormal allele for an autosomal recessive disease, then examination of the Southern blot will establish that (from left to right) the first offspring (B/B) is homozygous for the normal allele, the second offspring (A/A) is homozygous for the abnormal allele, and the third offspring (A/B) is a carrier. B, RFLP analysis of the DQ-beta gene of the human leukocyte antigen (HLA) locus. Genomic DNA from the members of the indicated pedigree was digested with restriction enzyme Pst I, size-fractionated by agarose gel electrophoresis, and transferred to nitrocellulose membrane by Southern blot technique. The membrane was then hybridized with a complementary DNA probe specific for the DQ-beta gene; the excess probe was removed by washing at appropriate stringency and was analyzed by radioautography. The sizes of the DNA fragments (in kilobases, kb) are indicated on the right. The pedigree chart indicates the polymorphic alleles (a, b, c, d) and the bands on the Southern blot corresponding to these alleles (a [5.5 kb], b [5.0 kb], c [14.0 kb], d [4.5 kb]) indicate the inheritance pattern of these alleles.

Polymerase Chain Reaction

PCR is a technique that was developed in the late 1980s and revolutionized molecular biology ( Fig. 2.3 ). PCR allows the selective logarithmic amplification of a desired fragment of DNA from a complex mixture of DNA that theoretically contains at least a single copy of the target fragment. In the typical application of this technique, some knowledge of the DNA sequences in the region to be amplified is necessary, so that a pair of short (approximately 18–25 bases in length) specific oligonucleotides (“primers”) can be synthesized. The primers are synthesized in such a manner that they define the limits of the region to be amplified. The DNA template containing the segment that is to be amplified is heat denatured, such that the strands are separated and then cooled to allow the primers to anneal to the respective complementary regions. The enzyme Taq polymerase, a heat stable enzyme originally isolated from the bacterium Thermophilus aquaticus , is then used to initiate synthesis (extension) of DNA. The DNA is repeatedly denatured , annealed , and extended in successive cycles in a machine called the thermocycler that permits this process to be automated. In the usual assay, these repeated cycles of denaturing, annealing, and extension result in the synthesis of approximately 1 million copies of the target region in about 2 hours. To establish the veracity of the amplification process, the identity of the amplified DNA can be analyzed by electrophoresis, hybridization to RNA or DNA probes, digestion with informative restriction enzyme(s), or subjected to direct DNA sequencing. The relative simplicity combined with the power of this technique has resulted in widespread use of this procedure and has spawned a wide variety of variations and modifications that have been developed for specific applications. From a practical point of view, the major drawback of PCR is the propensity to get cross-contamination of the target DNA. This drawback is the direct result of the extreme sensitivity of the method that permits amplification from one molecule of the starting DNA template. Thus unintended transfer of amplified sequences to items used in the procedure will amplify DNA in samples that do not contain the target DNA sequence (i.e., a false positive result). Cross-contamination should be suspected when amplification occurs in negative controls that did not contain the target template. One of the most common modes of cross-contamination is via aerosolization of the amplified DNA during routine laboratory procedures, such as vortexing, pipetting, and manipulation of microcentrifuge tubes. Meticulous care to experimental technique, proper organization of the PCR workplace, and inclusion of appropriate controls are essential for the successful prevention of cross-contamination during PCR experiments.

Fig. 2.3, Polymerase chain reaction (PCR). A pair of oligonucleotide primers (solid bars), complementary to sequences flanking a particular region of interest (shaded, stippled bars), are used to guide deoxyribonucleic acid (DNA) synthesis in opposite and overlapping directions. Repeated cycles of DNA denaturation, primer annealing, and DNA synthesis (primer extension) by DNA polymerase enzyme result in an exponential increase in the target DNA (i.e., the DNA sequence located between the two primers) such that this DNA segment can be amplified 1 × 106–7 times after 30 such cycles. The use of a thermostable DNA polymerase (i.e., Taq polymerase) allows for this procedure to be automated. Inset: The amplified DNA can be used for subsequent analysis (i.e., size-fractionation by agarose gel electrophoresis).

In general, PCR applications are either directed toward the identification of a specific DNA sequence in a tissue or body fluid sample or used for the production of relatively large amounts of DNA of a specific sequence, which then are used in further studies. Examples of the first type of application are common in many fields of medicine, such as in microbiology, wherein the PCR technique is used to detect the presence of DNA sequences specific for viruses or bacteria in a biological sample. Examples of such an application in pediatric endocrinology include the use of PCR of the SRY gene for detecting Y chromosome material in patients with karyotypically defined Turner syndrome and the rapid identification of chromosomal gender in cases of fetal or neonatal sexual ambiguity ( Fig. 2.4 ).

Fig. 2.4, Detection of SRY gene–specific sequence in Turner syndrome by polymerase chain reaction (PCR) amplification and Southern blot. SRY -specific primers were used in PCR to amplify deoxyribonucleic acid (DNA) from patients with 45X karyotype. The amplified DNA was size-fractionated by agarose gel-electrophoresis and transferred to membrane by Southern blotting. The membrane was then hybridized to labeled SRY -specific DNA and autoradiographed. From left to right: amplified male DNA (lane 1); amplified DNA from patients with 45X karyotype (lanes 2-5); amplified female DNA (lane 6); negative control with no DNA (lane 7); serial dilution of male DNA (lanes 8-13).

Most PCR applications, both as research tools and for clinical use, are directed toward the production of a target DNA or the complementary DNA of a target RNA sequence. The DNA that is made (“amplified”) is then analyzed by other techniques, such as DNA sequencing.

RNA Analysis

The majority (> 95%) of the chromosomal DNA represents noncoding sequences. These sequences harbor regulatory elements, serve as sites for alternate splicing, and are subject to methylation and other epigenetic changes that affect gene function. However, at present most disease-associated mutations in the human gene have been identified in coding sequences. An alternate strategy to analyze mutations in a given gene is to study its mRNA, which is the product (via transcription) of the remaining 5% of chromosomal DNA that encodes for proteins. In addition, because the mRNA repertoire is cell and tissue specific, the analyses of the mRNA sequences provide unique information about tissue-specific proteins produced in a particular organ/tissue.

There are many techniques for analyzing mRNA. The oldest and most widely used in the past, although now rarely used, is Northern blotting (so named because it is based on the same principle as the Southern blot), which is one of the original methods used for mRNA analysis. In Northern blotting, RNA is denatured by treating it with an agent, such as formaldehyde, to ensure that the RNA remains unfolded and in the linear form. The denatured RNA is then electrophoresed and transferred onto a solid support (such as nitrocellulose membrane) in a manner similar to that described for the Southern blot. The membrane with the RNA molecules separated by size is probed with the gene-specific DNA probe labeled with an identifiable tag that, as in the case of Southern blotting, is either a radioactive label (e.g., 32 P) or more commonly a chemiluminescent moiety. The nucleotide sequence of the DNA probe is complementary to the mRNA sequence of the gene and is hence called complementary DNA ( cDNA ). It is customary to use labeled cDNA (and not labeled mRNA) to probe Northern blots because DNA molecules are much more stable and easier to manipulate and propagate (usually in bacterial plasmids) than mRNA molecules. The Northern blot provides information regarding the amount (estimated by the intensity of the signal on radioautography) and the size (estimated by the position of the signal on the gel in comparison to concurrently electrophoresed standards) of the specific mRNA. Although the Northern blot technique represents a versatile and straightforward method to analyze mRNA, it had major drawbacks, and it has now been supplanted by more sensitive and less time-consuming techniques that are discussed later.

One of the most sensitive methods for the detection and quantitation of mRNA currently available is the technique of quantitative reverse transcriptase (RT)-PCR (qRT-PCR). This technique combines the unique function of the enzyme reverse transcriptase with the power of PCR. qRT-PCR is exquisitely sensitive, permitting analysis of gene expression from very small amounts of RNA. Furthermore, this technique can be applied to a large number of samples or many genes (multiplex) in the same experiment. These two critical features endow this technique with a measure of flexibility unavailable in more traditional methods, such as Northern blot or solution hybridization analysis. The first step in qRT-PCR analysis is the production of DNA complementary (cDNA) to the mRNA of interest. This is done by using the enzymes with RNA-dependent DNA polymerase activity that belong to the RT group of enzymes (e.g., Moloney murine leukemia virus [MMLV], avian myeloblastosis virus [AMV] reverse transcriptase, an RNA-dependent DNA polymerase). The RT enzyme, in the presence of an appropriate primer, will synthesize DNA complementary to RNA. The second step in the qRT-PCR analysis is the amplification of the target DNA, in this case the cDNA synthesized by the RT enzyme. The specificity of the amplification is determined by the specificity of the primer pair used for the PCR amplification. To establish the veracity of the amplification process, the identity of the amplified DNA can be analyzed by electrophoresis, hybridization to RNA or DNA probes, digestion with informative restriction enzyme(s), or subjected to direct DNA sequencing.

Whereas the detection of a specific mRNA by this technique is relatively straightforward, the precise quantitation of the mRNA in a given sample is more complicated. Because the production of DNA by PCR involves an exponential increase in the amount of DNA synthesized, relatively minor differences in any of the variables controlling the rate of amplification will cause a marked difference in the yield of the amplified DNA. In addition to the amount of template DNA, the variables that can affect the yield of the PCR include the concentration of the polymerase enzyme, magnesium, nucleotides (dNTPs), and primers. The specifics of the amplification procedure, including cycle length, cycle number, annealing, extension, and denaturing temperatures, also affect the yield of DNA. Because of the multitude of variables involved, routine RT-PCR is unsuitable for performing a quantitative analysis of mRNA. To circumvent these pitfalls alternate strategies have been developed. One technique for determining the concentration of a particular mRNA in a biological sample is a modification of the basic PCR technique called competitive RT-PCR . This method is based on the coamplification of a mutant DNA that can be amplified with the same pair of primers being used for the target DNA. The mutant DNA is engineered in such a way that it can be distinguished from the DNA of interest by either size or the inclusion of a restriction enzyme site unique to the mutant DNA. The addition of equivalent amounts of this mutant DNA to all the PCR reaction tubes serves as an internal control for the efficiency of the PCR process, and the yield of the mutant DNA in the various tubes can be used for the equalization of the yield of the DNA by PCR. It is important to ensure for accurate quantitation of the DNA of interest that the concentrations of the mutant and target template should be nearly equivalent. Because the use of mutated DNA for normalization does not account for the variability in the efficiency of the RT enzyme, a variation of the original method has been developed. In this modification, competitive mutated RNA transcribed from a suitably engineered RNA expression vector is substituted for the mutant DNA in the reaction before initiating the synthesis of the cDNA. Competitive RT-PCR can be used to detect changes of the order of two- to threefold of even very rare mRNA species. The major drawback of this method is the propensity to get inaccurate results because of the contamination of samples with the mRNA of interest. In theory, as the technique is based on PCR, contamination by even one molecule of mRNA of interest can invalidate the results. Hence, scrupulous attention to laboratory technique and set up is essential for the successful application of this technique.

In general, two types of methods are used for the detection and quantitation of PCR products: the “end-point” measurements of products and the newer “real-time” techniques. End-point determinations (e.g., the competitive RT-PCR technique described earlier) analyze the reaction after it is completed, whereas real-time determinations are made during the progression of the amplification process. In general, the real-time approach is more accurate and is currently the preferred method. Advances in fluorescence detection technologies have made the use of real-time measurement possible for routine use in the laboratory. One of the popular techniques that takes advantage of real-time measurements is the TaqMan (fluorescent 5′ nuclease) assay ( Fig. 2.5 ). The unique design of TaqMan probes, combined with the 5′ nuclease activity of the PCR enzyme (Taq polymerase), allows direct detection of PCR product by the release of a fluorescent reporter during the PCR amplification by using specially designed machines (ABI Prism 5700/7700). The TaqMan probe consists of an oligonucleotide synthesized with a 5′-reporter dye (e.g., FAM; 6-carboxy-fluorescein) and a downstream, 3′-quencher dye (e.g., TAMRA; 6-carboxy-tetramethyl-rhodamine). When the probe is intact, the proximity of the reporter dye to the quencher dye results in suppression of the reporter fluorescence, primarily by Forster-type energy transfer. During PCR, forward and reverse primers hybridize to a specific sequence of the target DNA. The TaqMan probe hybridizes to a target sequence within the PCR product. The Taq polymerase enzyme, because of its 5′-3′ nuclease activity, subsequently cleaves the TaqMan probe. The reporter dye and the quencher dye are separated by cleavage, resulting in increased laser-stimulated fluorescence of the reporter dye as a direct consequence of target amplification during PCR. This process occurs in every cycle and does not interfere with the exponential accumulation of product. Both primer and probe must hybridize to the target for amplification and cleavage to occur. The fluorescence signal is generated only if the target sequence for the probe is amplified during PCR. Because of these stringent requirements, nonspecific amplification is not detected. Fluorescent detection takes place through fiber optic lines positioned above optically nondistorting tube caps. Quantitative data are derived from a determination of the cycle at which the amplification product signal crosses a preset detection threshold. This cycle number is proportional to the amount of starting material, thus allowing for a measurement of the level of specific mRNA in the sample. An alternate machine (Light Cycler) also uses fluorogenic hydrolysis or fluorogenic hybridization probes for quantification in a manner similar to the ABI system.

Fig. 2.5, Fluorescent 5′ nuclease (TaqMan) assay. Three synthetic oligonucleotides are used in a fluorescent 5’ nuclease assay. Two oligonucleotides function as “forward” and “reverse” primers in a conventional polymerase chain reaction (PCR) amplification protocol. The third oligonucleotide, termed the TaqMan probe, consists of an oligonucleotide synthesized with a 5’-reporter dye (e.g., FAM; 6-carboxy-fluorescein) and a downstream, 3’-quencher dye (e.g., TAMRA; 6-carboxy-tetramethyl-rhodamine). When the probe is intact, the proximity of the reporter dye to the quencher dye results in suppression of the reporter fluorescence, primarily by Forster-type energy transfer. During PCR, forward and reverse primers hybridize to a specific sequence of the target deoxyribonucleic acid (DNA). The TaqMan probe hybridizes to a target sequence within the PCR product. The Taq polymerase enzyme, because of its 5’-3’ exonuclease activity, subsequently cleaves the TaqMan probe. The reporter dye and the quencher dye are separated by cleavage, resulting in increased fluorescence of the reporter dye as a direct consequence of target amplification during PCR. Both primers and probe must hybridize to the target for amplification and cleavage to occur. Hence the fluorescence signal is generated only if the target sequence for the probe is amplified during PCR. Fluorescent detection takes place through fiberoptic lines positioned above the caps of the reaction wells. Inset: The two distinct functions of the enzyme Taq polymerase: the 5’-3’ synthetic polymerase activity and the 5’-3’ polymerase-dependent exonuclease activity.

MicroRNA

One of significant advances in the early 2000s in the field of RNA biology is the discovery of small (20–30 nucleotide) noncoding RNAs. In general, there are two categories of small noncoding RNAs: microRNA (miRNA) and small interfering RNA (siRNA). miRNAs are expressed products of an organism’s own genome, whereas siRNAs are synthesized in the cells from foreign double-stranded RNA (e.g., from viruses or transposons or from synthetic DNA introduced into the cell to study the function of a particular gene/process). In addition, there are differences in the biogenesis of these two classes of small nucleotide RNAs. These differences notwithstanding, the overall biological effect of these small nucleotide RNAs is translational repression or target degradation and gene silencing by binding to complementary sequences on the 3′ untranslated region of target mRNA; positive regulation of gene expression via such a mechanism is distinctly uncommon. The complexity of the phenomenon is increased by the fact that in a cell- or tissue-specific context, a single miRNA can target multiple RNAs and more than one miRNA can recognize the same mRNA target to amplify and strengthen the translational repression of the target gene. It is estimated that this phenomenon is present in several cell types and the human genome codes for more than 1000 miRNAs that could target 60% to 70% of mammalian genes. miRNA-mediated events have been implicated in regulation of cell growth and differentiation, cell growth, apoptosis, and other cellular processes. To date, the major impact of the discovery of miRNA has been in the fields of developmental biology, organogenesis, and cancer. miRNA and miRNA-related events (e.g., proteins involved in miRNA processing) have been directly implicated in only a small number of nonneoplastic endocrine disorders (e.g., diGeorge syndrome and X-linked mental retardation). It is predicted that as we learn more about the basic biology of this process, small nucleotide noncoding RNAs will be implicated in the pathogenesis of a wider spectrum of endocrine diseases.

Detection of mutations in genes

Changes in the structural organization of a gene that impact its function involve deletions, insertions, or transpositions of relatively large stretches of DNA, or more frequently single-base substitutions in functionally critical regions. High throughput or next-generation sequencing (NGS) has revolutionized the identification of mutations in genes.

Direct Methods

DNA sequencing is the current gold standard for obtaining unequivocal proof of a point mutation. However, DNA sequencing has its limitations and drawbacks. A clinically relevant problem is that current DNA sequencing methods do not reliably and consistently detect all mutations. For example, in many cases where the mutation affects only one allele (heterozygous), the heights of the peaks of the bases on the fluorescent readout corresponding to the wild-type and mutant allele are not always present in the predicted (1:1) ratio. This limits the discerning power of “base calling” computer protocols and results in inconsistent or erroneous assignment of DNA sequence to individual alleles. Because of this limitation, clinical laboratories routinely determine the DNA sequence of both the alleles to provide independent confirmation of the absence/presence of a putative mutation. DNA sequencing can be labor intensive and expensive, although advances in pyrosequencing (discussed later), for example, have made it technically easier and cheaper.

Although the first DNA sequences were determined with a method that chemically cleaved the DNA at each of the four nucleotides, the enzymatic or dideoxy method developed by Sanger and colleagues in 1977 became the most commonly used for routine purposes ( Fig. 2.6 ). This method uses the enzyme DNA polymerase to synthesize a complementary copy of the single-stranded DNA (“template”) whose sequence is being determined. Single-stranded DNA can be obtained directly from viral or plasmid vectors that support the generation of single-stranded DNA or by partial denaturing of double-stranded DNA by treatment with alkali or heat. The enzyme DNA polymerase cannot initiate synthesis of a DNA chain de novo but can only extend a fragment of DNA. Hence the second requirement for the dideoxy method of sequencing is the presence of a “primer.” A primer is a synthetic oligonucleotide, 15 to 30 bases long, whose sequence is complementary to the sequence of the short corresponding segment of the single-stranded DNA template. The dideoxy method exploits the observation that DNA polymerase can use both dNTP and 2′,3′-dideoxynucleoside triphosphates (ddNTPs) as substrates during elongation of the primer. Whereas DNA polymerase can use dNTP for continued synthesis of the complementary strand of DNA, the chain cannot elongate any further after addition of the first ddNTP, because ddNTPs lack the crucial 3′-hydroxyl group. To identify the nucleotide at the end of the chain, four reactions are carried out for each sequence analysis, with only one of the four possible ddNTPs included in any one reaction. The ratio of the ddNTP and dNTP in each reaction is adjusted so that these chain terminations occur at each of the positions in the template where the nucleotide occurs. To enable detection by radioautography, the newly synthesized DNA is labeled, usually by including in the reaction mixture radioactively labeled dATP (for the older manual methods) or, most commonly, currently fluorescent dye terminators in the reaction mixture (now in use in automated techniques). The separation of the newly synthesized DNA strands manually is done via high-resolution denaturing polyacrylamide electrophoresis or with capillary electrophoresis in automatic sequencers. Fluorescent detection methods have enabled automation and enhanced throughput. In capillary electrophoresis, DNA molecules are driven to migrate through a viscous polymer by a high electric field to be separated on the basis of charge and size. Although this technique is based on the same principle as that used in slab gel electrophoresis, the separation is done in individual glass capillaries rather than gel slabs, facilitating loading of samples and other aspects of automation. Whereas manual methods allow the detection of about 300 nucleotides of sequence information with one set of sequencing reactions, automated methods using fluorescent dyes and laser technology can analyze 7500 or more bases per reaction. To sequence larger stretches of DNA, it is necessary to divide the large piece of DNA into smaller fragments that can be individually sequenced. Alternatively, additional sequencing primers can be chosen near the end of the previous sequencing results, allowing the initiation point of new sequence data to be moved progressively along the larger DNA fragment.

Fig. 2.6, Deoxyribonucleic acid (DNA) sequencing by the dideoxy (Sanger) method. A 5’-end–labeled oligonucleotide primer with sequence complementarity to the DNA that is to be sequenced (DNA template) is annealed to a single-strand of the template DNA. This primer is elongated by DNA synthesis initiated by the addition of the enzyme DNA polymerase in the presence of the four 2’-deoxynucleoside triphosphates (dNTPs) and one of the 2’,3’-dideoxynucleoside triphosphates (ddNTPs); four such reaction tubes are assembled to use all the four ddNTPs. The DNA polymerase enzyme will elongate the primer using the dNTPs and the individual ddNTP present in that particular tube. Because ddNTPs are devoid of the 3’ hydroxyl group, no elongation of the chain is possible when such a residue is added to the chain. Thus each reaction tube will contain prematurely terminated chains ending at the occurrence of the particular ddNTP present in the reaction tube. The concentrations of the dNTPs and the individual ddNTP present in the reaction tubes are adjusted so that the chain termination takes place at every occurrence of the ddNTP. Following the chain elongation-termination reaction, the DNA strands synthesized are size-separated by acrylamide gel electrophoresis and the bands visualized by radioautography.

One of the seminal technological advances has been the introduction of microarray-based methods for detection, and analysis of nucleic acids. Microarrays contain thousands of oligonucleotides deposited or synthesized in situ on a solid support, typically a coated glass slide or a membrane. In this technique, a robotic device is used to print DNA sequences onto the solid support. The DNA probes immobilized on the microarray slide as spots can either be cloned cDNA or gene fragments (expressed sequence-tags [ESTs]), or oligonucleotides corresponding to known genes or putative open reading frames. The arrays are hybridized with fluorescent targets prepared from RNA extracted from tissue/cells of interest; the RNA is labeled with fluorescent tags, such as Cy3 and Cy5. The prototypic microarray experimental paradigm consists of comparing mRNA abundance in two different samples. One fluorescent target is prepared from control mRNA and the second target with a different fluorescent label is prepared from mRNA isolated from the treated cells or tissue under investigation. Both targets are mixed and hybridized to the microarray slide, resulting in target gene sequences hybridizing to their complementary sequences on the microarray slide. The microarray is then excited by laser, and the fluorescent intensity of each spot is determined with the relative intensities of the two colored signals on individual spots being proportional to the amounts of specific mRNA transcripts in each sample ( Fig. 2.7 ). Analysis of the fluorescent intensity data yields an estimation of the relative expression levels of the genes in the sample and control sample. Microarrays enable individual investigators to perform large-scale analyses of model organisms and to customize arrays for special genome applications.

Fig. 2.7, A, Complementary deoxyribonucleic acid (cDNA) microarray, fluorescent labeled cDNA targets, adrenocorticotropic hormone (ACTH)-independent bilateral macronodular adrenal hyperplasia (Cy3), and ACTH-dependent hyperplasia (Cy5) were hybridized to glass slides containing genes involved in oncogenesis. Following laser activation of the fluorescent tags, fluorescent signals from each of the DNA “spots” are captured and subjected to analysis. B, Magnified view of the microarray platform displaying the fluorescent signals; green (Cy3) and red (Cy5) with yellow represent overlap of these two colors.

The method of choice for global expression profiling depends on several factors, including technical aspects, labor, price, time, and effort involved, and, most important, the type of information that is sought. Technical advances in the development of expression arrays, their abundance and commercial availability, and the relative speed with which analysis can be done are all factors that make arrays more useful in routine applications. In addition, array content can now be readily customized to cover from gene clusters and pathways of interest to the entire genome: some studies examine series of tissue-specific transcripts or genes known to be involved in particular pathology; others directly use arrays covering the whole genome. Another factor that needs to be considered before embarking on any high-throughput approach is whether individual or pooled samples will be investigated. Series of pooled samples reduce the price, the time spent, and the number of the experiments down to the most affordable. Investigating individual samples, however, is important for identifying unique expression ratios in a given type of tissue or cell. There are limitations of microarray-based techniques; for example, similar to direct DNA sequencing methods, microarray-based methods also suffer from the disadvantage of not being able to reliably and consistently detect heterozygous mutations. Furthermore, microarrays cannot be used to detect insertions of multiple nucleotides without exponentially increasing the number of oligonucleotides that must be immobilized on the glass slides.

A more reliable technique in mutation identification is pyrosequencing, which is based on an enzymatic real-time monitoring of DNA synthesis by bioluminescence ( Fig. 2.8 ). Pyrosequencing is performed by the addition of dNTPs individually, in a predefined dispensation order, so that the nascent nucleotide chain is extended one nucleotide residue per dispensation event. Detection of nucleotide sequence is performed by way of a chain of enzymatic reactions involving the activities of DNA polymerase, apyrase, ATP sulfurylase, and luciferase, respectively, allowing for the incorporation of complementary nucleotide, degradation of unused dNTP, generation of luciferase-substrate from pyrophosphate and adenosine 5’-phosphosulfate, and emission of light from the ATP-driven conversion of luciferin to oxyluciferin. Incorporation of a particular nucleotide is displayed graphically in the form of a chart recording of nucleotide dispensation event versus the intensity of emitted light. This cascade of enzyme reactions is quantitative, in that increased light intensity is produced upon incorporation of multiple nucleotides.

Fig. 2.8, Steps in pyrosequencing. (1) Deoxyribonucleic acid (DNA) extraction. (2) Genomic DNA shearing. (3) Fragment end repair. (4) Adenine ligation. (5) Adaptor ligation. (6) PCR amplification. (7) Fragment binding, via the adaptors, to complementary DNA segments anchored to a solid surface. (8) Clonal cluster formation through bridge amplification. (9) High-resolution images capture. (10) Alignment of multiple reads to a reference genome.

Pyrosequencing, introduced in the early 2000s, provided the background for the explosion of new techniques collectively known as high-throughput NGS or massively parallel sequencing . NGS provides longer read length and cheaper price per base of sequencing compared with Sanger sequencing. NGS is based on the uncoupling of the traditional nucleotide-identifying enzymatic reaction and the image capture and doing so in an ever-speedier way allows for essentially unlimited capacity. The first discoveries of gene mutations for endocrine diseases exploiting NGS were published in 2011. Currently, many similar systems are being used for NGS. Illumina® workflows, for example, include four basic steps: the first consists in the random fragmentation of the DNA (or cDNA) sample, followed by 5′ and 3′ adapter ligation. These adapter short sequences encompass binding sites, indices (necessary for performing multiplex reactions), and segments complementary to the oligos fixed on a flow cell. Adapter-ligated fragments are then PCR amplified and gel purified to generate a “library.” Then the generated library is loaded into the flow cell, where its fragments are captured on oligos complementary to the library adapters, immobilized onto a solid support at a dilute concentration. Each fragment is then further amplified into distinct, clonal clusters through bridge amplification. “Bridges” are formed by the annealing of the terminal, still free adapter sequence present at the other, not yet captured extremity of each single fragment, bending to the adjacent complementary immobilized oligo. A polymerase then generates a complementary strand. After denaturation, both strands bind again, pairing with other complementary primers anchored to the floor of the plate. This step is repeated a number of times, generating millions of each fragment’s copies. When this “cluster generation” is complete, the templates are ready for “sequencing.” Here also, as per conventional pyrosequencing, the reversible terminator–based method detects single bases as they are incorporated into DNA template strands, generating light emissions under excitation with a laser. Each colony of distinct fragments is sequenced through the capture of high-resolution images that reflect the base-by-base addition of nucleotides. In contrast to conventional pyrosequencing, all four reversible terminator–bound dNTPs are present during each sequencing cycle, each labeled with a different fluorescent dye. Natural competition minimizes incorporation bias, which reduces raw error rates. Finally, a sophisticated computer program aligns the newly identified sequence reads to a reference genome sequence, grouped, in the case of multiplex reactions, by the index segments to each different DNA source. The extremely high number of equal sequences guarantees the correctness of the obtained results and the exclusion of the ones only sporadically represented. This dramatic rise of data output, together with the progressive reduction of sequencing cost, paved the way for “personalized medicine” in which the genome of each individual can be easily obtained and compared with normal genome sequences to detect possibly present, known, or new, disease-linked mutations.

Semiconductor and nanotechnology-based systems are currently under use in massive sequencing efforts and promise an even cheaper and faster way of determining mutations and other abnormalities of the human genome.

A requirement of all high-throughput screening approaches is confirmation of findings (expression level of a given gene/sequence) by other independent methods. A select group of genes are tested usually; these genes are picked from the series of sequences that were analyzed either because they were found to have significant changes or because of their particular interest with regard to their expression in the studied tissue or their previously identified relationship to pathology or developmental stage. The confirmation process attempts to support the findings on three different levels: (1) reliability of the high-throughput experiment (for this purpose the same samples examined by the microarrays are used); (2) trustfulness of the observations in general (to achieve that, larger number of samples are examined, assessment of which by high-throughput approaches is often unaffordable price- or labor-wise); and (3) verification of the expression changes at the protein level. A commonly used confirmatory technique is qRT-PCR. For verification at the protein level, immunohistochemistry (IHC) and Western blot are the two most commonly chosen techniques. IHC is not quantitative but has the advantage of allowing for the observation of the exact localization of a signal within a cell (cytoplasmic versus nuclear) and the tissue (identifying histologically the tissue that is stained). Modern Western blot methods require a smaller amount of protein lysate than older techniques and have the advantage of offering high-resolution quantitation of expression without the use of radioactivity.

You're Reading a Preview

Become a Clinical Tree membership for Full access and enjoy Unlimited articles

Become membership

If you are a member. Log in here