Molecular Tools in Cancer Research


Summary of Key Points

  • Our understanding and treatment of cancer have always relied heavily on parallel developments in biologic research. Molecular biology provides the basic tools to study genes involved with cancer growth patterns and tumor suppression. An advanced understanding of the molecular processes governing cell growth and differentiation has revolutionized the diagnosis, prognosis, and treatment of malignant disorders.

  • This introductory chapter relates basic principles of molecular biology to emerging perspectives on the origin and progression of cancer and explains newly developed laboratory techniques, including whole-genome analysis, expression profiling, and refined genetic manipulation in and use of genetically diverse animal models, providing the conceptual and technical background necessary to grasp the central principles and new methods of current cancer research.

Since the last edition of this book was published, advances in our understanding of the basic mechanisms of cancer have continued to inform and refine clinical approaches to prevention and therapy. New prognostic and predictive markers derived from molecular biology can now pinpoint specific genetic changes in particular tumors or detect occult malignant cells in normal tissues, leading to improved technologies for tumor screening and early detection. Diagnostic approaches have expanded from morphologic criteria and single-gene analysis to whole-genome technologies and single-cell genomics imported from other biologic disciplines. A new systemic vision of cancer is emerging, in which the importance of individual mutation has been superseded by an appreciation for higher-order organization and individual genetic background, disrupted by complex interactions of disease-associated factors and gene-environmental parameters that affect tumor cell behavior. Results from these cross-disciplinary investigations underscore the complexity of carcinogenesis and have profoundly influenced the design of strategies for both cancer prevention and advanced cancer therapy.

This overview will serve as a foundation of conceptual and technical information for understanding the exciting new advances in cancer research described in subsequent chapters. Since the discovery of oncogenes, which provided the first concrete evidence of cancer's genetic basis, applications of advanced molecular techniques and instrumentation have yielded new insights into normal cell biology. A basic fluency in molecular biology and genetics has become a necessary prerequisite for clinical oncologists because many of the new diagnostic and prognostic tools currently in use rely on these fundamental principles of gene, protein, and cell function.

Our Unstable Heredity

Cancer genetics has classically relied on the candidate-gene approach, detecting acquired or inherited changes in specific genetic loci accumulated in a single cell, which then proliferates to produce a tumor composed of its identical clonal progeny. During the early steps of tumor formation, mutations that lead to an intrinsic genetic instability allow additional deleterious genetic alterations to accumulate. These genetic changes confer selective advantages on tumor cell clones by disrupting control of cell proliferation. The identification of specific mutations that characterize a tumor cell has proved invaluable for analyzing the neoplastic progression and remission of the disease. The emergence of cancer cells is a byproduct of the necessity for continuous cell division and DNA replication to maintain organ functionality throughout the life cycle.

The highly heterogeneous nature of tumors, each composed of multiple cell types, led to the formulation of the “cancer stem cell” hypothesis, which posits that only a subpopulation of cancer cells is able to maintain self-renewal, unlimited growth, and capacity for differentiation into other, more specialized cancer cell types. Cancer stem cells display bona fide stem cell markers, in contrast to other cancer cells present in the tumor, which do not have tumorigenic potential. In fact, fewer than 1 in 10,000 cells present in human acute myeloid leukemia are capable of reinitiating a new tumor when transplanted into animals. Cancer stem cells have been identified in many solid tumors in the brain, colon, ovaries, prostate, and pancreas, suggesting that more effective cancer therapies would target these self-renewing cells, rather than the tumor as a whole. The cancer stem cell concept differs from the original clonal evolution hypothesis, which states that every cell in a tumor mass is capable of self-renewal and differentiation, and suggests that detecting and targeting subtle genetic and epigenetic differences that distinguish cancer stem cells may provide a more effective avenue to intervention in disease progression.

Heterogeneity can also arise as a result of stochastic mutational events that lead to cancer progression. Clastogenic insults to the genome, or genomic instability due to aberrant gene regulation, could lead to loss of heterozygosity (LOH) of tumor suppressor genes such as TP53, RB1, or BRCA, and can also lead to tumor heterogeneity and change in disease progression. Furthermore, activation of DNA or RNA editing enzymes in tumors could lead to kataegis, a DNA hypermutation process, and increase tumor heterogeneity. Although there are molecular biology tools currently available to detect aberrant but stable genomes, the later processes that lead to genomic instability make diagnosis and prognosis more challenging.

Detecting Cancer Mutations

Methods for mutation detection all rely on the manipulation of DNA, the basic building block of heredity in the cell. DNA consists of two long strands of polynucleotides that twist around each other clockwise in a double helix ( Fig. 1.1 ). Nucleic acid bases attached to the sugar groups of each strand face each other within the helix, perpendicular to its axis. These comprise only four bases: the purines adenine and guanine (A and G) and the pyrimidines cytosine and thymine (C and T). During assembly of the double helix, stable pairings of nucleotides from either strand are made between A and T, or between G and C. Each base pair forms one of the billions of rungs in the long, unbroken ladder of DNA forming a chromosome.

Figure 1.1, DNA structure. Deoxyribonucleic acid (DNA) is the cell's genetic material, contained in single compacted strands comprising chromosomes within the cell nucleus. In the DNA double helix, the two intertwined components of its backbone, composed of sugar (deoxyribose) and phosphate molecules, are connected by pairs of molecules called bases. The sequence of four bases (guanine, adenine, thymine, and cytosine) in the DNA helix determines the specificity of genetic information. The bases face inward from the sugar-phosphate backbone and form pairs with complementary bases on the opposing strand for specific recognition. The arrangement of chemical groups is unique for each base pair, allowing base pairs to be specifically targeted by transcription factors, polymerases, restriction enzymes, and other DNA-binding proteins.

The functional unit of inherited information in DNA, the gene, is most often represented by a discrete section of sequence necessary to encode a particular protein structure. Gene expression is initiated by forming a copy of the gene, messenger RNA (mRNA), constructed base by base from the DNA template by a polymerase enzyme. Once transcribed, an mRNA transcript is modified and the processed product is transported out of the nucleus. In the cytoplasm, proteins are then synthesized, or translated, in macromolecular complexes called ribosomes that read the mRNA sequence and convert the nucleic acid code, based on three-base segments or codons, into a 20–amino acid code to form the corresponding protein.

Although these canonic processes drive gene expression in all normal cells, cancer cells defy the rules. For instance, uracils, which are found on RNA, can be detected in the DNA of cancer cells because of their high mutation rates. Paradoxically, these deviations from the norm allow the development of molecular biology tools to better diagnose and predict tumor progression.

Generating Diversity With Alternate Splicing

In higher organisms, most protein coding gene sequences are interrupted by stretches of noncoding DNA sequences, called introns. In the nucleus, these introns are removed after mRNA transcription to produce a continuous chain of coding sequences, or exons, that subsequently undergo translation into protein. The splicing process requires absolute precision because the deletion or addition of a single nucleotide at the splice junction would throw the three-base coding sequence out of frame, or lead to exon skipping or addition, creating abnormal proteins.

The dramatic increase in genetic complexity conferred by alternate RNA splicing is underscored by the multiple splice patterns of many medically relevant genes, in which different combinations of exons are chosen for the final mRNA transcript, such that one gene can encode many different proteins ( Fig. 1.2 ). The choice of protein isoform to be expressed from a gene with multiple splicing possibilities is a decision that can be perturbed in disease. Errors in splicing mechanisms have been associated with a large group of cancers. These include mutations in the oncogene p53 in more then 12 different types of cancer, mutL homolog 1 protein (MLH1) mutation in hereditary nonpolyposis colorectal cancer, and several transcription factors and cell signaling and membrane proteins. When mutations in the splicing site lead to insertion of novel sequences in the mRNA, the encoded protein can be used as a potential clinical marker, as seen for the transcription factor NSFR in small cell lung cancer. Owing to their unique expression in cancer cells, these markers can be further explored as new cancer-specific therapeutic targets.

Figure 1.2, RNA splicing. Alternate splicing produces multiple related proteins, or isoforms, from a single gene.

Genomics of Cancer

The complete set of DNA sequences carried on all the chromosomes is known as the genome. Although the general map of the genome is shared by all members of a species, the recent sequencing of thousands of individual human genomes has given rise to the new field of genomics, providing us with new tools to reveal the more subtle variations that arise between individuals. These variations are critical, both as a natural engine driving heterogeneity within a species, and as a source of predisposition to cancer types. The most common forms of human genetic variations, or alleles, arise as single-nucleotide polymorphisms (SNPs). Because these allelic dissimilarities are abundant, inherited, and dispersed throughout the genome, SNPs can be used to track racial diversity, personal traits, and susceptibility to common forms of cancer ( Fig. 1.3 ). Commercial entities have developed tools that can detect thousands of SNPs with relatively little sample material. Platforms such as MegaMUGA or GigaMUGA can allow mammalian genetic mapping that can aid in a number of diagnoses and can distinguish between predictive and prognostic markers.

Figure 1.3, Determining cancer susceptibility with single-nucleotide polymorphisms (SNPs). Millions of SNPs exist between individuals, as depicted by the red arrows and the SNP density map of human chromosome 11 (right). By contrast, point mutations, deletions, insertions, and rearrangements between normal tissues and tumors or between primary and secondary tumors probably number in the tens to hundreds (or potentially thousands), as depicted by the spectral karyotype image at the bottom of the figure. Because the constitutional genetic polymorphisms are present in all the tissues of the body, it might be possible to distinguish differences in metastatic versus nonmetastatic tumors and in nontumor tissues before they ever happen to develop a solid tumor.

How do SNPs arise between individuals? One source of variation in DNA sequence derives from deviations in the strict base-pairing rule underlying the structure, storage, retrieval, and transfer of genetic information. The duplicated genetic information in the two strands of DNA not only permits the repair of a damaged coding sequence but also forms the basis for the replication of DNA. During cell division, polymerase enzymes unwind the DNA strands and copy them, using the base sequences as a template for constructing a new helix so that the dividing cell passes its entire genetic content on to its progeny. Errors in this process are rare, and person-to-person differences comprise only about 0.1% of the human genome. SNPs are inherited if they occur in the germline. Many genetically inherited variations occur in regions that do not encode protein or alter the regulation of nearby genes. Given the disruptive effects even subtle genetic changes may have on cell function, it is important to distinguish SNPs that represent true mutations from benign polymorphisms.

Our ability to monitor hundreds of thousands of SNPs simultaneously is one of the most important advances in modern medical genetics. Relatively simple genotyping technologies for SNP detection rely largely on the polymerase chain reaction (PCR). In procedures that use this reaction, two chemically synthesized single-stranded DNA fragments, or primers, are designed to match chromosomal DNA sequences flanking the segment in which an SNP is positioned. With the addition of nucleotide building blocks and a heat-stable DNA polymerase, the primer pairs, or amplicons, initiate synthesis of new DNA strands, using the chromosomal material as a template. Each successive copying cycle, initiated by “melting” the resulting double-stranded products with heat, doubles the number of DNA segments in the reaction ( Fig. 1.4 ). The technique is exceptionally sensitive; millions of identical DNA copies can be generated in a matter of hours with PCR by using a single DNA molecule as the starting material.

Figure 1.4, Amplification of DNA by polymerase chain reaction (PCR). The DNA sequence to be amplified is selected by primers, which are short, synthetic oligonucleotides that correspond to sequences flanking the DNA to be amplified. After an excess of primers is added to the DNA, together with a heat-stable DNA polymerase, the strands of both the genomic DNA and the primers are separated by heating and allowed to cool. A heat-stable polymerase elongates the primers on either strand, thus generating two new, identical double-stranded DNA molecules and doubling the number of DNA fragments. Each cycle takes just a few minutes and doubles the number of copies of the original DNA fragment.

Other novel methods for large-scale SNP detection include single-nucleotide primer extension, allele-specific hybridization, oligonucleotide ligation assay, and invasive signal amplification, which detect polymorphisms directly from genomic DNA without the requirement of PCR amplification. The International HapMap Project was established with the objective of identifying those variations (commonly thought to be on the order of 10 million in our genome) in the human population. This project is already in its third phase (HapMap3), now including both SNPs and copy number variations observed in 1184 samples from 11 different human populations. Regardless of the method used to characterize them, the collective SNPs in a selected genomic region characterize a haplotype, or specific combination of alleles at multiple linked genetic loci along a chromosome that are inherited together.

Even when the SNPs within a given haplotype are not directly involved in a disease, they provide markers for clonality and for the loss or rearrangement of specific chromosomal segments in growing tumors. In the human nucleus, each of the 23 tightly compacted chromosomes has a characteristic size and structure, and a distinctive base sequence that carries unique protein coding information. Other noncoding DNA sequences are used for directing the transcription of neighboring genes, through complex regulatory circuits involving protein binding and modification of the DNA itself, or shifting of its chromosomal packaging. Although genomic instability is generally considered a consequence of tumor formation rather than the initial trigger of cancer, the loss, gain, or rearrangement of chromosomal segments through deletion or translocation is a common form of neoplastic mutation, as protein coding segments from different genes are combined or regulatory sequences are brought into new proximity to genes they do not normally control, as seen in chronic myeloid leukemia (CML). In CML, recombination events lead to the fusion of BCR and ABL genes (Philadelphia chromosome). This results in constitutive activation of the fused gene, leading to loss of proliferative control in myeloid cells and consequently cancer. Gross changes in DNA arrangement can be detected by cytogenetic analysis of chromosomal features on metaphase spreads. Although fluorescence in situ hybridization (FISH) provides greater resolution by localizing specific chromosomal DNA sequences corresponding to fluorescently labeled probes ( Fig. 1.5 ), and can be used to track specific alterations in chromosomal structure where known genes are involved, spectral karyotyping (SKY) is a powerful and more general tool that could aid diagnosis of cancer genomes. With each fluorescently labeled chromosome assigned a specific color, translocations and additions are revealed as multicolored chromosomes, or large deletions as pieces of missing chromosomes.

Figure 1.5, Detection of chromosomal translocations. Fluorescence in situ hybridization (FISH) technology uses a labeled DNA segment as a probe to search homologous sequences in interphase chromosomes for the t(9;22)(q34;q11) translocation, associated with chronic myeloid leukemia. On the left, patient nuclei were hybridized with probes for chromosome 9 (labeled with SpectrumRed fluorophore) and chromosome 22 (labeled with SpectrumGreen).

The plethora of data arising from genome-wide association studies using currently available techniques poses particular challenges to cancer researchers. Discerning the causal genetic variants among genotype-phenotype associations requires extensive replication, control for underlying genetic differences in population cohorts, and consistent classification of clinical outcomes. New technologies must be met with equivalently sophisticated and rigorous analytic methodologies for the true genetic cause of cancer to be teased out from our variable and often unstable heredity.

You're Reading a Preview

Become a Clinical Tree membership for Full access and enjoy Unlimited articles

Become membership

If you are a member. Log in here