Physical Address
304 North Cardinal St.
Dorchester Center, MA 02124
Cancer develops due to an accumulation of mutations in DNA. For many years, it has been widely accepted that the development and progression of cancer is associated with alterations in the DNA sequence of the cancer cell genome, such as base substitutions, short insertions and deletions, homozygous deletions, and amplifications and fusions (translocations) of genetic material. Improved understanding of the genetic mechanisms that initiate or drive cancer progression has set the stage for the development of personalized cancer treatment. Although the total catalog of critical “driver” alterations known to promote oncogenesis of a given cancer type may be large, the number of driver alterations contributing to an individual patient’s solid tumor is typically low and unpredictable. Direct sequencing of the tumor cell DNA is necessary to identify which alterations drive an individual patient’s disease. The identification and targeting of specific mutations that have arisen in a tumor continues to show great promise as a means to increase the efficacy of both targeted therapies and immunotherapies for breast cancer patients.
Several decades passed after the structure of DNA was discovered before the sequence of human DNA began to be elucidated. It was not until 1977 that Frederick Sanger developed the Sanger method of rapid DNA sequencing. In 1986, Leroy Hood introduced the first semiautomated DNA sequencing machine, and in 1987 the first fully automated sequencing machine, the ABI 370, was introduced. Shortly thereafter, the sequencing of human complementary DNA (cDNA) ends, known as expressed sequence tags, began in the laboratory of Craig Venter. These technical advances culminated with the first publications of the human genome sequence in 2001.
At the time of its publication in 2001, the initial near-complete draft of the human genome had required more than 12 years of sequencing at multiple laboratories at a cost of more than $3 billion USD. Since then, a continuous demand for more rapid and low-cost sequencing has driven the development of novel approaches designed to parallelize the sequencing process. These new massively parallel or next-generation strategies, in comparison with the traditional Sanger method and other methods, have increased sequencing rates by orders of magnitude and driven down the cost per base significantly.
Traditional DNA sequencing methods that have been used to characterize clinical cancer specimens and impact treatment decisions are highly sensitive, although they are often limited in their scope to known mutational hotspots. Although targeted and quick, the rate of false negatives, limitations in the type of alterations that can be identified, and missed opportunities for identifying other potential drivers are disadvantages. Next-generation methods have the capability to sequence a much larger set of alleles simultaneously, providing scale and breadth of analysis that were not previously possible ( Table 36.1 ).
Methodology | Variant Types Detected | Clinical considerations | |||||
---|---|---|---|---|---|---|---|
SNVs | Indels | CNVs | Rearr. | ||||
High-throughput NGS techniques | Whole genome sequencing | Whole genome sequencing can be performed using any of the NGS technologies | Y | Y | Y | Y |
|
Chain termination | DNA synthesis terminated by random incorporation of fluorescently labeled bases, as in Sanger dideoxy sequencing. Optical detection systems capture nucleotide incorporation after each cycle. Example: Illumina HiSeq | Y | Y | Y | Y a |
|
|
Ion semiconductor detection | Relies on the detection of hydrogen ions released during DNA synthesis. Is unique among NGS technologies because it does not rely on optical measurements. Example: Ion Torrent | Y | Y | Y | Y* |
|
|
Low throughput PCR-based Techniques | Sanger dideoxy sequencing | DNA synthesis terminated by random incorporation of fluorescently labeled bases. DNA fragments separated by capillary electrophoresis to determine sequence. | Y | Y | N | N |
|
Allele-specific PCR | Primers span codon of interest and probes detect specific mutation. | Y | N | N | N |
|
|
Mass spectrometry | Single nucleotide primer extension assays followed by analysis of DNA product using a mass spectrometer. | Y | Y* | N | N |
|
|
Real-time melting curve PCR | Melting curve of DNA measured to identify mutated PCR products, which melt at lower temperatures than wild-type DNA. | Y | Y* | N | N |
|
|
Pyrosequencing | Measures the release of pyrophosphate during nucleotide incorporation. | Y | Y* | N | N |
|
a It is straightforward to identify rearrangements with specific breakpoints, but variants resulting from unforeseen breakpoints not covered within the bait or primer set would not be detected.
The chain termination sequencing method developed by Sanger and colleagues was the cornerstone procedure used in the original sequencing of the human genome. Contemporary Sanger sequencing uses automated instruments that detect the insertion of fluorescently labeled dideoxynucleotide chain terminators and determine their position in the sequenced product following capillary electrophoresis.
Pyrosequencing differs from the chain termination method by relying on the detection of pyrophosphate after it is released when the cDNA strand is synthesized using as a template the single DNA strand to be sequenced. It is also known as the sequencing by synthesis method. As each new base is added to the cDNA strand by a chemiluminescent DNA polymerase, the sequencing system determines the nucleotide of the original template DNA strand being sequenced. This method is limited in the length of the template DNA strand that can be sequenced, which is significantly shorter than that for Sanger chain termination sequencing. However, it is considered more sensitive than Sanger sequencing and provides a percentage of the initial DNA that harbors the specific mutation. Thus, pyrosequencing is most often applied in clinical settings for short-length hotspot sequencing of specific codons within the gene of interest.
First introduced in 2005, the 454 next-generation sequencing (NGS) platform encapsulates a single DNA template strand in an oil droplet emulsion along with a primer-coated bead. The sequencing instrument is organized into picoliter wells in which an individual oil-coated bead is placed. Pyrosequencing is performed and the nucleotides are visualized in a luciferase system with a fiber-optic coupled imaging camera. The system provides longer read lengths, which is considered a strength. The 454 NGS platform features the strengths of relatively fast instrument run times and long read length, but is limited by the high cost of reagents and problems with high error rates in genetic regions rich in homopolymer repeats.
A variety of closed commercial polymerase chain reaction (PCR)–based systems have been developed to perform DNA sequencing and have shown high sensitivity for mutation detection with a reduced risk of sample contamination. Allele-specific real-time PCR determines the sequence of preidentified hotspots in the cancer cell genome. Primer and probe sets are designed to detect the mutations of clinical interest, and will not detect other mutations, deletions, or translocations involving related genes. This method is reported to detect a KRAS mutation present in as little as 1% of the total DNA extracted from the formalin-fixed paraffin-embedded (FFPE) specimen.
Analysis of the melting curve observed for the DNA products produced by PCR amplification can determine the presence of a specific DNA mutation of interest. This method is based on the principle that wild-type DNA will melt at a higher temperature than mutated DNA and that the system will show two lower-temperature melting peaks for heterozygous mutations and a single lower temperature peak for homozygous mutations. A variation of this method is the PCR Clamp method, which uses a peptide nucleic acid probe to block amplification of the wild-type DNA within a sample to detect a specific mutation. Although this method has high sensitivity, it cannot calculate the percentage of mutated DNA.
Matrix-assisted laser desorption ionization time-of-flight mass spectrometry (MALDI-TOF MS) has been applied to clinical samples for DNA sequencing with very high resolution and sensitivity, especially for the detection of somatic point mutations in cancer samples and single nucleotide polymorphisms in germline DNA. This type of DNA genotyping is the backbone of the Sequenom MassARRAY® system.
The first organization to develop a second, or next-generation, approach to DNA sequencing was Lynx Therapeutics. Its massively parallel signature sequencing (MPSS) platform was a microsphere (bead)-based system that read nucleotides in groups of four via an adapter ligation and adapter decoding strategy. Through a merger with the Solexa Corporation, which was subsequently acquired by Illumina, Inc., this bead-based approach using reversible dye terminators and short read lengths was adapted for use in a flow cell with eight individual lanes, the surfaces of which are coated with oligonucleotide anchors. In this approach, unincorporated nucleotides are washed away after each cycle, with the remaining DNA extended one nucleotide at a time. Subsequent system cycles take place after digital images are captured of the fluorescently labeled nucleotides and the terminal 3′ blocker is chemically removed from the DNA. By a process called bridge amplification, DNA templates are amplified in the flow cell by “arching” over and hybridizing to an adjacent anchor oligonucleotide. A number of technical issues, particularly those involving aberrant nucleotide incorporation rates, place major responsibility on the bioinformatics systems and computational biologists to correctly interpret the raw sequencing data produced by the Illumina systems. The Illumina technique is currently the most widely used NGS platform, and Illumina currently markets three major clinical instruments: the HiSeq 2500, the HiSeq 3000/4000, and the MiSeq. The HiSeq platforms can sequence up to 1 trillion bases in about 3 days or approximately 10 billion bases in a rapid run mode that takes as few as 7 hours. The MiSeq is a much less expensive, lower-capacity instrument used for rapid turnaround (it can sequence 500 million bases in 4 hours).
This method utilizes ion semiconductor sequencing based on the detection of hydrogen ions that are released during the polymerization of DNA, and is the basis for the Ion Torrent™ system. It has been widely adapted for use in clinical molecular diagnostics laboratories. Incorporation of nucleotides into the growing complementary DNA strand causes the release of a hydrogen ion that triggers a hypersensitive ion sensor. This approach is now owned by Life Technologies, which claims that PostLight TM sequencing technology has the major strength of being the first of its kind to eliminate the cost and complexity associated with the extended optical detection currently used in all other sequencing platforms. The uses of this system appear to be focused on rapid and affordable short sequence determination of exons containing hotspot mutations.
Prior to the launch of NGS testing platforms, traditional hotspot DNA sequencing had reached the bedside for the treatment of a variety of tumors including non-small cell lung cancer (NSCLC), colorectal cancer (CRC), hematological malignancies, and melanoma ( Table 36.2 ). Next-generation technologies are also capable of testing for each of these known driver mutations, and have expanded the repertoire of genetic abnormalities that can be evaluated to include copy number changes, such as human epidermal growth factor receptor 2 (HER2) gene amplification in the context of breast or upper gastrointestinal (GI) tumors, and a wider array of variable fusion or rearrangement events, such as those affecting ROS1 or RET in NSCLC.
Genetic event | Disease | Therapies |
---|---|---|
KRAS mutation | CRC | Cetuximab, panitumumab (contraindicated by KRAS mutation) |
KRAS G12C mutation | NSCLC | |
BRAF mutation | Melanoma | Vemurafenib, dabrafenib |
EGFR mutation | NSCLC | Gefitinib, erlotinib, afatinib, osimertinib |
EML4-ALK translocation | NSCLC | Crizotinib, ceritinib, alectinib, lorlatinib, brigantinib |
KIT mutation | GIST/melanoma | Imatinib, sunitinib, regorafenib, pazopanib, avapritinib |
BCR-ABL translocation | CML | Imatinib, dasatinib, nilotinib, bosutinib |
PML-RARA translocation t(15;17) | APL | ATRA, ATO |
HER2 gene amplification | Breast and upper GI cancer | Trastuzumab, pertuzimab, trastuzumab-DM1, trastuzumab-deruxtecan, lapatinib, afatinib, nertinib |
ROS1 fusion | NSCLC | Crizotinib |
RET fusion | NSCLC | Selpercatinib, pralsetinib |
MET Exon 14 splice site mutations | NSCLC | Capmatinib, tepotinib |
FGFR 1–3 mutations | Urothelial bladder cancer | Erdafitinib |
FGFR2 fusions | Cholangiocarcinoma | Pemigatinib |
NTRK1-3 fusions | Pan-cancer | Larotrectinib |
Microsatellite instability (MSI high) | Pan-cancer (excluding hematopoietic malignancies) | Pembrolizumab |
Tumor mutational burden >10 mutations/megabase | Pan-cancer | Pembrolizumab |
The traditional approaches to cancer cell DNA sequencing are compared with the NGS approach in Table 36.3 . The relative cost of the two approaches is of great importance to current and future test providers, consumers, and payers. Although the cost per base sequenced for the traditional approaches is high, these narrow approaches focused on one gene or a few hotspots are often less expensive overall than the cost of an NGS assay that evaluates many hundreds of genes with more expensive reagents and equipment. Without question, the expertise, especially in computational biology, required to perform clinical NGS testing for cancer patients is significantly higher than for traditional sequencing. In daily clinical pathology practice, both traditional and NGS sequencing approaches are challenged by several concerns: what is the best sample to test (e.g., primary vs. metastatic tumor tissue or tumor tissue vs. circulating tumor cells); small sample size, as from fine-needle aspiration (FNA) biopsies; tumoral heterogeneity with respect to genetic abnormalities; and extensive necrosis or samples that feature a very low percent of tumoral DNA compared with noncancerous tissue.
Parameter | Traditional | NGS | Advantage |
---|---|---|---|
Cost (per base) | High | Low | NGS |
Cost (per multiplex multigene “test”) | High | Moderate | Uncertain |
Equipment cost | Moderate | High | Traditional |
Expertise required for sequencing and data analysis | Moderate | High | Traditional |
Can be performed on FFPE samples | Yes | Yes | — |
Challenged by small samples, necrotic tumor, tumoral heterogeneity, and very low percent of tumoral DNA in sample | Yes | Yes | — |
Generally restricted to one gene at a time | Yes | No | NGS |
Can easily sequence hundreds of cancer-related genes in one sample | No | Yes | NGS |
Generally restricted to hotspots only | Yes | No | NGS |
Can easily detect deletions | No | Yes | NGS |
Can easily detect translocations | No | Yes | NGS |
Can easily detect gene copy number alterations | No | Yes | NGS |
Sensitivity | Low | High | NGS |
Turnaround time (single gene) | Shorter | Longer | Traditional |
Turnaround time (per multiplex multigene analysis) | Longer | Shorter | NGS |
The restriction of traditional sequencing to analysis of one gene at a time, and within that gene typically focused on hotspots (e.g., codons 12 and 13 of exon 2 in the KRAS oncogene), is a significant drawback. NGS platforms allow for large-scale gene sequencing that can both determine the status of mutational hotpots expected in a given clinical situation and discover unexpected sequence abnormalities that could significantly alter the treatment plan. Novel mutations with clinical impact continue to be discovered, even for established cancer genes, but are undetectable using traditional platforms as designed today. By analyzing read counts at given loci, NGS sequencing can provide information on gene copy number, identifying homozygous and heterozygous deletions and gene amplifications when traditional sequencing approaches cannot. NGS can also detect translocations that drive therapy selection, such as the EML4-ALK translocation that is the key indication for crizotinib treatment in NSCLC. Furthermore, the sensitivity of NGS can match or exceed traditional approaches when the mutation is present in only a small percentage of the total DNA extracted from the specimen.
The rapid analysis of many genes in parallel, as made possible with NGS technology, also facilitates the identification of potentially relevant clinical trials. Knowing a patient’s comprehensive genomic profile can allow for the selection of either a selective trial investigating therapeutic strategies in the limited context of one biomarker and/or disease, or indicate that patients could benefit from enrollment into a basket trial with potentially fewer restrictions on tumor type or molecular profile.
Although the turnaround time for NGS of a multiplex (>100-gene) cancer genome panel is currently longer (4–7 days) than traditional single-gene hotspot sequencing, it is anticipated that this difference will rapidly narrow as NGS technology continues to evolve. Information on the patient’s germline DNA sequences may be needed in a variety of clinical settings to make sense of the tumor cell sequence or to distinguish rare, harmless germline polymorphisms from possibly significant somatic mutations. Finally, in an era of growing demand for a more personalized approach to oncology practice, it is likely that other traditional and emerging cancer cell diagnostics, including slide-based assays (immunohistochemistry [IHC] and fluorescence in situ hybridization [FISH]), analysis of the epigenome using methylation-specific reverse transcription-polymerase chain reaction (RT-PCR), and microRNA profiling, will be combined with tumor cell DNA sequencing to create some form of unified laboratory report.
In order to deliver NGS results as a clinical assay for patient management a number of barriers must be overcome, from specimen requirements and cost to turnaround time to ensuring proper analysis and interpretation of the sequencing data.
Clinical NGS performed for solid tumors generally uses FFPE material, although many other tissue samples can be analyzed. Major resection specimens almost always provide an adequate sample, but small-needle biopsies, FNA biopsies, and fluid cell block samples, may be limiting. In general, a sample of approximately 15 mm 2 with a minimal depth of 40 microns is adequate for NGS. For assay systems that measure gene copy number in addition to other mutations, tumor nuclei must account for at least 20% of the total tissue nuclei present. Contamination with noncancerous tissue or high levels of necrosis can affect detection sensitivity. When tumor nuclei proportions are below 20%, the risk of missing a copy number gain or homozygous loss increases rapidly. Macrodissection can often be used on larger specimens to enrich the sample for tumor nuclei.
Cancer growth and progression can be driven by many different alteration types, all of which can dysregulate the checks and balances that normally preserve cellular homeostasis: point mutations that selectively alter enzyme activity, genomic rearrangements that create novel oncogenic molecules, copy number gains or losses that dramatically change transcript levels, and small insertions and deletions (indels) with various effects depending on the gene and location of the alteration. Given the variety of driver mutations possible, the collection of oligo baits used for hybrid capture and the algorithms processing the data must be designed to probe for and detect multiple alteration types. Concordance between complimentary assays such as IHC and the measurement of copy number gains or losses provided by DNA sequencing illustrates the power of NGS techniques for tumor profiling.
Validating the sensitivity and specificity of an NGS assay is a major challenge for test providers. One approach has relied on the use of HapMap cell lines known to have specific genomic alterations that can be diluted to low mutant allele frequencies (MAFs) and run in parallel with clinical samples. The more traditional approach is to obtain sets of samples with known mutations (as defined by another method or another lab) in each of the genes of interest. However, this approach is generally feasible only for the most commonly mutated genes.
Although the proper management of an NGS system requires technical expertise in many areas, the bioinformatics expertise required for proper analysis and interpretation is key. Statistical analysis of system performance, including depth and uniformity of sequencing coverage, is typically performed by the bioinformatics team. The software identifying alterations and determining which are clinically significant often requires local algorithm construction and modifications needed to bring the system to full performance in sensitivity and specificity. The lack of trained bioinformaticians capable of managing NGS data systems software is a major impediment to the development of NGS testing services in many clinical laboratories.
Rapid growth and cell division, coupled with dysregulated quality control, leads tumors to quickly accumulate mutations, without these changes necessarily conferring an advantage. These passenger mutations can arise in any gene, including well-characterized oncogenes or tumor suppressors, but are not expected to predict response to treatment or prognosis. Distinguishing driver mutations from passengers relies on clinical or experimental observations indicating significance. Many databases have been developed to aid in the understanding and identification of significant mutations found in cancer. Two major initiatives have been designed to map out all the somatic intragenic mutations in cancer: the Cancer Genome Atlas of the National Cancer Institute ( http://cancergenome.nih.gov ) and the Sanger Centre’s Cancer Genome Project ( http://cancer.sanger.ac.uk/cosmic ). The COSMIC database displays the data generated from all published or otherwise publicly available human cancer sequencing efforts, whereas the Cancer Genome Atlas Project has banked information on cancer genomes, as well as transcriptomes and proteomes. The International Cancer Genome Consortium is so far the biggest project to collect human cancer genome data, and is accessible through the ICGC website ( https://dcc.icgc.org ).
The term actionable has been applied to somatic cancer genotyping to indicate when the results of sequencing can direct a specific action on the part of the oncologist. Significant discussion surrounds the general definition of actionable genomic alterations as well as the potential “actionability” of alterations in individual genes. There is universal agreement that alterations associated with a specific approved therapy in that tumor type are actionable. Most investigators also agree that an alteration indicating an approved therapy for a different tumor type is also actionable, although it would require off-label drug use. The most controversial actionability definition concerns alterations directly listed in entry criteria for registered anticancer clinical trials. In some of these alteration-driven trials investigating targeted therapies, the association of the sequence result with the proposed mechanism of action or clinical responses is straightforward and well accepted, while in others the alteration and link with drug efficacy are not as well established. Given that the US Food and Drug Administration (FDA) continues to approve anticancer drugs based on their site of origin, a careful pathology review is required to assign the correct diagnosis to the sequence results. Although curation for well-known alterations may be relatively straightforward, as the list of anticancer drugs linked to genomic alterations continues to grow, having a skilled curation team capable of searching the current literature and databases is a critical component of NGS reporting.
Become a Clinical Tree membership for Full access and enjoy Unlimited articles
If you are a member. Log in here