Physical Address
304 North Cardinal St.
Dorchester Center, MA 02124
The emergence of high-throughput omics technologies in the mid-1990s heralded the emergence of a new paradigm of personalized medicine. Researchers could now study dynamic biological systems, including cancers, in novel and comprehensive ways. Collaborations arose among basic scientists, clinical researchers, bioinformaticians, and biostatisticians, resulting in the development of a multitude of omics-based tests for use in oncology. The speed of technological advancement and the ability to generate vast quantities of data initially outpaced the development of standards for appropriate study design and validation. Indeed, some of the earliest molecular-based tests, including for breast cancer, were prematurely implemented into clinical trials at a very early stage in their development. Ultimately, concerns raised by statisticians about the validity of the tests and potential harm to patients led to the termination of the trials, and highlighted the need for a rigorous methodological framework for the discovery, validation, and, ultimately, translation of omics-based tests into clinical practice.
Omics is a term encompassing multiple molecular disciplines that measure some characteristic of a large family of cellular molecules such as DNAs, RNAs, proteins, lipids, and metabolites. Genomics, for example, refers to the study of genes and their function. Omics-based tests include both an assay that measures the molecules of interest and a computational model that translates the assay measurements into a clinically actionable result. In general, omics research generates complex, high-dimensional data through measurement of many (often magnitudes) more variables than the number of samples. This results in a high risk that computational models will overfit data. Overfitting occurs when a statistical model describes random error or noise rather than a true underlying relationship. The development of omics-based tests for clinical use, therefore, requires carefully designed and strictly executed series of validation studies using independent sample sets.
In response to concerns raised by the premature incorporation of gene expression–based tests into clinical trials, the Institute of Medicine (IOM) convened a special Committee on the Review of Omics-Based Tests for Predicting Patient Outcomes in Clinical Trials, charged with identifying appropriate evaluation criteria for omics-based tests and their readiness for inclusion into a clinical trial design. The resulting guidelines are summarized in Table 10.1 . Within the discovery phase, a computational model is first developed on a training set of samples, and then the fully specified locked-down computational model is evaluated on an independent test set of samples. Successful omics-based tests are then transferred from the research laboratory to a clinical laboratory for the validation phase. Here the clinical testing method is developed and optimized, followed by analytic validation (analytic performance characteristics of the test) and clinical validation studies (confirmation that the test correlates with the clinical outcome of interest in an independent sample set). The last stage involves evaluating the clinical utility of the omics-based test through conduction of either a randomized clinical trial or at least two prospective-retrospective studies using archived samples from previous randomized clinical trials.
STEPS FOR MOLECULAR TEST DEVELOPMENT AND EVALUATION. DISCOVERY PHASE (RESEARCH LABORATORY) |
|
TEST VALIDATION PHASE (CLIA-CERTIFIED CLINICAL LABORATORY) |
|
EVALUATION OF CLINICAL UTILITY AND USE |
|
Since the completion of the Human Genome Project and introduction of microscale technologies for gene expression molecular analysis. accelerated developments in the field of high-throughput sequencing have significantly advanced oncological research, with a particularly large impact on clinical application. Within breast cancer, several gene expression signatures have been developed using microarray and real-time quantitative reverse transcription polymerase chain reaction (qRT-PCR) technologies, several of which are further discussed later in this chapter.
DNA microarrays for measuring gene expression levels create a global snapshot of a tissue or cell type’s relative gene expression (transcriptome) at a particular point in time (at which the tissue was harvested). Microarray technology identifies differences in gene expression profiles between normal and abnormal tissues and, through comparison of expression profiles of various cancers, permits identification of differences in expression profiles, such as those that correlate with different clinical outcomes or response to a specific therapy.
There are a number of commercially available microarrays, which can be broadly classified using at least three criteria: (1) length/type of probes (long probe complementary DNA [ cDNA ] arrays versus short probe oligonucleotide arrays ); (2) manufacturing method ( spotted arrays containing deposited spots of previously synthesized probes versus in situ synthesized arrays where oligonucleotide probes are built directly on the chip); and (3) number of samples simultaneously profiled on the array (single-channel versus multichannel arrays). The steps involved in a microarray experiment are summarized in Fig. 10.1 .
The sheer amount of data generated from the array requires the use of specific bioinformatics software tools. Normalization of fluorescence signals is performed to account for variations in labeling, hybridization, and scanning methods, and statistical tools are used to determine which changes are considered significant. The different methods of normalization and statistical analysis can result in significant differences in expression results from different laboratories using the same samples, and comparing and contrasting microarray results from different laboratories requires knowledge of the specific methods used. To facilitate this process, the Minimum Information About a Microarray Experiment (MIAME) checklist has been developed and is a requirement, along with depositing all experimental data into a public repository (e.g., Gene Expression Omnibus [GEO] or ArrayExpress ) for publication in many journals. Gene expression results obtained from microarray analysis are ideally confirmed using another method of expression profiling, with qRT-PCR being the most common method.
Polymerase chain reaction (PCR) is the workhorse technique of all molecular laboratories. qRT-PCR is an extension of standard PCR that permits quantifying gene expression levels within a sample. The substrate is messenger RNA (mRNA), which in the first step is reverse transcribed to cDNA followed by standard PCR amplification ( Fig. 10.2A ). An instrument monitors the presence of PCR products in real time while software performs quantitative analysis. Detection methods can be either nonspecific , such as fluorescent dyes (e.g., SYBR Green), or specific , such as DNA probes (e.g., TaqMan assay ). The mechanisms of action of these common detection systems are shown in Fig. 10.2B–C . Quantification may be either absolute (using a calibration curve to relate the PCR signal to the amount of starting mRNA) or relative (measuring the relative change in mRNA expression level of the target gene versus a housekeeping or reference gene).
Although gene expression microarray platforms are better suited to fresh tissue, PCR is an optimal technique for use on formalin-fixed paraffin-embedded tissue (FFPET), the currency of contemporary pathology laboratories.
There are three commonly used study designs for microarray experiments: class discovery , class comparison , and class prediction ( Fig. 10.3 ).
Class discovery is a hypothesis-independent exploratory analysis in which gene expression profiles of a series of unselected tumor samples are analyzed in an unsupervised manner to determine whether genetically distinct molecular subgroups emerge based on the patterns of gene expression. A commonly used data analysis method is hierarchical clustering, which groups the samples based on the similarity in their pattern of gene expression. The relationship between the samples can be graphically represented in a dendrogram (e.g., Figs. 10.3A and 10.4 ), in which the pattern and length of the branches reflect the relatedness of the samples, with shorter branches indicating more closely related gene expression profiles. Whether or not these groupings have clinical significance is determined subsequently.
In contrast, class comparison studies are hypothesis driven and start with two or more predefined groups based on clinically meaningful endpoints, such as patients who develop early metastatic disease versus those who do not, or patients who respond to a particular therapy versus those who progress on treatment. The microarray-derived gene expression profiles of the two groups are compared using supervised analysis methods to determine whether there is a genetic basis for the differences in clinical outcome and, if present, to identify which genes or functional gene groups appear to be involved.
Class prediction is similar to class comparison as a hypothesis-driven, supervised analysis; the objective of class prediction, however, is to use the identified gene expression differences between the classes of interest to develop a multigene algorithm (a predictor or gene signature) that can be applied to expression profiles of samples whose class is unknown to predict the class membership of the new sample (e.g., a particular breast cancer subtype or clinical outcome). Because the genes of interest are already identified, class prediction studies often start with a much more limited set of candidate genes. Whereas class discovery and class comparison are both examples of a top-down method of conducting a microarray experiment, class prediction is considered a bottom-up method for microarray study design.
One of the seminal experiments in applying gene expression profiling to breast cancer, and perhaps the most prominent example of a class discovery microarray study, is the work reported by Perou and colleagues, in which they performed unsupervised hierarchical clustering of the gene expression profiles of 65 breast tissue samples. Using cDNA microarrays representing 8,102 human genes, the authors analyzed the gene expression profiles of malignant and benign breast lesions from a cohort of 43 patients (36 invasive ductal carcinomas, 2 invasive lobular carcinomas, 1 ductal carcinoma in situ, 1 fibroadenoma, and 3 normal breast samples, in addition to a number of biological replicates from the same patient). From the data, they defined a set of intrinsic genes comprising genes that showed significantly greater variation between tumors from different patients compared with paired tumor samples from the same patient. Hierarchical cluster analysis using this set of intrinsic genes identified four major groupings, or molecular subtypes, of breast cancer: luminal-like, human epidermal growth factor receptor 2 (HER2) positive, basal-like, and normal breastlike. A subsequent study from the same group using a larger number of tumors, and correlation with outcome data, confirmed the presence of the molecular subtypes of breast cancer and, in addition to the original subtypes, showed that the luminal-like tumors could be divided into at least two subgroups: luminal A and luminal B ( Fig. 10.4 ). The authors also demonstrated that the molecular subtypes were associated with very different clinical outcomes. Ensuing studies using additional microarray data sets confirmed the presence and clinical relevance of breast cancer–intrinsic subtypes ( Fig. 10.5 ). Additional details are provided in Chapter 20 .
Since the initial description of intrinsic molecular subtypes in breast cancer, it is now firmly established that breast cancer does not represent a single disease process, and luminal A, luminal B, HER2-enriched, and basal-like have become integrated into the clinical realm. Based on their unique complement of genetic derangements, these molecular subtypes exhibit vastly differing biological behaviors. Both HER2-enriched and basal-like tumors are rapidly proliferative and highly aggressive, yet most responsive to chemotherapy (and anti-HER2–targeted therapy in the case of HER2-positive [HER2+] disease). Relapses occur early, with the majority occurring within 5 years of diagnosis. Luminal tumors, conversely, present a much broader range of behaviors, with favorable but chemoresistant and notoriously late-recurring luminal A tumors on the one hand, and aggressive, variably chemoresponsive luminal B tumors on the other.
In clinical practice, the estrogen receptor (ER)–positive (ER+)/HER2-negative (HER2–) cohort of patients (representing luminal tumors) is the most commonly encountered and represents often the most challenging treatment scenario. Many women within this category are, in fact, overtreated and subjected to the morbidity of cytotoxic chemotherapy for negligible benefit. Identifying those tumors with more aggressive biology that stand to benefit from the addition of chemotherapy versus those that are adequately treated by endocrine therapy alone has been the clinical impetus for the development of several gene expression assays and is the current indication where these assays play a role in clinical decision-making. Despite the multitude of reported prognostic signatures, only a minority of these assays have entered into clinical practice ( Table 10.2 ).
Test | Company | Specimen type | Training sample | Number of genes | Output | Indicated patient population | FDA clearance | Guidelines incorporating test | Prospective randomized trial |
---|---|---|---|---|---|---|---|---|---|
Mamma Print | Agendia, (Amsterdam, the Netherlands) | Fresh or FFPE | Microarray data from 78 patients (<55 years, node negative, tumor <5 cm, ER+/–, HER2+/–, most without systemic therapy) | 70 | High risk or low risk for 10-year distant recurrence | FDA cleared for women of all ages diagnosed with stage I or II invasive breast cancer, tumor size ≤5.0 cm, ER+/–, HER2+/–, lymph node negative | Yes |
|
MINDACT |
Oncotype DX | Genomic Health (Redwood City, Calif.) | FFPE | RT-PCR data from 447 FFPE samples from 3 clinical trials (most heavily weighted ER+, HER2+/–, node negative, tam treated; also used node positive, ER +/–, tam treated or chemo treated) | 16 prognostic genes + 5 reference genes | Recurrence score: low, intermediate, or high risk for 10-year distant recurrence | Newly diagnosed breast cancer, stage I– IIIa, ER+, HER2–, node negative or 1–3 positive nodes | No |
|
|
Prosigna (PAM50) | Veracyte (South San Francisco, Calif.) | FFPE or fresh | For original PAM50 algorithm:
|
50 tumor-related genes + 8 reference genes | Risk of recurrence score: low, intermediate, or high risk for 10-year distant recurrence |
|
Yes |
|
Embedded correlative science substudy in RxPONDER trial |
MapQuant Dx Genomic grade index | QIAGEN (Marseille, France) | Fresh and FFPE version | Microarray data from 64 ER+ tumor samples (33 histological grade 1 tumors and 31 histological grade 3 tumors) | 97 | High genomic grade vs. low genomic grade; plus equivocal category for clinical test | Patients with histological grade 2 breast cancer | No | None | |
EndoPredict | Sividon Diagnostics GmbH (Koln, Germany) | FFPE | Microarray data and FFPET from 964 tumors from 6 different patient cohorts (ER+, HER2–, tam treated) | 8 cancer genes + 3 control genes | EP and EPclin low risk and high risk for 10-year distant recurrence | Newly diagnosed breast cancer, ER+, HER2–, node negative or 1–3 nodes positive | No |
|
None |
Breast Cancer Index | Bio Thera-nostics (San Diego, Calif.) | FFPE |
|
72-gene H/I ratio + 5-gene molecular grade index |
|
Patients with newly diagnosed ER+ node-negative breast cancer, and patients who are recurrence free after an initial 5 years of adjuvant endocrine therapy (to assess benefit of extended hormonal therapy) | No |
|
None |
The first successful breast cancer prognostic signature developed using a top-down approach was the 70-gene signature by van’t Veer et al. The signature was trained using archived, fresh frozen breast cancer specimens from a cohort of 78 predominantly systemically untreated patients, all younger than 55 years with tumors less than 5 cm, negative nodal status, and a mix of positive and negative ER and HER2 status ( Fig. 10.6A ). The samples were divided into two groups: those that developed metastatic disease within 5 years and those that remained metastasis-free for at least 5 years. Supervised analysis of approximately 25,000 genes (Agilent oligonucleotide Hu25K microarray, Agilent Technologies, Santa Clara, Calif.) identified approximately 5,000 genes that were significantly regulated, of which 231 genes were significantly correlated with outcome between the two groups. These 231 genes were rank-ordered based on the magnitude of their correlation coefficient. The prognostic classifier was optimized by sequentially adding in five genes from the top of the list, followed by evaluating the ability of the classifier to accurately classify using the “leave-one-out” method of cross-validation. The best prognostic accuracy was achieved with 70 genes, which form the basis of the 70-gene signature, subsequently commercialized as the MammaPrint assay (Agendia, Amsterdam, the Netherlands), that stratifies patients as being high risk or low risk for early metastatic recurrence. Because of the potential for withholding adjuvant chemotherapy from the good prognosis group, the authors adjusted their optimal accuracy threshold (most accurate cutoff point for classifying tumors to the correct outcome group) to an optimized sensitivity threshold that was set so that no more than 10% of poor prognosis tumors would be misclassified to the good prognosis group ( Fig. 10.6B ). The 70-gene signature was then tested in a cohort of 19 different patients and was shown to correctly classify 17 out of 19 tumors, thus validating the robustness of the 70-gene prognostic classifier.
There are some points to consider regarding the development of the 70-gene signature. As stressed throughout this chapter, because omics-based data sets are composed of an extremely large number of molecular measurements relative to a small number of samples, overfitting of the data is a major concern. In the case of the 70-gene signature, the subset of 231 genes with the highest correlation with clinical outcome was selected from the original 25,000 genes using all samples, and only then was cross-validation performed to select the final 70 genes. This type of incomplete cross-validation can lead to significant overfitting, resulting in an omics-based test that performs well in cross-validation but results in much less discriminatory power on subsequent patient samples. The small sample size of both the training and the test sets is also worth mentioning, and certainly increased the risk of overfitting.
Analytic validation of the MammaPrint assay confirms the reproducibility and precision of the test, with a reported maximum variation of 5% in multiple samplings of the same tissue. Clinical validity of the 70-gene signature was tested on a retrospective cohort of primary breast cancer samples from 295 patients (Netherlands Cancer Institute [NKI] data set), including 151 node-negative samples, 61 of which were used in the training of the signature. The study confirmed that the 70-gene signature was an independent predictor of outcome when included in a multivariate survival model together with clinicopathological parameters and therapy ( Fig. 10.7 ), and that it could stratify the prognostic subgroups defined by the St. Gallen and National Institutes of Health (NIH) criteria.
The validation sample, however, was not entirely independent, as the study did include 61 patients used in the training of the predictor, thereby overestimating the discriminatory power of the test. This was well demonstrated in the following validation study that used completely independent samples from node-negative, systemically untreated patients from the TRANSBIG consortium of European cancer centers. The authors also included the 151 node-negative patients from the aforementioned NKI mixed training/validation study as a comparison. The independent samples showed an adjusted odds ratio for time to distant metastasis in the MammaPrint high-risk group of 2.1 as compared with a 6.1 odds ratio in the mixed training/validation cohort; and the odds ratio for overall survival was 2.6 in the independent samples versus 17.5 in the mixed training/validation cohort ( Fig. 10.8 ). This highlights the inflation of discriminatory power seen when one includes training samples into the validation set and points to overfitting of the signature. Length of follow-up likely also contributed to the discrepant odds ratios between the two studies, as the 70-gene signature was shown to be highly time dependent, with better discriminatory power for shorter follow-up times; the median length of follow-up for the mixed training/validation NKI cohort was 6.7 years versus 13.6 years in the independent validation study. The independent TRANSBIG samples did validate MammaPrint to be prognostic, however, and to better risk stratify patients than Adjuvant! Online. (Adjuvant! Online was a popular online decision-making tool commonly used by oncologists to quantify the risk and benefit of adjuvant systemic therapy in a particular breast cancer patient based on clinicopathological factors. It was developed in 2001 and was shut down for updates in 2015. There is currently no information about a relaunch of the website.) The MammaPrint assay has further been validated to be prognostic in several additional retrospective cohort studies. Notably, the requirement for fresh tissue precluded development or validation using homogeneously treated clinical trial samples, which are preferable to heterogeneously treated convenience samples, and also prevented the types of prospective-retrospective studies using archived clinical trial material that are a proposed alternative route to generating level 1B evidence and have been used by the newer generation of gene expression signatures.
MammaPrint was initially evaluated in a prospective observational study: the microarRAy prognoSTics in breast cancER (RASTER) study, which was conducted in 16 community hospitals in the Netherlands to assess the feasibility of implementing MammaPrint into a community-based setting and to study the impact the test result had on clinical decision-making with respect to use of adjuvant systemic therapy. At 5 years of follow-up, the 5-year distant recurrence–free interval (DRFI) in patients with MammaPrint low-risk and Adjuvant! Online high-risk classification ( n = 124) was 98.4%, of which 76% of patients had not received adjuvant chemotherapy. These findings demonstrate that MammaPrint adds prognostic information beyond standard clinicopathological factors and laid the foundation for the international, prospective, randomized phase III MINDACT clinical trial ( M icroarray I n N ode-negative and 1 to 3 positive lymph node D isease May A void C hemo T herapy). The trial recruited 6,693 patients who were evaluated by both Adjuvant! Online and the 70-gene signature. Patients characterized as low risk in both assessments did not receive chemotherapy, whereas for patients characterized as high risk in both assessments, chemotherapy was advised. Patients with discordant results were randomized to use either the Adjuvant! Online or the 70-gene signature risk classification for treatment decision-making ( Fig. 10.9 ). Five-year outcome data have shown that patients who were classified as low risk by both assessments ( n = 2,745) had 5-year distant metastasis–free survival (DMFS) of 97.6% without use of chemotherapy, compared with 90.6% for patients who were classified as high risk by both assessments ( n = 1,806) and received chemotherapy. Within the discordant group ( n = 2,142), 592 patients were low risk by Adjuvant! Online but high risk by MammaPrint, and 1,550 patients were high risk by Adjuvant! Online but low risk by MammaPrint. The primary statistical analysis for the trial was based on the subset of this latter cohort (who were high risk based on clinicopathological factors but low risk by molecular profile) who did not receive chemotherapy ( n = 644). The 5-year DMFS in this group was 94.7% (95%, confidence interval [CI] = 92.5%–96.2%), which met the trial definition of a successful result (prespecified as a 5-year DMFS greater than 92%). Unfortunately, MINDACT was not powered to address the question of whether chemotherapy benefited the patients in the discordant groups. In the intent-to-treat analysis, for the Adjuvant! Online high/MammaPrint low group, 5-year DMFS was 1.5 percentage points higher with chemotherapy (95.9%) than without (94.4%) (hazard ratio [HR] = 0.78, P = .27). For the Adjuvant! Online low/MammaPrint high group, DMFS rates with and without chemotherapy were 95.8% and 95.0%, respectively (HR = 1.17, P = .66). Disease-free and overall survival rates showed similar nonsignificant trends. Overall, however, survival was excellent within the discordant groups, regardless of whether chemotherapy was administered, and any benefit from chemotherapy, if real, was modest at best. Therefore, in the context of an Adjuvant! Online low-risk result, there is no proven added benefit from a MammaPrint assay. Within the Adjuvant! Online high-risk group, 46% of patients in the trial had a MammaPrint low-risk result, which could result in a significant reduction in chemotherapy prescriptions. Ultimately, the trade-off between a possible small benefit from chemotherapy and the toxicity of chemotherapy remains in the hands of the individual patient and clinician.
Another point to consider is that the 70-gene signature is known to classify nearly all ER-negative (ER–) patients as high risk (96%–100% of patients in prior studies and 96% of the ER/progesterone receptor-negative [PgR–] tumors in the MINDACT trial ). However, so does Adjuvant! Online. Consequently, most (96%) of the patients randomized in the MINDACT trial had ER+ tumors because very few discrepancies in classification would be expected for the ER– group. In fact, most (81%) of the patients enrolled in the trial had ER+, HER2– tumors due to inherent enrollment bias (i.e., both the patient and oncologist had to be comfortable with the possibility of withholding chemotherapy in the event of a low-risk result). It may have been beneficial, therefore, to have focused the development and validation of the gene signature on ER+ patients who were receiving endocrine therapy (as was the case for Oncotype DX and EndoPredict [discussed later]). In the MINDACT trial, approximately 5% of tumors classified as low risk by MammaPrint were HER2+. It should also be noted that in trastuzumab-naïve patients, 2% to 22% of HER2+ breast cancers have been shown to have a good prognosis 70-gene signature; however, withholding chemotherapy and anti-HER2 agents in this group remains controversial and may be considered as an appropriate option in elderly patients older than 70 years of age.
Preplanned analysis of the MINDACT trial with an updated 8.7 years of follow-up has further reaffirmed the utility of MammaPrint within the group of patients ( n = 644) with high clinical risk/low genomic risk showing that therapeutic de-escalation with omission of chemotherapy yielded excellent 5-year DMFS of 95.1% (95% CI = 93.1%–96.6%) irrespective of the nodal status. The underpowered exploratory analysis by age showed that this gain was seen in patients older than 50 years of age. In patients younger than 50 years, a 5% benefit from the addition of chemotherapy was attributed to chemotherapy-induced suppression of the ovarian function. Further studies have shown that the use of MammaPrint impacts the recommendations for therapeutic escalation or de-escalation and positively influences the physician’s confidence for making such treatment decisions.
Following the results of the MINDACT trial, a level of evidence (LoE) and grade of recommendation (GoR) of IA have been achieved for this prognostic signature. MammaPrint has been cleared by the US Food and Drug Administration (FDA) and marked by the Conformité Européenne (CE). It is performed in two companies’ central laboratories: one in the United States and one in the Netherlands. Originally developed for fresh tissue, MammaPrint received FDA 510(k) clearance in 2007 as an in vitro diagnostic multivariate index assay (IVDMIA). More recently, the assay has been adapted to FFPET and has also received FDA 510(k) clearance for the FFPE version. FDA-approved indications include use as a prognostic test for women younger than 61 years of age with lymph node–negative, stage I or II invasive breast carcinoma, with tumor size 5.0 cm or smaller and any ER and HER2 status. The use of MammaPrint is recommended in the recent guidelines from the American Society of Clinical Oncology (ASCO), National Comprehensive Cancer Network (NCCN), and European Society for Medical Oncology (ESMO) for consideration in hormone receptor–positive patients with high clinical risk (irrespective of the nodal status) for identifying good prognostic tumors where the benefit of chemotherapy is limited.
Recognizing the clinically relevant prognostic information contained within the biology of intrinsic molecular subtypes, Agendia has expanded its breast cancer assays to include molecular subtyping (BluePrint) as well as providing quantitative ER, PgR, and HER2 mRNA expression levels by microarray (TargetPrint). The BluePrint/MammaPrint assay allows for functional molecular subtype classification, which has shown better correlation with treatment response to neoadjuvant therapy than subtype based on standard ER, PgR, and HER2 assessment alone. Most recently, the company has developed kits to allow the MammaPrint and BluePrint tests to be performed on-site at reference laboratories that successfully complete the Agendia Partner Reference Lab certification process. Both BluePrint and TargetPrint are laboratory-developed tests, and neither is part of the FDA clearance for MammaPrint. The American Society of Clinical Oncology and College of American Pathologists (ASCO/CAP) 2013 guidelines reiterate that microarray and gene expression platforms are currently unsuitable for clinical HER2 testing; similarly, hormone receptor testing via gene expression has yet to be clinically validated for directing treatment decisions.
Oncotype DX (Genomic Health, Redwood City, Calif.) differs from its predecessors in several important respects. It is a qRT-PCR test that is performed on FFPE tumor specimens. It has opened the door to permitting use of an incredibly valuable asset: archival paraffin tumor blocks from previous clinical trials. In addition, the Oncotype DX signature was developed using a purely bottom-up approach. Initially, 250 candidate genes were selected from the published literature, genomic databases, and previous DNA microarray studies, including intrinsic subtypes and the 70-gene signature. Corresponding primer sets were created and qRT-PCR was used to generate quantitative expression levels of the genes from 447 FFPE samples from three separate clinical studies, including 233 samples from the tamoxifen-only arm of the National Surgical Adjuvant Breast and Bowel Project B-20 (NSABP B-20) trial. These latter samples, corresponding to ER+, node-negative tumors from tamoxifen-treated women, represented the most relevant patient population and were most heavily weighted in anticipation of validating on the similar NSABP B-14 trial. Genes were selected based on their correlation with recurrence across the studies as well as the consistency of the primer pair performance. The resulting algorithm is a 21-gene signature (16 prognostic genes plus 5 reference genes), which generates a 0–100 range of recurrence score (RS) that classifies patients as low, intermediate, or high risk of recurrence.
The 21-gene signature was clinically validated in a completely independent sample set: patients from the tamoxifen arm of the NSABP B-14 trial (a tamoxifen versus placebo trial for ER+ breast cancers). The study was conducted as a rigorous prospective-retrospective study, with a locked-down computational model and predefined statistical analysis plan including prespecified outcome endpoints and cutoff points for RS. The study confirmed the ability of the signature to distinguish prognostically distinct groups based on risk group assignment. In this study, the rates of distant recurrence were 6.8%, 14.3%, and 30.5% for the low- (RS <18), intermediate- (RS ≥18 and <31), and high-risk groups (RS ≥31), respectively.
The second Oncotype DX validation study, designed to assess whether the signature predicts for benefit from chemotherapy, was performed using specimens from the NSABP B-20 trial (randomizing women with ER+ breast cancer to tamoxifen only versus tamoxifen plus chemotherapy). The results demonstrated that women assigned a low-risk RS showed no statistically significant difference in distant relapse–free survival (DRFS) with the addition of chemotherapy; whereas within the high-risk group, women treated with chemotherapy had significantly improved DRFS compared with the tamoxifen-only arm ( Fig. 10.10 ). Once again, however, this is an example of a pervasive methodological flaw in earlier validation studies: mixing of training and validation sets. Samples from the NSABP B-20 tamoxifen-only arm were used (and most heavily weighted) in the training of the algorithm. Although it is no surprise that highly proliferative tumors would respond most to chemotherapy (RS most heavily weighs proliferation-associated genes), including training samples in the validation set overinflates the benefit seen by the addition of chemotherapy in this group, as well as the predictive capacity of the Oncotype DX risk stratification. Furthermore, the study included HER2+ tumors, which are not part of the current clinical indication. (At the time, the HER2 story was still unfolding, and trastuzumab had not yet been approved outside the metastatic setting.) In the clinical setting, where HER2+ tumors are excluded from Oncotype DX testing, the proportion of cases falling into the high-risk group is significantly lower than reported within the original NSABP B-14 and B-20 trial cohorts (both of which included HER2+ tumors). One study reported only 10% of clinical tumor samples being classified as high risk (versus 25% in the validation studies) with a proportional rise in the percent of cases classified as intermediate RS (40% of clinical samples versus 20% in the validation studies). An exploratory analysis of the NSABP B-20 study has recently addressed this controversial issue by eliminating HER2+ tumors (defined by a cutoff of ≥11.5 as determined by reverse-transcriptase polymerase chain reaction [RT-PCR]). The reduced number of events after exclusion of HER2+ tumors yielded statistically insignificant benefit from addition of chemotherapy to tamoxifen versus tamoxifen alone in the overall NSABP B-20 cohort (DRFS: 93% vs. 90%; P = 0.06); however, a significant benefit was maintained with the inclusion of chemotherapy to tamoxifen in tumors with an RS ≥31 (HR = 0.18; 95% CI: 0.07–0.47; P <0.001).
Despite these initial methodological shortcomings, Oncotype DX has been successfully clinically validated as a prognostic assay in numerous studies. Analytic validity of the assay has also been demonstrated. Currently, Oncotype DX is the most widely used of the prognostic breast cancer molecular tests in clinical practice. Typical indications for Oncotype DX are in a patient with a node-negative, ER+, HER2– tumor where the benefit of adjuvant chemotherapy is in question. The role of Oncotype DX in node-positive patients continues to be defined. One of the initial studies was a prospective-retrospective designed study using archived tumor material from a randomized clinical trial (the Southwest Oncology Group [SWOG] S8814 trial) in postmenopausal, axillary lymph node–positive, ER+ breast cancer. The results showed that patients with a high RS benefited from the addition of chemotherapy to tamoxifen, whereas patients with a low RS did not show significant benefit. As the HER2+ tumors were not excluded from the analysis, the performance of the assay in the relevant ER+, HER2– population is unclear. A subsequent combined analysis of five studies including more than 9,000 node-positive patients (which also included the SWOG S8814 trial) treated with endocrine therapy alone showed that a low RS reliably categorized patients harboring limited nodal metastasis with favorable prognosis, though the long-term follow-up is awaited.
The benefit of adding chemotherapy to endocrine therapy within the intermediate RS group has been evaluated in two prospective randomized trials: TAILORx and RxPONDER. In the TAILORx trial ( Fig. 10.11A ), patients with node-negative, hormone receptor–positive, HER2– breast cancer receive the Oncotype DX assay. Women with an RS less than 11 receive hormonal therapy, women with an RS greater than 25 receive chemotherapy in addition to hormonal therapy, and women in the middle range (11–25) are randomized to chemotherapy plus hormonal therapy versus hormonal therapy alone. These cutoffs differ from the current risk group cutoff points, where the clinical intermediate score is 18 to 30. The study recruited 10,253 eligible women. Initial 5-year outcome results for the low-risk (RS of 0–10) group, which comprised 15.9% of the eligible patient population, showed that after endocrine therapy alone, the 5-year distant recurrence–free rate was 99.3%, freedom from any recurrence was 98.8%, and overall survival was 98%. Although widely anticipated, this is an important result. These data confirm that these very low-risk women are safely treated by endocrine therapy alone and that there is a negligible role for the addition of chemotherapy. More important, the results of the 67.3% of patients falling into the midrange RS of 11 to 25 who were randomized to the addition of chemotherapy versus endocrine therapy alone were recently reported at a 9-year follow-up showing that endocrine therapy was noninferior to chemoendocrine therapy for invasive disease–free survival (83.3% vs. 84.3%), distant recurrence–free survival (94.5% vs. 95%), and freedom from any recurrence (92.2% vs. 92.9%). Within the high RS range (26–100), where all women were recommended chemotherapy in addition to endocrine therapy, the estimated 5-year freedom from distant recurrence was 93%, which is better than expected with endocrine therapy alone.
RxPONDER ( Fig. 10.11B ) is an ongoing prospective randomized clinical trial including more than 5,000 eligible women designed to investigate chemotherapy benefit in hormone receptor–positive/HER2– breast cancer with 1 to 3 positive nodes subjected to Oncotype DX testing. Those with an RS of 0 to 25 are randomized to hormonal therapy alone versus chemotherapy plus endocrine therapy. The results of the interim analysis demonstrate that adjuvant therapy can be safely de-escalated to endocrine therapy alone in postmenopausal women with limited node-positive disease as no benefit is derived from addition of chemotherapy. In contrast, among premenopausal women, the addition of chemotherapy reduced the hazards for an invasive event by 46% regardless of the clinicopathological variables, and yielded superior 5-year overall survival compared to women treated with endocrine therapy alone. These findings could possibly be attributed to chemotherapy-induced ovarian suppression, as also previously reported in the SOFT/TEXT trial; however, as the information related to chemotherapy-induced amenorrhea was not collected, it remains a speculation. Long-term results with 15-year follow-up data will be reported at a later date.
Note that, unlike the MINDACT trial, neither of these studies compares outcome based on use of Oncotype DX for clinical decision-making compared with treatment decisions based on traditional clinicopathological variables alone, widely regarded as the definition of a clinical utility study.
Become a Clinical Tree membership for Full access and enjoy Unlimited articles
If you are a member. Log in here