Biomarkers for Assessing Risk of Cancer


Cancer is a complex disease involving both environmental and genetic determinants. The majority of cancers are caused by environmental and lifestyle factors, including smoking, alcohol use, infectious agents, occupation, diet, obesity, and lack of physical activity. However, only a small number of cancers, such as lung cancer (smoking) and cervical cancer (human papillomavirus [HPV] infection), have a major environmental risk factor that accounts for the bulk of disease in the population. For most cancers, the exposures have weak effects, and how to better assess the exposures and their effects remains a challenge. On the other hand, even for those cancers with a predominant environmental risk factor, only a small fraction of exposed individuals develop the cancer. For example, it is estimated that only 1 in 10 smokers develops lung cancer, and the rate of developing cervical cancer in high-risk carriers of HPV is even lower. Our understanding of the etiology of cancer in terms of environmental factors and genetic susceptibility is still rather limited, and the interplay among these etiological constituents is poorly understood. In recent years, biomarkers have been playing an increasingly important role in cancer etiological study to help better determine exposures, evaluate effects, and assess susceptibility.

Figure 21-1 shows the spectrum of biomarkers along the continuum of carcinogenic process from environmental exposure to cancer development. Biomarkers of cancer etiology can be classified into three broad categories: biomarkers of exposure, biomarkers of effect, and biomarkers of susceptibility. Biomarkers of exposure indicate the presence of a carcinogenic compound or its biological interactions with cellular molecules, which can further be categorized into biomarkers of internal dose (e.g., carcinogens or their metabolites in bodily fluids, and circulating antibodies to infectious agents) and biomarkers of biologically effective dose (e.g., DNA and protein adducts). Biomarkers of effect are biological indicators of the body’s response to exposure and reflect the interaction between exposure and the human body. Biomarkers of effect encompass a broad array of early biological responses and altered structure and function of cells and tissues, such as chromosomal instability, gene mutation, epigenetic alteration (DNA methylation, histone modification, chromatin remodeling, and microRNA expression), changes in mRNA transcription and protein expression, and altered cell structure and function. Biomarkers of susceptibility (cancer risk) can be derived from each of the steps along the continuum of the carcinogenic process, reflecting interindividual variations in absorbing, distributing, metabolizing (activating and detoxifying), and excreting carcinogens; sensitivity to formation of macromolecule (DNA and protein) adducts; and ability to repair these adducts and eliminate damaged and premalignant cells. This categorization, however, is somewhat arbitrary, and the distinction among the three categories of biomarkers could be blurred. For instance, some chromosome instability biomarkers (e.g., micronuclei and chromosome aberrations) can be considered markers of effect as well as markers of susceptibility, because these markers not only indicate the early biological effect of exposure, but also reflect an individual’s ability to metabolize carcinogens and repair DNA damage. Moreover, these markers have certain degrees of genetic heritability and can predict the risk of cancer independent of exposure level. Therefore, the interplay among these three categories of biomarkers is crucial for a complete understanding of cancer etiology. In the next few sections, we highlight some examples of biomarkers of exposure and biomarkers of effect. We also describe in more detail our current knowledge of biomarkers of cancer susceptibility.

Figure 21-1
Biomarkers in cancer etiology
Biomarkers in cancer etiologic studies can be classified into three broad categories: biomarkers of exposure, biomarkers of effect, and biomarkers of susceptibility. Biomarkers of susceptibility (cancer risk) can be derived from each of the steps along the continuum of the carcinogenic process, reflecting interindividual variations in absorption, distribution, metabolism (activating and detoxifying), and excretion of carcinogens; sensitivity to formation of macromolecule (DNA and protein) adducts; and ability to repair macromolecule damage, restore normal cellular functions, and eliminate premalignant cells.

Biomarkers of Exposure

Biomarkers of exposure can provide the most direct evidence of human exposure to a carcinogen as well as internal dose and biologically effective dose, affirm exposure-cancer associations established from traditional epidemiological approaches, and provide biological plausibility for the observed exposure-cancer association.

Biomarkers of Internal Dose

Biomarkers of internal dose measure levels of a carcinogen or its metabolite in human tissues, bodily fluids, and excreta. These biomarkers are not bound to cellular targets but provide a measure of exposure, absorption, metabolism, and excretion. There is generally a good correlation between external exposure and internal dose; however, the involvement of absorption and metabolism and interindividual variation in these processes suggest that the relationship may not always be simple. One of the classical examples of validated biomarkers of internal dose that contributed greatly to elucidate the environmental cause of human cancer is urinary aflatoxin and its metabolites. Aflatoxins have long been suspected to be human hepatic carcinogens, but the strongest evidence came from a prospective nested case-control study in which the authors measured urinary aflatoxin B1 (AFB1), its metabolites AFP1 and AFM1, and DNA adducts (AFB1-N7-Guanine) to assess the relation between aflatoxin exposure and liver cancer. Subjects with liver cancer were more likely to have detectable concentrations of any of the aflatoxin metabolites than controls, and the highest relative risk was for AFP1 (6.2-fold). Moreover, there was a strong interaction between chronic hepatitis B infection and aflatoxin exposure in liver cancer risk. Tobacco-specific metabolites are the most studied biomarkers of internal dose. Cotinine is the main metabolite of nicotine, and the measurement of serum/plasma cotinine offers higher accuracy than self-reports in assessing tobacco smoking. Two prospective studies have reported that higher levels of serum cotinine and urinary cotinine were associated with higher risk of lung cancer. The risk estimate of tobacco smoking and lung cancer from serum cotinine might be stronger than from questionnaire-based studies. The highest risk group had a 55-fold increased risk with no clear suggestion of a plateau in risk at high exposure levels, suggesting that analyzing the relationship between serum cotinine and lung cancer risk might contribute to a better quantitative assessment of tobacco-related lung carcinogenesis. Additional prediagnostic urinary metabolites of tobacco carcinogens, such as NNAL (a metabolite of tobacco carcinogen NNK) and PheT (a metabolite of tobacco carcinogen polycyclic aromatic hydrocarbons [PAHs]), have also been associated with increased risk of lung cancer. Other examples of biomarkers of internal dose include hormones or nutrients in body fluids and circulating antibodies to infectious agents (e.g., Helicobacter pylori in gastric cancer, hepatitis B virus [HBV] in liver cancer, and HPV in cervical cancer).

Biomarkers of Biologically Effective Dose

Many carcinogens are metabolically activated and bind to DNA and/or protein to form adducts. DNA adducts have been the most evaluated biomarker of biologically effective dose, and a broad range of different DNA adducts have been measured in human samples using various approaches, including 32 P-postlabeling, immunoassays and immunohistochemistry, and mass spectrometry. Because of the difficulty in measuring adducts in target tissues, almost all of the population studies have used easily accessible surrogate samples, typically plasma/serum, white blood cells, or urine. The first prospective study showing a significant association between carcinogen-DNA adducts and subsequent cancer development was the aforementioned aflatoxin study in liver cancer. Men with detectable urinary aflatoxin-DNA adduct (AFB1-N7-Guanine) and no HBV infection exhibited a ninefold increased liver cancer risk compared to men without detectable AFB1-N7-Guanine and no HBV infection. Tobacco smoke contains many different carcinogens, including PAHs, aromatic amines, and nitrosamines. Three prospective studies have evaluated the association of PAH-DNA or related aromatic-DNA adducts with the risk of lung cancer. In a pioneering nested case-control study within the prospective Physicians’ Health Study, the authors measured aromatic-DNA adducts in white blood cells (WBCs) at baseline and found current smokers who had higher levels of aromatic-DNA adducts in WBCs exhibited approximately threefold increased risk of lung cancer compared to current smokers with lower adduct concentrations. There were no associations in former and never smokers. Two subsequent prospective studies, a pooled analysis of the three prospective studies, as well as a meta-analysis of nine studies (including retrospective case-control studies) recapitulated the main observation that bulky DNA adducts are associated with elevated lung cancer risk in current smokers after a follow-up of several years. Protein adducts, such as hemoglobin (Hb) adducts of tobacco carcinogens and AFB1-albumin adducts, have also been used as biomarkers of biologically effective dose.

Biomarkers of Effect

Biomarkers of effect measure early biological alterations that occur in the time frame between exposure and cancer development and are also known as biomarkers of intermediate endpoints or intermediate biomarkers. Historically, the most commonly studied biomarkers of effect in surrogate tissues are those related to genotoxicity endpoints, such as chromosome aberrations (CAs) and micronucleus (MN) formation in peripheral blood lymphocytes (PBLs). A number of prospective epidemiological studies have reported positive associations between elevated chromosomal aberrations in PBLs and increased cancer risk. Importantly, a joint nested case-control study using subjects from two prospective cohort studies found that CA level in PBLs could predict future cancer development independent of environmental exposures (smoking and occupational exposure), supporting the notion that chromosomal instability markers could serve as both biomarkers of effect and biomarkers of susceptibility. A few prospective studies showed that individuals with higher frequencies of MN in PBLs had 1.5- to 2-fold increased relative risk of cancer compared with the low-frequency subjects.

Among various chromosome aberrations, chromosome translocation is one of the most well-established biomarkers of exposure and effect. Translocations have been observed in nearly every cancer type, and a single chromosome translocation can cause cancer—for example, Philadelphia translocation causes chronic myelogenous leukemia. Chromosome translocation can survive mitosis and is the most persistent of all the different types of chromosome exchanges. PBLs with radiation-induced translocations persisted for several decades in Japanese atomic bomb survivors. In contrast, other types of chromosome exchanges, for instance, dicentrics and acentric fragments, are unstable because they encounter difficulties during mitosis, which results in the affected cells being killed and eventually removed from the population, making them undetectable in long-term atomic bomb survivors.

The mutational analyses of cancers with distinct environmental exposure provide elegant examples of biomarkers of effect. Exposure-specific mutational fingerprints in tumor suppressor genes have been known for many years. The TP53 gene is the most frequently mutated gene in human cancers, and its mutational spectrum varies substantially by tumor site, which is at least partially due to distinct environmental exposure. For example, sunlight (UV exposure)-induced TP53 mutations in skin cancer occur exclusively at dipyrimidine sites, including a high frequency of C-to-T transitions and unique CC-to-TT double base changes that do not happen in other malignancies. In lung cancer, TP53 mutational patterns are different between smokers and nonsmokers, with an excess of G-to-T transversions in smoking-associated cancers. In liver cancer, a unique mutation (Arg249Ser) in TP53 was linked to aflatoxin B1 exposure. Moving beyond sequencing of candidate genes, the recent explosion of whole-genome sequencing (WGS) of cancer provides further evidence for profound effects of environmental exposure on the cancer genome. In the first WGS of a solid tumor genome, Pleasance and colleagues sequenced a malignant melanoma and a lymphoblastoid cell line from the same person. Consistent with observations in the TP53 gene, of the more than 30,000 base substitutions found in the tumor genome compared to the lymphoblastoid genome, about 70% were C-to-T transitions. Of these, 92% occurred at the 3′ base of dipyrimidine sites, much higher than expected for chance occurrence. These mutations are characteristic of UVB-induced DNA lesions. In the WGS of a small-cell lung cancer cell line and a lung adenocarcinoma, the mutational pattern (an excess of G-to-T mutations, an enrichment of CpG dinucleotides in the G-to-T mutation sets, etc.) is also consistent with that of TP53 and reflects the influences of tobacco carcinogens.

With the breathtaking pace of technological advancements in the biomedical field, and with all the high-throughput “omics” technology, the list of potential biomarkers of effect is long and growing, encompassing molecular and functional changes in the epigenome, transcriptome, proteome, metabolome, and so on. There have been numerous investigations in exploring these technologies in epidemiological studies to identify biomarkers of effect in relation to cancer etiology; however, most of these studies are not validated, and the roles of the potential biomarkers in cancer causation are not clear. For biomarkers of effect (intermediate biomarkers), prospective studies are always preferred over retrospective case-control studies to avoid “reverse causation.” Longitudinal evaluation of sequential samples should be the gold standard compared to cross-sectional and single time point analysis. There are many other issues in terms of study design, biospecimen collection and handling, assay reliability, and data reporting that are particularly important in the study of intermediate biomarkers to avoid spurious results.

Biomarkers of Susceptibility

A major focus of cancer etiological study in recent years has been the determination of cancer susceptibility based on genetic variability. The earliest evidence for genetic susceptibility to cancer came from epidemiological observations of increased cancer risk among relatives of cancer patients. The existence of many rare inherited syndromes that predispose patients to increased risks of certain cancers provides other evidence for genetic susceptibility to cancer. A large classical twin study estimated the genetic heritability of most common cancers to be between 20% and 40%. The identification of a large number of genetic susceptibility loci to common cancers by genome-wide association studies (GWAS) provides the strongest evidence for genetic susceptibility to common cancers. Biomarkers of susceptibility have provided significant biological insight into cancer etiology and may become potential targets for preventive and therapeutic interventions. Biomarkers of susceptibility can also improve risk prediction and identify high-risk populations for targeted surveillance, screening, and prevention. Further investigation of gene-environment interactions is critical to advance the understanding of human carcinogenesis and improve the accuracy of cancer risk prediction.

Earlier analyses in cancer susceptibility have mos舉tly focused on family-based linkage studies to identify high-penetrance genes whose mutations (mutation rate typically less than 0.1% in the general population) cause Mendelian cancer-predisposing syndromes. More recent efforts have mainly focused on association studies that compare variant allele frequencies of common single-nucleotide polymorphisms (SNPs) between a large number of cancer cases and unrelated controls. The underlying hypothesis for such an approach is the “common disease–common variant” (often abbreviated CD-CV) hypothesis, which postulates that genetic susceptibility to common diseases such as cancer is largely due to many common variants (typically having frequencies greater than 5% in the general population) with only modest effect conferred by each allele. The competing hypothesis is the “common disease–rare variant” (CD-RV) hypothesis, which suggests that multiple rare variants (typically having frequencies that lie between approximately 0.1% and 1%, the upper limit for Mendelian mutations and the lower limit of SNPs, respectively) of larger effect cause common diseases. However, recent evidence has suggested that these two hypotheses should not be viewed as an “either/or” choice, but rather that both common and rare variants may contribute to common diseases, and the degree of contribution by either variant depends on the particular disease phenotype. In cancer, for example, and as described later, several rare susceptibility variants for breast cancer have been identified by candidate gene sequencing and genotyping. Furthermore, next-generation sequencing studies have identified novel rare susceptibility variants for other cancers.

Depending on the population frequency of risk alleles and effect size, cancer genetic susceptibility markers can be roughly grouped into three classes: rare high-penetrance mutations, rare low- to moderate-penetrance disease-causing variants, and common low-penetrance SNPs ( Table 21-1 ).

Table 21-1
Three Classes of Biomarkers of Genetic Susceptibility to Cancer
Rare High-Penetrance Mutations Rare Low-to Moderate-Penetrance Variants Common Low-Penetrance Variants
Population frequency Rare, typically <0.1% Rare, MAF typically between 0.1% and 2% Common, MAF mostly >10%
Familial aggregation Yes No No
Cancer risk (odds ratio) ≥10 Mostly ≥2 Mostly between 1.1 and 1.5
Population-attributable risk Very small Small High
Functional significance Direct effect (causal) Direct effect (causal) Mostly nonfunctional, in LD with causal variants
Approaches for identification Linkage analysis followed by positional cloning, targeted sequencing of candidate genes Candidate gene, exome, or whole genome sequencing of genetically enriched cases, followed by large-scale association study Association study (candidate gene and GWAS) of unrelated cases and controls
Examples of genes Breast cancer: BRCA1 and BRCA2; retinoblastoma: RB1; CRC: APC, mismatch repair genes (MSH2, MLH1, MSH6); kidney cancer: VHL Breast cancer: ATM, CHEK2, BRIP1, PALB2, XRCC2; CRC: MUTYH; ovarian cancer: BRIP1; melanoma: MITF Breast cancer: FGRF2, CASP8; lung cancer: CHRNA3-CHRNA5, TERT; CRC: TGFBR1, SMAD7, CDH1; bladder cancer: NAT2, GSTM1
CRC, Colorectal cancer; GWAS, genome-side association studies; LD, linkage disequilibrium; MAF, minor allele frequency.

You're Reading a Preview

Become a Clinical Tree membership for Full access and enjoy Unlimited articles

Become membership

If you are a member. Log in here