Weighing the Benefits and Harms: Screening Mammography in the Balance

Plain Language Summary

Breast cancer affects millions of women worldwide. Screening mammography has the potential to detect breast cancer early, leading to more effective treatment, reduced chance of metastasis, and better survival and quality of life for the patient. Large trials performed in the 1970s and 1980s showed that breast cancer mortality was lower in women who were invited for mammography screening. This led to widespread implementation of breast cancer screening, predominantly in Western Europe and North America. However, the implementation of screening introduced harms in addition to benefits. The major harm of breast cancer screening is overdiagnosis, the detection of a breast cancer that would never have become symptomatic during a woman’s lifetime in the absence of screening. False-positive test results, a positive screening mammography result in a woman who does not have breast cancer, are the most common harm. Over recent decades, a debate has been ongoing as to whether the benefits of breast cancer screening outweigh the harms. In an effort to guide decision-making, a number of organizations have summarized the available evidence in reviews, balance sheets, and screening recommendations, aiming to help women and their physicians make an informed choice about screening. Despite the fact that all reviews are based on evidence from similar sets of trials and observational studies, some have concluded that screening should be stopped, while others recommend continuation of screening activities. In this chapter, we show that the differences between these reviews are at least partly related to choices about which studies to include, which screening strategies to consider, and how screening harms and benefits are defined. One of the challenges is that estimates of breast cancer mortality reduction due to screening are generally based on “old trials” and do not consider more recent observational studies that have evaluated mortality reductions using data from modern screening programs. Harms, on the other hand, are almost exclusively assessed based on today’s screening practice. Another factor that complicates the comparability of the reviews is that the benefits and harms are not expressed in the same way, using different measures and populations. Finally, there are differences across countries in the organization of health care, cultural factors, and medicolegal considerations that shift the relative balance of harms and benefits. This is best illustrated in the large difference in the risk of a false-positive screening result, which is much higher in the United States compared to Europe. Future evaluations of the benefits and harms of breast cancer screening should base estimates of breast cancer mortality, overdiagnosis, and false-positives on the same screening setting, including time period, age, screening test, screening frequency, and organization of screening. If the aim is to compare the balance of harms and benefits between countries, it is important to ensure that the balances are indeed comparable. Finally, those creating or applying balance sheets should be aware that additional challenges lie ahead with the implementation of new screening tests and more personalized screening strategies. All of these are likely to affect the current balance of benefits and harms.

Introduction

The primary aim of breast cancer screening is to reduce mortality from the disease, but it is well-understood that screening does harm as well as good. Mammography is the preferred screening test for early detection of breast cancer and has been studied in more than 600,000 women in 11 randomized trials over the past 50 years. Over the past several decades, mammographic breast cancer screening has been the subject of controversy, with some questioning whether the benefit in terms of mortality reduction is large enough to justify the recognized harms of screening, in particular overdiagnosis. Others reviewing essentially the same accumulated evidence have concluded that the pros outweigh the cons.

In light of this debate, this chapter focuses on reviews that evaluate the balance of screening mammography benefits and harms that have been used to guide decision-making or provide recommendations for breast cancer screening. We have selected the reviews to represent evidence from different settings where screening practices vary, but acknowledge that many more reviews could potentially have been included. In this sense, the selected reviews serve as examples to illustrate how researchers and decision-makers reach conclusions on the balance of benefits and harms. This provides the background for a discussion of the methodologies used to determine this information and present results and conclusions.

The balance of the benefits and harms of a breast screening program can be communicated in a number of ways and to a variety of audiences. The format depends on the purpose of the balance and the intended audience. In this chapter, the focus is on benefit/harm balance sheets as presented in the scientific literature rather than communication to individual women or decision-makers. However, these scientific balance sheets should serve as the primary source of information for estimates communicated to women in the target population, health professionals and policy-makers. Outcome measures may be chosen according to the purpose of communication, but by necessity should be based on the same evidence. However, recommendations based on the same evidence may still differ, depending on the relative weight that is placed on different outcomes.

Benefits

Breast Cancer Mortality Reduction

Screening mammography aims to reduce breast cancer mortality through detection and treatment of tumors at an early stage, leading to better survival than symptomatically detected tumors. As such, there is agreement across all reviews presenting a harm/benefit balance sheet on relative or absolute reductions in breast cancer mortality as the main benefit of screening. Nevertheless, some authors have suggested adopting all-cancer or all-cause mortality as the main outcome measure in order to avoid overestimation of the benefit due to bias in cause-of-death classification. Several studies have explicitly assessed the quality of cause-of-death determination in relation to mammographic screening and have found no significant evidence of bias. Further, the randomized trials of screening mammography were not designed to estimate overall or all-cancer mortality and were thus not powered to adequately estimate these outcome measures. Absence of evidence for an effect on these outcomes can thus not be taken to indicate evidence for absence of an effect. Estimates of screening benefit from randomized trials and observational studies are further detailed in Estimates of Screening Benefit: The Randomized Trials of Breast Cancer Screening, The Importance of Observational Evidence to Estimate and Monitor Mortality Reduction From Current Breast Cancer Screening respectively.

Effects of screening on breast cancer mortality can be quantified as lives saved or life-years saved. Evidence reviews typically translate decreases in breast cancer mortality risk into absolute effects in terms of lives saved. However, since absolute risk is higher among older women, most of the benefit accrues at older ages in which the total number of life-years saved may be relatively small. Life-years saved attributable to breast cancer screening have been reported by some cost-effectiveness analyses, but are not generally included in evidence reviews.

Other Benefits

Early detection of breast cancer only confers benefit if it is followed by appropriate treatment, resulting in a more favorable outcome than would have been achieved had the treatment been given later in the course of disease or not at all. Early detection and treatment are also expected to improve quality of life for the women diagnosed, since less invasive procedures are more likely to be an option when the tumor is detected at a more favorable stage, for example, breast conserving surgery as opposed to mastectomy. Thus the majority of women participating in screening do not experience a benefit, and even those with a screen-detected cancer benefit only if earlier treatment leads to reduced morbidity and mortality. Although improved quality of life is expected to result from screening for some women, this benefit is not typically considered in balance sheets.

Since the ultimate impact of service screening on breast cancer mortality is inevitably long term, there are several indicators that can assess the performance of a screening program in its early phases and that can also be used to predict whether an effect on breast cancer mortality is likely. These early benefits are mostly related to the stage shift that is introduced with screening and is expected to result in a higher rate of small cancers and a lower rate of advanced cancers. Although these measures are intuitive, there are many methodological challenges associated with their definition and interpretation. Moreover, assumptions need to be made in order to estimate their consequent effect on breast cancer mortality. The latter may explain why these outcomes are not usually considered in the evidence reviews.

Harms

Overdiagnosis and Overtreatment

Overdiagnosis, and resulting overtreatment, is regarded as the major harm of breast cancer screening. In cancer screening, overdiagnosis is defined as the detection of cancers that would not present symptomatically during one’s lifetime in the absence of screening. Although there is ongoing research focusing on identification of overdiagnosed tumors, at present the extent of overdiagnosis can only be estimated on the population level by comparing breast cancer incidence in the presence and absence of screening or by using simulation models.

Overdiagnosis is harmful in two major ways. The first harm is simply due to the unnecessary detection of a breast cancer. This diagnosis transforms women into cancer patients, a transformation that would not have taken place in the absence of screening. The second and major harm of overdiagnosis is overtreatment. Although some tumors will not become life-threatening during the life span of the woman, it is currently not possible to distinguish dangerous from nonlife-threatening cancers. As a consequence, women with an overdiagnosed cancer receive unnecessary treatment, referred to as overtreatment. To prevent unnecessary treatment, a few trials have begun to compare usual care with active surveillance, a “wait-and-see”-procedure, for cancers that have a high risk of being overdiagnosed, that is, low-grade ductal carcinoma in situ (DCIS). In a “wait-and-see”-procedure cancers are only treated if and when they progress.

The extent of overdiagnosis in breast cancer screening remains highly uncertain, with estimates ranging from 0% to 54%. A major reason for this disagreement is the difficulty in estimating overdiagnosis. Overdiagnosis in cancer screening can be estimated with a variety of study designs, including randomized trials, pathological or imaging studies, modeling studies, and observational studies. In breast cancer screening, the most common designs are randomized trials, observational studies, and modeling studies. However, each design has limitations. In randomized trials and observational studies, overdiagnosis is estimated by comparing breast cancer incidence in the presence and absence of screening. Breast cancer screening works by identifying cancers at an earlier, more treatable stage. As a result, after the initiation of screening there will be a transient increase in breast cancer incidence. In the absence of overdiagnosis, this increased incidence will be compensated for by a subsequent decrease in cancer diagnoses. To determine if the cancers diagnosed during screening were attributable to early detection or overdiagnosis, women should be followed after leaving screening to account for this effect. Ideally, this follow-up after leaving screening should be until death. In randomized trials, there may be incomplete adjustment for early detection, although women were followed for 15 years after leaving screening in the Canadian trials and in the Malmö trial. This leads to overestimation of the extent of overdiagnosis. In observational studies, longer follow-up periods are possible, but there is no comparable population that is not offered screening. As a result, breast cancer incidence in the absence of screening is estimated using extrapolation of prescreening trends, control regions, nonattenders, or adjustment for the effect of screening. Because unscreened populations may differ from screened populations in characteristics that are also related to breast cancer incidence, all observational studies of overdiagnosis have the potential for bias. The primary limitation of using modeling studies to estimate overdiagnosis is the heavy dependence of overdiagnosis estimates from these studies on modeling assumptions such as lead time.

False-Positives

False-positive test results are the most common harm of screening mammography. Conceptually, a false-positive is defined as a positive screening mammography result in a woman who is cancer free. In the United States and other settings where screening mammography interpretation is performed according to the American College of Radiology Breast Imaging Reporting And Data Systems (BI-RADS) Atlas, a positive examination has been operationalized as a screening mammography initial assessment of 0:Needs Additional Evaluation, 3:Probably Benign, 4:Suspicious, or 5:Highly Suggestive of Malignancy. In organized screening programs such as those in Europe and Australia, a positive screening result is defined by recall for further evaluation. Typically, recalled women who are determined to be cancer free at the end of diagnostic evaluation and for 1 year after the recall are considered to have experienced a false-positive. False-positive results can be subdivided into those receiving further evaluation with imaging only and those that undergo invasive procedures including biopsy or fine needle aspiration.

False-positive test results have a number of negative consequences. The greater the false-positive risk, the lower the efficiency of the screening program and the more unnecessary imaging is performed. This adds to the overall resource use and cost of screening. False-positives also have negative psychological consequences for the affected women. Studies have found that women receiving false-positive test results experience increased anxiety and psychological distress. This anxiety and distress is greater in women who undergo invasive procedures rather than additional imaging only. However, a recent study found that the associated anxiety resolved quickly once the women were determined to be cancer free. The experience of receiving false-positive test results can also be a deterrent to participation in future screening. This has been found to be the case in several of the organized screening programs in Europe and Canada. Conversely, in the United States women receiving false-positive test results have been found to be more likely to return for future screening. Finally, false-positive results lead to additional radiation exposure through subsequent mammography and mammography-guided biopsy. Although radiation exposure due to mammography is small, the aggregate burden among women experiencing repeated false-positives could become large and should be minimized to avoid the increased risk of radiation-induced cancer.

False-positive screening mammography results are common, occurring in approximately 10% of exams in the United States and 1–7% of mammograms in European service screening programs. False-positives are generally more common in younger women and those with dense breasts. Because false-positive mammography results are common, the proportion of women participating in regular screening who receive a false-positive result over the course of their screening participation is large. Most evaluations of the balance of harms and benefits have quantified this harm in terms of the cumulative false-positive risk of screening mammography, defined as the probability that a woman will receive at least one false-positive mammography result over the course of a fixed number of screening mammograms, typically either 10 or the total number recommended by the screening program.

Other Harms

There are a variety of screening mammography harms that are not typically explicitly included when evaluating the balance of benefits and harms. When considering the harms of screening, it is important to consider not only direct harms but harms that result indirectly from downstream effects of the screening process, such as unnecessary treatment resulting from incidental findings. However, evidence reviews have not typically included indirect harms.

Additional direct harms not typically considered by evidence reviews, can be divided into harms that are serious but very rare and those that are common but have a generally minor impact. A false-negative screening mammography result is one harm that is very serious but relatively rare. A false-negative occurs when a woman is diagnosed with cancer following a negative screening mammography assessment (in the BI-RADS lexicon, an assessment of 1:Negative or 2:Benign). A negative mammography result could give a woman false reassurance that she is cancer free and may lead to delays in seeking care for new symptoms. Some evaluations of screening mammography include measures of the diagnostic accuracy of screening mammography interpretation, which typically provide an assessment of false-negatives at a single screening round. However, these are not commonly included in evaluations of the balance of harms and benefits. Radiation-induced cancer is another harm that is very serious but considered extremely rare. Radiation-induced cancer may be more of a concern in women with very large breasts or breast augmentation who require extra views at each exam for full coverage of breast tissue. Additional minor harms of screening mammography include the pain of the examination itself. A systematic review found that 28–77% of women report experiencing pain associated with mammography. Pain due to mammography has been found to be associated with discontinuation of screening.

Reviewing the Balance of Harms and Benefits

We have selected a number of influential reviews from North America and Europe to serve as examples for the comparison of reviews of benefits and harms of breast cancer screening. In this section, we will describe the context of these reviews, the general approach adopted by the authors, the sources of data used and the main outcomes of the reviews, both in relative and absolute measures. Tables 3.1–3.3 summarize the data from these reviews for breast cancer mortality, overdiagnosis, and the cumulative risk of false-positives.

Table 3.1

Breast Cancer Mortality in Evidence Reviews

	Study Designs Selected; Included Studies	Intervention (Study Period; Age Groups; Screening Test; Screening Interval)	Relative Effect	Absolute Effect
Cochrane	Randomized trials ● 4/11 eligible trials included as adequately randomized: Malmo I, Canada I and II, UK age trial ● 6/11 eligible trials included as suboptimally randomized: HIP, Malmo II, Kopperberg, Östergötland, Stockholm, Göteborg ● Edinburgh trial excluded	● Adequately randomized trials: trials conducted between1976–1997; 293,135 women in age range 39–69; mammography with/without physical examination and/or self-examination; interval 12–24 months ● Subadequately randomized trials: trials starting between 1963 and 1981; 355,796 women in age range 38–75; mammography and physical examination; interval 12–33 months	Four trials with adequate randomization did not show a statistically significant reduction in breast cancer mortality at 13 years (relative risk (RR) 0.90, 95% confidence interval (CI) 0.79–1.02); five trials with suboptimal randomization showed a significant reduction in breast cancer mortality with an RR of 0.75 (95% CI 0.67–0.83). The RR for all nine trials combined was 0.81 (95% CI 0.74–0.87) after 13 years. Malmo II trial excluded from 13-year follow-up analysis	Assuming a 15% mortality reduction: For every 2000 women invited for screening throughout 10 years, one will avoid dying of breast cancer
Independent UK Panel	Randomized trials. 10/11 eligible trials included. Edinburgh trial excluded	Trials conducted between 1963 and 1997; 648,931 women in the age range 38–74 years; mammography with/without physical examination and/or self-examination; interval 12–33 months	Metaanalysis of these trials with 13 years of follow-up estimated a 20% reduction in breast cancer mortality in women invited for screening (RR =0.80, 95% CI 0.73–0.89)	● Assuming a 20% mortality reduction: For 10,000 UK women invited to screening from age 50 for 20 years, 43 deaths from breast cancer will be prevented ● For every 10,000 women attending screening from age 50–70 years, about 56 deaths from breast cancer would be prevented
USPSTF	Randomized trials. Selection depends on age: ● 39–49 years, 9/11 eligible trials included: HIP, Malmo I and II, Kopperberg, Östergötland, Canada I, Stockholm, Göteborg, and UK age trial ● 50–59 years, 7/11 eligible trials included: Malmo I and II, Kopperberg, Östergötland, Canada II, Stockholm, and Göteborg ● 60–69 years, 5/11 eligible trials included: Malmö I and II, Östergötland ● 70–74 years, 3/11 eligible trials included: Östergötland, Kopparberg, Malmo I	● 39–49 years: Trials conducted between 1963 and 1997; 609,472 women in the age range 38–75 years; mammography with/without physical examination and/or self-examination; interval 12–33 months ● 50–59 years: Trials starting between 1976 and 1981; 386,551 women in the age range 38–75; mammography with/without physical examination and/or self-examination; interval 12–33 months ● 60–69 years: trials starting between 1976–1978; 112,298 women in the age range 43–75; mammography with/without self-examination; interval 18–33 months ● 70–74 years: trial starting in 1978; 92,934 women in the age range 38–75; mammography with/without self-examination; interval 24–33 months	In metaanalysis, RR for breast cancer mortality in women age 39–49 based on nine trials was 0.88 (95% CI 0.73–1.003), for women 50–59 based on seven trials was 0.86 (95% CI 0.68–0.97), based on five trials for women 60–69 was 0.67 (95% CI 0.54–0.83), and for women 70–74 based on three trials was 0.80 (95% CI 0.51–1.28)	● Assuming a 12% mortality reduction: For every 10,000 women screened aged 40–49 for 10 years, four deaths from breast cancer will be prevented ● Assuming a 14% mortality reduction: For every 10,000 women screened aged 50–59 for 10 years, eight deaths from breast cancer will be prevented ● Assuming a 33% mortality reduction: For every 10,000 women screened aged 60–69 for 10 years, 21 deaths from breast cancer will be prevented ● Assuming a 20% mortality reduction: For every 10,000 women screened aged 70–74 for 10 years, 13 deaths from breast cancer will be prevented
American Cancer Society	Metaanalysis of 9 randomized trials (13-year follow-up). Metaanalysis of 14 observational studies: seven incidence-based mortality studies and seven case–control studies. One modeling study providing seven estimates	● Metaanalysis of RCTs: Trials published 1963–1991; 599,090 women age 39–74; screening mammography; interval not reported but available from the original study ● Metaanalysis of population-based observational studies: Studies published 1997–2012; >2 million women age >40; screening mammography; interval not reported but available from the original study ● Model-based estimates from seven models	● See Cochrane review: The RR for all nine trials combined was 0.81 (95% CI 0.74–0.87) after 13 years ● See EUROSCREEN: Pooled estimates of breast cancer mortality reduction among invited women were 0.75 (95% CI 0.69–0.81) in incidence-based mortality studies and 0.69 (95% CI 0.57–0.83) in case–control studies. Estimates for women actually screened were 0.62 (95% CI 0.56–0.69) in incidence-based mortality studies and 0.52 (95% CI 0.42–0.65) in case–control studies, corrected for self-selection. Interval not reported but available from the original study ● Model-based estimates: median reduction of 15% (range 7–23%)	● Assuming a 20% mortality reduction: 1770 women age 40–49, 1087 women age 50–59, and 835 women age 60–69 would need to be screened biennially for 15 years to prevent one breast cancer death ● Assuming a 40% mortality reduction: 753 women age 40–49, 462 women age 50–59, and 355 women age 60–69 would need to be screened biennially for 15 years to prevent one breast cancer death
Canadian Taskforce	Randomized trials. 10/11 eligible trials included. Selection depends on age: ● 39–49 years, 9/11 eligible trials included: HIP, Malmo I and II, Kopperberg, Östergötland, Canada I, Stockholm, Göteborg and UK age trial ● 50–69 years, 8/11 eligible trials included: HIP, Malmo I and II, Kopperberg, Östergötland, Canada II, Stockholm, Göteborg ● 70–74 years, 1/11 eligible trials included: Kopperberg, Östergötland	● 39–49 years: Trials conducted between 1963 and 1997; 211,270 women in the age range 39–74 years; mammography with/without physical examination and/or self-examination; interval 12–33 months ● 50–69 years: Trials starting between 1976 and 1981; 250,274 women in the age range 50–69; mammography with/without physical examination and/or self-examination; interval 12–33 months ● 70–74 years: trial starting in 1978; 17,646 women above 70; mammography with/without self-examination; interval 24–33 months	Pooled estimate of breast cancer reduction from eight trials in women 40–49 of 0.85 (95% CI 0.75–0.96), seven trials of women 50–69 of 0.79 (95% CI 0.68–0.90), and two trials of women 70–74 of 0.68 (95% CI 0.45–1.01). Results are reported after a median follow-up of 11.4 years	● Assuming a 15% mortality reduction: 2108 women aged 40–49 need to be screened to prevent one breast cancer death ● Assuming a 21% mortality reduction: 721 women aged 50–69 need to be screened to prevent one breast cancer death ● Assuming a 32% morality reduction: 451 women aged 70–74 need to be screened to prevent one breast cancer death
EUROSCREEN	Systematic search of PubMed up to February 2011; European observational studies, ie, trend studies ( n = 17), incidence-based mortality (IBM) studies ( n = 20) and case–control studies ( n = 8)	Studies reporting on programs implemented between 1970 and 2007; including at least some of the age groups between 50–69; mammography; interval 2–3 years; population-based screening programs (study has at least a three years’ overlap with the current regional or national program)	Pooled estimates of breast cancer mortality reduction among invited women were 0.75 (95% CI 0.69–0.81) in incidence-based mortality studies and 0.69 (95% CI 0.57–0.83) in case–control studies. Estimates for women actually screened were 0.62 (95% CI 0.56–0.69) in incidence-based mortality studies and 0.52 (95% CI 0.42–0.65) in case–control studies, corrected for self-selection	Assuming a 25–38% mortality reduction: For every 1000 women screened from age 50 to 69, seven to nine breast cancer deaths are prevented

Table 3.2

Overdiagnosis in Evidence Reviews

	Study Designs Selected; Included Studies	Intervention (Study Period; Age Groups; Screening Test; Screening Interval; Screening Organization)	Relative Effect	Absolute Effect
Cochrane	Randomized trials that did not invited the control group at the end of the screening phase (3/11: Malmö I, Canada I and II) and recent observational studies mentioned in discussion	Trials started screening between 1976 and 1980; 132,214 women in age range 40–69; mammography with/without physical examination and/or self-examination; interval 12–24 months	There were 30% more cancers in the screened groups than in the control groups. Large observational studies support these findings	Assuming 30% overdiagnosis: For every 2000 women invited for screening throughout 10 years, 10 healthy women who would not have had a breast cancer diagnosis if there had not been screening will be diagnosed as cancer patients, and will be treated unnecessarily
Independent UK Panel	Randomized trials that did not invited the control group at the end of the screening phase (3/11: Malmö I, Canada I and II)	Trials started screening between 1976 and 1980; 132,214 women in age range 40–69; mammography with/without physical examination and/or self-examination; interval 12–24 months	The frequency of overdiagnosis was of the order of 11% from a population perspective, and about 19% from the perspective of a woman invited to screening	Assuming 19% overdiagnosis from an individual women perspective: For every 10,000 UK women invited to screening from age 50 for 20 years, 129 cancers will be overdiagnosed
USPSTF	A review of eight trials, a metaanalysis of three trials, a systematic review of 13 individual studies, and 25 primary studies estimating overdiagnosis	Details on intervention factors can be found in the metaanalysis, systematic review of randomized trials and observational studies and the 25 primary studies	The relative overdiagnosis estimate was based on the metaanalysis of three trials. The rate of overdiagnosis was estimated at 19%	No absolute estimate provided
American Cancer Society	Review of observational studies, modeling studies, and trials that did not invite the control group at the end of screening	Details on intervention factors are not reported in the review	The review notes that overdiagnosis estimates range from <5% to >50% with estimates based on modeling studies generally lower than those based on empirical studies	No absolute numbers provided. They conclude that there is good evidence that overdiagnosis does occur but no high-quality evidence on the magnitude of overdiagnosis
Canadian Taskforce	The USPSTF review, a systematic review and four primary studies estimating overdiagnosis	Details on intervention factors can be found in the included systematic review and the four primary studies	The frequency of overdiagnosis ranges from 0.4% to 52% s in the included studies. In the main report of the review, the frequency of overdiagnosis ranges from 30% to 52%	For every 1000 women aged 39 years and older who are screened using mammography, five will have an unnecessary lumpectomy or mastectomy as a result of overdiagnosis
EUROSCREEN	Literature review of observational studies that provided estimates of breast cancer overdiagnosis in European population-based mammographic screening programs	Studies reporting on programs implemented between 1970 and 2007; there were 13 primary studies reporting 16 estimates of overdiagnosis in seven European countries (the Netherlands, Italy, Norway, Sweden, Denmark, United Kingdom, and Spain)	Unadjusted estimates ranged from 0% to 54%. Reported estimates adjusted for breast cancer risk and lead time were 2.8% in the Netherlands, 4.6% and 1.0% in Italy, 7.0% in Denmark and 10% and 3.3% in England and Wales. The average estimate of the individual estimates was 6.5% of the incidence in the absence of screening	Assuming 6.5% overdiagnosis: For every 1000 women screened biennially from ages 50 to 51 years until ages 68–69 years and followed up until age 79 years, four cases are overdiagnosed

Table 3.3

Cumulative Risk of False-Positives in Evidence Reviews

	Study Designs Selected; Included Studies	Intervention (Study Period; Age Groups; Screening Test; Screening Interval; Screening Organization)	Relative Effect	Absolute Effect
Cochrane	Observational studies mentioned in the discussion	Details on the intervention are not reported in the review and can be found in the included studies	The cumulative risk of a false-positive result after 10 mammograms ranges from about 20% to 60%	For every 2000 women invited for screening throughout 10 years, it is likely that more than 200 women will experience important psychological distress for many months because of false-positive findings
Independent UK Panel	No quantitative assessment of false-positive risk
USPSTF	Observational studies from the United States and unpublished data from the BCSC	● 1983–1995, 40–69 years old, annual vs biennial screening, United States ● 1994–2006, 169,456 women aged 40–59, annual vs biennial screening, United States ● 1994–2008, 11,474 women with and 9,222,624 women without breast cancer aged 40–74, annual vs biennial vs triennial screening, United States BCSC data was used from women undergoing screening from 2003 to 2011; 405,191 women aged 40–89 years; mammography; one mammogram in the previous 2 years	The observational studies reported a 10-year cumulative risk for false-positive mammography results of 61% for annual and 41% for biennial screening	The BCSC provided the absolute number of false-positives per 1000 women screened per age category: ● 40–49 years: 121.2 ● 50–59 years: 93.2 ● 60–69 years: 80.8 ● 70–79 years: 69.6 ● 80–89 years: 65.2
American Cancer Society	Observational studies from the United States	Details on intervention factors are not reported in the review and can be found in the included studies	The observational studies reported a 10-year cumulative risk for false-positive mammography results of 61% for annual and 41% for biennial screening and for false-positive results leading to a biopsy recommendation of 7% for annual and 5% for biennial screening	No absolute numbers provided
Canadian Taskforce	The USPSTF review and one additional primary study	Details on intervention factors are not reported in the review and can be found in the included studies	Data from the BCSC, as reported in the USPSTF review, gave a cumulative false-positive risk of 49%-77% after 10 screening rounds The observational studies on 49.1% and 20.8%	The absolute number of false-positive results per 1000 women screened for a median of 11 years was reported per age group: ● 40–49 years: 327 ● 50–69 years: 282 ● 70–74 years: 212
EUROSCREEN	Systematic review of studies of the cumulative risk of a false-positive result in European screening program. Four studies were included	Studies published between 1955–2001 were incorporated; 390,000 women starting at ages 50–51 and continuing to ages 68–69; mammography; interval of 2 years; population-based screening program in a European country	Pooled estimates were derived from studies that estimated the risk over 10 years (364,991 women). The estimated cumulative risk of a false-positive screening result in women aged 50–69 undergoing 10 biennial screening tests varied from 8% to 21% in the three studies examined (pooled weighted estimate 19.7%). The cumulative risk of an invasive procedure with benign outcome ranged from 1.8% to 6.3% (pooled weighted estimate 2.9%)	Assuming a 20% false-positive recall and 3% false-positive recall with invasive work-up: For every 1000 women screened biennially from ages 50 to 51 years until ages 68–69 years and followed up until age 79 years, 170 women have at least one recall followed by noninvasive assessment with a negative result, and 30 women have at least one recall followed by invasive procedures yielding a negative result

North America

In North America, the landscape of guidelines for screening mammography is characterized by recommendations from numerous groups. A variety of independent panels, professional societies, and advocacy groups issue screening recommendations. Several of these conduct an evaluation of the balance of harms and benefits in order to support their recommendations, while others rely on existing evaluations. Canadian provinces offer a defined screening mammography program to women age 50–69 but practices for younger and older women vary across provinces. No defined screening program exists in the United States, where it is left to individual providers and patients to make decisions that are informed by recommendations. Guidelines issued by the US Preventive Services Task Force (USPSTF) are particularly influential because they inform Medicare coverage decisions and many private insurers follow the same coverage practices as the Center for Medicare and Medicaid Services. Below we summarize the evidence review conducted by the USPSTF as well as that of the Canadian Task Force on Preventive Health Care (Canadian Task Force), a similar independent panel for Canada. Finally, we discuss the American Cancer Society (ACS) guideline statement as an example of an influential North American organization issuing recommendations related to screening mammography.

US Preventive Services Task Force

The USPSTF is an independent panel authorized by the US Congress and supported by the Agency for Healthcare Research and Quality (AHRQ) to make evidence-based recommendations about clinical preventive services. The USPSTF commissioned a review of the evidence on benefits and harms of screening mammography in preparation for an update to their recommendations on screening mammography issued in 2015. The review included evidence on harms and benefits of mammography for all women age 40 years and older. Evidence on harms and benefits of other screening modalities including breast MRI and digital breast tomosynthesis was also reviewed. The USPSTF in conjunction with AHRQ developed the key questions used to structure the review. The review itself was then conducted by an independent contractor sponsored by AHRQ, the Pacific Northwest Evidence-based Practice Center.

The primary benefit of screening mammography as defined by the draft USPSTF evidence review was reduction in breast cancer mortality. Other benefits considered included reductions in all-cause mortality, advanced breast cancer cases, and treatment-related morbidity. Harms were radiation exposure, pain during procedures, patient anxiety and other psychological responses, false-positive and false-negative test results, overdiagnosis, and overtreatment. Evidence on breast cancer mortality was obtained from randomized controlled trials of screening mammography in women age 40 and over. The evidence review identified eight eligible studies by searching the Cochrane Register of Controlled Trials, Cochrane Database of Systematic Reviews, and MEDLINE, as well as by manually reviewing references. Observational studies and systematic reviews were also included, although quantitative evaluation of breast cancer benefits was based on RCTs. A variety of evidence sources were used for evaluating harms. Systematic reviews and metaanalyses were included as well as recently published primary studies. Primary analysis of observational data on screening mammography from the Breast Cancer Surveillance Consortium (BCSC) was conducted to provide information on performance characteristics of screening mammography. Simulation modeling from the Cancer Intervention Surveillance Network (CISNET) as well as a new simulation model for radiation exposure were also incorporated.

A metaanalysis of the 8 trials included in the draft USPSTF evidence review estimated a relative risk (RR) of breast cancer mortality of 0.88 (95% CI: 0.73–1.00) for women age 39–49 years. Similar estimates were obtained for women 50–59 and 60–69. For women over age 70 three trials met inclusion criteria, but results of the metaanalysis in this age group had broad confidence intervals indicating substantial uncertainty in the benefit (RR =0.80, 95% CI: 0.51–1.28). The evidence review summarized the absolute benefit corresponding to these RRs in terms of breast cancer deaths prevented by screening for 10 years per 10,000 women screened. The number of breast cancer deaths prevented was estimated at 4.1 (95% CI: –0.1–9.3) for women aged 39–49, 7.7 (95% CI: 1.6–7.2) for women aged 50–59, 21.3 (95% CI: 10.7–31.7), and 12.5 (95% CI: –17.2–32.1) for women aged 70–74. The number needed to invite (NNI) the number of women who must be invited to participate in screening mammography for 10 years in order to prevent one breast cancer death. For women age 40–49 and 50–59 the NNI was estimated at approximately 2000 women. The draft USPSTF evidence review also included a summary of breast cancer mortality reduction based on observational studies using the results of the EUROSCREEN review (see section: Europe). However, estimates based on observational studies were not incorporated into numerical summaries of breast cancer mortality reduction due to the risk of bias inherent in observational studies.

The evidence review for overdiagnosis found that estimates varied substantially across studies and methodologies. Included studies consisted of a metaanalysis of five trials, a systematic review of observational studies, and 17 individual studies. A metaanalysis of three trials considered to be least biased, estimated overdiagnosis to be 19% (95% CI: 15–23). In observational and modeling studies using varied methodologies, overdiagnosis estimates ranged from <1% to 54%. The risk of false-positive mammography at a single screening round was estimated using data from the BCSC. Estimates ranged from 65 to 121 per 1000 examinations across age groups. Two observational studies of cumulative false-positive risk from the United States were identified that met inclusion criteria. Overall, cumulative false-positive risk after 10 years of annual mammography was estimated at 61%. Risk was higher among women 40–49 with heterogeneously dense breasts (69%) or extremely dense breasts (66%).

Overall, the evidence review found a significant benefit of screening mammography coupled with relatively frequent harms, with more modest benefits and more common harms in women under 50 years of age. On the basis of this review, the USPSTF issued new draft guidelines in 2015 supporting biennial mammography for women age 50–74 years. Routine screening for women younger than 50 was not recommended.

Canadian Task Force on preventive health care

The Canadian Task Force is an independent panel that makes recommendations on preventive clinical services, similar to the USPSTF in the United States. They commissioned the Evidence Review and Synthesis Centre to undertake a review of screening mammography to support updated recommendations in 2011. Because the USPSTF had conducted an evidence review using similar methodology in support of their 2009 breast cancer screening guidelines, the Canadian Task Force used the USPSTF evidence review and updated this review with studies published in the intervening period. The USPSTF review was used for evidence up to 2008 and updated with additional data through October 2011. The Canadian Task Force review identified breast cancer mortality as the benefit of screening mammography and expressed their results in terms of the number needed to screen (NNS) defined as the number of women who would need to be screened once every 2 years over about 11 years to prevent one breast cancer death. Results from nine trials were included in estimates of the benefit of screening mammography. The estimated NNS for women 40–49 was 2108, while for women 50–69 it was only 721. On the basis of a review of four primary studies and one prior systematic review, the Canadian Task Force estimated the overdiagnosis rate at 5 per 1000 women screened. Unlike the USPSTF review, evidence reviewed on false-positive risk did not incorporate primary data and was expressed only in terms of the number of false-positives associated with one breast cancer death prevented, not as a cumulative false-positive risk. The false-positive risk was found to be highest in the youngest age group at 690 false-positives per breast cancer death averted compared to only 204 in the 50–69 year age group.

The conclusions of the Canadian Task Force echo those of the USPSTF. The benefit of screening mammography was found to be smaller in younger women and harms were found to be more common. Screening every 2–3 years for women aged 50–74 was recommended. Routine screening was not recommended for women under 50 years of age.

You're Reading a Preview

Become a Clinical Tree membership for Full access and enjoy Unlimited articles

Become membership

If you are a member. Log in here