Evidence-based medicine and intraoperative neurophysiology


Introduction

Evidence-based medicine (EBM) has been described as a search for a science of clinical care . Many of its core concepts and methods in use today can be traced to the field of epidemiology and the pioneering efforts of Archie Cochrane and Alvan Feinstein in the 1970s. Both promoted randomized controlled trials (RCTs) to reduce bias in medical practice. The Cochrane Collaboration was named in honor of Archie Cochrane, in part for his epidemiological studies of tuberculosis in Welsch mining communities. Alvan Feinstein, the godfather of clinical epidemiology, introduced statistics and Bayesian thought to the clinic. Greatly influenced by Feinstein, David Sackett founded the Department of Clinical Epidemiology and Biostatistics at McMaster University in Hamilton, Ontario. His successful promotion of RCTs and systematic reviews and this new department laid the foundation of EBM in North America. Gordon Guyatt was an early graduate of this new department and coined the term “evidence-based medicine” in 1991. He continues to be a key proponent of the most widely used metric for assessing quality of evidence, GRADE (Grading of Recommendations Assessment, Development and Evaluation). Guyatt is also a strong advocate of both patient values and quality of evidence for guideline recommendations. More recently, dramatic advances in causal science are increasingly applied in big data analysis, epidemiology, and observational studies in medicine . In summary, important EBM concepts and methods emerged from epidemiology. We can anticipate that both epidemiology and the new field of casual science will improve intraoperative neurophysiological monitoring (ION) practice and secure its evidence base.

Principles of evidence-based medicine

From its beginnings in the 1990s to now, EBM has had three core epistemological principles . These principles capture “how we know” a treatment or practice (ION) has value for patients. First, “all evidence is not created equal.” We have more confidence in the conclusions of studies thought to be at less risk of bias (systematic error) than others. Second, “the totality of evidence is considered.” All studies for an outcome are examined, not just those that give an expected or desired result. Third, “patient values, along with quality of evidence, determine the value of a treatment or practice.” This third principle includes valuations of the opportunity costs (risks, benefits, economic) of a practice and its next best alternative .

All evidence is not created equal

Evidence for improved outcomes boils down to evidence for causation, or causal links, between ION and outcomes. Early EBM thought made rigid distinctions between randomized and nonrandomized studies as represented in the familiar evidence pyramid and classes (I, II, III, IV) of evidence quality ( Fig. 43.1 ). Current EBM thought has moved from exclusive consideration of research design (randomized vs. observational) to how a study is conducted and the specific sources of bias, which limit our confidence in treatment effects .

Figure 43.1, Hierarchical classification of evidence by research design (e.g., Classes I, II, III, and IV) has given way to considerations of study conduct and limitations.

Sources (domains) of bias are different for randomized compared to nonrandomized studies. Furlan et al. . described a domain-based tool for RCTs that has been adopted by the Cochrane Collaboration. For observational studies the ROBINS-I tool was recently developed by a team with experts in causal science . Like the Cochrane tool for RCTs, ROBINS-I uses signaling questions for risk of bias judgments in each domain. ROBINS-I starts with a hypothetical “target” randomized trial for the intervention and outcomes of each observational study, not limited by ethical constraints or practical issues, and then defines risk of bias as a systematic difference between the results of the observational study and those expected from the target trial. Modern domain-based risk of bias assessments recognize that RCTs with important limitations may not support causal links between treatments and outcomes, while observational studies with adjustments for confounding can sometimes provide strong support. The GRADE method for assessing overall quality of evidence provides a useful illustration of how this works .

In GRADE, RCTs start as high-quality evidence. However, with serious limitations (bias, inconsistency, indirectness, imprecision, publication bias), they are downgraded to moderate or low quality ( Fig. 43.1 ). For example, a systematic survey of 40 RCTs of spine surgery interventions found the results were frequently fragile from the sample sizes and small number of outcome events . Adding two events to one of the treatment arms in many of these studies would have eliminated statistical significance! Many of these studies would be downgraded for imprecision by GRADE.

Observational studies in GRADE start as low-quality evidence, but if rigorously conducted with adjustments for confounding are upgraded to moderate or high quality for size of effects, dose–response relationships, or if residual confounding would be expected to decrease treatment effects . A frequently cited example is the use of Warfarin, a blood thinner, to reduce stroke during cardiac valve replacement. A meta-analysis of observational studies with adjustments for suspected confounders found a very strong effect (risk ratio=0.17), which has been considered high-quality evidence for Warfarin-reduced stroke .

The totality of evidence is considered

Science is cumulative, and investigations of causal links between treatments and outcomes consider the totality of evidence available from all studies. The essentials of a study are its patients, interventions, comparison, and outcomes (PICO) . Comparative studies (studies with a control group) are typically (but not always) required to support treatment effects. For ION, O utcomes for P atients with neuromonitoring I ntervention are C ompared to O utcomes for patients without neuromonitoring or without a neuromonitoring modality. Outcomes should be those that people experience (feel physically or mentally) and care about . For ION, these are new or worsened neurologic deficits in the postoperative period and not end of surgery monitoring results (e.g., somatosensory evoked potentials (SEPs) or motor evoked potentials (MEPs) at closing compared to preincision baselines).

Totality of evidence is considered by assessing the quality of evidence for studies with comparable PICOs in a systematic review that can be either in narrative form or quantitative with a meta-analysis. The overall quality of evidence (another way of saying this is “our overall confidence”) for an intervention effect in a systematic review is then often summarized for specific outcomes in summary of findings (SoF) tables. In GRADE the SoF are “outcome centric” where quality of evidence (high, moderate, low, very low) for several or many studies may be different for different outcomes. For ION, new neurologic deficits associated with spinal cord injury and those associated with nerve root injury could be separate outcomes in a SoF table. The quality of evidence in a SoF table might also be different for immediate postoperative outcomes compared to those assessed at follow-up. Missing data from unavailable neurologic exams at follow-up (e.g., 3 months) can limit confidence.

A meta-analysis or statistical synthesis is used when the interventions and outcomes of studies are not too diverse . Meta-analyses will typically obtain a weighted average of the treatment effect across studies, with a confidence interval and significance level. Measures of heterogeneity (differences in treatment effect from one study to the next) are also provided. Meta-analyses are useful as a summary of average treatment effects with increased power and precision compared to an individual study. An example of a meta-analysis forest plot for ION with typical statistics and their interpretation is provided in Fig. 43.2 .

Figure 43.2, Forest plot of ION versus no ION for spine intradural tumor surgery outcomes of new neurologic deficits (sensory and/or motor). Dichotomous outcomes are shown as risk ratios ( squares ) with 95% confidence intervals on a log scale. The size of the square represents the weight given a study in calculating average treatment effects ( diamonds ). The overall effect was significant ( Z =2.23, P =.03). Subgroup comparisons were undertaken for differences in pathophysiology, operative risks, and operative plan for intramedullary and extramedullary tumors. The Chi 2 and I 2 statistics gave no evidence for heterogeneity, but the small number of studies limits interpretation. Studies which reported continuous outcomes from McCormick scale scores [16 17] and their standardized mean differences are not shown (RN Holdefer, CN Seubert, J McAuliffe, JL Shils, DM MacDonald, ME Edwards, PF Sturm). ION , Intraoperative neurophysiological monitoring.

Patient values and quality of evidence determine the value of a treatment or practice

The Institute of Medicine (United States), as directed by the Medicare Improvements for Patients and Providers Act of 2008 , addressed variations in methods for assessing quality of evidence, lack of transparency, and conflicts of interest in the proliferation of guideline statements by professional organizations. In Clinical Practice Guidelines We Can Trust the Institute of Medicine emphasized that the value of a practice depends on both systematic reviews of the quality of evidence and assessments of its benefits and harms. “Clinical practice guidelines (CPGs) fundamentally rest on appraisal of the quality of relevant evidence, comparison of the benefits and harms of particular clinical recommendations, and value judgments regarding the importance of specific benefits and harms .” Like other medical practices, the value of ION depends on the quality of evidence from systematic reviews and patient valuations of the risks, benefits, and economic costs of ION.

Quality of evidence and practice value are different concepts. A good example is bracing in adolescent idiopathic scoliosis. Bracing continues to be used to prevent progression of high-risk curves despite only low or very low quality of evidence for its effectiveness. Recent Cochrane reviews described the obstacles to better evidence from RCTs . Value judgments from both parents and physicians were that the benefits of bracing (less deformity, avoidance of surgery) outweighed any harms. Parents and physicians refused to randomize treatment in many instances and attempts at RCTs for bracing versus simple observation failed.

Like bracing, and as underscored by the Institute of Medicine report, the value of ION depends on an appraisal of its benefits and harms as well as the quality of evidence that supports improved outcomes ( Fig. 43.3 ). Opportunity costs as “benefits lost from the next best alternative” depend on patient diagnosis and type of surgical procedure and remind us to compare the benefits lost without ION to those with ION. The opportunity cost of no ION (benefits lost without ION) for aneurysm, spine intramedullary tumor, and spinal deformity procedures is an expected decreased risk of new, permanent, and severe neurologic deficits and long-term medical expenses . By comparison, the opportunity cost of ION (benefits lost with ION) is relatively small, and includes fewer harms specifically associated with ION (tongue bites, seizures, needle infections, unnecessary changes in surgical strategy from inaccurate ION tests), and no expense for the ION service . For surgical procedures with high opportunity costs, less certainty in evidence for improved outcomes with ION is needed to support its value. The big difference in expected opportunity costs for no ION (vs ION) tends to unbalance or ‘‘break’’ the clinical equipoise required for randomized studies ( Fig. 43.3 , point C). On the other side, clinical equipoise is tipped in favor of randomized studies when there is substantial uncertainty in the evidence estimating ION effects and the opportunity cost difference (no ION vs ION) is low ( Fig. 43.3 , point A).

Figure 43.3, ION value depends on opportunity costs and quality of evidence. Data points represent surgical procedures with different incidence and severity of iatrogenic injury. Equipoise is present under the curve. With increasing opportunity costs of the next best alternative (A->C), less quality of evidence is needed to break equipoise for ION value [6] . ION , Intraoperative neurophysiological monitoring.

High-quality evidence from RCTs for high-risk surgeries such as those for pediatric spine tumors and spine deformities will not be forthcoming. Informed parents as well as surgeons will refuse to randomize treatment. A chair of spine surgery at a major medical center recently stated, “As to RCT in AIS (adolescent idiopathic scoliosis) you would not get a single pediatric deformity surgeon in the US to agree to that.”(Peter Sturm, personal communication). On the other hand, for very frequently performed surgeries with lower risks of new neurologic deficits (e.g., cervical decompression for radiculopathy), a large RCT and the resources required for its conduct may be justified if uncertainty (equipoise) remains after well-conducted, comparative observational studies of ION. Uncertainty may be present in the absence of very large benefits (e.g., risk ratios >0.2 for ION), difficult to measure suspected confounders, or imprecision.

The three EBM principles are most fully developed for comparative studies of interventions and, next, will be applied to ION efficacy in improving outcomes. Support for ION efficacy requires demonstration of causal links between ION and outcomes, typically from systemic reviews of comparative studies. Then, these same EBM principles will be applied to ION diagnostic test accuracy (DTA) and postoperative outcomes.

Intraoperative neurophysiological monitoring and outcomes: back to basics for causal links

Association of a treatment with outcomes does not demonstrate causation but may instead result from confounding variables which affect both . In order to untangle association or correlation from causation, statistics has typically relied on randomized assignment of treatments to patients. In such studies, confounding is avoided. Still, children readily learn causal relationships from nonrandomized, observational data. Formally, this requires methods to “neutralize” confounding from unequal representation of prognostic variables in treatment and control groups. Suspected confounders must be accounted for by separating their effects on outcomes from those of the treatment. Showing no statistical difference in suspected confounders between groups used for comparisons is not sufficient. Multivariate regression or propensity score analysis is typically required to make adjustments for confounding and estimate treatment efficacy .

Fundamental problems in causal science are those of identifiability and also planning, both of which are pertinent to ION observational studies . Identifiability asks, “Can I obtain reliable estimates of ION treatment effects on outcomes?” Planning asks, “What suspected confounders do I need to measure for adjustments to demonstrate ION effects?”. Answers to both of these questions are typically implicit as “background knowledge” for ION comparative studies and are greatly facilitated by a causal model.

Structural causal models (SCMs) and their expression as directed graphs are key advances in causal science which, over the past 10–15 years, have seen increasing use in epidemiology and the health sciences. By making assumed relationships between variables explicit and testable, SCM can reveal treatment effects in observational studies . SCM and directed graphs are used in this chapter to convey important ION concepts: ION effects on outcomes are indirect; ION use as a surrogate endpoint constrains estimates of its DTA and efficacy; and estimates of ION treatment effects in observational studies must account for error from confounding. Our general treatment of SCM will follow that of Judea Pearl .

A SCM and directed graph of ION is shown in Fig. 43.4 . In directed graphs, nodes ( circles ) represent variables and the edges ( lines ) between nodes represent functions. Edges are often directed for the path of information flow. The seven variables and their associated edges in this SCM represent commonly understood relationships thought to influence ION effects on postoperative outcomes. Each of the seven variables (endogenous) is influenced by an exogenous variable that is external and unaffected by the model. Typical values for these exogenous variables are provided in Fig. 43.4 . For example, surgical procedure (spine deformity, intramedullary tumor resection) is exogenous to Intraoperative Events and will influence its values (correction, instrumentation, bleeding). ION methods (which include alert criteria) are exogenous to ION and will influence its values (alert or no alert).

Figure 43.4, ( A ) SCM variables and relationships commonly assumed to affect ION effects on outcomes. Each of the seven endogenous variables in the model is influenced by an external (exogenous) variable (table). ION effects on Post-Op Outcomes are indirect and mediated through the Surgeon (green path). Surgeon responses to ION alerts affect both Post-Op Outcomes and Intra-Op Events (blue path) and limit estimates of ION test accuracy and efficacy. (B) Comparative study of “ION” versus “No ION” is represented. Appropriate surgeon responses are assumed to ION alerts ( dashed oval ). Nonrandomized studies must take into account effects on Post-Op Outcomes from suspected confounders ( Patient Characteristics , Training/Experience/OR adjuncts , red pathways). After adjustments for confounding, outcomes for ION are compared to those for no ION (control) and risk differences or risk ratios obtained. See text for more details. ION , Intraoperative neurophysiological monitoring SCM , structural causal model.

The ION SCM has important omissions ( Fig. 43.4A ). For example, an anesthesia variable, which influences ION, is not included. Likewise, particular ION methods (alert criteria, model of service delivery), surgeon decision-making, and postoperative outcomes measures are kept external to the model and could be included as endogenous variables in a more realistic model. Despite these limitations, this model addresses three essential concepts for consideration by research investigating ION treatment effects on outcomes.

You're Reading a Preview

Become a Clinical Tree membership for Full access and enjoy Unlimited articles

Become membership

If you are a member. Log in here