Physical Address
304 North Cardinal St.
Dorchester Center, MA 02124
The randomized controlled trial (RCT) is the gold standard in evidence-based medicine.
Surgical RCTs can be affected by several types of bias, including informational bias, which occurs owing to a lack of blinding, and response bias, which occurs when a significant percentage of patients with a particular outcome do not provide outcomes data.
Comparative effectiveness research aims to use modern clinical trial methods to determine what works best in actual contemporary practice.
The term clinical equipoise means that there is genuine uncertainty within the expert medical community on the optimal approach for a certain medical condition. Clinical equipoise is necessary for the performance of an RCT.
Many surgical RCTs are limited by crossovers. When more than 20% of patients assigned to one treatment cohort or another choose to cross over to the other treatment arm, the intention-to-treat analysis is compromised.
An ideal clinical trial compares two or more treatment options for a study population that results in broad and generalizable results that are durable and change clinical practice.
The purpose of a clinical trial is to determine the safety, efficacy, or effectiveness of a medical intervention. A clinical trial is, by definition, prospective and has a control group. The number of clinical trials that address diseases and interventions of the spine is increasing ( Fig. 176.1 ). As discussed in Chapter 177 , retrospective studies play a significant role in clinical spine surgery research; however, this chapter focuses exclusively on prospective studies.
The randomized controlled trial (RCT) is the gold standard in evidence-based medicine. Very few RCTs have been completed to date to guide spine surgeons. Making matters more complex, of the RCTs that do reach completion, many do not provide a definitive answer to the research question and, as a result, do not change practice. Lack of blinding in most surgical RCTs also introduces an informational bias that can affect outcomes data both from patients and from clinical observers.
The goal of this chapter is to review the methodology behind the clinical trial and to explain how the clinical trial may advance the practice of neurosurgery and spine surgery in particular.
The term comparative effectiveness has become widely used since passage of the American Recovery and Reinvestment Act of 2009, which allocated $1.1 billion for new research projects aimed at improving our understanding of the differences among established medical treatments. Effectiveness differs from efficacy. An intervention is effective if it actually results in a satisfactory outcome when used broadly in the community. For example, an emergency tracheotomy might be efficacious when performed by trained personnel, but it might be ineffective, or even dangerous, when applied by untrained individual first responders. In general, a clinical trial is a carefully controlled scientific study that examines the efficacy, or safety, of an intervention. A clinical trial can evaluate the comparative effectiveness of an intervention compared with an alternative if, and only if, the results are truly “generalizable.” That is, the study population should represent the majority of patients with a particular condition, and, in the case of a surgical trial, the trial surgeons’ skill set should resemble that of an average practitioner. For example, transarticular C1‒C2 screws have been shown to be efficacious in treating C1‒C2 instability, but their effectiveness in the community has not been assessed in large practical trials involving surgeons without highly specialized spine surgery skills.
A prospective cohort study is an investigation that includes two or more groups of similar patients who undergo different treatments and who are then assessed for a specific outcome. In spine surgery, most cohort studies are interventional in nature. The Arbeitsgemeinschaft für Osteosynthesefragen Foundation–sponsored cervical spondylotic myelopathy (CSM) trial is a typical example. In this multicenter effort, patients with CSM underwent either ventral or dorsal surgery and were assessed using validated outcomes instruments at regular intervals. The nonrandomized study accrued 278 unselected patients and therefore provides some comparative effectiveness data. , In this study, the baseline degree of myelopathy between treatment groups (ventral and dorsal surgery patients) differed significantly, and therefore differences in outcome observed following ventral and dorsal approaches cannot be necessarily attributed to the surgical approach. In fact, the degree of selection bias in many prospective cohort studies precludes any real conclusion regarding the primary research question. Selection bias might not always be related to the actual pathology. In a different prospective cohort study comparing 36 patients who underwent microsurgical resection with 46 patients who underwent stereotactic radiosurgery for treating vestibular schwannoma, the groups were compared using health-related quality of life (HR-QOL), functional, and radiographic tumor control outcome measures. Although the groups were similar with regard to tumor size preoperatively, the surgical group was significantly younger, thereby raising questions about the validity and generalizability of the results.
Nonrandomized data have been reported to demonstrate significant differences among treatments 56% of the time, whereas RCTs show “significant” differences only 30% of the time. In addition to information bias a publication bias, or the tendency of authors and editors to favor the publication of studies with “positive” results, may exist. Systematic efforts to compare RCTs and nonrandomized studies on a number of medical and surgical topics have reached different conclusions. For nonrandomized trials to match the validity of an RCT, the inclusion and exclusion criteria must be clear, and known prognostic factors should be balanced with the utilization of objective outcome assessments. ,
Nonrandomized experiments might fail to balance important baseline prognostic variables, introducing bias into the results of the trial. Therefore it is generally agreed that RCTs are the gold standard for determining whether one intervention is superior, equivalent, or inferior to an alternative. It is estimated, however, that fewer than 1% of published papers in leading neurosurgical journals are RCTs. Some of the key completed RCTs that address spinal and other neurosurgical conditions discussed in this chapter are listed in Table 176.1 .
Randomized Controlled Trial | Objective | N | Primary Outcome Measures | Conclusion |
---|---|---|---|---|
Fairbank et al. | Lumbar fusion vs. intensive rehabilitation for low back pain | 349 | SF-36, VAS, ODI | Lumbar fusion may not provide greater benefit than intensive rehabilitation for low back pain |
Heller et al. | Bryan cervical disc arthroplasty vs. ACDF for cervical disc herniation | 463 | NDI, return to work | Bryan cervical disc arthroplasty is a viable alternative to ACDF for single-level disc disease |
Murrey et al. | ProDisc-C cervical disc arthroplasty vs. ACDF for cervical disc herniation | 209 | NDI, SF-36, VAS | ProDisc-C cervical disc arthroplasty is not inferior to ACDF at 2 years |
Mummaneni et al. | Compare Prestige arthroplasty vs. ACDF for cervical disc herniation | 541 | NDI, SF-36, VAS | Prestige cervical disc arthroplasty is not inferior to ACDF at 2 years |
Guyer et al. | Charité lumbar disc arthroplasty vs. lumbar fusion for low back pain | 375 | VAS, ODI, SF-36, reoperation rate | Charité lumbar artificial disc has equivalent outcomes to lumbar fusion, but has a lower reoperation rate |
Weinstein et al. | Discectomy vs. nonoperative management for symptomatic lumbar disc herniation | 501 | SF-36 and ODI | No difference in outcome between surgical and nonoperative management for lumbar herniated disc at 2 years |
Weinstein et al. | Surgery vs. nonoperative management for degenerative lumbar spondylolisthesis | 304 | SF-36 and ODI | No difference in outcome between surgical and nonoperative management for lumbar spondylolisthesis at 2 years |
Weinstein et al. | Surgical decompression vs. nonoperative management for symptomatic lumbar spinal stenosis | 289 | SF-36 | Surgical decompression for lumbar spinal stenosis is superior to nonoperative management of lumbar spinal stenosis at 2 years |
Asymptomatic Carotid Atherosclerosis Study | CEA vs. medical management for asymptomatic carotid artery stenosis | 1662 | Ipsilateral stroke rate | CEA is superior to medical therapy for asymptomatic carotid stenosis greater than 60% |
North American Symptomatic Carotid Endarterectomy Trial | CEA vs. medical management for symptomatic carotid artery stenosis | 595 | Ipsilateral stroke rate | CEA is superior to medical therapy for patients with high-grade carotid stenosis (>70%) |
Patchell et al. | Surgical decompression with radiotherapy vs. radiotherapy alone for symptomatic spinal cord compression from metastasis | 101 | Ability to walk | Surgical decompression with radiotherapy is superior to radiotherapy alone for treating symptomatic cord compression from metastatic disease |
Mantese et al. | CEA vs. CAS for symptomatic and asymptomatic patients with carotid artery stenosis | 2502 | Stroke, MI, death | No significant difference in primary outcome rates following CEA vs. CAS; periprocedural stroke risk was higher for CAS, whereas periprocedural risk for MI was higher for CEA |
Försth et al. | Laminectomy vs. laminectomy + fusion for lumbar stenosis with or without spondylolisthesis | 247 | ODI | Addition of fusion when performing decompression surgery for lumbar spinal stenosis with or without spondylolisthesis is not associated with better clinical outcomes |
Ghogawala et al. | Laminectomy plus fusion vs. laminectomy alone for lumbar spondylolisthesis | 66 | SF-36 PCS | Addition of lumbar spinal fusion to laminectomy was associated with slightly greater but clinically meaningful improvement compared with laminectomy alone for patients with lumbar spondylolisthesis |
There are a number of significant barriers to performing high-quality RCTs in spine surgery. One of these is the heterogeneity of spine diseases—the myriad of symptoms caused by a single spinal anatomic abnormality and the clinical differences between patients with identical radiographic findings. The variation of the back pain population, for example, limits the ability to perform well-designed clinical trials comparing the nonoperative and operative treatments. Fairbank and colleagues performed an RCT (349 patients) comparing surgery with intensive rehabilitation therapy for back pain. Despite using a large sample size with validated and appropriate outcome instruments, it was difficult for the investigators to draw conclusions about the surgical treatment of low back pain because the clinical entity was itself so heterogeneous. Making matters more complex, the most significant variables that result in patient heterogeneity are often unknown. A lumbar RCT comparing the Charité (DePuy, Raynham, MA) lumbar arthroplasty to anterior lumbar interbody fusion for the treatment of low back pain demonstrated statistically significant improvements in multiple outcome measures at 2 years. The study population, however, was not well defined and thus left clinicians to selectively choose ideal candidates for the study. This type of selection bias is particularly common when trials are designed to investigate a new technology and are supported by corporate funding and can, in part, lead to failure by a medical payer system (such as Centers for Medicare and Medicaid Services) to adopt and ultimately pay for the new technology despite its being supported by class I data.
One of the most important barriers to performing a surgical clinical trial is lack of equipoise. This term, popularized by Freedman in a classic 1987 paper, means “genuine uncertainty within the expert medical community” on the optimal approach for a certain medical condition. RCTs are ethical and feasible only when there is clinical equipoise between the treatment arms. Lack of clinical equipoise affected the National Institutes of Health (NIH)–sponsored Spine Patient Outcomes Research Trial (SPORT), an RCT that compared surgery with nonoperative management for symptomatic lumbar disc herniation. The high crossover rate (30% from the nonoperative cohort to the operative cohort within 3 months) suggested that clinicians, patients, or both felt that surgery would provide a greater chance of clinical benefit after 6 weeks of failed conservative management. Conversely, almost as many patients randomized to receive surgery did not have an operation, indicating that patients had strong opinions favoring the role of conservative treatment when symptoms were mild or improving. In retrospect, the lack of clinical equipoise limited the ability of the study to detect better outcomes from surgery.
Another SPORT RCT examined surgical versus nonsurgical treatment for degenerative lumbar spondylolisthesis. Patients were included if they had neurogenic claudication or radicular leg pain, with spinal stenosis and degenerative spondylolisthesis on imaging. These patients were randomized either to nonoperative treatment or to decompressive laminectomy, with or without bilateral single-level fusion, with or without iliac crest bone grafting, and with or without pedicle screw instrumentation. This RCT also demonstrated high rates of crossover attributed to a lack of clinical equipoise. However, the methodology also demonstrated that heterogeneity of treatment can limit the ability to generalize results. In this trial, the underlying assumption that instrumented fusion, noninstrumented fusion, and decompression alone are equivalent may not be true, and therefore the trial does not provide meaningful information about which treatment is optimal for the management of grade I spondylolisthesis.
By strict statistical criteria, an RCT should be analyzed by the intent-to-treat principle—that is, the outcomes are analyzed not by which treatment the patient actually received but rather by which treatment group the patient was randomly assigned to. This approach preserves the integrity of randomization, which theoretically balances both known and unknown confounding risk factors. For example, in the Asymptomatic Carotid Atherosclerosis Study (ACAS), patients randomized to receive surgery were analyzed as such even if they had an angiographic complication after randomization (not related to surgery) or even if they did not undergo surgery at all. When crossover rates are high, the intention-to-treat analysis is less likely to detect a true difference between two treatments. In the SPORT lumbar discectomy trial, the intent-to-treat analysis did not detect any benefit from surgery (the crossover rate was 30%), although the as-treated analysis showed a significant benefit from surgery.
The validity of a study analysis is also compromised when significant clinical data are missing. Response bias can occur when a subject does not fully complete questionnaires at each time point of the study. If the reasons that subjects do not participate (e.g., anger over surgical outcome) differ between the arms of the study, then a response bias exists. In the first published study regarding SPORT, the degree of missing data was between 24% and 27%. ,
Another difficulty in designing RCTs for spine surgery is the learning curve associated with the clinical application of a new technology. If a practitioner has not performed a procedure with a new technology, it is likely the complication rate will be higher because of the learning curve associated with this technology. There has been a constant evolution of novel spine procedures, exemplified by the interbody fusion techniques. Current techniques for interbody fixation and fusion are changing at such a rapid pace that trials designed today to test these newer technologies might be obsolete, and therefore irrelevant, before the trials’ completion. One RCT compared the use of femoral ring allograft to the use a titanium cage in circumferential lumbar spine fusion. Clinical outcome was measured by the Oswestry Disability Index (ODI), visual analog scale (VAS), and Short Form–36 (SF-36) 22 with 2-year follow-up. The trial found greater clinical improvements in all outcome scales with femoral ring allograft than for titanium cages. These results, and the higher cost of titanium cages, prompted the authors to state that the use of cages in lumbar fusion was not justified. However, the surgical procedure performed in this study is now rarely performed. This “front-and-back” approach, using dorsal screw fixation in addition to the retroperitoneal ventral approach for placement of interbody graft, has been replaced by a single approach to achieve circumferential fusion. More recent lumbar techniques include minimally invasive transforaminal techniques (that possibly reduce muscle trauma), often supplemented with cages and recombinant bone morphogenetic protein–2. Although this RCT was well designed, its results cannot be applied to more recent minimally invasive lumbar fusion techniques.
Become a Clinical Tree membership for Full access and enjoy Unlimited articles
If you are a member. Log in here