Physical Address
304 North Cardinal St.
Dorchester Center, MA 02124
Appropriate use criteria are developed to define which patients certain medical and surgical procedures are appropriate for and when the benefits sufficiently exceed the risks, thus making the procedure worth doing.
The RAND Corporation/University of California Los Angeles Appropriateness Method uses extensive literature review and an expert panel to classify indications as appropriate, equivocal, or inappropriate.
Appropriateness criteria have been developed for procedures such as cervical fusion and lumbar laminectomy, as well as surgery for degenerative lumbar scoliosis, degenerative lumbar spondylolisthesis, vertebral fragility fractures, and persistent pain following spinal surgery.
There is still a need to develop further appropriateness criteria in spinal surgery to improve evidence-based clinical decision-making.
Surgeon utilization of appropriate use criteria may have future implications in terms of physician reimbursements from Medicare.
Efficient and fair healthcare systems depend on the delivery of appropriate care. Appropriate care means that the health benefits exceed the health risks by a sufficiently wide margin and that the procedure is worth doing, exclusive of cost. There has been well-documented variation in rates of surgical procedures in the United States that are not fully explained by disease incidence or patient preferences, indicating that surgical procedures are being underused in certain regions and inappropriately overused in others. Underuse is defined as any patient with a necessary indication who does not receive the procedure, whereas overuse is defined as any patient who undergoes the procedure for an inappropriate indication. In order to increase appropriate care and reduce inappropriate care, there must be a clear way to define appropriate care.
Clinical practice guidelines (CPGs) have been developed to serve as tools for healthcare providers in clinical decision-making. The definition of CPGs has evolved over time. The Institute of Medicine previously defined CPGs as “systematically developed statements to assist practitioner and patient decisions about appropriate healthcare for specific clinical circumstances.” The definition has since been updated to “statements that include the recommendations intended to optimize patient care that are informed by a systematic review of evidence and an assessment of the benefits and harms of alternative care options.” Appropriate use criteria (AUC) are distinct clinical decision-making tools aside from CPGs. They build upon the evidence-based recommendations in CPGs and attempt to cover any gaps in the guidelines. In areas where there are little data or where the quality of evidence is lacking, AUC bring in the experience of those practitioners in the field to inform practice. AUC are much more detailed and specific, meant to be applicable to nearly every patient in every clinical scenario imaginable. Table 179.1 highlights the differences between CPGs and AUC.
Clinical Practice Guidelines | Appropriate Use Criteria |
---|---|
Recommendations intended to optimize patient care | Specify when it is appropriate to use a procedure |
Informed by a systematic review of evidence and an assessment of the benefits and harms of alternative care options | Based on systematic review of evidence and uses consensus of expert opinion where quality evidence is developing or lacking |
Reflect best practices based on available evidence only | Indicate what is reasonable to do in many specific clinical scenarios |
Advisory based on strength of evidence | Clearly assign scenarios as appropriate or inappropriate indications |
The appropriateness method was developed to determine for which patients certain medical and surgical procedures are appropriate. A widely used and reliable method to produce AUC is the RAND Corporation/University of California Los Angeles (UCLA) Appropriateness Method (RAM). The RAM was developed to systematically assess variation in the use of surgeries by clearly defining which patients should and should not undergo surgical intervention.
Investigators at RAND and UCLA attempted to determine appropriate use for a procedure from the medical literature in order to test their hypothesis at the time that high rates of use of a procedure in a geographic region likely represented “inappropriate” use of the procedure in that region. Upon review of the literature for six different procedures, they realized that medical literature simply was not enough to make determinations regarding the appropriateness of a procedure. There were unanswered questions that required input from those with clinical experience in treating the condition. Therefore, multiple clinical disciplines could help make judgments about appropriateness where the medical literature was lacking. The investigators also felt that appropriateness criteria should be able to be applied to almost every patient in every possible situation where a clinician would be considering a certain procedure. Furthermore, the criteria had to be direct, describing the clinical scenarios in enough detail so each specific scenario could be labeled as appropriate or not. “Weasel words” could not be used to keep the determination of appropriateness vague or ambiguous for any situation.
The RAM involves review of literature in conjunction with the clinical judgment of a multidisciplinary panel. Fig. 179.1 shows a flowchart for the RAM. The original description of the method used a nine-member panel; however, panels can now be composed of six to 15 members. Panelists are often carefully selected through a process that seeks to bring the top experts in a field together to make determinations of appropriateness for a procedure or procedures for a certain condition. A set of specific definitions for all relevant but potentially ambiguous terms is provided to the panelists so that all panelists are making decisions from the same frame of reference. Clear definitions for every relevant term also help make application of the appropriateness criteria reproducible for real-life cases.
Panelists are given an extensive literature review that discusses the risks and benefits of a procedure and are then asked to rate the appropriateness of performing the procedure for specific clinical scenarios, using both their clinical judgment and the best available literature. Note that panelists are asked to consider the average patient presenting to the average physician who performs the procedure at the average hospital without considering cost implications when making their determinations of appropriateness.
To be comprehensive, many different scenarios, often hundreds, must be rated. The appropriateness ratings are done on a nine-point scale, with 1 being the lowest (highly inappropriate) and 9 being the highest (highly appropriate). A rating of 5 is given when the risks and benefits are comparable. The panel rates each indication in two rounds, with the second round after in-person discussion. In the first round (often performed at home), each panelist rates the appropriateness for each clinical scenario. Then the panelists are able to see all of the group ratings so each panelist may compare them to their own ratings (the Delphi group process method). A moderator will lead the in-person discussion and go through each scenario one by one. It is important that the moderator be someone who is comfortable with the subject and very familiar with the literature review. Often, the moderator is a physician and has assisted in performing the literature review. It may be wise to not have a physician who performs the procedure(s) being rated to avoid the introduction of his or her bias into the discussion.
Discussion focuses on scenarios for which there was a wide range of ratings in the first round. The panel can choose to alter definitions of the terms at this time to suit their clinical judgment if there is disagreement or lack of clarity for all panel members. This is also the time to highlight new studies about which not all panel members may be knowledgeable. There can simply be disagreement among panelists based on their own critical assessment of the literature and their own clinical experience. The moderator does not try to force agreement as panelists prepare for the second round. The second-round ratings are then used for analysis with the median panel rating used to classify the indications.
Appropriate (rating of 7–9) indications are when the expected benefits of the procedure outweigh the expected harms. Equivocal (4–6) indications are when expected benefits and harms are nearly equal or when there is disagreement among panelists. Inappropriate (1–3) indications are when the expected harms outweigh the benefits. Appropriate indications may be further classified as necessary if it would be improper care to not offer the procedure to the patient, there is a reasonable chance the procedure will benefit the patient, and the magnitude of the benefit is not small. This may be done in a third round of ratings. Rarely, in cases of disagreement, indications are considered uncertain. The most commonly used definition of disagreement in a nine-member panel is when three panelists rated a scenario in the lowest tertile, three in the middle tertile, and three in the upper tertile.
The RAM has been studied extensively to test for its reliability and validity. , The results of the RAM are sensitive to panel composition, with physicians who perform the procedure more enthusiastic about its appropriateness than nonperformers. Kahan et al. found that performers of a procedure tended to rate procedures higher on the appropriateness scale compared with physicians in other specialties or primary care providers. Multispecialty panels provided more variation in appropriateness ratings, which led to fewer indications rated as appropriate. However, independent panels with the same composition of panelist specialties generate reproducible results (kappa 0.5–0.7). , Test-retest reliability of the same panelists has had a correlation coefficient greater than 0.9. The sensitivity and specificity of the RAM to identify inappropriate overuse of coronary revascularization has been estimated at 68% and 99%, respectively; while the sensitivity and specificity of the RAM in identifying underuse of the procedure has been estimated at 94% and 97%, respectively.
There are several existing AUC that were created using the RAM and pertain to spine surgery. AUC have been published for cervical fusion and lumbar laminectomy procedures. There are also AUC published for the following conditions: degenerative lumbar scoliosis, degenerative lumbar spondylolisthesis, vertebral fragility fractures, and persistent pain after spine surgery. These existing AUC are summarized in Table 179.2 . The international consensuses reached using a modified Delphi survey performed by the AOSpine Knowledge Deformity Forum to identify appropriate management for both adolescent idiopathic scoliosis and adult spinal deformity are not included. ,
Condition/ Procedure | Author | Year Published | Summary |
---|---|---|---|
Cervical fusion | North American Spine Society | 2013 |
|
Lumbar laminectomy | Porchet et al. | 1995 |
|
Degenerative lumbar scoliosis | Chen et al. | 2016 |
|
Daubs et al. | 2018 |
|
|
Degenerative lumbar spondylolisthesis | Mannion et al. | 2014 |
|
Vertebral fragility fractures | Hirsch et al. | 2018 |
|
Persistent pain postoperatively | Tronnier et al. | 2009 |
|
Become a Clinical Tree membership for Full access and enjoy Unlimited articles
If you are a member. Log in here