Measuring Outcomes of Multidimensional Geriatric Assessment Programs


Although frailty in older adults may be associated with an underlying loss of complexity in many physiologic systems, the clinical conditions and geriatric syndromes that are commonly present in frail older adults are often highly complex. This clinical complexity, including the presence of multiple interacting medical and social concerns, is the challenge and also the joy of geriatrics.

Geriatric services respond to this complexity with comprehensive approaches to assessment, multidisciplinary teams, and multidimensional interventions. Although there may be widespread agreement on the need for comprehensive, multidisciplinary, and multicomponent approaches, there is less agreement on the specific elements of these approaches. It is also not always clear which specific interventions or aspects of care (or combinations thereof) make a difference for an individual patient or for groups of patients—hence, the references to the black box of geriatrics. Clinical complexity and comorbidity have often meant that frail older adults are excluded from many clinical trials, although there have been recent efforts to rectify this. This exclusion is problematic in terms of the interventions being tested and results of the studies, which are not relevant or generalizable to many frail older adult patients. Multicomponent interventions have been found to be more effective than single-component interventions for frail older patients but these types of programs are much more difficult to evaluate in the context of clinical trials. Allore and colleagues have made a distinction between statistical and analytic considerations and clinical considerations in the design of such trials. Statistical or analytic considerations would suggest that one specific intervention should target a single outcome or risk factor, the basis on which power calculations are generally undertaken. Clinically, however, it makes sense for interventions to target more than one outcome or risk factor, and many interventions are likely to have overlapping effects. For studies of interventions for frail older patients, clinical and analytic considerations are particularly at odds.

Given the heterogeneity of the patient population and the heterogeneity of clinical interventions, it is not surprising that evidence for the effectiveness of geriatric interventions has been hard to establish. Rubenstein and Rubenstein have closely observed this literature over the years and have pointed out a number of factors associated with an increased likelihood of demonstrating their effectiveness. These include appropriate targeting, more intensive interventions, control over longer term management, and a usual care control group. To this list, it is suggested here that an additional consideration be added, the selection of meaningful and responsive outcome measures. The selection of appropriate outcome measures for geriatric interventions is not straightforward and has been identified as a priority for research. In the early 1990s, a working group of the American Geriatrics Society achieved a consensus on measures appropriate for measuring outcomes of geriatric evaluation and management units. The consensus statement recommended 12 physical outcomes, three psychological and social functioning outcomes, and 17 outcomes related to health care utilization and cost, reflecting concerns about future implementation and funding. The number and variety of these measures reflect the multidimensional nature of geriatric care as well as its potential system impact. Although all these measures may have relevance to specialized geriatric interventions, few, if any, of these measures would be relevant for all patients. The question therefore becomes how to achieve nonarbitrary dimensionality reduction from multidimensional interventions with multidimensional outcomes.

A more recent attempt to achieve a consensus on outcome measurement for older patients was undertaken by a U.S. National Institute on Aging (NIA) expert panel in 2001. This working group was charged to “recommend the content of a core set of well-validated universal patient-centered outcome measures that could be routinely measured and recorded widely in health care delivery” for older persons with multiple chronic conditions. This group recommended an initial composite measure, such as the SF-36 or the Patient-Reported Outcomes Measurement Information System 29-item Health Profile (PROMIS-29) be used, with these results forming a basis for targeting additional follow-up measures. This approach has the potential to be more feasible in routine clinical practice, but still may require a fairly large array of outcome measures. The working group was unable to achieve consensus on appropriate follow-up measures in several important assessment domains, including disease burden, cognitive function, and caregiver burden. Also, despite an intention to recommend patient-centered measures, patients were not included in the consensus process nor were measures proposed to elicit patient preferences and values, which would be fundamental to a patient-centered approach.

Some of the challenges associated with measuring outcomes of multidimensional geriatric interventions can be gauged by reviewing the outcome measures used in randomized controlled trials (RCTs) of these interventions. Relevant studies were identified from selected major systematic reviews and meta-analyses, beginning with the seminal meta-analysis of comprehensive geriatric assessment services published by Stuck and associates in 1993. Other reviews included a review of studies specifically focused on outpatient geriatric assessment, two reviews of studies focused on preventive home visits, and one review that specifically targeted multicomponent interventions. Collectively, these reviews reported results from 56 RCTs (see Appendix Table 38-1 ). Outcome measures were categorized into mortality, self-rated health, health care utilization, three assessment domains (physical function, cognitive function, and psychosocial outcomes), and an “other” category. These 56 studies are summarized as follows (see Appendix Table 38-1 ):

  • Physical function was measured in 54 studies, using 77 different measures, of which 23 were statistically significant.

  • Cognitive function was measured in 32 studies, using 12 different measures, of which six were statistically significant.

  • Psychosocial function was measured in 39 studies, using 43 different measures, of which 13 were statistically significant.

  • Self-rated health was measured in 18 studies, using nine different approaches, of which five were statistically significant.

  • Health care utilization outcomes were measured in 46 studies, using 27 different measures, of which 26 were statistically significant.

  • Other outcomes were measured in 32 studies, using 31 different measures, producing statistically significant results in 14 studies.

This review illustrates several points. Geriatric services were associated with statistically significant benefits in each category of outcome measure in at least some studies, but no category of outcome was significantly improved in all studies. None of the studies reported significant improvement in all the outcomes measured. The review also highlights the range of outcomes considered meaningful and plausible for geriatric services. Mortality is a clear end point and amenable to summation and comparison in meta-analyses, but is not necessarily the most meaningful outcome for programs serving a frail clientele for whom life expectancy is limited. Indicators related to health care utilization are of great relevance to the health care system and, although they may relate to an older person's quality of life (e.g., for some older adults their quality of life may be higher in a community setting than in a long-term care home), these are at best indirect measures of quality of life from a patient's perspective. Within each of the other domains, there is further evidence of heterogeneity; each domain has multiple aspects, and a large variety of instruments and approaches have been used to measure these. Even within the “other” category, an outcome such as falls is itself a multifactorial syndrome.

Geriatric Assessment Outcomes and Quality of Life Measures

The assessment domains commonly measured in geriatric intervention studies can be seen as major components of quality of life. If outcomes commonly targeted by multidimensional geriatric interventions can be considered, collectively, as a reflection of quality of life as the overarching domain of importance, a sufficiently comprehensive quality of life measure could be a good choice as an outcome measure for common use in geriatric intervention studies. A candidate measure is the SF-36, or one of its variants with subsets of items, which has been very widely used as a health-related measure of quality of life. Unfortunately, testing of its use with older adults has not been extensive, and results of these studies have suggested that the utility of this measure with older patients may be limited. A promising measure is the EQ-5D, which quantifies an individual's health-related quality of life into a single index value and provides a descriptive profile. It has proven to be a valid, reliable, and easy to use measure. However, it has also been shown to have limitations—predominantly, ceiling effects and poor sensitivity at the top of the scale. A revised five-level version (EQ-5D-5L) has shown promise in addressing these limitations. A few studies have tested the EQ-5D in populations that include older subjects ; further work in this area would be welcome.

Despite some promising work in quality of life measurement, the development of any measure that could achieve wide acceptance has been hindered by the lack of a common conceptual or theoretical understanding of the meaning of quality of life and by lack of agreement on its constituent elements. Spitzer has argued that the development of a gold standard measure is possible, even for a subjective construct such as quality of life: “We fail to have a Gold Standard…because no one has made it his or her primary objective to develop a Gold Standard either for measures of health status or for measures of quality of life…I believe Marilyn Bergner and her co-workers have a sufficiently long head start that they deserve support from all the rest of us.” Although Spitzer pointed to the work of Bergner on the Sickness Impact Profile as the best candidate for further development as a gold standard quality of life measure, Bergner turned out not to share this view: “The bitter truth is that there is no gold standard, there is unlikely ever to be one, and it is unlikely to be desirable to have one.”

Standardized Assessment Systems

Another approach that aims at providing a comprehensive assessment of health and social functioning is the use of a standardized assessment system, of which the interRAI minimum data set (RAI or MDS) assessment systems are the most prominent. The interRAI instruments are a comprehensive assessment and problem identification system developed by an international consortium of researchers. The original interRAI assessment was developed for long-term care homes (MDS 2.0) in response to U.S. government regulations (Omnibus Budget Reconciliation Act of 1987) aimed at improving nursing home quality. The interRAI home care assessment instrument (RAI-HC or MDS-HC) has been developed for home care settings. Other versions have been developed for use in mental health, acute care, palliative care, and other settings. RAI assessment items include personal items, referral information, cognition, communication and hearing, vision, mood, behavior, physical functioning, continence, disease diagnoses, preventive health measures, nutrition status, oral health, skin condition, environmental assessment, and formal and informal service use. Specific scales have been derived from RAI assessment items, including measures of activities of daily living (ADLs), cognitive impairment, depression, and pain. Application of the RAI system has been linked with reduced institutionalization and functional decline. The approach to data collection is one of best available information, which may be done by an interview or observation of the older adult via an interview of their caregiver (paid or unpaid) or through chart review. Although this approach may suggest the possibility of inconsistent data collection, it should be noted that there has been growing support for outcome measurement that incorporates a variety of perspectives, including self-report, proxy, and objective measures. Briefer screening tools have been developed as part of the RAI system, including the RAI contact assessment. When articulated with the more comprehensive RAI assessments such as the MDS 2.0 and the RAI-HC, the RAI system could thus be seen as an alternative strategy to achieve the aims of the NIA working group mentioned earlier (i.e., a screening tool followed by more in-depth assessment).

An important advantage of the interRAI system is that it allows for consistency in data collection across sites and across types of care settings; the various versions of the RAI instruments use similar questions and data collection approaches. This advantage is particularly strong when contrasted to the alternative practice of trying to achieve consensus on the battery of measures that should be used in clinical practice and outcome evaluation. Even if a particular group achieves consensus on a set of tools (e.g., as noted by Dickinson ), another group is likely to agree on a different set (e.g., as noted by Pepersack ), and it is unlikely that all members of either group will be consistent in their use of the prescribed measures.

A limitation of the interRAI assessment systems is the same as for other approaches aiming to achieve a comprehensive, multidimensional assessment; not all the assessment areas will point to relevant clinical outcomes for all patients, and it would still be necessary to identify the specific outcomes of interest for a specific intervention or for a specific patient. In the interRAI system, this is addressed to some extent through the use of triggers used to identify issues warranting further investigation, referred to as resident assessment protocols (RAPs) or clinical assessment protocols (CAPs).

You're Reading a Preview

Become a Clinical Tree membership for Full access and enjoy Unlimited articles

Become membership

If you are a member. Log in here