Biological variation and analytical performance specifications


Abstract

Background

There are many sources of variation in numerical results generated by examinations performed in laboratory medicine. Some measurands have biological variations over the span of life and others have predictable cyclical or seasonal variations. Most measurands in an individual display random variation around homeostatic set points and this is termed within-subject biological variation. The homeostatic set points vary between individuals and the variation between the set points of different individuals is termed between-subject biological variation. An understanding of these sources of variation is required to enable appropriate application of clinical laboratory measurements.

Content

In this chapter, we explain that numerical estimates of analytical, within-subject, and between-subject biological variation are usually generated by prospective studies; series of specimens from a cohort of individuals are examined, followed by statistical analysis to identify and quantify the different types of variation. Furthermore, sources of evidence-based data on biological variation and tools for the appraisal of the quality of biological variation studies are presented. The chapter also provides an overview of what applications biological variation data have in laboratory medicine such as the “index of individuality” and “reference change value” where the latter is used to determine whether changes in serial results from an individual can be explained by analytical and within-subject biological variation only. Additionally, models for setting analytical performance specifications, for imprecision, bias, total error, and measurement uncertainty, which can be created using estimates of within-subject and between-subject variation, are presented.

Introduction

There are many causes of variation that contribute to the uncertainty of any result generated in laboratory medicine. Biological variation is one of the most important sources and should be taken into account in any interpretation made. This chapter is based on a chapter in the previous editon of the Tietz textbook.

There are various types of biological variation. The concentration or activity of some measurands changes over the span of life, some slowly and some more quickly, particularly at times of rapid physiologic development, such as the neonatal period, childhood, puberty, menopause, adults, and advanced age. The concentration or activity of measurands can also differ between men and women. This variation is taken care of by the creation of age- and/or sex-stratified (partitioned) reference intervals. A number of measurands have predictable cyclical rhythms in their concentrations. These can be daily (e.g., iron), monthly (e.g., pituitary gonadotrophins in females), or seasonal (e.g., vitamin D) in nature. Knowledge of the expected values throughout the cycles mentioned above is vital for clinical interpretation, and specimen collection should occur at appropriate times. An absence of rhythm may indicate disease. These types of biological variations are described in detail in Chapter 9 .

In 1960, Schneider defined the distribution of values observed in a group as caused by “the effect of a large number of undefined forces acting randomly to displace the values of the individual members of the group away from the true group value.” He described three factors which contribute to the overall variation in a dataset, where one result from each individual has been included.

  • Factors which make for true differences between individuals (interindividual/between-subject)

  • Factors which make for true differences from time to time in each single individual (intraindividual/within-subject)

  • Factors which make for true differences from measurement to measurement of each sample that may be measured (analytical).

Under the assumption of constant and homogenous within-subject variation, as well as that the variation of laboratory error is the same for all measurements, he divided the observed variance (squared standard deviation [SD]) SD2observedSD2observed using the formula:

SD2observed=SD2between observed+SD2within-subject+SD2analyticalSD2observed=SD2between observed+SD2within-subject+SD2analytical

The concept was further elaborated on in a series of four articles in clinical chemistry titled “Biological and analytic components of variation in long-term studies of serum constituents in normal subjects.” An overview of the analytical, within-subject, and between-subject variation is displayed in Fig. 8.1 .

FIGURE 8.1, The relationship between the between-subject (SD G ), within-subject (SD I ), and analytical (SD A ) variation.

As an example, four specimens were taken from four individuals at daily intervals, and serum sodium activity was examined (reference interval: 135 to 147 mmol/L). The results are provided in Table 8.1 . It is evident that the results for each individual vary from day to day; which is ascribed to three sources of variation: preanalytical, analytical, and within-subject biological variations. The mean value is termed the homeostatic set point. In addition, each individual has a different average serum sodium concentration; the variation among the homeostatic set points of individuals is the between-subject variation, whereas the average variation within each individual is the within-subject variation. Generation and subsequent application of numerical data on the components of biological variation are crucial facets in laboratory medicine, and both of these are described in detail in this chapter.

TABLE 8.1
Serum Sodium Activity in Four Specimens Collected at Daily Intervals From Each of a Cohort of Four Individuals
Sodium Day 1 Day 2 Day 3 Day 4
Individual 1 137 139 136 138
Individual 2 144 146 145 144
Individual 3 141 143 142 140
Individual 4 139 138 141 140
Values are measured in millimoles per liter.

Terminology

The terms and symbols used throughout this chapter are:

  • SD: standard deviation

  • CV: coefficient of variation

  • SD I /CV I : within-subject biological variation (variation within a single individual estimated as a pooled variation from a [homogenous] group of individuals)

  • SD G /CV G : between-subject biological variation (variation between the homeostatic set points of a group of individuals)

  • SD A /CV A : analytical variation (analytical examination variation).

Generation of data on components of biological variation

Production of data on biological variation is quite similar to derivation of population-based reference intervals (see Chapter 9 ) with the exception being that, instead of one specimen being taken from a large number of reference individuals, at least two specimens are needed. Biological variation studies are usually undertaken as prospective experimental studies that include a higher number of specimens taken from a smaller cohort of reference individuals where estimates thereafter are derived from a traditional statistical approach such as, for example analysis of variance, ANOVA or similar analyses (traditional approach), or by more recently published approaches such as Bayesian statistics. Additionally, there is a renewed interest in basing estimates on a lower number of specimens, using a larger cohort (big data), such as was the case in a recently published study where measurements available from patient cohorts from hospital data were used.

Prospective experimental studies

Design of studies

In general terms, this approach recommends that numerical estimates of CV A and both CV I and CV G components of biological variation should be generated using the following experimental approach:

  • Select a group of reference individuals.

  • Take a set of specimens from each of the individuals at regular time intervals while minimizing all sources of pre-analytical variation in preparation of the subjects for specimen collection.

  • Transport specimens in a standardized way and store aliquots under controlled conditions until ready for analysis.

  • Undertake the analyses in duplicate while minimizing analytical sources of variation.

This design has been widely used and is very suitable for those measurands that have a low CV I and strict homeostatic control. For example, a typical study design includes 10 specimens collected on a weekly basis from 20 individuals recruited from a smaller cohort of reference individuals. In such a setting, the aim is to have the same number of specimens from each individual (i.e., a balanced design), but in reality, the dataset in the end is usually unbalanced with an unequal number of specimens from each individual, as not all participants will be able to participate in every sampling, and some data points may be considered as outliers. Fraser and Harris stated that “the components of variation can be obtained from a relatively small number of specimens collected from a small group of subjects over a reasonably short period of time”; however, solid evidence to support this statement was lacking until recently. Basing the design on a small group of subjects can have some weaknesses, especially when we want to evaluate whether subgroups have different within-subject biological variation estimates. If the initial study population consisted of 20 subjects, consisting equally of males and females, the estimates of the subgroups for sex will only consist of 10 subjects. If we further want to evaluate age groups or ethnicity, the subgroups will be even smaller. If there is clinical reason to assess subgroups, we need to design the study with this in mind. It is a general concern that design may often be based on simplicity, for example, on how many individuals are easily recruited from the local staff at the hospital, and this may lead to the derived biological variation estimates not being representative for the general population at all.

Other types of laboratory data are often accompanied by confidence intervals (CIs); unfortunately, this has rarely been included in the most recently published reports that provide estimates of the components of biological variation. CIs are essential for appraising the results of and comparing results between studies. The determination of CIs for different balanced designs for a two-level nested variance analysis model with varying analytical imprecision has been examined in detail. Data sets based on this model were simulated to calculate the power of different study designs for estimation of CV I . It was found that the reliability of an estimate for CV I and the power are greatly influenced by the study design and by the ratio between CV A and CV I . The study provided data where it was indicated what the effects were of increasing the number of included individuals and the number of replicates at different levels of imprecision.

Some measurands for which biological variation estimates are sought may be unstable and examinations must therefore be performed soon after the collection of specimens (e.g., for some hematologic measurands, such as mean cell volume, number of erythrocytes and leukocytes per volume). In this case, to obtain the necessary statistically unconfounded estimate of CV I , the CV A is estimated by analyzing all specimens taken at each sampling point, in duplicate. However, this only represents the within-run CV. Thus in addition, quality control materials have to be analyzed between each run to ascertain that variation due to systematic deviations in the examination procedure between each examination is excluded. Using this strategy, the quality control material should preferentially be commutable, meaning that the variations of the analyses of the individual samples and the quality control materials are comparable. Furthermore, it must be assured that the concentrations or activities of the quality control materials are similar to those of the samples from the subjects studied, because CV A often varies with concentration or activity.

Data analysis—traditional approach

The most frequently used method for sample collection and data analysis is detailed by Fraser and Harris, where duplicate analyses are performed on samples from a cohort of individuals. However, before any components of variation are estimated with this type of method, it is important to (1) verify that the individuals are in a steady state; (2) exclude outliers in the data set; and (3) assess whether the individuals have a homogeneously distributed within-subject CV I .

Steady state.

The calculation of biological variation data assumes that the individuals assessed are in a “steady state,” that is, the homeostatic set points do not change during the duration of the study. If the population displays a trend over the sampling period, data should be transformed to a “steady state,” for example, by correcting for trends using regression analysis or using other methods such as multiple of medians (MoM) and its natural logarithm.

Data transformation/normal distribution.

If one wishes to estimate the CI for the components of biological variation, most methods assume that data is normally distributed. As most biological data are naturally logarithmically distributed, the examination and calculations must be performed on the logarithms of the observations. This both helps in extracting the CVs and ensuring that the data distribution is closer to normal. It is important to specify that the normality relates to the model effects; that is, both the analytical variation around the true sample value and the individuals’ variation around their homeostatic set point are normally distributed. It does not relate to the total pooled data. Pooling the standardized residuals for each level, namely, residuals from replicates (difference between replicates and mean of replicates from each sample), residuals from samples (difference between mean of samples and mean for the individual), and residuals from subjects (differences between individual means and total mean), can be used to assess normality.

Example: Assume observations of 2.25, 2.50, and 2.75 are from individual A and 3.50, 4.00, and 4.50 from individual B. Standardized residuals are generated by first dividing by the mean for each individual. Individual A will have an average of 2.50, so standardized observations will be 0.90, 1.00, and 1.10, and corresponding standardized residuals −0.10, 0.00, and 0.10. Individual B will have an average of 4.00, standardized observations 0.875, 1.00, and 1.125, and corresponding residuals −0.125, 0.00, and 0.125. The pooled standardized residuals will be −0.10, 0.00, 0.10, −0.125, 0.00, and 0.125. The standardized residuals can then be examined using Kolmogorov-Smirnov or Anderson-Darling or other techniques for the assessment of normality.

If a log-transformation is applied, it is important to transform the estimated SDs back to CVs afterward. An alternative approach is using the CV-ANOVA as described by Røraas and colleagues. Using this approach, data are transformed by dividing each subject’s measurement values by that subject’s mean value, so a distribution of values around 1 is obtained, and the CV A and CV I can be derived directly, as all subjects have a mean of 1. However, this approach does not lend itself to estimation of CV G .

Outliers.

The assessment of outliers is important, because such aberrant values will lead to erroneous estimates of the components of biological variation if applying the method of Fraser and Harris, or a similar approach. It is important that this assessment is done using the same measure of variability that is estimated; that is, if CVs are estimated, which is usually the case, all the calculations should be performed using CVs. This can be achieved by normalizing data, for example, through log-transformation as described above. After any transformation, outlier assessments are performed at three levels: (1) between duplicates or replicates, (2) between samples within an individual, and (3) between individuals. For levels 1 and 2, typically Cochrane’s test is used on the CV , but applied to SD if we work on log-transformed data. Failure to remove outliers in the replicates can result in a falsely high CV A and an erroneous CV I estimate, while failure to remove outliers from results from each of the individuals can result in a falsely increased CV I . Finally, outliers among the mean values of the individuals (level 3) are assessed. A simple strategy to perform this process is to use Reed’s criterion, where the difference between any mean value and the next highest or lowest value in the series should be less than one-third of the absolute range of all values. Another useful approach for the assessment of outliers between individuals is a simple graphical approach in which the mean values and the range of all these values are plotted for each individual (on the y -axis) against concentration or activity (on the x -axis). An example is provided in Fig. 8.2 and discussed later in this chapter. Failure to exclude outliers of the mean values of the different individuals will result in a falsely large CV G , and because the overall mean value will be different; this may also affect CV A and CV I depending on the transformation chosen. The number of outliers and concentrations that have been found and the number of datapoints used to derive the components of biological variation should always be reported.

FIGURE 8.2, Means and extreme values for serum creatinine in 27 older adults. Note: 100 μmol/L = 1.13 mg/dL.

Homogeneity.

Applications of within-subject biological variation data, particularly for estimation of reference change values (RCVs) (which will be dealt with later), depend on the within-subject variation data, for example, the variances being homogeneously distributed. If the data are derived from a study cohort with heterogeneously distributed within-subject estimates, the results may not be representative of the population, except for an “average” individual. To remedy this, stratified analysis by subgroups should be applied if possible, or an alternative approach for deriving biological variation estimates, such as Bayesian statistics, should be used. Thus although estimates of CV I and RCVs can be calculated, these are not generalizable to the overall population. It is therefore always necessary to check that the variances in specimens drawn from a population are “homogeneous” by definition, and consequently, that the ranked cumulative distributions of these variances are distributed around the true variance of the population according to χ 2 /df ( χ 2 ) distribution for degrees of freedom (df) according to the individual sample sizes. In contrast, when a series of different variances has a dispersion around the pooled variance different from a χ 2 /df distribution, they are considered to be heterogeneous. Ideally if we have normality, a balanced design and a low CV A compared to CV I , then homogeneity can be illustrated by plotting the cumulated ranked fractions of within-subject variations as a function of the within-subject variation estimates on a Rankit scale ( Fig. 8.3 ). If homogeneous, this curve will fit to the theoretical of the square root of the pooled variance multiplied by χ 2 /df. Variance homogeneity can also be tested further by Bartlett’s test. Although Cochrane’s test of the variances of the mean values is primarily an outlier test, it also can be used as a test for homogeneity.

FIGURE 8.3, Variance homogeneity plots. Rankit plots show the accumulated fractions as function of within-subject biological coefficient of variations (CVs) . The filled circles represent individual CVs for healthy individuals; the line indicates the expected distribution of measured CV for “true” CV values of 17.3% for porphobilinogen (PBG) (A) and 38.1% for porphyrins (B) . The Cochrane test gives values of 0.11 for PBG and 0.25 for porphyrins, and a value indicating heterogeneity is greater than 0.17. By this follows that it is not recommendable to derive common RCVs for porphyrins.

After testing for homogeneity and outliers, it is important to indicate how many individuals and results have to be removed to obtain homogeneity of the data used to estimate the CV I . This again provides an indication of the representative nature of the data and underscores its suitability for wide application. It is also recommended to check if excluded individuals share common traits that can explain their heterogeneity, which can be valuable information to the applicability of the estimated CV I .

Calculation of analytical, within-subject, and between-subject variation.

There are several estimators available: ANOVA, maximum likelihood (ML), restricted maximum likelihood (REML), minimum variance quadratic unbiased estimator (MIVQUE), and weighted analysis of means (WAM). For balanced designs, the choice of estimator is not important; however, for unbalanced designs, the estimators can yield different results and should be chosen carefully. These estimators are available in most statistical software packages using general linear models or generalized linear models. In historical publications, many have not used formal ANOVA but instead have simply used subtracted variances. The assumption is that, because preanalytical sources of variation have been minimized and can be considered negligible, the total CV (CV T ) of a set of results from each cohort of individuals includes CV A , CV I , and CV G . Then, because CV T = [(CV A ) 2 + (CV I ) 2 + (CV G ) 2 ] ½ , the components can be calculated by simple subtraction. However, this approach will require calculations including degrees of freedom and therefore a formal ANOVA is a simpler approach for correct calculations and can take into account unbalanced designs and further give the ability to calculate CIs. Additionally, in many of these studies, CV A has been estimated based on quality control materials, the possibility of outliers has not been assessed, and a normal distribution and homogeneity of data are assumed; as a consequence, estimates of the components of biological variation are likely to be less precise.

You're Reading a Preview

Become a Clinical Tree membership for Full access and enjoy Unlimited articles

Become membership

If you are a member. Log in here