Abstract
The development of regional databases and doctors’ desktop programs that accept pathology results from different laboratories should improve patient care by allowing easy assessment of cumulative data. This has the potential to be unnecessarily confusing unless laboratories contributing to the databases provide standardised results and common reference intervals, where this is valid. The analytical methods that produce significantly different results need to be reported in a manner that avoids inappropriate interpretation.
The process of setting reference intervals requires an organisational structure which enables appropriate intervals to be set taking all relevant factors into account, including the opinions of expert clinicians. There must also be criteria for analytical agreement between the laboratories involved based on comparison studies using patient samples.
A network of QA groups across Australasia, with leadership from the AACB and RCPA, should be formed to share the ongoing work of defining reference intervals (RIs) for common tests, and reviewing them as the testing environment changes with the introduction of new techniques and instruments.
Introduction
The AACB-RCPA working party on Common Reference Intervals was formed in 2004. The benefits of common RIs have been outlined in a previous paper in the Clinical Biochemist Reviews.1 More recently the authors have toured Australia and New Zealand delivering the 2006 AACB Current Concepts lectures promoting the advantages, and indeed the necessity, for laboratories throughout Australasia to develop and adopt common RIs. We promoted the concept that the process can initially be developed on a regional basis, to include laboratories where a single patient is likely to have the same test performed and where a common database is most likely to be established. However, expanding the process of standardisation to include all regions throughout Australasia is the ideal approach as it will produce standard RIs that have undergone a much more rigorous peer assessment process to ensure universal acceptance and will also be more acceptable to the large nation-wide private providers.
Although the focus of the AACB lectures was on standardising RIs for chemical pathology tests, there is no reason why the same principles should not be extended to include tests which have a numerical result in all other pathology disciplines, and indeed, this has already begun to happen in some regions.
The Advantages of RI Standardisation
The advantages of common reference intervals to the patient and the clinician are self-evident. Provided the assays themselves give sufficiently close results, common reference intervals are required for laboratories to give the same clinical information in response to a test request. The development of regional databases as the repositories for all information on an individual patient allows clinicians to view laboratory results analysed at multiple sites. If different RIs for each laboratory are to be displayed, then the complexity of the data set on any one patient has the potential to cause unnecessary confusion and interpretative errors. The same applies to doctors practice systems receiving atomised results from multiple laboratories.
The appropriate determination of a reference interval is a time-consuming process with considerable exercise of professional judgement in addition to the need for significant amounts of data on which to base the decisions. It is our opinion that it is beyond the resources of all but the largest pathology services to set and review appropriately the reference intervals for the many tests offered by modern laboratories. Indeed there are many “impossible” reference intervals, for example paediatrics and pregnancy stages, where individual laboratories are virtually unable to perform appropriate studies. Additionally differing judgement means that different reference intervals are set in response to the same data, leading to the current state of widely varying RIs across Australia and New Zealand. By working together we can gain from shared data, improve decision making by wider consultation and reduce unimportant variation.
A critical review of the RIs in use within many laboratories is likely to find that many have a dubious or completely unknown source, and may not even be related to the method presently in use. They are often transferred from method to method over the years, without an up to date revalidation using sufficient patient samples to attain statistical significance.
A number of recent surveys of reference intervals have been conducted2,3 (GJ personal survey March 2007) and the results have consistently shown wide variation in reference intervals, which is not related to the analytical performance of the assay. The notion that the different intervals in use reflect analytical variation is not supported by data. It is our contention that the current variability in reference intervals is a correctable source of potential medical error which requires urgent attention.
Requirements for Common Reference Intervals
The prerequisites for use of common reference intervals have been elegantly described by Ceriotti in the previous issue of this journal.4 The key requirements in an ideal world are reference measurement systems, traceability of field methods and high quality reference interval studies. As described by Ceriotti these elements “are not easy, fast or straightforward” and the need to reach the goal of common reference intervals is urgent. There are however activities that can be commenced in the short term that can both provide intervals for use and establish the processes for implementing better processes as they become available. In particular the adoption of common reference intervals requires an organisational structure to evaluate data and make recommendations, and a method for comparison of field methods using patient samples.
Organisational Structure - The Auckland Region QA Group (ARQAG) Example
For laboratories to collaborate on reference intervals an organisational structure is required to assess data and make decisions. A prime example of successful regional implementation of standardised reference intervals is the work of the ARQAG. This Group consists of ten laboratories in the upper half of the North Island of New Zealand and has been meeting on a monthly basis and working together to standardise reference intervals since 1976. Although this has proved to be a never-ending process of revising RIs as methods, instruments, and reagent formulations change, standard RIs have been in use for most of this period for a high proportion of commonly performed biochemical tests. A similar group has more recently standardised the majority of haematology test RIs.
The process of solving the background technical issues that are required for assay standardisation has led to ARQAG developing into a general problem-solving forum, covering all aspects of analysis for a wide range of instrument types and methods of analysis. Even laboratories in commercial competition table their technical problems at the monthly ARQAG meetings, and the combined resources of the region are used to resolve them, with the primary aim of benefiting all patients in the region. Maintaining confidentiality about an individual laboratory’s poor QA performance with a test has never been an issue, and indeed, confidentiality would inhibit the level of cooperation required to function efficiently. All members of the Group realise that we all have technical problems from time to time, and that every member can contribute equally to their resolution, independent of the size of the laboratory they represent. The quality of analytical performance for the whole group is enhanced with this cooperative and open approach.
Similar QA Groups are being established in other regions and states throughout Australasia, with varying levels of activity.
The next step is to form these Groups into a network across the Australasian region as a whole, so that they can work together and pool information and expertise. This will enable standard reference intervals to be developed that will gain wide acceptance, and will also facilitate the investigation of any specific technical issues arising from this process. In terms of organisational structure the RCPA and AACB are the professional organisations which must provide the leadership to promote these activities and then formally accept and promote any decisions.
Comparison of Methods
The basic requirement for standardising RIs is to ensure that all analytical methods in use produce comparable results on patient specimens. Traceability and reference method systems are the keys to long-term method comparability but current field methods can be assessed for between-method bias by sharing patient samples on a local or nation-wide basis. In some cases, there will be consistently observed differences, but they may not be clinically or statistically significant. Some measurements are however clearly method dependent, and obviously the quoted RIs should reflect this difference.
For the majority of commonly requested tests, ARQAG has found that the method related differences are not clinically significant when the values are close to the critical cut-off limits, either for disease diagnosis or patient management. The differences are usually more apparent at the extremes, where confusion is unlikely to occur. For these tests, quoting different RIs within a single region adds unnecessarily to interpretation difficulties and inhibits cumulative result comparisons.
Using non-patient samples, such as QA or QC material, is not satisfactory for method comparisons, unless all laboratories use identical methodologies or commutability of the material has been established. Results are generally more variable and method dependent when using “artificial” material than is shown by analysing fresh or frozen patient samples. The method differences seen using QA material are commonly due to matrix effects.
Important Theoretical Aspects
There is now an accepted body of theoretical knowledge regarding reference intervals which is covered in other publications. These include documents from the IFCC and the Clinical and Laboratory Standards Institute (CLSI) for the performance of reference interval studies and Fraser and others on the criteria for sharing intervals.5–7 There are some areas which need further consideration and agreement amongst the laboratories using the intervals.
The percentage of the reference population to be included in the interval for each analyte should be considered. The IFCC position is to use the central 95%, however other criteria may be indicated such as the lowest 99% for serum troponins.8 Other skewed distributions such as CK or liver enzymes may be best served with a lowest 95% for example.
The reference interval includes the scatter due to analytical and pre-analytical effects in addition to the within- and between-person variation. While attempts are made to minimise these effects, when multiple analysers and methods are used the scatter due to analytical effects will increase and concepts to deal with this must be developed. This is of greatest importance when an interval developed at one laboratory is considered for sharing amongst other laboratories. A multi-centre reference interval study will include these differences as part of its development.
The criteria for sharing intervals are based on a Gaussian distribution for the reference population. There are however many analytes where this is not the case and criteria must be developed to address these analytes.
Assays with True Analytical Differences
The experience of ARQAG is that there are relatively few common biochemical tests that are clearly method dependent to a degree that is clinically significant, and the majority of these are measured using immunoassay techniques.
Vitamin B12 and folate stand out as being not only method dependent, but also vary significantly and unpredictably with changes in reagent batches. Troponin T is clearly not the same as troponin I, and the various troponin I assays are still method dependent, despite moves towards standardisation.9 Free T3 and T4 are method dependent but were found at ARQAG to be remarkably similar up until 2005, when reagent changes once again created significantly different method dependent results. Rheumatoid factor, lactate dehydrogenase, and cortisol have all been shown to be method dependent, despite the universal cut-off values memorised by our clinical colleagues, which remain unchanged despite the variable RIs that different laboratories may quote.
Tumour markers (e.g. CEA, hCG) fall into an important category of their own. Oncology centres usually draw patients from a wide area, which means that they are likely to be visiting more than one laboratory for their monitoring tests. The potential variation in results used for monitoring progression or regression of a tumour may be clinically misleading. For this reason, patients must either be instructed to consistently visit one laboratory for their tests, or preferably, all specimens should be referred to the hospital laboratory directly associated with the regional oncology service.
Results from these method dependent analytes must be identified as such and kept separate in any databases where results are accumulated. At present in New Zealand and Australia Logical Observation Identifiers Names and Codes (LOINC) is the coding system most commonly used to identify individual tests for electronic communication. Unfortunately this system has only a facility to separate results of fundamentally different methods (e.g. immunoassay compared to activity assay) rather than differences between immunoassay manufacturers.
Setting Reference Intervals
Data Sources
Setting RIs requires a combination of statistical and non-statistical information, which must be combined with professional judgement. The basic building blocks for setting RIs are statistical studies based on measurements from a reference population. Such studies may be performed locally, be sourced from the literature or found in manufacturers’ information. In all cases it is necessary to ensure the population, pre-analytical and analytical issues reflect the practices which will actually be in use.4
A growing source of information for RI studies is “data mining” from large pathology laboratory’s databases. The assumptions that underlie the use of this kind of data is that a significant proportion of the results in the database will be unaffected by the reason the patient has sought medical attention and thus reflects the levels found in the relevant population. As the data does include patients with relevant illness, some data exclusion process is required. The first important exclusion is to restrict the data to ambulant outpatients. This markedly reduces the frequency of the effects of recumbancy or of the acute phase response. Other restrictions can be applied based on the results of related tests (e.g. only include free T4 results where the TSH is within the reference interval) or where disease frequency may be expected to be low (e.g. subjects attending a health promotion clinic).
Another approach to this problem is to search for underlying statistical distributions in the data using methods such as Bhattacharya analysis.10 This method can identify Gaussian distributions in the midst of other data. For analytes with a clear underlying Gaussian distribution, such as serum sodium and haemoglobin, this is a powerful and robust technique, allowing assessment of tens or hundreds of thousands of subjects with stratification according to sex or age as required. Indeed it could be considered nearly impossible for reference interval studies without significant funding to include more than a few hundred subjects in each decile of age, whereas thousands can be available in this manner.
Data mining techniques also have the advantage of establishing intervals which reflect the population against which a patient will compared, i.e. those presenting to a medical practitioner, with the same pre-analytical and analytical issues.
Data from hospital patients may show different RIs for some tests, due to a lower level of physical activity, bed rest, and non-specific effects of illness, but the effects of these should be understood by those managing patients in hospital, and therefore any differences are unlikely to cause confusion.
Other Factors
There are a range of factors to consider when setting reference intervals which do not rely on strict statistical factors. These include rounding to an appropriate number to facilitate memory and to avoid indicating inappropriate accuracy for the interval, the need to consider the use of multiple lot numbers of reagents, multiple instruments and variation in methods, and the trade-off between sensitivity and specificity. No reference interval can be established without including professional judgement from pathologists, scientists and clinicians. The inclusion of our clinical colleagues in the process is vital both for ensuring the clinical validity of the interval developed but also as a mark of approval promoting uptake of the interval.
Extensive literature reviews are also an essential part of the process of determining any RIs, with attention to pre-analytical factors such as sample type, delays in sample handling and the likelihood of the need for separate intervals for different sub-populations.
RI Partitioning and Alternate Strategies
Age and sex stratification need to be considered, as do any true racial differences that do not simply reflect a difference in the incidence of an abnormal state.11 This is an additional area of variability, where different laboratories choose for or against having separate intervals for the sexes, or for different age groups. This has its greatest impact in the paediatric age group where a large number of analytes have important physiological variation in the results, with consequent widely different values for the intervals. An additional variable is the choice of age brackets used when defining the reference intervals.
For some analytes the use of clinical decision points is preferred to the use of population reference intervals. This can occur when a significant section of the general population has “non-healthy” values, such as with serum lipids and glucose. Decision points for these analytes have been established on the basis of risk of future adverse events and are established by expert groups such as the NHMRC.12,13 Our key responsibility in the laboratory is to ensure our results match those of the appropriate reference method and that the decision points on our reports are up to date. Other candidates for this expert consensus approach are serum urate, for which a joint working party of the RCPA, AACB and Australian and New Zealand Rheumatology Associations has been recently established and liver enzymes, where the high prevalence of diabetes and obesity in causing fatty liver disease and raised enzyme levels occurs in a significant proportion of the “healthy” population.
Nomenclature and Units
To avoid confusion, it is essential that all Australasian laboratories agree on common names, abbreviations, units and reporting intervals for all tests. Local practice often results in the use of a number of similar names or abbreviations for the same test, which we all understand, but which should be standardised to avoid confusion. This is particularly important as patients become more involved in their own care and such sources as Lab Tests On-Line provide reference information directly to the public.14
Although Australasian laboratories agreed to adopt SI units in the early 1970s, the uptake of these units and their subsequent use has not been complete or universal.15 Further work is required to ensure consistent use of units with recent examples being the decisions to standardise serum creatinine concentration reporting in μmol/L and creatinine clearance in mL/min.16
Common Reference Interval Project
There is much work to be done, and this needs to be shared amongst the State and NZ regional QA Groups with National leadership from the RCPA and AACB through the Common Reference Interval working party.
It has been proposed that initially, a single test be allocated to each regional QA group, or to specially convened committees formed in conjunction with the relevant clinical bodies, for them to derive appropriate reference intervals. Each group will evaluate all the relevant information and make a recommendation for further discussion, evaluation and adoption. True method differences may exclude some manufacturer’s methods from a proposed interval but advances in assay standardisation may resolve this at a later date. This is the start of an inter-laboratory collaboration which is required to serve the doctors and patients who use our services.
References
- 1.Jones GR, Barker A, Tate J, Lim CF, Robertson K. The case for common reference intervals. Clin Biochem Rev. 2004;25:99–104. [PMC free article] [PubMed] [Google Scholar]
- 2.IMEP-17. Trace and minor constituents in human serum. Report to participants. European commission, Institute for Reference Materials and Measurement; 2003. [Google Scholar]
- 3.Jones G. Urine albumin sampling and reporting. Current Practice in Australasia. Urine protein sampling and reporting. Current Practice in Australasia. The Clinical Biochemist Newsletter; September 2006; pp. 31–3. [Google Scholar]
- 4.Ceriotti F. Prerequisites for use of common reference intervals. Clin Biochem Rev. 2007;28:115–21. [PMC free article] [PubMed] [Google Scholar]
- 5.Solberg HE. A guide to IFCC recommendations on reference values. J Int Fed Clin Chem. 1993;5:162–5. [PubMed] [Google Scholar]
- 6.National Committee for Clinical Laboratory Standards. How to define and determine reference intervals in the clinical laboratory: Approved guideline. 2. Wayne, PA: CLSI; 2000. NCCLS document C28-A2. [Google Scholar]
- 7.Fraser CG. Biological Variation: from principles to practice. Washington DC: AACC Press; 2001. [Google Scholar]
- 8.Alpert JS, Thygesen K, Antman E, Bassand J-P. Myocardial infarction redefined– a consensus document of the Joint European Society of Cardiology/American College of Cardiology Committee for the Redefinition of Myocardial Infarction. J Am Coll Cardiol. 2000;36:959–69. doi: 10.1016/s0735-1097(00)00804-4. [DOI] [PubMed] [Google Scholar]
- 9.Christenson RH, Duh SH, Apple FS, Bodor GS, Bunk DM, Pantheghini M, et al. Toward standardization of cardiac troponin I measurements Part II: Assessing commutability of candidate reference materials and harmonization of cardiac troponin I assays. Clin Chem. 2006;52:1685–92. doi: 10.1373/clinchem.2006.068437. [DOI] [PubMed] [Google Scholar]
- 10.Bhattacharya CG. A simple method of resolution of a distribution into Gaussian components. Biometrics. 1967;23:115–35. [PubMed] [Google Scholar]
- 11.Lahti A, Hyltoft Petersen P, Boyd JC, Fraser CG, Jorgensen N. Objective criteria for partitioning Gaussian-distributed reference values into subgroups. Clin Chem. 2002;48:338–52. [PubMed] [Google Scholar]
- 12.National Health and Medical Research Council. National evidence based guidelines for the Case Detection and diagnosis of type 2 diabetes mellitus. [Accessed 23 August 2007]; http://www.nhmrc.gov.au/publications/synopses/_files/di9.pdf.
- 13.Position Statement on Lipid Management 2005. Heart Lung and Circulation. 2005;14:275–91. doi: 10.1016/j.hlc.2005.10.010. [DOI] [PubMed] [Google Scholar]
- 14.Lab Tests Online Australasia. [Accessed 23 August 2007]; http://www.labtestsonline.org.au.
- 15.Jones G. Reporting units for therapeutic drug monitoring: a correctable source of potential clinical error. Med J Aust. 2007;186:420–1. doi: 10.5694/j.1326-5377.2007.tb00977.x. [DOI] [PubMed] [Google Scholar]
- 16.The Australasian Creatinine Consensus Working group. Chronic kidney disease and automatic reporting of estimated glomerular filtration rate: a position statement. Med J Aust. 2005;183:142–3. doi: 10.5694/j.1326-5377.2005.tb06958.x. [DOI] [PubMed] [Google Scholar]