18F‐FDG PET for the early diagnosis of Alzheimer’s disease dementia and other dementias in people with mild cognitive impairment (MCI)

. 2015 Jan 28;2015(1):CD010632. doi: 10.1002/14651858.CD010632.pub2

Question	Response and weighting	Explanation
Patient Selection
Was the sampling method appropriate?	No = high risk of bias Yes = low risk of bias Unclear = unclear risk of bias	Where sampling is used, the designs least likely to cause bias are consecutive sampling or random sampling. Sampling that is based on volunteers or selecting participants from a clinic or research resource is prone to bias.
Was a case‐control or similar design avoided?	No = high risk of bias Yes = low risk of bias Unclear = unclear risk of bias	Designs similar to case‐control that may introduce bias are those designs where the study team deliberately increase or decrease the proportion of participants with the target condition, which may not be representative. Some case‐control methods may already be excluded if they mix participants from various settings.
Are exclusion criteria described and appropriate?	No = high risk of bias Yes = low risk of bias Unclear = unclear risk of bias	Study will be automatically graded unclear if exclusions are not detailed (pending contact with study authors). Where exclusions are detailed, the study will be graded as 'low risk' if exclusions are felt to be appropriate by the review authors. Certain exclusions common to many studies of dementia are: medical instability; terminal disease; alcohol/substance misuse; concomitant psychiatric diagnosis; other neurodegenerative condition. Exclusions are not felt to be appropriate if ‘difficult to diagnose’ patients are excluded. Post hoc and inappropriate exclusions will be labelled 'high risk' of bias.
Index Test
Was ¹⁸F‐FDG PET biomarker assessment/interpretation performed without knowledge of clinical dementia diagnosis?	No = high risk of bias Yes = low risk of bias Unclear = unclear risk of bias	Terms such as “blinded” or “independently and without knowledge of” are sufficient and full details of the blinding procedure are not required. Interpretation of the results of the index test may be influenced by knowledge of the results of reference standard. If the index test is always interpreted prior to the reference standard then the person interpreting the index test cannot be aware of the results of the reference standard and so this item could be rated as ‘yes’. For certain index tests the result is objective and knowledge of reference standard should not influence the result, for example level of protein in cerebrospinal fluid; in this instance the quality assessment may be 'low risk' even if blinding was not achieved.
Were ¹⁸F‐FDG PET biomarker thresholds prespecified?	No = high risk of bias Yes = low risk of bias Unclear = unclear risk of bias	For scales and biomarkers there is often a reference point (in units or categories) above which participants are classified as 'test positive'; this may be referred to as threshold; clinical cut‐off or dichotomisation point. A study is classified at high risk of bias if the authors define the optimal cut‐off post hoc based on their own study data, because selecting the threshold to maximise sensitivity and specificity may lead to overoptimistic measures of test performance. Certain papers may use an alternative methodology for analysis that does not use thresholds and these papers should be classified as not applicable.
Reference Standard
Is the assessment used for clinical diagnosis of dementia acceptable?	No = high risk of bias Yes = low risk of bias Unclear = unclear risk of bias	Commonly‐used international criteria to assist with clinical diagnosis of dementia include those detailed in DSM‐IV and ICD‐10. Criteria specific to dementia subtypes include, but are not limited to, NINCDS‐ADRDA criteria for Alzheimer’s dementia; McKeith criteria for Lewy Body dementia; Lund criteria for frontotemporal dementia; and the NINDS‐AIREN criteria for vascular dementia. Where the criteria used for assessment are not familiar to the review authors or the Cochrane Dementia and Cognitive Improvement group (‘unclear’) this item should be classified as 'high risk of bias'.
Was clinical assessment for dementia performed without knowledge of the ¹⁸F‐FDG PET biomarker?	No = high risk of bias Yes = low risk of bias Unclear = unclear risk of bias	Terms such as “blinded” or “independently and without knowledge of” are sufficient and full details of the blinding procedure are not required. Interpretation of the results of the reference standard may be influenced by knowledge of the results of index test.
Participant flow
Was there an appropriate interval between ¹⁸F‐FDG PET biomarker and clinical dementia assessment?	No = high risk of bias Yes = low risk of bias Unclear = unclear risk of bias	As we test the accuracy of the ¹⁸F‐FDG PET biomarker for MCI conversion to dementia, there will always be a delay between the index test and the reference standard assessments. The time between reference standard and index test will influence the accuracy (Geslani 2005;Okello 2009;Visser 2006), and therefore we will note time as a separate variable (both within and between studies) and will test its influence on the diagnostic accuracy. We have set a minimum mean time to follow‐up assessment of 1 year. If more than 16% of participants have assessment for MCI conversion before 9 months this item will score ‘no’.
Did all participants get the same assessment for dementia regardless of ¹⁸F‐FDG PET biomarker?	No = high risk of bias Yes = low risk of bias Unclear = unclear risk of bias	There may be scenarios where participants who score “test positive” on index test have a more detailed assessment. Where dementia assessment differs between participants this should be classified as high risk of bias.
Were all participants who received ¹⁸F‐FDG PET biomarker assessment included in the final analysis?	No = high risk of bias Yes = low risk of bias Unclear = unclear risk of bias	If the number of participants enrolled differs from the number of paricipants included in the 2 x 2table then there is the potential for bias. If participants lost to drop‐out differ systematically from those who remain, then estimates of test performance may differ. If there are drop‐outs they should be accounted for; a maximum proportion of drop‐outs to remain at low risk of bias has been specified as 20%.
Were missing ¹⁸F‐FDG PET biomarker results or uninterpretable ¹⁸F‐FDG PET biomarker results reported?	No = high risk of bias Yes = low risk of bias Unclear = unclear risk of bias	Where missing or uninterpretable results are reported, and if there is substantial attrition (we have set an arbitrary value of 50% missing data), this should be scored as ‘no’. If those results are not reported, this should be scored as ‘unclear’ and authors will be contacted.
Anchoring statements to assist with assessment for applicability
Question	Explanation
Were included participants representative of the general population of interest?	The included participants should match the intended population as described in the review question. The review authors should consider population in terms of symptoms; pre‐testing; potential disease prevalence; setting If there is a clear ground for suspecting an unrepresentative spectrum the item should be rated poor applicability.
Index test
Were sufficient data on ¹⁸F‐FDG PET biomarker application given for the test to be repeated in an independent study?	Variation in technology, test execution, and test interpretation may affect estimate of accuracy. In addition, the background, and training/expertise of the assessor should be reported and taken into consideration. If ¹⁸F‐FDG PET biomarker was not performed consistently this item should be rated poor applicability.
Reference Standard
Was clinical diagnosis of dementia made in a manner similar to current clinical practice?	For many reviews, inclusion criteria and assessment for risk of bias will already have assessed the dementia diagnosis. For certain reviews an applicability statement relating to reference standard may not be applicable. There is the possibility that a form of dementia assessment, although valid, may diagnose a far larger proportion of participants with disease than usual clinical practice. In this instance the item should be rated poor applicability.