The diagnosis of tuberculous meningitis in adults and adolescents: protocol for a systematic review and individual patient data meta-analysis to inform a multivariable prediction model

Tom Boyles; Anna Stadelman; Jayne P Ellis; Fiona V Cresswell; Vittoria Lutje; Sean Wasserman; Nicki Tiffin; Robert Wilkinson

doi:10.12688/wellcomeopenres.15056.1

. 2019 Jan 31;4:19. [Version 1] doi: 10.12688/wellcomeopenres.15056.1

The diagnosis of tuberculous meningitis in adults and adolescents: protocol for a systematic review and individual patient data meta-analysis to inform a multivariable prediction model

Tom Boyles ^1,^2,^a, Anna Stadelman ³, Jayne P Ellis ⁴, Fiona V Cresswell ^5,⁶, Vittoria Lutje ⁷, Sean Wasserman ⁸, Nicki Tiffin ^8,⁹, Robert Wilkinson ^8,^10,¹¹

PMCID: PMC7863992 PMID: 33585702

Abstract

Background: Tuberculous meningitis (TBM) is the most lethal and disabling form of tuberculosis. Delayed diagnosis and treatment, which is a risk factor for poor outcome, is caused in part by lack of availability of diagnostic tests that are both rapid and accurate. Several attempts have been made to develop clinical scoring systems to fill this gap, but none have performed sufficiently well to be broadly implemented. We aim to identify and validate a set of clinical predictors that accurately classify TBM using individual patient data (IPD) from published studies.

Methods: We will perform a systematic review and obtain IPD from studies published from the year 1990 which undertook diagnostic testing for TBM in adolescents or adults using at least one of, microscopy for acid-fast bacilli, commercial nucleic acid amplification test for Mycobacterium tuberculosis or mycobacterial culture of cerebrospinal fluid. Clinical data that have previously been shown to be associated with TBM, and can inform the final diagnosis, will be requested. The data-set will be divided into training and test/validation data-sets for model building. A predictive logistic model will be built using a training set with patients with definite TBM and no TBM. Should it be warranted, factor analysis may be employed, depending on evidence for multicollinearity or the case for including latent variables in the model.

Discussion: We will systematically identify and extract key clinical parameters associated with TBM from published studies and use a ‘big data’ approach to develop and validate a clinical prediction model with enhanced generalisability. The final model will be made available through a smartphone application. Further work will be external validation of the model and test of efficacy in a randomised controlled trial.

Keywords: Tuberculous meningitis, multivariable prediction rule, machine learning, diagnostics

Introduction

Tuberculosis remains a major global health problem, with the most lethal and disabling form being tuberculous meningitis (TBM), of which there are more than 100,000 new cases each year ¹. Mortality is high, particularly in children and patients who are co-infected with HIV-1 ². The diagnosis is often delayed by the insensitive and lengthy culture technique required for disease confirmation, with delayed diagnosis and treatment being important risk factors for poor outcome ¹. Recently introduced nucleic acid amplification tests (NAATs) allow more rapid detection of TBM. Pooled specificity of 98.0% and 90% for Xpert MTB/RIF and Xpert MTB/RIF Ultra respectively, suggest that they are effective rule-in tests with the potential to speed up diagnosis and reduce unnecessary treatments for alternative conditions in some patients. However, the pooled sensitivity is 71.1% and 90% respectively, which is even lower for patients with HIV (58% to 81%) ³. Given the extremely high mortality if treatment is withheld from patients with TBM, these values are unlikely to be sufficient evidence to withhold treatment when negative in most patients. Improved strategies to rapidly and accurately diagnose TBM are urgently needed ¹.

A major stumbling block in TBM research had been the absence of a single reference standard test or standardised diagnostic criteria. In 2010, a committee of 41 international experts in the field developed consensus case definitions for TBM for use in clinical research ⁴. These case definitions have helped to standardise research but are not appropriate for use in routine clinical care as they depend on variables such as cerebrospinal fluid (CSF) culture results, which can take up to 6 weeks to become positive and may include brain imaging, which is not available in many resource constrained settings.

Another approach to improving rapid diagnosis in TBM, particularly in resource-limited settings where the majority of cases occur, is to develop and validate multivariable prediction models. At least 10 models have been published for the diagnosis of TBM, but a major limitation is that their performance is variable in different populations and settings ¹. A major reason for heterogeneous model performance across different settings and populations is case mix variation, which refers to the distribution of important predictor variables such as HIV status and age, and the prevalence of TBM. Case mix variation across different settings or populations can lead to genuine differences in the performance of a prediction model, even when the true predictor effects are consistent (that is, when the effect of a particular predictor on outcome risk is the same regardless of the study population) ⁵.

Recent studies have shown how big datasets can be used to examine heterogeneity and improve the predictive performance of a model across different populations, settings, and subgroups ^6–
8. Individual patient data meta-analysis is preferred to aggregate data meta-analysis, as risk scores can be generated and validated, and multiple individual level factors can be examined in combination ⁹.

Objectives

1.
Conduct a systematic review to identify studies that applied systematic diagnostic strategies for TBM in adolescents and adults presenting with meningitis
2.
Establish an international collaboration among TBM research groups who are willing to provide individual patient data (IPD)
3.
Use IPD to develop a clinical prediction model that estimates the probability of TBM in adolescent and adults, based on clinical and laboratory data that is routinely available within 48 hours of initial evaluation

Secondary objectives include an assessment of the number and quality of studies addressing the diagnosis of TBM, as well as an analysis of demographic and clinical characteristics of cases and non-cases of TBM.

Protocol

A systematic review and IPD meta-analysis will be performed according to Preferred Reporting Items for Systematic review and Meta-Analysis of IPD (PRISMA-IPD) guidelines ¹⁰.

Identification of studies

Potentially eligible studies will be identified by an extensive search of electronic databases, manual search of reference lists and by contacting researchers with interest and expertise in meningitis who may have access to unpublished studies.

We have designed a broad search strategy to maximise sensitivity. We will combine medical subject heading (MeSH) and free text terms to identify relevant studies, see Table 1. We will search Medline (accessed via PubMed), Africa-Wide Information and CINAHL (both accessed via EBSCO Host). We will not limit our searches by geographical location. The search will be restricted to studies published after 01 January 1990 and in English. The detailed search strategies will be presented in an online supplementary appendix. Reference lists of the selected articles and reviews will be searched manually to identify additional relevant studies.

Table 1. Proposed search terms.

Search	Query
#1	Search tuberculosis meningitis Field: Title/Abstract
#2	Search “tuberculosis, meningeal”[MeSH ]
#3	Search cerebral tuberculosis Field: Title/Abstract
#4	Search “brain tuberculosis” Field: Title/Abstract
#5	Search TBM Field: Title/Abstract
#6	Search ((((tuberculosis meningitis) OR “tuberculosis, meningeal”[MeSH Terms]) OR “cerebral tuberculosis“) OR “brain tuberculosis”) OR TBM
#7	Search “Diagnosis”[Majr]
#8	Search diagnosis or diagnostic Field: Title/Abstract
#9	Search “clinical scores” or “clinical scoring” Field: Title/Abstract
#10	Search “Research Design”[Mesh]
#11	Search predictor* or predictive Filters: Field: Title/Abstract
#12	Search “clinical predict*” Field: Title/Abstract
#13	Search “clinical feature*” Field: Title/Abstract
#14	Search (((#13 OR ((#12) OR ((#11) OR ((#10) OR ((#9) OR #8 OR #7 Filters: Humans
#15	Search #14 AND #6 Filters: Humans

Open in a new tab

Types of studies

Inclusion criteria

Randomized controlled trials, cross-sectional studies, and observational cohort studies
Participants presenting to care with clinical meningitis
Use of at least 1 of microscopy for acid-fast bacilli, commercial nucleic acid amplification test (NAAT) for Mycobacterium tuberculosis or mycobacterial culture of CSF to diagnose TBM
Study includes a minimum of 10 participants aged ≥ 13 years

Exclusion criteria

Case-control studies and case reports/series of patients with confirmed TBM
Participants taking anti-TB drugs at the time of their evaluation
Non-English articles
Studies published before 1990
Full text unable to be located
Studies not in humans

Screening and study selection

Duplicate studies will be removed. Study selection will follow the process described in the Cochrane Handbook of Systematic Reviews and PRISMA-IPD statements ¹⁰. Two investigators will independently screen titles and abstracts to remove irrelevant studies. Full text review will be performed on the remaining studies to determine eligibility. Any disagreements will be resolved by consensus or in consultation with a third reviewer.

Data extraction

Data will be extracted on a proforma, independently by two review authors on study level variables: study setting and dates; contact details; inclusion criteria and exclusion criteria, and number of patients. Corresponding authors of studies identified as eligible after full text review will be contacted with a request to provide anonymised individual patient data. IPD for variables that have previously been shown to be predictive of TBM ¹ and competing diagnoses will be requested, Table 2. Investigators will be requested to share their anonymised data after obtaining a signed agreement.

Table 2. Individual patient data that will be requested from authors.

LAM= lipoarabinomannan NAAT= nucleic acid amplification test.

Clinical data at presentation	Laboratory results (blood)	Laboratory results (CSF)
• Age * • Sex * • Presence of extrapyramidal movements * • Presence of neck stiffness * • Duration of symptoms * • Focal neurological deficit (including cranial nerve palsy) * • Temperature * • Glasgow Coma Scale * • AVPU score *	• HIV sero-status * • Total leukocytes * • CD4 count * • Glucose *	• Appearance * • Total leukocytes * • Total neutrophils * • Total lymphocytes * • Protein * • Glucose * • Gram stain * • Adenosine deaminase activity * • Bacterial culture • India ink stain * • Cryptococcal antigen * and culture • Microscopy for acid-fast bacilli • Mycobacterial culture • NAAT for Mycobacterium tuberculosis • NAAT for any virus • Syphilis serology * • Any other test informing an alternative diagnosis
Laboratory results (urine, sputum and serous effusions)	Radiological investigations	Autopsy
• Urine LAM * • Microscopy for acid-fast bacilli * • Mycobacterial culture • NAAT for Mycobacterium tuberculosis *	• Chest X-ray * • Abdominal ultrasound scan • CT brain • MRI brain	• Histological results from autopsy

Open in a new tab

*Factors chosen a priori to be used to develop the initial model

Data management

Investigators will be asked to share anonymised individual patient data, preferably electronically using encrypted files and other secure data transfer technologies using standardised data collection forms. Only study collaborators will have access to the combined IPD data available in Box. Box Secure Storage is a cloud storage and collaboration service configured to meet the security standards for HIPAA data. Data will remain stored in Box for the duration of the study and will not be used or sold for any commercial purpose.

Authorship

Authors providing IPD will be asked to nominate co-authors to expand the expertise of the review group, including review of preliminary findings and manuscript authorship. The number of co-authors will depend on the amount of data supplied, 1 author for <100 patients, 2 authors for >100 and <250 patients, and 3 authors for >250 patients.

Quality assessment

Quality assessment in terms of risk of bias and applicability for each included study will be performed according the QUADAS-2 tool for diagnostic accuracy studies ¹¹. This tool comprises 4 domains: patient selection, index test, reference standard, and flow and timing. Each domain is assessed in terms of risk of bias, and the first 3 domains are also assessed in terms of concerns regarding applicability. Signalling questions are included to help judge risk of bias.

Data synthesis

1. Review and descriptive analysis of available parameters and data completeness for contributing datasets.

The contributing datasets will be reviewed for sample size, available parameters and data completeness, to inform the selection of a modelling approach. A descriptive analysis will be undertaken to understand similarities and differences between the contributing datasets. Participant characteristics, clinical features, and test results will be summarized for each contributing dataset and compared across datasets using chi-square, t-tests, or non-parametric methods as warranted. Additionally, participant characteristics and clinical features will be further evaluated for heterogeneity via IPD meta-analysis accounting for random effects.

2. Developing a Predictive model

Participants will be categorised as definite TBM if they have one of the following-

At least one of acid-fast bacilli seen in the CSF; Mycobacterium tuberculosis cultured from the CSF; or a CSF positive commercial NAAT
Acid-fast bacilli seen in the context of histological changes consistent with tuberculosis in the brain or spinal cord at autopsy
Culture positive extra-neural TB and no other definitive cause for clinical meningitis

Participants will be categorised as definitely not TBM if they-

Do not fulfil the criteria for definite TBM and either an alternative diagnosis is made or they fully recovered, without antituberculosis chemotherapy, 3 months after admission

Participants will be categorised as possible TBM if they-

Do not meet the criteria for either definite TBM or definitely not TBM

Model development will initially be carried out using participants with either definite TBM or definitely not TBM. The model will then be applied to participants with possible TBM. First, a training dataset will be generated using a proportion of participants from each contributing dataset that are selected at random for inclusion in the combined training dataset. This method ensures that there is representation of each contributing dataset in the development of the TBM diagnostic algorithm. Second, clustering in the data will be explored using a variety of methods including Gaussian Mixture Models and cluster analysis (latent component analysis (LCA), Spectral Clustering, KMeans). This step serves as a tool to elucidate case-mix variation within TBM diagnostic categories (confirmed, probable, possible/suspected, and not-TBM), which will inform TBM diagnostic prediction and TBM prediction model development. Finally, the model will be developed using inputs that have been chosen a priori as they are known to predict TBM, are routinely available to clinicians within 48 hours of admission and are not used part of the definition of definite and definitely not TBM ( Table 2). The model will be developed using machine learning techniques including logistic regression, classification and regression analysis, and random forest classifier analysis. The training set will be calibrated to optimize the model coefficients for best predictive accuracy using AUC-ROC score.

3. Testing the model for internal validity

Using the testing/validation dataset, we will calculate overall sensitivity, specificity, positive predictive value, and negative predictive to assess the accuracy of the algorithm in predicting TBM. The model will also be validated using ‘internal-external cross-validation’, which is a multiple validation approach that accounts for multiple studies by rotating which are used toward model development and validation ⁷. Each contributing study will be excluded from the available set, and the remainder will be used to develop the diagnostic model; the excluded study will then be used to validate the model externally. This process will be repeated with each study being omitted in turn, allowing the consistency of the developed model and its performance to be examined on multiple occasions.

4. Sensitivity analysis

We will perform sensitivity analyses to explore the contributions of risk of bias on the final model(s) by limiting inclusion in the meta-analysis to the following.

Studies that used consecutive or random selection of participants based on a clinical presentation consistent with TBM
Studies that investigated all patients for TBM regardless of other CSF findings
Studies using CSF mycobacterial culture as the reference standard

Registration

This review is registered with PROSPERO, number CRD42018110501.

Presenting and reporting of results

We will report the results according to the Preferred Reporting Items for a Systematic Review and Meta-analysis of Individual Participant Data Statement (PRISMA-IPD) ¹⁰. This will include a flow diagram to summarise the study selection process and detail the reasons for exclusion of studies screened as full text. We will publish our search strategy and quality-scoring tool as supplementary documents. Quantitative data will be presented in evidence tables of individual studies as well as in summary tables. We plan to report on quality scores and risk of bias for each eligible study. This may be tabulated and accompanied by narrative summaries. A descriptive analysis of the strength of evidence assessment will be reported. The final prediction model(s), that is, the variable-selected model(s) with the highest area under the receiver operating characteristic curve (AUC), will be implemented in a Smart phone application and a Web-based calculator and graphically depicted using nomograms.

Discussion

TBM is a serious public health concern with delayed diagnosis and treatment being important risk factors for poor outcome ¹. At least 10 attempts have been made to develop clinical prediction models to aid the rapid diagnosis of TBM but none have been broadly successful. The aim of this project is to combine data from multiple sources to develop and internally validate a novel clinical prediction model, which will be made easily available as a smart phone application and a Web-based calculator. By combining data from multiple geographical locations and using advanced machine learning techniques it is hoped that we can develop a model that is broadly generalizable around the world. Further work will involve external validation of the model(s) and testing in randomised controlled trials.

Ethics

No specific ethical approval has been sought for this systematic review. Authors who submit IPD will be asked to confirm that the dissemination of anonymised data was included in the original patient consent document.

Data availability

Underlying data

No data is associated with this article.

Reporting guidelines

Figshare: PRISMA-P checklist for The diagnosis of tuberculous meningitis in adults and adolescents: protocol for a systematic review and individual patient data meta-analysis to inform a multivariable prediction model, https://doi.org/10.6084/m9.figshare.7628639.v1 ¹²

Data are available under the terms of the Creative Commons Attribution 4.0 International license (CC-BY 4.0).

Funding Statement

This work was supported by Wellcome [210772; 104803; 203135; FC0010218].

The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

[version 1; peer review: 1 approved, 1 approved with reservations]

References

1. Wilkinson RJ, Rohlwink U, Misra UK, et al. : Tuberculous meningitis. Nat Rev Neurol. 2017;13(10):581–98. 10.1038/nrneurol.2017.120 [DOI] [PubMed] [Google Scholar]
2. Thwaites GE, van Toorn R, Schoeman J: Tuberculous meningitis: more questions, still too few answers. Lancet Neurol. 2013;12(10):999–1010. 10.1016/S1474-4422(13)70168-6 [DOI] [PubMed] [Google Scholar]
3. Boyles TH, Thwaites GE: Appropriate use of the Xpert® MTB/RIF assay in suspected tuberculous meningitis. Int J Tuberc Lung Dis. 2015;19(3):276–7. 10.5588/ijtld.14.0805 [DOI] [PubMed] [Google Scholar]
4. Marais S, Thwaites G, Schoeman JF, et al. : Tuberculous meningitis: a uniform case definition for use in clinical research. Lancet Infect Dis. 2010;10(11):803–12. 10.1016/S1473-3099(10)70138-9 [DOI] [PubMed] [Google Scholar]
5. Riley RD, Ensor J, Snell KI, et al. : External validation of clinical prediction models using big datasets from e-health records or IPD meta-analysis: opportunities and challenges. BMJ. 2016;353:i3140. 10.1136/bmj.i3140 [DOI] [PMC free article] [PubMed] [Google Scholar]
6. Ahmed I, Debray TP, Moons KG, et al. : Developing and validating risk prediction models in an individual participant data meta-analysis. BMC Med Res Methodol. 2014;14:3. 10.1186/1471-2288-14-3 [DOI] [PMC free article] [PubMed] [Google Scholar]
7. Debray TP, Moons KG, Ahmed I, et al. : A framework for developing, implementing, and evaluating clinical prediction models in an individual participant data meta-analysis. Stat Med. 2013;32(18):3158–80. 10.1002/sim.5732 [DOI] [PubMed] [Google Scholar]
8. Jolani S, Debray TP, Koffijberg H, et al. : Imputation of systematically missing predictors in an individual participant data meta-analysis: a generalized approach using MICE. Stat Med. 2015;34(11):1841–63. 10.1002/sim.6451 [DOI] [PubMed] [Google Scholar]
9. Riley RD, Lambert PC, Abo-Zaid G: Meta-analysis of individual participant data: rationale, conduct, and reporting. BMJ. 2010;340:c221. 10.1136/bmj.c221 [DOI] [PubMed] [Google Scholar]
10. Stewart LA, Clarke M, Rovers M, et al. : Preferred Reporting Items for Systematic Review and Meta-Analyses of individual participant data: the PRISMA-IPD Statement. JAMA. 2015;313(16):1657–65. 10.1001/jama.2015.3656 [DOI] [PubMed] [Google Scholar]
11. Whiting PF, Rutjes AW, Westwood ME, et al. : QUADAS-2: a revised tool for the quality assessment of diagnostic accuracy studies. Ann Intern Med. 2011;155(8):529–36. 10.7326/0003-4819-155-8-201110180-00009 [DOI] [PubMed] [Google Scholar]
12. Boyles T: PRISMA-P-checklist.doc. figshare.Figure.2019. 10.6084/m9.figshare.7628639.v1 [DOI] [Google Scholar]

Wellcome Open Res. 2019 Mar 8. doi: 10.21956/wellcomeopenres.16426.r34913

Reviewer response for version 1

Kym IE Snell ¹

This is a clear and well-written protocol for a systematic review, collection of IPD and IPD meta-analysis for a new diagnosis model for TBM. I think the aim of the study and details relating to the systematic review part are clear, however I have a few comments/questions for clarity and reproducibility, mostly regarding what happens once IPD has been collected.

In the introduction the authors mention that case-mix can affect the predictive performance of a model and that big datasets can be used to examine the heterogeneity and improve the predictive performance. However, I don’t think they really address this issue in the methods or say how they will use the IPD to try improve the performance. Heterogeneity in performance if the predictor effects are consistent suggests differences in case-mix that are not being captured by the predictors in the model. Unless additional variables that are thought to improve the model are included, how will this be addressed? Will the authors consider recalibrating the baseline risk to different populations for example?
For the risk of bias assessment, I suggest using items from PROBAST too (excluding the analysis domain) which has recently been published and relates to prediction modelling studies (Wolff et al., 2019 ¹).
Have the authors considered how much data they would need to acquire to develop new prediction models for TBM e.g. any sample size calculations based on likely event rate and expected number of candidate predictors for consideration in the models, as a target to aim for?
In my experience, one of the biggest difficulties with IPD-MA like these is how different studies record different combinations of variables. Therefore, combining studies for model development can be very difficult and it may be necessary to prioritise certain variables (or combinations of variables) and use a subset of studies with those variables, hence my previous comment regarding sample size. Have the authors considered which variables are of particular interest and what they will do if these are not recorded in individual studies? How will IPD be selected for developing new models as it is unlikely to all be used?
How will missing data be handled? If imputing, will this be done within or across datasets?
It’s not clear if one model will be developed or multiple models (using each of the different modelling approaches). If aiming for a single model, how will it be selected?
Bottom of page 5: I’m not sure what is meant by “Model development will initially be carried out using participants with either definite TBM or definitely not TBM. The model will then be applied to participants with possible TBM.” Can the authors please clarify? Do they mean that possible TBM will be included in the definition of TBM?
I’m also not sure what is meant by “the training set will be calibrated to optimise the model coefficients for best predictive accuracy using AUC-ROC score” (Data synthesis, part 2)? By definition, the model will be calibrated to the development data and is therefore optimised to the data, which can lead to overfitting.
Will clustering by dataset be accounted for in the model development e.g. using a random intercept?
Will calibration of the model be assessed? This is also likely to be heterogeneous in different populations and therefore may need tailoring to different populations. In contrast, the AUC depends on the case-mix and will be lower in more homogeneous populations which doesn’t mean the model doesn’t work well.
I don’t see the point in splitting each dataset for development and validation, especially when some studies are likely to be small (min. sample size of 10 so even fewer events). The internal-external cross-validation is a better approach as it still retains the external validation element and will help evaluate the heterogeneity in performance across datasets.
Have the authors considered the potential for optimism in model development, particularly if they have few events and small sample size overall? Will they consider shrinking the coefficients (in a regression modelling approach) to correct for optimism?
Data synthesis, part 3: What threshold will be selected to calculate measures of diagnostic test accuracy – will this be based on a predicted probability and pre-specified to avoid bias in using ‘optimal’ thresholds? I would also suggest evaluating calibration and discrimination as part of the internal validation.
I would suggest reporting according to the TRIPOD guidelines for the multivariable modelling (Collins et al., 2015 ²).
I would caution against simply developing smart phone apps and web-based calculators unless the model demonstrates good predictive ability. Ideally it should be externally validated first before considering it as a tool for use in practice.

Is the study design appropriate for the research question?

Yes

Is the rationale for, and objectives of, the study clearly described?

Yes

Are sufficient details of the methods provided to allow replication by others?

Partly

Are the datasets clearly presented in a useable and accessible format?

Not applicable

Reviewer Expertise:

Biostatistics, prediction modelling and IPD-MA

I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard, however I have significant reservations, as outlined above.

References

1. : PROBAST: A Tool to Assess the Risk of Bias and Applicability of Prediction Model Studies. Ann Intern Med.2019;170(1) : 10.7326/M18-1376 51-58 10.7326/M18-1376 [DOI] [PubMed] [Google Scholar]
2. : Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD): the TRIPOD statement. BMJ.2015;350: 10.1136/bmj.g7594 g7594 10.1136/bmj.g7594 [DOI] [PubMed] [Google Scholar]

Wellcome Open Res. 2019 Aug 21.

Tom Boyles ¹

We would like to thank Professor Snell for her very helpful comments.

Responses drafted by Anna Stadelman.

READ ME: Comments from Professor Snell are in bold-type font. My responses are below each comment.

1. In the introduction the authors mention that case-mix can affect the predictive performance of a model and that big datasets can be used to examine the heterogeneity and improve the predictive performance. However, I don’t think they really address this issue in the methods or say how they will use the IPD to try [to] improve the performance. Heterogeneity in performance if the predictor effects are consistent suggests differences in case-mix that are not being captured by the predictors in the model. Unless additional variables that are thought to improve the model are included, how will this be addressed? Will the authors consider recalibrating the baseline risk to different populations for example?

The objective of assessing heterogeneity will be to ascertain case-mix variation among TBM cases and non-cases to inform model development. To begin, the contributing datasets will be reviewed for sample size, available predictors, and data completeness to inform the selection of a modelling approach. A descriptive analysis will be undertaken to understand similarities and differences between the contributing datasets. Participant characteristics and clinical features will be summarized for each contributing dataset and compared across datasets using chi-square, t-tests, or non-parametric methods as warranted. Then, we will formally evaluate case-mix variation and predictor heterogeneity via IPD meta-analysis using a logistic regression model with stratified intercepts for each study (99). Each TBM predictor will be rotated into the model individually to underscore the baseline predictive value of each in the different contributing datasets. We will also use this method to assess predictor heterogeneity by HIV status, WHO region, and TB burden within each country. For both the informal (descriptive statistics) and formal (IPD meta-analysis) assessment of heterogeneity, predictor estimates and their uncertainty intervals will be used to determine relative significance as opposed to p-values. Uncertainty intervals for each predictor will indicate how reliable the predictor is in terms of its prediction value. Furthermore, looking at p-values will only assess statistical significance, which may not be clinically meaningful.

Subsequently, we will employ methods of model development that take into account the heterogeneity observed in the IPD meta-analysis. These methods include, but are not limited to, classification and regression trees, supervised and unsupervised machine learning, Latent component analysis, etc. There may be other sources of heterogeneity that become evident during model development which will also be included in the development of the clinical prediction rule.

2. For the risk of bias assessment, I suggest using items from PROBAST too (excluding the analysis domain) which has recently been published and relates to prediction modelling studies (Wolff et al., 2019 ¹ ).

Thanks for the recommendation.

3. Have the authors considered how much data they would need to acquire to develop new prediction models for TBM e.g. any sample size calculations based on likely event rate and expected number of candidate predictors for consideration in the models, as a target to aim for?

Sample size is difficult to calculate in the context of developing a prediction model. However, the size of the development dataset and number of predictors in the final model have an impact on the statistical power to detect a difference in TBM case vs. non-TBM case. The greater the number of individual participants the more information we have to inform the development of the model, specifically the parameterization of the predictors and variability explained. More data (i.e. individual participants) better optimizes the individual predictors and has a better chance of capturing the variability in TBM case presentations. Ultimately, we will make every effort to acquire as many datasets as possible and limit the number of predictors to the ones that explain the most variability in TBM diagnosis.

4. In my experience, one of the biggest difficulties with IPD-MA like these is how different studies record different combinations of variables. Therefore, combining studies for model development can be very difficult and it may be necessary to prioritise certain variables (or combinations of variables) and use a subset of studies with those variables, hence my previous comment regarding sample size. Have the authors considered which variables are of particular interest and what they will do if these are not recorded in individual studies? How will IPD be selected for developing new models as it is unlikely to all be used?

We have included in the table which variables are of interest in the development of the model(s) (marked with an *). We consider these variables to be the most important for model development and inclusion of data into model development will be contingent on the representation of these variables in the individual contributing datasets. We will conduct a sensitivity analysis with all the individual contributing datasets, regardless of variable inclusion, so as to assess any bias introduced into the model by excluding certain datasets.

5. How will missing data be handled? If imputing, will this be done within or across datasets?

We will not impute any missing data. We will request all the diagnostic data available from investigators and any missingness on an individual level may ultimately end up excluding that particular individual from model development.

6. It’s not clear if one model will be developed or multiple models (using each of the different modelling approaches). If aiming for a single model, how will it be selected?

We will create multiple models with the development dataset and compare the fit across models via bootstrap, k-fold cross validation, and internal-external cross-validation.

7. Bottom of page 5: I’m not sure what is meant by “Model development will initially be carried out using participants with either definite TBM or definitely not TBM. The model will then be applied to participants with possible TBM.” Can the authors please clarify? Do they mean that possible TBM will be included in the definition of TBM?

The model(s) will be developed with confirmed TBM and non-TBM cases, and we will test the model(s) on probable and possible TBM cases as part of the sensitivity analysis.

8. I’m also not sure what is meant by “the training set will be calibrated to optimise the model coefficients for best predictive accuracy using AUC-ROC score” (Data synthesis, part 2)? By definition, the model will be calibrated to the development data and is therefore optimised to the data, which can lead to overfitting.

Sorry for the confusion. Suggested revision in Version 2.0 of the protocol.

9. Will clustering by dataset be accounted for in the model development e.g. using a random intercept?

Yes, this will be the aim of the IPD meta-analysis. Each predictor of interest will be rotated into a model predicting TBM that has a random intercept for each contributing dataset. The aim of this will be to ascertain heterogeneity in predictor strength, which will be accounted for in the final model(s). However, the overall aim is to develop a model(s) that accounts for heterogeneity by region, HIV status, and other known causes of heterogeneity in TBM cases. Therefore, we are hoping that including these predictors in the final model(s) will account for most of the variation in predictor strength that may be introduced by individual contributing datasets.

10. Will calibration of the model be assessed? This is also likely to be heterogeneous in different populations and therefore may need tailoring to different populations. In contrast, the AUC depends on the case-mix and will be lower in more homogeneous populations which doesn’t mean the model doesn’t work well.

Yes, model(s) calibration will be assessed and you bring up important points about the metrics for calibration and discrimination.

11. I don’t see the point in splitting each dataset for development and validation, especially when some studies are likely to be small (min. sample size of 10 so even fewer events). The internal-external cross-validation is a better approach as it still retains the external validation element and will help evaluate the heterogeneity in performance across datasets.

Agreed. We will revise our internal validation approach to include bootstrap, cross-validation (k-fold), and internal-external validation. Bootstrap validation tells us more about the validity of predictor variable selection in algorithm development, which is useful for assessing how well our predictors assess TBM diagnosis within different samples. Simulations have demonstrated that bootstrap is the best approach to internal validation as it appropriately reflects all sources of model uncertainty, especially in predictor variable selection (113). We will then utilize a k-fold cross-validation approach to assess the validation of the model approach and accuracy of model fit. The resulting c-statistic will convey overall model optimism and accuracy of model fit. The predictive model(s) will be further validated using ‘internal-external cross-validation’, which is a multiple validation approach that accounts for multiple studies by rotating which are used toward model development and validation (99).

12. Have the authors considered the potential for optimism in model development, particularly if they have few events and small sample size overall? Will they consider shrinking the coefficients (in a regression modelling approach) to correct for optimism?

High optimism, indicative of overfitting, can be corrected via shrinkage and we will consider this approach if overfitting is evident. However, we do not anticipate that we will encounter overfitting due to the data reduction step described in Version 2.0 of the protocol.

13. Data synthesis, part 3: What threshold will be selected to calculate measures of diagnostic test accuracy – will this be based on a predicted probability and pre-specified to avoid bias in using ‘optimal’ thresholds? I would also suggest evaluating calibration and discrimination as part of the internal validation.

We will assess optimism, calibration, and discrimination as part of the internal validation approach and have discussed this process further in Version 2.0 of the protocol. As for determining a pre-specified predictive threshold for defining TBM versus not, there is little information in the literature to inform an appropriate cutoff for TBM. Prior prediction models have used ROC curve to determine an optimal cutoff. Furthermore, this is the first study to include data from different populations world-wide. As such it is difficult to pre-specify the optimal predictive cutoff.

14. I would suggest reporting according to the TRIPOD guidelines for the multivariable modelling (Collins et al., 2015 ² ).

Great! Thanks for the recommendation.

15. I would caution against simply developing smart phone apps and web-based calculators unless the model demonstrates good predictive ability. Ideally it should be externally validated first before considering it as a tool for use in practice.

Absolutely agree. Developing an application and/or website calculator is our end goal, but the step to getting there includes further external validation.

References

1. Wolff RF, Moons KGM, Riley RD, Whiting PF, Westwood M, Collins GS, Reitsma JB, Kleijnen J, Mallett S, PROBAST Group†: PROBAST: A Tool to Assess the Risk of Bias and Applicability of Prediction Model Studies. Ann Intern Med. 2019; 170 (1): 51-58 PubMed Abstract | Publisher Full Text

2. Collins GS, Reitsma JB, Altman DG, Moons KG: Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD): the TRIPOD statement. BMJ. 2015; 350: g7594 PubMed Abstract | Publisher Full Text

Wellcome Open Res. 2019 Feb 5. doi: 10.21956/wellcomeopenres.16426.r34749

Reviewer response for version 1

Ravindra Kumar Garg ¹

I read with interest the protocol that aims to identify and validate a set of clinical predictors that accurately identify patients with definite tuberculous meningitis and absence of tuberculous meningitis. Conventionally, microscopy for acid-fast bacilli, commercial nucleic acid amplification test for Mycobacterium tuberculosis or mycobacterial culture of cerebrospinal fluid, are the tests that are used to bacteriologically confirm the diagnosis of tuberculous meningitis.

In developing countries and countries with a very high tuberculosis burden, tuberculous meningitis is encountered very frequently. Tuberculous meningitis is the commonest CNS infection seen in Neurology and Medicine indoors. Facing resource constrains, we always have to rely on clinical, imaging and cerebrospinal fluid parameters. Despite constrains, we are able to make reliable diagnosis of tuberculous meningitis most of the time. Our classical teaching points, to diagnose tuberculous meningitis, are often accurate. With a clinical diagnosis of meningitis along with characteristic cerebrospinal fluid findings help in making reasonable and prompt diagnosis enabling to start antituberculosis treatment with confidence. Raised cerebrospinal fluid lymphocyte count and markedly raised protein are characteristically seen in tuberculous meningitis.

Certain clinical signs are very specific to tuberculous meningitis. For example, sixth nerve involvement and vision loss in points towards a basal meningeal involvement and tuberculous meningitis. Other cranial nerve involvements are very infrequent. In patients with multiple cranial nerve palsies, fungal infection and a malignancy are more likely possibilities. As per observation, headache and fever are often not dominant features, and they are never presenting features. Similarly, neck rigidity may not be present in many patients. In cryptococcal meningitis, severe and dominant headache may be a presenting feature. Presence of extrapyramidal movements is a rare manifestation of tuberculous meningitis in adults. Extrapyramidal movements are more frequent in children.

Computed tomographic findings, if present, are quite characteristic of tuberculous meningitis. Basal exudates along with hydrocephalus with or without tuberculoma and periventricular infarcts indicates tuberculous meningitis and differential diagnosis option are then limited. A search for spinal cord involvement, we believe, if present, add to the diagnostic accuracy. A combination of optochiasmatic arachnoiditis and spinal lumbo-sacral arachnoiditis, in my opinion, is probably as accurate as bacteriological confirmation. Demonstration of paradoxical reaction, if present, also helps us in substantiating the reliable diagnosis of tuberculous meningitis.

Tuberculous meningitis, frequently, is a manifestation of more disseminated tuberculosis. Search for other sites of involvement often help us establishing clinical diagnosis. For example, ordinary X-ray chest shows additional pulmonary involvement. Many cases surprisingly show asymptomatic military tuberculosis. Lymph adenopathy and spinal vertebral tuberculosis are also seen in many cases.

Diagnostic caution is exercised in elderly patients and HIV infected patients. In these two groups, there are high chances of alternative diagnosis. We routinely perform tests with India ink preparation and detection of malignant cells. Still, distinctive features of tuberculous meningitis help in diagnosis of tuberculous meningitis in these two populations as well. Aspergillosis has a more aggressive course and large vessel involvement is more common. In tuberculous meningitis infarcts are usually small and periventricular.

Another issue that need to be addressed is diagnosis of drug-resistant tuberculous meningitis. XpertMTB/RIF test, which is now readily available, start discovering drug-resistant tuberculous meningitis in increasing number. This is not surprising because India harbors the major portion of global drug-resistant tuberculosis problems. This issue also needs to be given due emphasis.

I greatly appreciate the investigators efforts to evolve a predictive logistic model to accurately diagnose definite tuberculous meningitis. There are certain points that I highlighted that need to be re-looked and can be incorporated in this protocol.

Is the study design appropriate for the research question?

Yes

Is the rationale for, and objectives of, the study clearly described?

Yes

Are sufficient details of the methods provided to allow replication by others?

Yes

Are the datasets clearly presented in a useable and accessible format?

Not applicable

Reviewer Expertise:

CNS tuberculosis and other CNS infections

I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

Underlying data

No data is associated with this article.

Reporting guidelines

Data are available under the terms of the Creative Commons Attribution 4.0 International license (CC-BY 4.0).

[ref-1] 1. Wilkinson RJ, Rohlwink U, Misra UK, et al. : Tuberculous meningitis. Nat Rev Neurol. 2017;13(10):581–98. 10.1038/nrneurol.2017.120 [DOI] [PubMed] [Google Scholar]

[ref-2] 2. Thwaites GE, van Toorn R, Schoeman J: Tuberculous meningitis: more questions, still too few answers. Lancet Neurol. 2013;12(10):999–1010. 10.1016/S1474-4422(13)70168-6 [DOI] [PubMed] [Google Scholar]

[ref-3] 3. Boyles TH, Thwaites GE: Appropriate use of the Xpert® MTB/RIF assay in suspected tuberculous meningitis. Int J Tuberc Lung Dis. 2015;19(3):276–7. 10.5588/ijtld.14.0805 [DOI] [PubMed] [Google Scholar]

[ref-4] 4. Marais S, Thwaites G, Schoeman JF, et al. : Tuberculous meningitis: a uniform case definition for use in clinical research. Lancet Infect Dis. 2010;10(11):803–12. 10.1016/S1473-3099(10)70138-9 [DOI] [PubMed] [Google Scholar]

[ref-5] 5. Riley RD, Ensor J, Snell KI, et al. : External validation of clinical prediction models using big datasets from e-health records or IPD meta-analysis: opportunities and challenges. BMJ. 2016;353:i3140. 10.1136/bmj.i3140 [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref-6] 6. Ahmed I, Debray TP, Moons KG, et al. : Developing and validating risk prediction models in an individual participant data meta-analysis. BMC Med Res Methodol. 2014;14:3. 10.1186/1471-2288-14-3 [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref-7] 7. Debray TP, Moons KG, Ahmed I, et al. : A framework for developing, implementing, and evaluating clinical prediction models in an individual participant data meta-analysis. Stat Med. 2013;32(18):3158–80. 10.1002/sim.5732 [DOI] [PubMed] [Google Scholar]

[ref-8] 8. Jolani S, Debray TP, Koffijberg H, et al. : Imputation of systematically missing predictors in an individual participant data meta-analysis: a generalized approach using MICE. Stat Med. 2015;34(11):1841–63. 10.1002/sim.6451 [DOI] [PubMed] [Google Scholar]

[ref-9] 9. Riley RD, Lambert PC, Abo-Zaid G: Meta-analysis of individual participant data: rationale, conduct, and reporting. BMJ. 2010;340:c221. 10.1136/bmj.c221 [DOI] [PubMed] [Google Scholar]

[ref-10] 10. Stewart LA, Clarke M, Rovers M, et al. : Preferred Reporting Items for Systematic Review and Meta-Analyses of individual participant data: the PRISMA-IPD Statement. JAMA. 2015;313(16):1657–65. 10.1001/jama.2015.3656 [DOI] [PubMed] [Google Scholar]

[ref-11] 11. Whiting PF, Rutjes AW, Westwood ME, et al. : QUADAS-2: a revised tool for the quality assessment of diagnostic accuracy studies. Ann Intern Med. 2011;155(8):529–36. 10.7326/0003-4819-155-8-201110180-00009 [DOI] [PubMed] [Google Scholar]

[ref-12] 12. Boyles T: PRISMA-P-checklist.doc. figshare.Figure.2019. 10.6084/m9.figshare.7628639.v1 [DOI] [Google Scholar]

PERMALINK

The diagnosis of tuberculous meningitis in adults and adolescents: protocol for a systematic review and individual patient data meta-analysis to inform a multivariable prediction model

Tom Boyles

Anna Stadelman

Jayne P Ellis

Fiona V Cresswell

Vittoria Lutje

Sean Wasserman

Nicki Tiffin

Robert Wilkinson

Roles

Abstract

Introduction

Objectives

Protocol

Identification of studies

Table 1. Proposed search terms.

Types of studies

Screening and study selection

Data extraction

Table 2. Individual patient data that will be requested from authors.

Data management

Authorship

Quality assessment

Data synthesis

Registration

Presenting and reporting of results

Discussion

Ethics

Data availability

Underlying data

Reporting guidelines

Funding Statement

References

Reviewer response for version 1

Kym IE Snell

Roles

References

Tom Boyles

Reviewer response for version 1

Ravindra Kumar Garg

Roles

Associated Data

Data Availability Statement

Underlying data

Reporting guidelines

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases