Table 1.
Benefits and limitations to using particular datasets in GI and hepatobiliary research.
Datasets | Common Data Types | Benefits in GI and Hepatobiliary Research |
Possible Limitations | Refs |
---|---|---|---|---|
Genetics/ Genomics |
Whole genome, whole exome sequencing data | Many GI and hepatobiliary disorders have an under-characterized hereditary component | Most diseases are multifaceted—a large amount of data is needed to reveal true signals | [1] |
Epigenetics/ Epigenomics |
Most commonly methyl-seq, ChIP-seq, and ATAC-seq | These data may offer enhanced diagnostic utility, particularly in the context of GI and hepatobiliary disorders with complex genetic etiologies | Multiple factors such as the gut microbiome and diet greatly affect epigenetic regulation Detecting changes that can be universally tracked may be difficult without access to large amounts of data and clearly characterized subgroups Large number of cell types may render data interpretation difficult |
[2] |
Transcriptomics | mRNA-seq, total RNA-seq, targeted RNA-seq, scRNA-seq, and snRNA-seq | Metatranscriptomics offers insights into the transcriptional activity of intestinal microbes, whose presence does not always correlate with bacterial activity Digestive organs comprise complex cell types with specific biomarkers that make them excellent candidates for scRNA-seq analyses |
Transcript expression does not always correlate with bacterial or human protein output; so, downstream validation must be performed to confirm findings The complex distribution of cell types makes it difficult to use or interpret bulk RNA-seq and other less costly methods for digestive organ research |
[3,4,5] |
Proteomics | NMR, integrated chromatography, and mass spectrometry data | Captures larger compounds than metabolomic analysis; this may be important in biomarker identification and validating transcriptional activity | Remains cost-prohibitive | [6] |
Microbiome | 16S rRNA amplicon sequencing, shotgun metagenomic sequencing | The intestinal microbiome contains the highest concentration of commensal bacteria residing on human tissue Large amounts of stool are relatively easy to collect with little human DNA present |
Findings derived from stool may differ from those derived from mucosal biopsies (i.e., taken at the source) Association does not imply causation: bacterial profiles may change due to disease; findings must be carefully validated to attribute diseases to dysbiosis |
[7,8,9,10] |
Metabolomics | Liquid chromatography, gas chromatography, capillary electrophoresis, and ionic mobility spectrometry mass spectrometry, NMR data | Useful for gut volatile compounds, the metabolites thought to be most associated with disease Metabolomics combined with gut microbiome data can reveal mechanistic targets—particularly useful in study of GI and hepatobiliary disorders |
The preservation of volatile compounds requires use of a buffer or immediate sample processing and may still not adequately capture their presence or abundance Host variation in endogenous compounds, particularly those interacting with bacterial pathways, may complicate development as a diagnostic tool |
[11,12] |
Medical informatics | EHR, questionnaire, patient interview data | Procedures such as colonoscopy and FibroScan for GI and hepatobiliary disorders, respectively, have quantifiable datapoints that can be combined with clinical and demographic data for retrospective and prospective research | Inconsistent techniques and data input and variation require rigorous coordination, quality assessments, and large cohorts to accurately capture differences | [13] |
Imaging data | X-ray, CT, MRI, endoscopy, capsule imaging data | AI-based interpretation of upper, lower, and video capsule endoscopy can capture findings that may otherwise be overlooked | Subjective interpretation by the operator is still required (relatively minor concern since these can be standardized) | [14,15] |
Refs, reference; ChIP, chromatin immunoprecipitation; NMR, nuclear magnetic resonance; EHR, electronic health record; CT, computerized axial tomography; MRI, magnetic resonance imaging; AI, artificial intelligence.