Skip to main content
. 2019 Jun 28;25(24):2990–3008. doi: 10.3748/wjg.v25.i24.2990

Table 5.

Examples of studies on inflammatory bowel disease research by utilization of large healthcare datasets

Inflammatory bowel disease
Country/Region Database Area of research Sample size Design, statistical methods and 3V Application
South Korea Korean Health Insurance Review and Assessment Service (HIRA) UC 11233 Nationwide retrospective cohort study Incidence and clinical impact of perianal disease in UC
Song et al[97], 2018
Comparator: general population
Volume, Velocity and Variety
Taiwan, China Taiwan National Health Insurance Database (NHID) IBD 38039 Nationwide retrospective cohort study to compare IBD patients with general population to derive SIR Association between IBD and herpes zoster infection
Chang et al[98], 2018
Hospital based nested case-control study
Volume, Velocity and Variety
Sweden Swedish Patient Registry UC 63711 Nationwide retrospective cohort study Association between appendectomy and UC
Myrelid et al[99], 2017
Volume, Velocity and Variety
Swedish Medical Birth Register (child-mother link) IBD 827,239 children born between 2006 and 2013 Nationwide prospective population-based register study Association between maternal exposure to antibiotics during pregnancy and very early onset IBD in adulthood
Ortqvist et al[72], 2019
Volume, Velocity and Variety
Swedish Multigeneration Register (child-father link)
Swedish Prescribed Drug Register National Patient Register
United States NCBI Gene Expression Omnibus (GEO) IBD n.a. Signature inversion study Topiramate as a potential therapeutic agent against IBD
Dudley et al[70], 2011
Volume, Velocity and Variety
United States n.a. IBD 1585 Retrospective cohort study Natural language processing Association between arthralgia and biologics (anti-TNF vs vedolizumab)
Cai et al[20], 2018
Volume, Velocity and Variety
n.a International IBD Genetics Consortium's Immunochip project IBD 53279 Machine learning algorithm Predictors of IBD
Wei et al[64], 2013
Volume, Velocity and Variety
United States n.a. IBD 575 colonoscopy reports Retrospective cohort study Natural language processing Differentiation of surveillance from non-surveillance colonoscopy
Hou et al[100], 2013
Volume, Velocity and Variety
United States n.a. IBD 1080 Retrospective cohort study Prediction of IBD remission in thiopurine users
Waljee et al[66], 2017
Random Forest machine learning algorithm
United States n.a. IBD 20368 Retrospective cohort study Prediction of hospitalization and outpatient steroid use
Waljee et al[65], 2017
Random Forest machine learning algorithm
n.a. Phase 3 clinical trial data IBD 491 Retrospective cohort study Prediction of steroid-free endoscopic remission with vedolizumab in UC
Waljee et al[67], 2018
Random Forest machine learning algorithm
Volume, Velocity and Variety

This list is not exhaustive, but serves to provide a few distinct examples of how Big Data analysis can generate high-quality research outputs in the field of gastroenterology and hepatology. 3V: Volume/velocity/variety; UC: Ulcerative colitis; IBD: Inflammatory bowel disease; SIR: Standardized incidence ratio; anti-TNF: anti-tumour necrosis factor.