Skip to main content
. 2023 Jul 14;10(2):lsad022. doi: 10.1093/jlb/lsad022

Table 5.

Data quality top policy options

Issue statement Shared data are of variable quality and there is no consensus about how to monitor and assess its quality.
Top three policy options (label)* Points to consider Illustrative quotes (study ID)
Funders could fund data-sharing infrastructures (eg setting standards, data cleaning, curation) (Q2) Value of data cleaning and curation ‘fund curation, clap clap clap clap clap’ (015)
Enforcement of standards is important, but also difficult ‘Biocuration is one of the biggest needs out there! Some of it goes back to problems with clinical workflows: that data is often not collected in a form that’s amenable to downstream analysis. The first thing that could change the picture [is] enforced standards for data quality, coupled with support for the biocuration activities that are often needed for ensuring data quality’ (014)
Prioritize data quality over data sharing `[W]e’re asking a lot of funders, and yet we don’t have a stable source of funding. Do we prioritize other aspects of making data available and let recipients deal with data quality issues? Not sure I would trade quality improvement at the data access point for some of the other aspects of data sharing for which funding is needed...' (022)
Not as simple as you think ‘[D]ata quality and standards are always relative to intended use, and standardization inevitably constrains what can be done with a resource. Moreover, in favoring some uses over others, standardization also favors some actors over others. Before thinking about standard-setting, you therefore need to think about what a resource is for and [who] should benefit from it’ (009)
Data resources could include rigorous quality checks in data selection and curation processes (eg gnomAD) and/or attach quality ratings by standard metrics to data (eg ClinVar) (Q8) Gatekeeping incentivizes generation of higher-quality data ‘In general, I believe in the effectiveness of policies that enforce the quality of data being ingested over those that mark the quality of data already ingested, because the former pushes the submitters to generate better-quality data’ (014)
Requires resources ‘[W]here are the resources for that – and to what extent are standards recognized that would enable this kind of check?’ (022)
Funders could incentivize data contributors to comply with standards (ie preferred access or funding access to data resources) (Q1) Incentives for data contribution are the key to success ‘Incentives are needed to get anyone to do this work. Standards also help the work, once incentivized, to get done well’ (012)

*See supplementary tables for all policy options labeled by domain (number), eg Q1 for the first list policy option in the data quality domain.