Modern life sciences with their highly sensitive omics data face several challenges regarding data storage and sharing [1, 2]. On the one hand data must be protected to preserve the privacy of those individuals who contributed their data to research. On the other hand, omics data’s true value is only to be realized if shared with as many researchers as possible. In an ideal world, patients can flexibly control access to their personal data on a case-by-case basis [3]. However, granting and revoking access to data is a slow and tedious process within the current life sciences research paradigm, where most data is either stored on central controlled-access data repositories or kept locally within the respective research groups [3].
Distributed Ledger Technology (DLT; e.g., Blockchain) recently emerged as a means to enable immutable transactions between untrustworthy parties, which are kept in a consistent state through automated, algorithm-based consensus building mechanisms, thus eliminating the need for third-party trust enforcement and giving way for patients’ direct control over the flow of their personal data [3, 4]. Furthermore, as a distributed database, DLT provides enhanced data availability and integrity compared with centralized data repositories [4, 5]. However, DLT is a novel technology and although it has attracted tremendous attention from practitioners and researchers, the genomics community only recently started to realize DLT’s potential for the field [3, 6]. Overall, we still lack a profound understanding of the benefits that DLT’s application in genomics may yield as well as associated challenges, which is why we want to draw fellow researchers’ attention to some of the most important opportunities and most pressing challenges for DLT in genomics. We think that DLT’s application in genomics can especially bring forth the following opportunities.
Providing patients with flexible and direct access over the flow of their personal (genome) data, thus stimulating greater willingness to contribute data to research.
Increased security for genome data storage through decentralization and elimination of single points of failure.
Although initiatives like ELIXIR already seek to facilitate data sharing in the life sciences, DLT’s inherent characteristics can strengthen such efforts, facilitating further democratization of (genome) data access and the breaking up of extant data silos.
At the same time, we see the following main challenges for DLT’s meaningful application in genomics:
DLT was not designed to handle omics-sized data sets.
Diverse and uncertain regulatory environments around DLT (e.g., putting genomic data on a distributed ledger is a difficult-to-reverse decision, which might require very high efforts and thus contradict the right to be forgotten as stipulated by the European Union’s General Data Protection Regulation (GDPR)).
Applications of DLT in genomics currently mainly revolve around young businesses like Nebula Genomics, EncrypGen, or Genecoin, to name but a few, that aim at building marketplaces where consumers can trade their private genome data for tokens. While such monetization of genome data sharing will likely attract an increasing number of people to share their genome data, many will do so predominantly for monetary reasons without being fully aware of the implications, thus creating new ethical challenges.
European researchers have traditionally been leading voices about the dangers and ethical implications of large-scale access to and use of genome data [7]. Genomics is highly regulated within Europe and genome tests for medical or predictive purposes must be carried out by trained professionals in most European Union member states [8]. Consequently, compared with the US and other regions of the world, the direct-to-consumer genetic testing market in Europe, for example, is much smaller and there are fewer incentives for building DLT-based genome data markets where European consumers can trade their private genome data. However, while the past has taught us that one can easily fall behind in the fast-paced technology sector, we also firmly believe in the European way of cautiously balancing the benefits of large-scale genome data access with the personal and societal risks that arise with such. Building on the spirit that has spawned initiatives for the open and large-scale sharing of genome data [9], we therefore call for more attention to the surging phenomenon of DLT in genomics from European researchers and institutions. In particular, we see the following avenues for European researchers and institutions to contribute to the proliferation of DLT in genomics with a European character. First, we should focus on applying DLT as a tool for researchers that allows individuals to contribute to genomic research while putting less emphasis on establishing yet another genome data market for consumers. Although researchers have already begun to do so (e.g., Lee et al. [10], Ozercan et al. [3], or the iDASH Privacy and Security Workshops 2018 and 2019), more efforts in this direction are necessary. Second, we should investigate means for how DLT can manage omics-sized data while still providing strong information security and privacy in accordance with contemporary European legislation like the GDPR. Third, create a European distributed ledger for genomics (e.g., a European genomics Blockchain), which serves as a lighthouse project, helps breaking up extant data silos, and encourages collaborations among researchers from different institutes and countries across and beyond Europe. Thereby, extant initiatives like ELIXIR could be a fruitful starting ground for setting up such a European distributed ledger for genomics.
Although only time can tell whether DLT can live up to the current hype and meet everyone’s expectations, we believe that in pursuing these avenues, we will be able to realize DLT’s full potential for genomics, beyond mere genome data markets.
Compliance with ethical standards
Conflict of interest
The authors declare that they have no conflict of interest.
Footnotes
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
References
- 1.Thiebes S, Kleiber G, Sunyaev A. Cancer genomics research in the cloud: a taxonomy of genome data sets. In: Proceedings of the 4th International Workshop on Genome Privacy and Security. Orlando, Florida, USA; 2017.
- 2.Thiebes S, Lyytinen K, Sunyaev A. Sharing is about caring? Motivating and discouraging factors in sharing individual genomic data. In: Kim YJ, Agarwal R, Lee JK, editors. Proceedings of the 38th International Conference on Information Systems. Seoul, South Korea; 2017. p. 1–20.
- 3.Ozercan HI, Ileri AM, Ayday E, Alkan CJ. Realizing the potential of blockchain technologies in genomics. Genome Res. 2018;28:1255–63. doi: 10.1101/gr.207464.116. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Kannengießer N, Lins S, Dehling T, Sunyaev A. Mind the gap: trade-offs between Distributed Ledger Technology characteristics. Preprint at https://arxiv.org/abs/1906.00861 (2019).
- 5.Nakamoto S. Bitcoin: a peer-to-peer electronic cash system. Bitcoin.org. 2008; https://bitcoin.org/bitcoin.pdf. Accessed 1 August 2019.
- 6.Shabani M. Blockchain-based platforms for genomic data sharing: a de-centralized approach in response to the governance problems? J Am Med Inform Assoc. 2018;26:76–80. doi: 10.1093/jamia/ocy149. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.European Society of Human Genetics. Statement of the ESHG on direct-to-consumer genetic testing for health-related purposes. Eur J Hum Genet. 2010;18:1271. doi: 10.1038/ejhg.2010.129. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Borry P, Van Hellemondt RE, Sprumont D, Jales CFD, Rial-Sebbag E, Spranger TM, et al. Legislation on direct-to-consumer genetic testing in seven European countries. Eur J Hum Genet. 2012;20:715–21. doi: 10.1038/ejhg.2011.278. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Amann RI, Baichoo S, Blencowe BJ, Bork P, Borodovsky M, Brooksbank C, et al. Toward unrestricted use of public genomic data. Science. 2019;363:350–2. doi: 10.1126/science.aaw1280. [DOI] [PubMed] [Google Scholar]
- 10.Lee S-J, Cho G-Y, Ikeno F, Lee T-R. BAQALC: blockchain applied lossless efficient transmission of DNA sequencing data for next generation medical informatics. Appl Sci. 2018;8:1471. doi: 10.3390/app8091471. [DOI] [Google Scholar]