Why open-access data?
Life science has traditionally not been known for the open sharing of data and models. When Hopfield, a pioneer in computational neuroscience, attended his first biology conference in 1970, he spoke of his experience: “One of the senior biochemists took me aside to explain why I had no place in biology. As he said, gentlemen did not interpret other gentlemen's data and preferably worked on different organisms. If you wish to interpret data, you must get your own.” (Hopfield, 2014). The culture has however shifted and making use of open-access data, models and resources has now become commonplace in neuroscience. There are many good reasons for this, including the high quality of the available data, along with increased reproducibility and transparency, and collaboration opportunities among researchers working in different institutions, countries, and continents (Choudhury et al., 2014; Pascu and Burgelman, 2022). The tendency of making use of publicly available data gained further momentum in the past few years when the COVID-19 pandemic led to a near halt in experimental laboratory work in some countries. Thus, remote working and remote analysis of open-access databases emerged as a solution for overcoming the lack of hands-on laboratory work and were recommended by several neuroscience committees and societies (Vlisides et al., 2021).
The most important reason for the increasing use of open-access data, models, and resources is perhaps simply the increased availability. During the previous decade, there has been a tremendous increase in the number of published neuroscience papers accompanied by raw experimental data, spurring the creation of numerous data-sharing initiatives and databases that provide, at no cost, a large number of datasets (Spires-Jones et al., 2016). These databases can be used for various forms of neuroscientific research, such as morphometric analysis of nervous system cells (Ascoli et al., 2007), neuroimaging analysis of MRI datasets (Poldrack and Gorgolewski, 2014; Markiewicz et al., 2021), as well as genomic, transcriptomic, and proteomic expression analysis (Pereira et al., 2014; Keil et al., 2018). Needless to say, multiple users for these databases leverages the value of the original investment both in time and money that was needed to create them.
The same progress has been seen for open-access models as well: In the past, when computational neuroscientists wanted to reproduce modeling results, they often had to re-implement models based on equations and tables of parameters from scientific papers. This obstacle to scientific progress and reproducibility was acknowledged early, and the ModelDB1 repository for sharing neuroscience models was founded in 1996 and has been continuously growing ever since (McDougal et al., 2017). More recently, we have also seen the rise of several new model repositories, both from large initiatives such as the Allen Brain Institute2 and the Human Brain Project and from community-based efforts, such as the Open Source Brain.3
Research Topic
The main idea behind this Research Topic was to bring to the attention of a wide audience the use of open-access data, models, and resources from different publicly available repositories to generate new insights. Thus, we welcomed research papers based on open-access data or models, as well as any form of new computational resource, software, or platform that could be used for neuroscience research. As a result, eight articles were published in this topic that cover diverse aspects of neuroscience research, ranging from basic to clinical neuroscience, exploration of genetic and MRI databases, and different software tools for handling connectomic, electrophysiological, and multi-modal data. As this Research Topic has shown, the greatest interest in using open-access data is still present in research based on the use of neuroimaging databases, primarily containing MRI images (Short et al.; Beauferris et al.; Saat et al.). This is not surprising as these types of data can be easily obtained from diverse and large populations of human subjects, both healthy and diseased. In this way, researchers are offered a large pool of data that can be analyzed and lead to new advances in diagnosing and managing neuropsychiatric and other disorders (Poline et al., 2012). Another important component of this topic was the papers in which novel software tools and/or algorithms were presented. These include a unified framework for handling in vivo high-density extracellular probes (Garcia et al.), large-scale data analysis in connectomic research (Plaza et al.), as well as a query tool that can be used to easily search and categorize published multi-modal papers (Li and Liang). Further, the topic includes original research providing novel insights into the possible genes underlying autism spectrum disorders (Li et al.), which again emphasizes the importance and potential usage of open-access data obtained from gene-expression studies. Although there is no doubt that open-access data will yield discoveries in the field of neuroscience, the question of protection and privacy of such data arises (Jwa and Poldrack, 2022). This remains one of the most important and controversial questions in the world of open-access data sharing, and this topic includes an opinion article on the current problems, obstacles, and policies regarding data sharing (Lathe).
Perspectives
Although open-access databases have been present in neuroscience research for more than a decade, their importance is growing and is likely to continue to grow in the coming years (Wiener et al., 2016). This may represent a game-changing option for early-career researchers, or researchers in countries with less-than-optimal science funding as it gives access to resources that would otherwise have been restricted to large and well-funded labs. The increasing availability of online computational resources, like the EBRAINS4 or Neuroscience Gateway,5 can also be expected to contribute to making high-quality scientific work within reach of a much larger population of aspiring scientists. These initiatives are not only helping researchers gain access to large-scale computational resources but furthermore, have commonly used neuroscientific software pre-installed. This makes it almost trivial for researchers to ensure that their simulation results are reproducible and interactable since the computational environment that produced the results can easily be fully specified and recreated.
Open-access data, models, and tools are valuable only as long as they are Findable, Accessible, Interoperable, and Reusable (FAIR) (Wilkinson et al., 2016). Achieving this requires a joint effort: On the one hand, those responsible for the database or repository must facilitate that it is easy to search through and access, both for humans and machines, and that interacting with it does not require the user to be an expert programmer. On the other hand, researchers must strive to be sufficiently proficient in handling the variety of software needed to interact with online databases and repositories. This underscores the importance of making bioinformatics training an essential feature in biomedical curricula, or else many of these databases will remain underutilized and will not reach their full potential (Akil et al., 2011).
Neuroscience is increasingly becoming the highly interconnected global endeavor it needs to be to overcome the challenges ahead. As further progress is made, we look forward to the increasing usage of ever-expanding databases, leading to further advances in neuroscience.
Author contributions
IZ wrote the first draft of the article. RN and TN revised the final version of the manuscript. All authors read and approved the final submitted version.
Footnotes
Conflict of interest
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Publisher's note
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.
References
- Akil H., Martone M. E., Van Essen D. C. (2011). Challenges and opportunities in mining neuroscience data. Science 331, 708–712. 10.1126/science.1199305 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ascoli G. A., Donohue D. E., Halavi M. (2007). NeuroMorpho.Org: a central resource for neuronal morphologies. J. Neurosci. 27, 9247–9251. 10.1523/JNEUROSCI.2055-07.2007 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Choudhury S., Fishman J. R., McGowan M. L., Juengst E. T. (2014). Big data, open science and the brain: lessons learned from genomics. Front. Hum. Neurosci. 8, 239. 10.3389/fnhum.2014.00239 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hopfield J. J. (2014). Two cultures? Experiences at the physics-biology interface. Phys. Biol. 11, 053002. 10.1088/1478-3975/11/5/053002 [DOI] [PubMed] [Google Scholar]
- Jwa A. S., Poldrack R. A. (2022). Addressing privacy risk in neuroscience data: from data protection to harm prevention. J. Law Biosci. 9, lsac025. 10.1093/jlb/lsac025 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Keil J. M., Qalieh A., Kwan K. Y. (2018). Brain transcriptome databases: a user's guide. J. Neurosci. 38, 2399–2412. 10.1523/JNEUROSCI.1930-17.2018 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Markiewicz C. J., Gorgolewski K. J., Feingold F., Blair R., Halchenko Y. O., Miller E., et al. (2021). The OpenNeuro resource for sharing of neuroscience data. Elife 10, e71774. 10.7554/eLife.71774 [DOI] [PMC free article] [PubMed] [Google Scholar]
- McDougal R. A., Morse T. M., Carnevale T., Marenco L., Wang R., Migliore M., et al. (2017). Twenty years of ModelDB and beyond: building essential modeling tools for the future of neuroscience. J. Comput. Neurosci. 42, 1–10. 10.1007/s10827-016-0623-7 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Pascu C., Burgelman J. C. (2022). Open data: the building block of 21st century (open) science. Data Policy 4, e15. 10.1017/dap.2022.7 [DOI] [Google Scholar]
- Pereira S., Gibbs R. A., McGuire A. L. (2014). Open access data sharing in genomic research. Genes 5, 739–747. 10.3390/genes5030739 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Poldrack R. A., Gorgolewski K. J. (2014). Making big data open: data sharing in neuroimaging. Nat. Neurosci. 17, 1510–1517. 10.1038/nn.3818 [DOI] [PubMed] [Google Scholar]
- Poline J. B., Breeze J. L., Ghosh S., Gorgolewski K., Halchenko Y. O., Hanke M., et al. (2012). Data sharing in neuroimaging research. Front. Neuroinform. 6, 9. 10.3389/fninf.2012.00009 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Spires-Jones T. L., Poirazi P., Grubb M. S. (2016). Opening up: open access publishing, data sharing, and how they can influence your neuroscience career. Eur. J. Neurosci. 43, 1413–1419. 10.1111/ejn.13234 [DOI] [PubMed] [Google Scholar]
- Vlisides P. E., Vogt K. M., Pal D., Schnell E., Armstead W. M., Brambrink A. M., et al. (2021). Roadmap for conducting neuroscience research in the COVID-19 era and beyond: recommendations from the SNACC Research Committee. J. Neurosurg. Anesthesiol. 33, 100–106. 10.1097/ANA.0000000000000758 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wiener M., Sommer F. T., Ives Z. G., Poldrack R. A., Litt B. (2016). Enabling an open data ecosystem for the neurosciences. Neuron 92, 617–621. 10.1016/j.neuron.2016.10.037 [DOI] [PubMed] [Google Scholar]
- Wilkinson M. D., Dumontier M., Aalbersberg I. J. J., Appleton G., Axton M., Baak A., et al. (2016). The FAIR guiding principles for scientific data management and stewardship. Sci Data 3, 160018. 10.1038/sdata.2016.18 [DOI] [PMC free article] [PubMed] [Google Scholar]