Abstract
As the impacts of climate change continue to intensify, humans face new challenges to long-term survival. Humans will likely be battling these problems long after 2100, when many climate projections currently end. A more forward-thinking view on our science and its direction may help better prepare for the future of our species. Researchers may consider datasets the basic units of knowledge, whose preservation is arguably more important than the articles that are written about them. Storing data and code in long-term repositories offers insurance against our uncertain future. To ensure open data are useful, data must be FAIR (Findable, Accessible, Interoperable and Reusable) and be complete with all appropriate metadata. By embracing open science practices, contemporary scientists give the future of humanity the information to make better decisions, save time and other valuable resources, and increase global equity as access to information is made free. This, in turn, could enable and inspire a diversity of solutions, to the benefit of many. Imagine the collective science conducted, the models built, and the questions answered if all of the data researchers have collectively gathered were organized and immediately accessible and usable by everyone. Investing in open science today may ensure a brighter future for unborn generations.
Keywords: open science, data archiving, transparency, reproducible, reproducibility
1. Introduction
Climate change is undeniably altering Earth, from the highest peaks to the world’s oceans [1]. Humans have been aware of changes to the global climate for over a century [2,3]. The relatively slow response in adjusting emissions [4,5] suggests that we may be tackling the problem of human-induced climate change for an extended period of time, likely long after the year 2100—a date that is commonly the end of future climate change projections [6]. While humans should continue to reduce greenhouse gas emissions, mitigate the effects of those emissions already released, and continue research to understand climate change and climate impact in the immediate term, we also need to begin to think on a broader temporal scale and prepare for our future battles with such persistent challenges.
What climate change questions will future researchers be trying to answer in the year 2100? This question is impossible to answer precisely, but contemporary scientists have some control in how prepared future researchers are for the uncertain effects of climate change. That is, knowledge workers can leave behind masses of organized and detailed information, which can allow the future of humanity to synthesize information in novel ways [7] and make more informed decisions [8–10]. However, there are currently many systemic issues that make synthesis and informed decision-making more difficult, such as improperly analysed data [11,12], non-reproducible research [13], fabricated results [14,15], publication bias [16,17] and a general lack of available data [18]. While open data do not eliminate these systemic issues, they allow for proper reanalysis [19], assessment of reproducibility [20], identification of fabricated results [21,22] and documentation of publication bias [23], along with providing immediately available data for meta-analyses focused on generating new insights [7,24,25].
Why not make our data and/or code the most useful it can be, potentially long into the future? There are, of course, many legitimate hesitations in data sharing [26–29], including the potential unethical or discriminatory misuse of data [30,31]. The historic abuse of people and their information, for example, makes an obvious case for respecting indigenous data sovereignty [32]. Data collected on Indigenous Peoples can be overly focused on highlighting a narrative of deficit (i.e. portraying Indigenous communities through perceived deficiencies or lack of capabilities), and making these data open can cause more harm than benefit as this narrative is perpetuated [33]. CARE Principles (Collective benefit, Authority to control, Responsibility, Ethics) can help guide best practices for governance and stewardship of Indigenous data [34]. There are many other types of sensitive data (e.g. individual health records, personal financial information and interview transcripts) that generate privacy and security concerns and which could be (mis)used out of context or without further ethical approval if they were made openly available [35]. In a more extreme example, termed dual-use research of concern, open genetic information could be used for viral engineering in a biological warfare or terrorism context [36]. In many of these cases, it is important to note that data often should not be made open. Each case should be handled with care and available resources for working with sensitive data be consulted [37–40].
In the majority of cases of data sharing hesitations in the biological sciences, there are other barriers, such as a lack of time, not knowing where to store open data or simply not wanting to share data or code [27,29], that can be more easily overcome. Many of these barriers are being lowered and the benefits of open data and code outweigh the costs to individual researchers [41]. Benefits include increased citation rates, more organized and efficient projects, increased collaboration opportunities, positive reviews and enhanced ability to find errors before, during and after publication [28,29,42–46], but work from Berberi & Roche [47] suggests that open data policies, by themselves, do not necessarily increase article retractions or corrections. Yet, the benefits do not stop at the individual researcher. Society could profit from open science for generations to come [48–52].
2. The collective advantages today
Many individuals are already familiar with the power of open data and open science tools, as it leads to more informed decision-making and saves time and other valuable resources [48,49,52–54]. It is not difficult to imagine the countless ways that open datasets have sped up knowledge production [55–57], lowered costs [48] and have advanced innovation for the benefit of society, not just individual research projects [58]. As common examples, researchers often use open datasets providing global air or sea surface temperatures and satellite imagery in scientific work [59,60]. Freely available satellite imagery has been used in countless contexts to aid in our collective understanding of the natural world. With such tools, we have been able to monitor water use and availability, plant and animal habitat availability, sustainable development and many other land use changes over time [61–63].
Yet, simply sharing data is not enough [55]. For data to be useful they must be FAIR (Findable, Accessible, Interoperable and Reusable; [64]). That is, they must be stored in openly available digital repositories in machine-readable (data that are structured in a way that computers can automatically read and process, without human intervention), non-proprietary formats (i.e. accessible to all with a computer, not just those with access to proprietary software). Repositories should include enough metadata for a naive reader to be able to interpret how, where, when and why the data were collected (including units of measurement and relevant caveats). While complying with FAIR guidelines can seem daunting at first, especially for those who are inexperienced in open science practices, there are many resources available [29,64]. For example, there has been considerable effort into developing many metadata templates, standards, protocols and guidelines [65–70] that can be followed until individual workflows become common practice and a part of the normal publishing process.
Long-term repositories that mint digital object identifiers (DOI) for data and code (e.g. Zenodo, Dryad, Open Science Framework) ensure longevity, whereas including data or code as journal supplementary materials or linking articles to GitHub accounts via URLs (i.e. not archiving through the GitHub–Zenodo integration) is more likely to be short-lived as URLs are almost certain to be broken not far into the future [71,72]. Attaching a DOI to data and code has the additional advantage of allowing for the direct citation of those data and code when they are reused for other purposes.
Despite these useful guidelines, there are still practical decisions that individuals must make. Exactly what form of ‘raw’ data will be useful to others will depend on the subdiscipline and the new research questions. For example, active acoustics in fishery research produces many large sound files that require substantial processing before biological information can be extracted [73,74]. Raw sound files are large and cumbersome, such that they are less likely to be shared directly, relative to the processed data (e.g. estimates of biomass; [75]). While the processed data are both easier to share and re-use for many applications, raw sound files are still useful, as they can be re-processed in other ways to assess completely different questions, providing a valuable resource (e.g. building a timeseries for a different target species; [76]). Thus, we should strive to archive all our data [77] within reason, even if we cannot imagine what future scientists will use them for. With this said, larger datasets can present significant archiving and re-use challenges [78–81]. Some repositories have relatively high storage limits that will suit many research project needs, and other solutions exist for larger datasets, such as bundling into smaller datasets such that each is within the storage limit [82,83]. However, some datasets are so large that they impose high financial and environmental costs to be retained and preserved. In these field-specific cases, decisions must be made about what types of data are important enough to warrant the resource-intensive process of retaining and preserving these data.
Not only does open data benefit scientific progress, open code does as well. Like open data, the usefulness of open code depends on how it is shared. That is, it ideally needs to be adequately described, reviewed, quality controlled and operable when shared [84–86]. Many researchers directly benefit from others who build, and maintain, freely available software that we use for data storage and sharing, processing, visualization, analysis and collaboration [54,87–91]. With advances in open software availability and cheap computing power, free modelling tools are becoming increasingly complex, which can increase our ability to build an accurate representation of the world around us [92–94].
Ecosystem models, for example, are powerful simulation-based process models which can help in assessing potential future management and climate scenarios in a holistic ecosystem-based framework. Yet, complex system models are inherently data-hungry, and model predictions are often limited by the available data [95]. There are several open or publicly available resources that can aid in ecosystem model development and parameterization [96–98], yet entire-system model-building efforts are still costly and time-consuming, as many sources of information are missing or difficult to obtain [8,99].
From my own recent experience re-parameterizing an end-to-end ecosystem model [100], obtaining datasets from collaborators within a common institution took an average of nearly 100 days from the first data request and required over 80 e-mails per dataset to obtain the data, clarify metadata and outline dataset caveats (table 1). Two datasets took nearly 1 year to retrieve, despite the fact that the U.S. Government requires that these taxpayer-funded datasets are made publicly available [58,102,103]. While communication during a data transfer process can clarify many caveats and misconceptions about a dataset, and thus is not inherently undesirable, substantial delays in integrating such information causes increased costs in providing new knowledge and can unnecessarily postpone overdue conservation management actions and decision-making [104,105].
Table 1.
Effort required gathering in-house (National Oceanic and Atmospheric Administration [NOAA] Fisheries-produced) data, during my postdoctoral research associateship with NOAA-Fisheries. Each row represents a dataset used to update an ecosystem model. I calculated the number of days it took me to get various datasets for the ecosystem model I was tasked to update by subtracting the date I initially sent an e-mail to data managers from the date that I received the final dataset for use in the model. The number of total e-mails sent by all parties, and the hours of meetings spent discussing the data and/or the transfer of data were counted. The last row is a openly available dataset [101].
|
initial email |
final data |
# days |
# emails |
hours of meetings |
|---|---|---|---|---|
|
12 05 2021 |
25 04 2022 |
348 |
70 |
2 |
|
05 04 2021 |
27 01 2022 |
297 |
152 |
11 |
|
21 04 2021 |
12 07 2021 |
82 |
73 |
2 |
|
24 06 2021 |
02 08 2021 |
39 |
99 |
3 |
|
12 05 2021 |
04 06 2021 |
23 |
60 |
3 |
|
10 05 2021 |
10 05 2021 |
0 |
103 |
3 |
|
01 05 2021 |
03 05 2021 |
2 |
101 |
3 |
|
— |
— |
0 |
0 |
0 |
I am not alone in realizing the inefficiencies and difficulties in obtaining data from authors [106–109]. Bledsoe et al. [110] illustrate other examples of difficulty in rescuing data; for example, they highlight that after 5 years of trying to locate data from the monitoring that followed the Exxon Valdez oil spill, researchers were only able to recover 30% of the data, estimating that $105 million USD was spent on collecting data that are no longer available [110]. In general, requests for previously published research are highly unsuccessful, failing 41–73% of the time [106,111–114]. In a particularly stark example, Wicherts et al. [112] sent more than 400 e-mails over the course of 6 months and only received data for 27% of the publications of interest (73% of authors did not share). Many researchers are on short-term contracts and are unable to fulfil project goals with such delays, leading to wasted resources and stifled careers. Many organizations around the world do not have adequate funds to support researchers waiting for access to valuable data. Likewise, if we expect to respond to climate change in a reasonable amount of time, a more efficient system would be useful.
Yet, we cannot blame individual researchers for experiencing real barriers to sharing their data and code [29]. Instead, a culture shift is necessary to adopt open science practices [115]. Institutions can support data and code archiving with resources (e.g. funding, formal guidance [116], workshops and training, library staff assistance). For example, the American Geophysical Union has developed and compiled resources as guidance for authors, which are also applicable elsewhere (https://data.agu.org/resources/). While open data mandates are useful in increasing data availability [113], the shared data are often not reusable [117]. Thus, while it is a useful first step for top–down entities to ‘require’ data sharing, compliance may improve with enforcement and quality control. Some authors have suggested that enforcement could include withholding research funding [58,102], while others include delaying promotion or tenure, or holding back journal article publications as potential options [118]. Several funding agencies have explicit open data-sharing policies and mention monitoring compliance and sanctions in cases of non-compliance (e.g. [119], §3). Journals such as The American Naturalist, the Journal of Evolutionary Biology, Ecology Letters and the Proceedings of the Royal Society B each have designated data editors whose role is to ensure that authors provide organized data and metadata in long-term repositories for future re-use [120].
The sharing of data and code can and should be streamlined to save costs and reduce waste (e.g. time, resources, funding, etc. [104]). As an example from my own work, the National Oceanic and Atmospheric Administration’s (NOAA-Fisheries) Northwest Fisheries Science Center West Coast Groundfish Bottom Trawl Survey team (last row in table 1) has stored their data in an open and freely available data warehouse that can be pulled directly into R via a custom package [101]. These efforts, while a significant upfront time commitment, will likely continue to provide researchers with instant access to their data, saving everyone (including the original data providers) time and funds in the long term. Imagine the science that could be conducted, the models that could be built and the questions that could be answered if the data that has been collectively gathered were well-documented, organized and immediately accessible (i.e. following FAIR principles) to everyone.
3. The collective advantages tomorrow
Open science has already greatly benefitted humans [48,50,121], and there is no reason to believe that further progress on this front will not continue doing so long into the future [49,51,122]. Open data and code will likely allow future scientists the ability to predict, with even greater precision and accuracy, future climate states, and the resulting patterns of animal movement and habitat availability for the world’s species [123,124]. As organism distributions shift on global and regional scales [125–127], species will cross geopolitical boundaries [128,129]. Future scientists will likely have many versions of entire Earth biosphere models, to better predict the management outcomes of harvested species and responses of endangered species under continued changes to our planet. If scientists consistently perform the actions required for achieving open science, it may allow future humans to have access to data that we can only wish that we had today [130,131]. Today’s actions can pave the path for how many centuries-long time-series datasets future humans will inherit. Although it is worth noting that this process will require data curation, including preservation, to ensure these data are usable for the long term [132,133].
Transparent sharing of data with the world can lead to a more inclusive and collaborative future of global natural resource management [134,135]. Many countries and institutions around the world do not have adequate funds to collect data or pay for the use of (or development of) analytical software. While computing power costs can still be prohibitive in some cases, personal computing hardware has become more accessible and cheaper than ever. Combined with freely available data and code, many more students and researchers are empowered to participate when science is open [122,136,137]. This can lead to collective benefits as a diversity of perspectives leads to a diversity of solutions to global issues [138,139].
We are in an era of scientific progress that is highly wasteful [104,140–144]. Individuals tend to cherish the research funds they are granted, but take for granted the monetary value of data and code after they have been collected or created. Collectively, the United States and the European Union spend hundreds of billions of dollars on research [145,146]. Ensuring that these funds are well-spent gathering information or building tools that will have long shelf lives should be of utmost importance [141]. Yet, we continually fail to archive our data and code in long-term repositories [27,77,108,147]. Sutherland [105] suggests that policy and management decisions are often made without regard to the available evidence, such that actions often do not lead to the desired results, and Buxton et al. [104] reports that this can lead to the waste of time, funds, and valuable natural resources and ecological services. This begs the question of why scientists collect data in the first place. Creating a culture of open data and code access may lead to more appropriate policies and management practices as decision-making moves towards a more transparent evidence-based process [102,103,134,135]. Instead of pretending that our data are only useful for the research question at hand, adopting the mindset of assuming that our data are valuable to the world might lead to better open science practices.
In 2008, the Svalbard Global Seed Vault was opened. Its purpose is to ‘… secure the foundation of our future food supply’ [148]. As of September 2023, 99 organizations have contributed over 1.2 million seed samples since the Global Seed Vault opened [148]. This highly collaborative effort prioritizes the future of humanity in an uncertain future. Like the hallways of seed storage shelves in the Seed Vault, hallways of servers are ready to be filled with information for future generations. And similarly, data and code will help ‘… future generations to overcome the challenges of climate change and population growth’ [148]. The question that we must ask ourselves is ‘what insurance policy do we want to have with our global knowledge?’
Large databases that store information compiled from many sources parallel global efforts such as the Seed Vault. One example of these databases is the Ocean Biodiversity Information System (OBIS), which supports over 45 million observations of diverse taxa from bacteria to marine mammals, collected by 500 institutions in 56 countries ([96]; https://obis.org/about/). Considerable collaborative effort has gone into the standardization and integration of such a database, including delivery to the Global Biodiversity Information Facility (http://www.gbif.org), for the benefit of those wishing to access such data [149]. Curating such large databases is no simple feat and requires additional effort beyond the initial data sharing by the original researchers. For example, changes to taxonomic nomenclature over time, typos and other errors pose serious challenges to the organization of these databases.
OBIS devotes significant staff time to helping clarify the appropriate metadata, which importantly takes some burden off of the researchers who collected the primary data, as they are guided through this process [149]. Yet, in general, databases often lack appropriate metadata, which hinders the find-ability and re-usability of available data [150–152]. Differences in methodology, ontology and definitions of vocabulary can lead to a lack of interoperability between datasets. While there have been many efforts to standardize metadata (e.g. [67–69,152]), these standards are lacking in many disciplines and are not adequately policed in many repositories. Institutions, funding agencies and grant writers can play a role by providing monetary and logistical support for such efforts. The costs of maintaining these databases are substantial but are outweighed by the benefits to scientific progress and society [153–158].
4. A change in culture
What is the unit of knowledge that we would most like to protect for future generations? Is it the scientific publication? Or is it our datasets? Datasets are snapshots in space and time of n-dimensional hypervolumes of information that are resources in and of themselves—each giving numerous insights into the measured world [134,135]. New publishing paradigms, such as Octopus, allow researchers to link multiple ‘Analysis’ and/or ‘Interpretation’ publications to a single ‘Results’ publication as alternative analyses and interpretations of the same data [159]. A more traditional research paper, on the other hand, is one realization of many possible assessments of the data that were originally collected, and a wide diversity of results can be obtained when many individuals analyse one dataset with the same research question in mind [160,161]. That is, publications are one version of an oversimplified projection through n-dimensional space which communicate stories that our human minds can comprehend. Manuscript narratives, by necessity, leave out information to craft such a story.
This is not to say that scientific publications in and of themselves are not useful. On the contrary, they frame our current and historical understanding of the world and put scientific inquiry into the relevant spatial and temporal context. Scientific articles offer analysis and interpretation of data which will allow future generations to understand why certain policies, management actions, or approaches were attempted and/or abandoned. However, if future researchers are not granted access to our (past) data, future humans will have to repeat costly (e.g. time and resources) experiments, laboriously extract information directly from figures, tables and text in the articles themselves (assuming the relevant information is available and detailed enough, although there is evidence that this is not the case in at least some disciplines [55,162]) or will have to trust our analytical procedures and our intuitions and perceptions about the data we collected [160,161].
We do not hesitate to spend our valuable time and resources to publish our manuscripts in journals, have DOIs minted and collectively fund the hosting of these research products in multiple archives. Yet, we argue that a lack of time and funding keeps us from doing the same with our datasets [27]. We need to consider datasets a basic unit of knowledge, the importance of which is arguably at least on par with the importance of storing the articles that are written about them [134]. Fortunately, we do not have to choose between the two. We can archive both if we choose to.
The mind-bendingly fast growth of artificial intelligence is changing the scale and scope of what we can learn with the information around us. Who can possibly know what statistical methodology or machine learning algorithms we will be applying to our data in 2100? It seems nearly certain that scientists in 2100 will rather have access to our data than any antiquated model output or table of summary statistics that we might generate today. Our ‘data available upon request’ statements, which are still pervasive in our manuscripts [106–109], will give future scientists a bit of frustration with our lack of foresight, as this type of data availability declines rapidly with age [163]. What computer, hard drive, institutional e-mail address, or contemporary researcher will be alive and operational in 2100 to share their data upon request?
As knowledge workers, we should strive to show our work—archive our data and code—to safeguard the uncertain future of our species. Earth contains complex, highly dynamic systems that we often make sense of with access to high-quality data [164] and complex modelling [95,165,166]. Society collectively benefits from openly sharing data and software with one another today and tomorrow as we form a better understanding of the world around us—a world that humans rely on for survival. Open science practices broaden participation in science and create a more inclusive, equitable and sustainable world as useful information (e.g. data and code), and a diversity of solutions, are shared—including with those who cannot afford to collect costly data [75,167]. As the cost of computational resources continues to decline, the usefulness of open data and code to underfunded scientists around the world only continues to increase. By locking information behind paywalls and taking (taxpayer-funded) datasets to the grave, knowledge workers exclude most of the world’s population from the process of science and the state of the resulting knowledge. Alternatively, investing in open science today can help to ensure information is available for unborn generations.
Acknowledgements
I thank Dom Roche, Hunter J. Cole, Howard Browman, Vivian Hutchison and Madison L. Langseth for useful comments on earlier versions of this manuscript. Any use of trade, firm or product names is for descriptive purposes only and does not imply endorsement by the U.S. Government.
Ethics
This work did not require ethical approval from a human subject or animal welfare committee.
Data accessibility
This article has no additional data.
Declaration of AI use
I have not used AI-assisted technologies in creating this article.
Authors’ contributions
D.G.E.G.: conceptualization, investigation, project administration, writing—original draft, writing—review and editing.
Conflict of interest declaration
I declare I have no competing interests.
Funding
No funding has been received for this article.
References
- 1. Karl TR, Trenberth KE. 2003. Modern global climate change. Science 302, 1719–1723. ( 10.1126/science.1090228) [DOI] [PubMed] [Google Scholar]
- 2. Arrhenius S. 1896. XXXI. On the influence of carbonic acid in the air upon the temperature of the ground. Lond. Edinb. Dublin Philos. Mag. J. Sci. 41, 237–276. [Google Scholar]
- 3. Callendar GS. 1938. The artificial production of carbon dioxide and its influence on temperature. Q. J. R. Meteorol. Soc. 64, 223–240. ( 10.1002/qj.49706427503) [DOI] [Google Scholar]
- 4. Figueres C, Rivett-Carnac T. 2021. The future we choose: the stubborn optimist’s guide to the climate crisis. New York, NY: Vintage. See https://books.google.com/books?hl=en&lr=&id=JLUlEAAAQBAJ&oi=fnd&pg=PR9&dq=+The+Future+we+Choose:+The+Stubborn+Optimist%E2%80%99s+Guide+to+the+Climate+Crisis&ots=8QbAcVvdbc&sig=w6b4Z7uSxSX6kqSw5He7FT0Q-cw. [Google Scholar]
- 5. Bryndum-Buchholz A. 2022. Keeping up hope as an early career climate-impact scientist. ICES J. Mar. Sci. 79, 2345–2350. ( 10.1093/icesjms/fsac180) [DOI] [Google Scholar]
- 6. Pörtner HO, Roberts DC, Poloczanska ES, Mintenbeck K, Tignor M, Alegría A, Craig M. 2022. IPCC, 2022: summary for policymakers. Cambridge, UK and New York, NY: Cambridge University Press. [Google Scholar]
- 7. Gurevitch J, Koricheva J, Nakagawa S, Stewart G. 2018. Meta-analysis and the science of research synthesis. Nature 555, 175–182. ( 10.1038/nature25753) [DOI] [PubMed] [Google Scholar]
- 8. Levin N, et al. 2014. Biodiversity data requirements for systematic conservation planning in the Mediterranean Sea. Mar. Ecol. Prog. Ser. 508, 261–281. ( 10.3354/meps10857) [DOI] [Google Scholar]
- 9. Dichmont CM, et al. 2017. From data rich to data-limited harvest strategies—does more data mean better management? ICES J. Mar. Sci. 74, 670–686. ( 10.1093/icesjms/fsw199) [DOI] [Google Scholar]
- 10. Machado AMS, Giehl ELH, Fernandes LP, Ingram SN, Daura-Jorge FG. 2021. Alternative data sources can fill the gaps in data-poor fisheries. ICES J. Mar. Sci. 78, 1663–1671. ( 10.1093/icesjms/fsab074) [DOI] [Google Scholar]
- 11. Ioannidis JPA. 2005. Why most published research findings are false. PLoS Med. 2, e124. ( 10.1371/journal.pmed.0020124) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12. Deressa T, Stern D, Vangronsveld J, Minx J, Lizin S, Malina R, Bruns S. 2023. More than half of statistically significant research findings in the environmental sciences are actually not. EcoEvoRxiv. ( 10.32942/X24G6Z) [DOI] [Google Scholar]
- 13. Baker M. 2016. 1,500 scientists lift the lid on reproducibility. Nature 533, 452–454. ( 10.1038/533452a) [DOI] [PubMed] [Google Scholar]
- 14. Fanelli D. 2009. How many scientists fabricate and falsify research? A systematic review and meta-analysis of survey data. PLoS One 4, e5738. ( 10.1371/journal.pone.0005738) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15. Clark TD, Binning SA, Raby GD, Speers-Roesch B, Sundin J, Jutfelt F, Roche DG. 2016. Scientific misconduct: the elephant in the lab. A response to Parker et al. Trends Ecol. Evol. 31, 899–900. ( 10.1016/j.tree.2016.09.006) [DOI] [PubMed] [Google Scholar]
- 16. Kimmel K, Avolio ML, Ferraro PJ. 2023. Empirical evidence of widespread exaggeration bias and selective reporting in ecology. Nat. Ecol. Evol. 7, 1525–1536. ( 10.1038/s41559-023-02144-3) [DOI] [PubMed] [Google Scholar]
- 17. Yang Y, Sánchez-Tójar A, O’Dea RE, Noble DWA, Koricheva J, Jennions MD, Parker TH, Lagisz M, Nakagawa S. 2023. Publication bias impacts on effect size, statistical power, and magnitude (Type M) and sign (Type S) errors in ecology and evolutionary biology. BMC Biol. 21, 71. ( 10.1186/s12915-022-01485-y) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18. Wallis JC, Rolando E, Borgman CL. 2013. If we share data, will anyone use them? Data sharing and reuse in the long tail of science and technology. PLoS One 8, e67332. ( 10.1371/journal.pone.0067332) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19. Ebrahim S, Sohani ZN, Montoya L, Agarwal A, Thorlund K, Mills EJ, Ioannidis JPA. 2014. Reanalyses of randomized clinical trial data. JAMA 312, 1024–1032. ( 10.1001/jama.2014.9646) [DOI] [PubMed] [Google Scholar]
- 20. Hardwicke TE, et al. 2018. Data availability, reusability, and analytic reproducibility: evaluating the impact of a mandatory open data policy at the journal Cognition. R. Soc. Open Sci. 5, 180448. ( 10.1098/rsos.180448) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21. Enserink M. 2017. Paper about how microplastics harm fish should be retracted, report says. Science ( 10.1126/science.aal1133) [DOI] [Google Scholar]
- 22. Pennisi E. 2020. Prominent spider biologist spun a web of questionable data. Science 367, 613–614. ( 10.1126/science.367.6478.613) [DOI] [PubMed] [Google Scholar]
- 23. Nakagawa S, Lagisz M, Jennions MD, Koricheva J, Noble DWA, Parker TH, Sánchez‐Tójar A, Yang Y, O’Dea RE. 2022. Methods for testing publication bias in ecological and evolutionary meta‐analyses. Methods Ecol. Evol. 13, 4–21. ( 10.1111/2041-210x.13724) [DOI] [Google Scholar]
- 24. Kupershmidt I, et al. 2010. Ontology-based meta-analysis of global collections of high-throughput public data. PLoS One 5, e13066. ( 10.1371/journal.pone.0013066) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25. Culina A, Crowther TW, Ramakers JJC, Gienapp P, Visser ME. 2018. How to do meta-analysis of open datasets. Nat. Ecol. Evol. 2, 1053–1056. ( 10.1038/s41559-018-0579-2) [DOI] [PubMed] [Google Scholar]
- 26. Roche DG, Lanfear R, Binning SA, Haff TM, Schwanz LE, Cain KE, Kokko H, Jennions MD, Kruuk LEB. 2014. Troubleshooting public data archiving: suggestions to increase participation. PLoS Biol. 12, e1001779. ( 10.1371/journal.pbio.1001779) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27. Stuart D, Baynes G, Hrynaszkiewicz I, Allin K, Penny D, Lucraft M, Astell M. 2018. Practical challenges for researchers in data sharing. Berlin, Germany: Springer Nature. [Google Scholar]
- 28. Soeharjono S, Roche DG. 2021. Reported individual costs and benefits of sharing open data among Canadian academic faculty in ecology and evolution. Bioscience 71, 750–756. ( 10.1093/biosci/biab024) [DOI] [Google Scholar]
- 29. Gomes DGE, Pottier P, Crystal-Ornelas R, Hudgins EJ, Foroughirad V, Sánchez-Reyes LL, Turba R. 2022. Why don’t we share data and code? Perceived barriers and benefits to public archiving practices. Proc. R. Soc. B 289, 20221113. ( 10.1098/rspb.2022.1113) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30. Daly A, Mann M, Devitt SK. 2019. Good data. Amsterdam, The Netherlands: Institute of Network Cultures. [Google Scholar]
- 31. Ozalp AS. 2019. Unlawful data access and abuse of metadata for mass persecution of dissidents in Turkey: the Bylock Case. In Good data (eds Daly A, Devitt SK, Mann M), pp. 117–134. Amsterdam, The Netherlands: Institute of Network Cultures. [Google Scholar]
- 32. Walter M, Kukutai T, Carroll SR, Rodriguez-Lonebear D. 2021. Indigenous data sovereignty and policy. London, UK: Routledge. ( 10.4324/9780429273957) [DOI] [Google Scholar]
- 33. Walter M, Lovett R, Maher B, Williamson B, Prehn J, Bodkin‐Andrews G, Lee V. 2021. Indigenous data sovereignty in the era of big data and open data. Aust. J. Soc. Issues 56, 143–156. ( 10.1002/ajs4.141) [DOI] [Google Scholar]
- 34. Carroll SR, et al. 2020. The CARE principles for Indigenous data governance. Data Science Journal 19, 43. ( 10.5334/dsj-2020-043) [DOI] [Google Scholar]
- 35. Liu B, Wei L. 2023. Unintended effects of open data policy in online behavioral research: an experimental investigation of participants’ privacy concerns and research validity. Comput. Human Behav. 139, 107537. ( 10.1016/j.chb.2022.107537) [DOI] [Google Scholar]
- 36. Smith JA, Sandbrink JB. 2022. Biosecurity in an age of open science. PLoS Biol. 20, e3001600. ( 10.1371/journal.pbio.3001600) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37. Borgesius FZ, Gray J, Van Eechoud M. 2015. Open data, privacy, and fair information principles: towards a balancing framework. Berkeley Technol. Law J. 30, 2073–2131. [Google Scholar]
- 38. Van Atteveldt W, Althaus S, Wessler H. 2021. The trouble with sharing your privates: pursuing ethical open science and collaborative research across national jurisdictions using sensitive data. Polit. Commun. 38, 192–198. ( 10.1080/10584609.2020.1744780) [DOI] [Google Scholar]
- 39. Templ M, Sariyar M. 2022. A systematic overview on methods to protect sensitive data provided for various analyses. Int. J. Inf. Secur. 21, 1233–1246. ( 10.1007/s10207-022-00607-5) [DOI] [Google Scholar]
- 40. Karcher S, Secen S, Weber N. 2023. Protecting sensitive data early in the research data lifecycle. JPC 13, 846. ( 10.29012/jpc.846) [DOI] [Google Scholar]
- 41. McKiernan EC, et al. 2016. How open science helps researchers succeed. eLife 5, e16800. ( 10.7554/elife.16800) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42. Jong S, Slavova K. 2014. When publications lead to products: the open science conundrum in new product development. Res. Policy 43, 645–654. ( 10.1016/j.respol.2013.12.009) [DOI] [Google Scholar]
- 43. Friesike S, Widenmayer B, Gassmann O, Schildhauer T. 2015. Opening science: towards an agenda of open science in academia and industry. J. Technol. Transf. 40, 581–601. ( 10.1007/s10961-014-9375-6) [DOI] [Google Scholar]
- 44. Chataway J, Parks S, Smith E. 2017. How will open science impact on university–industry collaboration. Foresight and STI Governance 11, 44–53. ( 10.17323/2500-2597.2017.2.44.53) [DOI] [Google Scholar]
- 45. Christensen G, Dafoe A, Miguel E, Moore DA, Rose AK. 2019. A study of the impact of data sharing on article citations using journal policies as a natural experiment. PLoS One 14, e0225883. ( 10.1371/journal.pone.0225883) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46. Fernández-Juricic E. 2021. Why sharing data and code during peer review can enhance behavioral ecology research. Behavioral Ecology and Sociobiology 75, 103. ( 10.1007/s00265-021-03036-x) [DOI] [Google Scholar]
- 47. Berberi I, Roche DG. 2022. No evidence that mandatory open data policies increase error correction. Nat. Ecol. Evol. 6, 1630–1633. ( 10.1038/s41559-022-01879-9) [DOI] [PubMed] [Google Scholar]
- 48. Piwowar HA, Vision TJ, Whitlock MC. 2011. Data archiving is a good investment. Nature 473, 285–285. ( 10.1038/473285a) [DOI] [PubMed] [Google Scholar]
- 49. Lowndes JSS, Best BD, Scarborough C, Afflerbach JC, Frazier MR, O’Hara CC, Jiang N, Halpern BS. 2017. Our path to better science in less time using open data science tools. Nat. Ecol. Evol. 1, 160. ( 10.1038/s41559-017-0160) [DOI] [PubMed] [Google Scholar]
- 50. Besançon L, Peiffer-Smadja N, Segalas C, Jiang H, Masuzzo P, Smout C, Billy E, Deforet M, Leyrat C. 2021. Open science saves lives: lessons from the COVID-19 pandemic. BMC Med. Res. Methodol. 21, 117. ( 10.1186/s12874-021-01304-y) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51. Kadakia KT, Beckman AL, Ross JS, Krumholz HM. 2021. Leveraging open science to accelerate research. N. Engl. J. Med. 384, e61. ( 10.1056/nejmp2034518) [DOI] [PubMed] [Google Scholar]
- 52. Roche DG, O’Dea RE, Kerr KA, Rytwinski T, Schuster R, Nguyen VM, Young N, Bennett JR, Cooke SJ. 2022. Closing the knowledge‐action gap in conservation with open science. Conserv. Biol. 36, e13835. ( 10.1111/cobi.13835) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53. Murray-Rust P. 2008. Open Data in Science. Nat. Prec. 1. ( 10.1038/npre.2008.1526.1) [DOI] [Google Scholar]
- 54. Braga PHP, et al. 2023. Not just for programmers: how GitHub can accelerate collaborative and reproducible research in ecology and evolution. Methods Ecol. Evol. 14, 1364–1380. ( 10.1111/2041-210x.14108) [DOI] [Google Scholar]
- 55. Roche DG, Kruuk LEB, Lanfear R, Binning SA. 2015. Public data archiving in ecology and evolution: how well are we doing? PLoS Biol. 13, e1002295. ( 10.1371/journal.pbio.1002295) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56. Piwowar HA, Day RS, Fridsma DB. 2007. Sharing detailed research data is associated with increased citation rate. PLoS One 2, e308. ( 10.1371/journal.pone.0000308) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57. Piwowar HA, Vision TJ. 2013. Data reuse and the open data citation advantage. PeerJ 1, e175. ( 10.7717/peerj.175) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58. Obama B. 2013. Executive order—making open and machine readable the new default for government information. Washington, DC: The White House. [Google Scholar]
- 59. Tucker CJ, Grant DM, Dykstra JD. 2004. NASA’s global orthorectified landsat data set. Photogramm. Eng. Remote Sensing 70, 313–322. ( 10.14358/PERS.70.3.313) [DOI] [Google Scholar]
- 60. Smith TM, Reynolds RW, Peterson TC, Lawrimore J. 2008. Improvements to NOAA’s historical merged land–ocean surface temperature analysis (1880–2006). J. Clim. 21, 2283–2296. ( 10.1175/2007jcli2100.1) [DOI] [Google Scholar]
- 61. Gottschalk TK, Huettmann F, Ehlers M. 2005. Review article: Thirty years of analysing and modelling avian habitat relationships using satellite imagery data: a review. Int. J. Remote Sens. 26, 2631–2656. ( 10.1080/01431160512331338041) [DOI] [Google Scholar]
- 62. Duan Z, Bastiaanssen WGM. 2013. Estimating water volume variations in lakes and reservoirs from four operational satellite altimetry databases and satellite imagery data. Remote Sens. Environ. 134, 403–416. ( 10.1016/j.rse.2013.03.010) [DOI] [Google Scholar]
- 63. Burke M, Driscoll A, Lobell DB, Ermon S. 2021. Using satellite imagery to understand and promote sustainable development. Science 371, e8628. ( 10.1126/science.abe8628) [DOI] [PubMed] [Google Scholar]
- 64. Wilkinson MD, et al. 2016. The FAIR Guiding Principles for scientific data management and stewardship. Sci. Data 3, 160018. ( 10.1038/sdata.2016.18) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 65. Duval E. 2001. Metadata Standards: What, Who & Why. J Univers Comput Sci 7, 591–601. [Google Scholar]
- 66. Millerand F, Bowker GC. 2009. Metadata standards. Trajectories and enactment in the life of an ontology. In Standards and their stories: how quantifying, classifying, and formalizing practices shape everyday life (eds Lampland M, Leigh Star S), pp. 149–165. Ithaca, NY: Cornell University Press. [Google Scholar]
- 67. Whitlock MC. 2011. Data archiving in ecology and evolution: best practices. Trends Ecol. Evol. 26, 61–65. ( 10.1016/j.tree.2010.11.006) [DOI] [PubMed] [Google Scholar]
- 68. Lisowska B. 2016. Metadata for the open data portals. Development Initiatives retrieved. See http://www.devinit.org/wp-content/uploads/2018/01/Metadata-for-open-data-portals.pdf (accessed 16 December 2023).
- 69. Shaw F, et al. 2020. COPO: a metadata platform for brokering FAIR data in the life sciences. F1000Research 9, 495. ( 10.12688/f1000research.23889.1) [DOI] [Google Scholar]
- 70. Sinaci AA, Núñez-Benjumea FJ, Gencturk M, Jauer ML, Deserno T, Chronaki C, Cangioli Get al. 2020. From raw aata to FAIR aata: the FAIRification workflow for health research. Methods Inf. Med. 59, e21–e32. ( 10.1055/s-0040-1713684) [DOI] [PubMed] [Google Scholar]
- 71. Wren JD. 2004. 404 not found: the stability and persistence of URLs published in MEDLINE. Bioinformatics 20, 668–672. ( 10.1093/bioinformatics/btg465) [DOI] [PubMed] [Google Scholar]
- 72. Howell S, Burtis A. 2022. The continued problem of URL decay: an updated analysis of health care management journal citations. J. Med. Libr. Assoc. 110, 463–470. ( 10.5195/jmla.2022.1456) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 73. Simmonds J, MacLennan DN. 2005. Fisheries acoustics: theory and practice. Padstow, UK: Blackwell Publishing. See https://books.google.com/books?hl=en&lr=&id=ktUOvnfzB-QC&oi=fnd&pg=PR5&dq=+Fisheries+acoustics:+theory+and+practice&ots=Kkxtqx0PJg&sig=ymzY20HhJT9ggxqUZlvvEfe1gRs. [Google Scholar]
- 74. Perrot Y, Brehmer P, Habasque J, Roudaut G, Behagle N, Sarré A, Lebourges-Dhaussy A. 2018. Matecho: an open-source tool for processing fisheries acoustics data. Acoust. Aust. 46, 241–248. ( 10.1007/s40857-018-0135-x) [DOI] [Google Scholar]
- 75. Fleischer G. 2005. The 2003 integrated acoustic and trawl survey of Pacific hake, Merluccius productus, in US and Canadian waters off the Pacific Coast. See https://repository.library.noaa.gov/view/noaa/3418 (accessed 16 December 2023).
- 76. Phillips EM, Chu D, Gauthier S, Parker-Stetter SL, Shelton AO, Thomas RE. 2022. Spatiotemporal variability of euphausiids in the California Current Ecosystem: insights from a recently developed time series. ICES Journal of Marine Science 79, 1312–1326. ( 10.1093/icesjms/fsac055) [DOI] [Google Scholar]
- 77. Drew BT, et al. 2013. Lost branches on the tree of life. PLoS Biol. 11, e1001636. ( 10.1371/journal.pbio.1001636) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 78. Poldrack RA, Gorgolewski KJ. 2014. Making big data open: data sharing in neuroimaging. Nat. Neurosci. 17, 1510–1517. ( 10.1038/nn.3818) [DOI] [PubMed] [Google Scholar]
- 79. Stephens ZD, et al. 2015. Big data: astronomical or genomical? PLoS Biol. 13, e1002195. ( 10.1371/journal.pbio.1002195) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 80. Acharjya DP, Ahmed K. 2016. A survey on big data analytics: challenges, open research issues and tools. Ijacsa 7, 511–518. ( 10.14569/IJACSA.2016.070267) [DOI] [Google Scholar]
- 81. Farley SS, Dawson A, Goring SJ, Williams JW. 2018. Situating ecology as a big-data science: current advances, challenges, and solutions. Bioscience 68, 563–576. ( 10.1093/biosci/biy068) [DOI] [Google Scholar]
- 82. Eggleton F, Winfield K. 2020. Open data challenges in climate science. Data Science Journal 19. ( 10.5334/dsj-2020-052) [DOI] [Google Scholar]
- 83. Simmonds MB, et al. 2022. Guidelines for publicly archiving terrestrial model data to enhance usability, intercomparison, and synthesis 21, 3. ( 10.5334/dsj-2022-003) [DOI] [Google Scholar]
- 84. Spinellis D, Gousios G, Karakoidas V, Louridas P, Adams PJ, Samoladas I, Stamelos I. 2009. Evaluating the quality of open source software. Electron. Notes Theor. Comput. Sci. 233, 5–28. ( 10.1016/j.entcs.2009.02.058) [DOI] [Google Scholar]
- 85. Bavota G, Russo B. 2015. Four eyes are better than two: on the impact of code reviews on software quality. In 2015 IEEE International Conference on Software Maintenance and Evolution (ICSME), Bremen, Germany, pp. 81–90. ( 10.1109/ICSM.2015.7332454) [DOI] [Google Scholar]
- 86. Trisovic A, Lau MK, Pasquier T, Crosas M. 2022. A large-scale study on research code quality and execution. Sci. Data 9, 60. ( 10.1038/s41597-022-01143-6) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 87. Sanner MF. 1999. Python: a programming language for software integration and development. J. Mol. Graph. Model. 17, 57–61. [PubMed] [Google Scholar]
- 88. Wickham H. 2011. ggplot2. In Wiley interdisciplinary reviews: computational statistics, pp. 180–185, vol. 3. Wiley Online Library. ( 10.1002/wics.147) [DOI] [Google Scholar]
- 89. Dabbish L, Stuart C, Tsay J, Herbsleb J. 2012. Social coding in GitHub. In CSCW ’12, Seattle Washington USA, pp. 1277–1286. New York, NY, USA. ( 10.1145/2145204.2145396). https://dl.acm.org/doi/proceedings/10.1145/2145204. [DOI] [Google Scholar]
- 90. Team R. 2015. RStudio: integrated development for R. Boston, MA: RStudio. Inc. [Google Scholar]
- 91. R Core Team . 2017. R: A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing. [Google Scholar]
- 92. Rogers A. 1995. Population forecasting: do simple models outperform complex models? Math. Popul. Stud. 5, 187–202. ( 10.1080/08898489509525401) [DOI] [PubMed] [Google Scholar]
- 93. Adams LA, George J, Bugianesi E, Rossi E, De Boer WB, van der Poorten D, Ching HLI, Bulsara M, Jeffrey GP. 2011. Complex non-invasive fibrosis models are more accurate than simple models in non-alcoholic fatty liver disease. J. Gastroenterol. Hepatol. 26, 1536–1543. ( 10.1111/j.1440-1746.2011.06774.x) [DOI] [PubMed] [Google Scholar]
- 94. Priesmann J, Nolting L, Praktiknjo A. 2019. Are complex energy system models more accurate? An intra-model comparison of power system optimization models. Appl. Energy 255, 113783. ( 10.1016/j.apenergy.2019.113783) [DOI] [Google Scholar]
- 95. Geary WL, Bode M, Doherty TS, Fulton EA, Nimmo DG, Tulloch AIT, Tulloch VJD, Ritchie EG. 2020. A guide to ecosystem models and their environmental applications. Nat. Ecol. Evol. 4, 1459–1471. ( 10.1038/s41559-020-01298-8) [DOI] [PubMed] [Google Scholar]
- 96. Grassle F. 2000. The Ocean Biogeographic Information System (OBIS): an on-line, worldwide atlas for accessing, modeling and mapping marine biological data in a multidimensional geographic context. Oceanography (Wash D C) 13, 5–7. ( 10.5670/oceanog.2000.01) [DOI] [Google Scholar]
- 97. Buitenhuis ET, et al. 2013. MAREDAT: towards a world atlas of MARine Ecosystem DATa. Earth Syst. Sci. Data 5, 227–239. ( 10.5194/essd-5-227-2013) [DOI] [Google Scholar]
- 98. Froese R, Pauly D. 2022. FishBase version (02/2022). See https://fishbase.se/search.php.
- 99. Christensen V, et al. 2009. Database-driven models of the world’s large marine ecosystems. Ecol. Modell. 220, 1984–1996. ( 10.1016/j.ecolmodel.2009.04.041) [DOI] [Google Scholar]
- 100. Gomes DGE, et al. 2024. An updated end-to-end ecosystem model of the Northern California Current reflecting ecosystem changes due to recent marine heatwaves. PLoS One 19, e0280366. ( 10.1371/journal.pone.0280366) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 101. Wetzel CR, Johnson KF, Hicks AC. 2021. nwfscSurvey: Northwest Fisheries Science Center Survey. R package version 2.0. See https://rdrr.io/github/nwfsc-assess/nwfscSurvey/.
- 102. Ryan PD. 2019. Foundations for Evidence-Based Policymaking Act of 2018. 115–435. See https://www.congress.gov/bill/115th-congress/house-bill/4174.
- 103. OSTP . 2022. Ensuring free, immediate, and equitable access to federally funded research. Office of Science and Technology Policy. See https://www.whitehouse.gov/wp-content/uploads/2022/08/08-2022-OSTP-Public-access-Memo.pdf. [Google Scholar]
- 104. Buxton RT, et al. 2021. Avoiding wasted research resources in conservation science. Conserv. Sci. Pract. 3, e329. ( 10.1111/csp2.329) [DOI] [Google Scholar]
- 105. Sutherland W. 2022. Transforming conservation: a practical guide to evidence and decision making. Cambridge, UK: Open Book Publishers. ( 10.11647/OBP.0173.0210) [DOI] [Google Scholar]
- 106. Krawczyk M, Reuben E. 2012. (Un)Available upon request: field experiment on researchers’ willingness to share supplementary materials. Account. Res. 19, 175–186. ( 10.1080/08989621.2012.678688) [DOI] [PubMed] [Google Scholar]
- 107. Langille MGI, Ravel J, Fricke WF. 2018. “Available upon request”: not good enough for microbiome data! Microbiome 6, 8. ( 10.1186/s40168-017-0394-z) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 108. Tedersoo L, et al. 2021. Data sharing practices and data availability upon request differ across scientific disciplines. Sci. Data 8, 192. ( 10.1038/s41597-021-00981-0) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 109. Hussey I. 2023. Data is not available upon request. PsyArXiv. See https://psyarxiv.com/jbu9r/download?format=pdf. [Google Scholar]
- 110. Bledsoe EK, et al. 2022. Data rescue: saving environmental data from extinction. Proc. R. Soc. B 289, 20220938. ( 10.1098/rspb.2022.0938) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 111. Hardwicke TE, Ioannidis JPA. 2018. Populating the data ark: an attempt to retrieve, preserve, and liberate data from the most highly-cited psychology and psychiatry articles. PLoS One 13, e0201856. ( 10.1371/journal.pone.0201856) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 112. Wicherts JM, Borsboom D, Kats J, Molenaar D. 2006. The poor availability of psychological research data for reanalysis. Am. Psychol. 61, 726–728. ( 10.1037/0003-066X.61.7.726) [DOI] [PubMed] [Google Scholar]
- 113. Vines TH, et al. 2013. Mandated data archiving greatly improves access to research data. FASEB J. 27, 1304–1308. ( 10.1096/fj.12-218164) [DOI] [PubMed] [Google Scholar]
- 114. Vanpaemel W, Vermorgen M, Deriemaecker L, Storms G. 2015. Are we wasting a good crisis? The availability of psychological research data after the storm. Collabra 1, 3. ( 10.1525/collabra.13) [DOI] [Google Scholar]
- 115. O’Dea RE, et al. 2021. Towards open, reliable, and transparent ecology and evolutionary biology. BMC Biol. 19, 68. ( 10.1186/s12915-021-01006-3) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 116. USGS . 2023. Fundamental Science Practices (FSP) guide to data releases with or without a companion publication. See https://www.usgs.gov/office-of-science-quality-and-integrity/fundamental-science-practices-fsp-guide-data-releases-or.
- 117. Roche DG, Berberi I, Dhane F, Lauzon F, Soeharjono S, Dakin R, Binning SA. 2022. Slow improvement to the archiving quality of open datasets shared by researchers in ecology and evolution. Proc. R. Soc. B 289, 20212780. ( 10.1098/rspb.2021.2780) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 118. Roche D. 2016. Open data: policies need policing. Nature 538, 41–41. ( 10.1038/538041c) [DOI] [PubMed] [Google Scholar]
- 119. National Science Foundation . 1979. NSF Public Access Plan 2.0; ensuring open, immediate and equitable access to National Science Foundation Funded Research. NSF 23-104. National Science Foundation. See https://www.nsf.gov/pubs/2023/nsf23104/nsf23104.pdf (accessed 26 November 2024). [Google Scholar]
- 120. Thrall PH, Chase J, Drake J, Espuno N, Hello S, Ezenwa V, Han B, Mori A, Muller‐Landau H. 2023. From raw data to publication: introducing data editing at Ecology Letters. Ecol. Lett. 26, 829–830. ( 10.1111/ele.14210) [DOI] [PubMed] [Google Scholar]
- 121. Tse EG, Klug DM, Todd MH. 2020. Open science approaches to COVID-19. F1000Res. 9, 1043. ( 10.12688/f1000research.26084.1) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 122. Fredston AL, Lowndes JSS. 2024. Welcoming more participation in open data science for the oceans. Annu. Rev. Mar. Sci. 16, 537–549. ( 10.1146/annurev-marine-041723-094741) [DOI] [PubMed] [Google Scholar]
- 123. Neumann W, Martinuzzi S, Estes AB, Pidgeon AM, Dettki H, Ericsson G, Radeloff VC. 2015. Opportunities for the application of advanced remotely-sensed data in ecological studies of terrestrial animal movement. Mov. Ecol. 3, 8. ( 10.1186/s40462-015-0036-7) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 124. Kays R, et al. 2022. The Movebank system for studying global animal movement and demography. Methods Ecol. Evol. 13, 419–431. ( 10.1111/2041-210x.13767) [DOI] [Google Scholar]
- 125. Hickling R, Roy DB, Hill JK, Fox R, Thomas CD. 2006. The distributions of a wide range of taxonomic groups are expanding polewards. Glob. Chang. Biol. 12, 450–455. ( 10.1111/j.1365-2486.2006.01116.x) [DOI] [Google Scholar]
- 126. Chen IC, Hill JK, Ohlemüller R, Roy DB, Thomas CD. 2011. Rapid range shifts of species associated with high levels of climate warming. Science 333, 1024–1026. ( 10.1126/science.1206432) [DOI] [PubMed] [Google Scholar]
- 127. Kortsch S, Primicerio R, Fossheim M, Dolgov AV, Aschan M. 2015. Climate change alters the structure of Arctic marine food webs due to poleward shifts of boreal generalists. Proc. R. Soc. B 282, 20151546. ( 10.1098/rspb.2015.1546) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 128. Engelhard GH, Righton DA, Pinnegar JK. 2014. Climate change and fishing: a century of shifting distribution in North Sea cod. Glob. Chang. Biol. 20, 2473–2483. ( 10.1111/gcb.12513) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 129. Monirul Islam Md, Sallu S, Hubacek K, Paavola J. 2014. Limits and barriers to adaptation to climate variability and change in Bangladeshi coastal fishing communities. Mar. Policy 43, 208–216. ( 10.1016/j.marpol.2013.06.007) [DOI] [Google Scholar]
- 130. Geromont HF, Butterworth DS. 2015. Generic management procedures for data-poor fisheries: forecasting with few data. ICES J. Mar. Sci. 72, 251–261. ( 10.1093/icesjms/fst232) [DOI] [Google Scholar]
- 131. Orofino S, McDonald G, Mayorga J, Costello C, Bradley D. 2023. Opportunities and challenges for improving fisheries management through greater transparency in vessel tracking. ICES J. Mar. Sci. 80, 675–689. ( 10.1093/icesjms/fsad008) [DOI] [Google Scholar]
- 132. Yakel E. 2007. Digital curation. OCLC Systems & Services 23, 335–340. ( 10.1108/10650750710831466) [DOI] [Google Scholar]
- 133. Voss A, Lvov I, Thomson SD. 2016. Data storage, curation and preservation. In The sage handbook of social media research methods, pp. 161–176. London, UK: SAGE Publications Ltd. ( 10.4135/9781473983847.n11) [DOI] [Google Scholar]
- 134. Wilson A, Downs RR, Lenhardt WC, Meyer C, Michener W, Ramapriyan H, Robinson E. 2014. Realizing the value of a national asset: scientific data. Eos Trans. Am. Geophys. Union 95, 477–478. ( 10.1002/2014eo500006) [DOI] [Google Scholar]
- 135. Clarke P. 2020. Research data – a rising national priority. Ireland’s Educ. Yearb. https://irelandseducationyearbook.ie/downloads/IEYB2020/YB2020-Research-3.pdf [Google Scholar]
- 136. Grahe JE, Cuccolo K, Leighton DC, Cramblet Alvarez LD. 2020. Open science promotes diverse, just, and sustainable research and educational outcomes. Psychology Learning & Teaching 19, 5–20. ( 10.1177/1475725719869164) [DOI] [Google Scholar]
- 137. McKenney EA, Gates TA, Goller CC, Tully D, Leggett Z, Lupek M, Cross W, Krieg C. 2024. Strategies to empower students through open pedagogy and citizen science. Ijoer 5. ( 10.18278/ijoer.5.1.5) [DOI] [Google Scholar]
- 138. Page S. 2008. The difference: how the power of diversity creates better groups, firms, schools, and societies - new edition. Princeton, NJ: Princeton University Press. See https://www.degruyter.com/document/doi/10.1515/9781400830282/html. [Google Scholar]
- 139. Yang Y, Tian TY, Woodruff TK, Jones BF, Uzzi B. 2022. Gender-diverse teams produce more novel and higher-impact scientific ideas. Proc. Natl Acad. Sci. USA 119, e2200841119. ( 10.1073/pnas.2200841119) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 140. Nasser M, Clarke M, Chalmers I, Brurberg KG, Nykvist H, Lund H, Glasziou P. 2017. What are funders doing to minimise waste in research? Lancet 389, 1006–1007. ( 10.1016/S0140-6736(17)30657-8) [DOI] [PubMed] [Google Scholar]
- 141. Chan AW, Song F, Vickers A, Jefferson T, Dickersin K, Gøtzsche PC, Krumholz HM, Ghersi D, van der Worp HB. 2014. Increasing value and reducing waste: addressing inaccessible research. Lancet 383, 257–266. ( 10.1016/s0140-6736(13)62296-5) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 142. Al-Shahi Salman R, Beller E, Kagan J, Hemminki E, Phillips RS, Savulescu J, Macleod M, Wisely J, Chalmers I. 2014. Increasing value and reducing waste in biomedical research regulation and management. Lancet 383, 176–185. ( 10.1016/S0140-6736(13)62297-7) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 143. Glasziou PP, Sanders S, Hoffmann T. 2020. Waste in COVID-19 research. BMJ m1847. ( 10.1136/bmj.m1847) [DOI] [PubMed] [Google Scholar]
- 144. Purgar M, Klanjscek T, Culina A. 2022. Quantifying research waste in ecology. Nat. Ecol. Evol. 6, 1390–1397. ( 10.1038/s41559-022-01820-0) [DOI] [PubMed] [Google Scholar]
- 145. Press WH. 2013. Presidential address. What’s so special about science (and how much should we spend on it?). Science 342, 817–822. ( 10.1126/science.342.6160.817) [DOI] [PubMed] [Google Scholar]
- 146. Wallace N. 2020. European research budget gets unexpected €4 billion boost. Science ( 10.1126/science.abf6726) [DOI] [Google Scholar]
- 147. Culina A, van den Berg I, Evans S, Sánchez-Tójar A. 2020. Low availability of code in ecology: a call for urgent action. PLoS Biol. 18, e3000763. ( 10.1371/journal.pbio.3000763) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 148. Trust C. 2023. Svalbard Global Seed Vault. See https://www.croptrust.org/work/svalbard-global-seed-vault/ (accessed 24 August 2023).
- 149. Sedberry G, Fautin D, Feldman M, Fornwall M, Goldstein P, Guralnick R. 2011. OBIS-USA: a data-sharing legacy of the census of marine life. Oceanography 24, 166–173. ( 10.5670/oceanog.2011.36) [DOI] [Google Scholar]
- 150. Whitmore A. 2012. Extracting knowledge from U.S. Department of Defense freedom of information act requests with social media. Gov. Inf. Q. 29, 151–157. ( 10.1016/j.giq.2011.08.015) [DOI] [Google Scholar]
- 151. Umbrich J, Neumaier S, Polleres A. 2015. Quality assessment and evolution of open data portals. In 2015 3rd International Conference on Future Internet of Things and Cloud (FiCloud), Rome, Italy, pp. 404–411. ( 10.1109/FiCloud.2015.82) [DOI] [Google Scholar]
- 152. Řezník T, Raes L, Stott A, De Lathouwer B, Perego A, Charvát K, Kafka Š. 2022. Improving the documentation and findability of data services and repositories: a review of (meta)data management approaches. Comput. Geosci. 169, 105194. ( 10.1016/j.cageo.2022.105194) [DOI] [Google Scholar]
- 153. Lawrenson R, Williams T, Farmer R. 1999. Clinical information for research; the use of general practice databases. J. Public Health 21, 299–304. ( 10.1093/pubmed/21.3.299) [DOI] [PubMed] [Google Scholar]
- 154. Zhang J, Dawes SS, Sarkis J. 2005. Exploring stakeholders’ expectations of the benefits and barriers of e‐government knowledge sharing. Journal of Enterprise Information Management 18, 548–567. ( 10.1108/17410390510624007) [DOI] [Google Scholar]
- 155. Williams AJ, Ekins S. 2011. A quality alert and call for improved curation of public chemistry databases. Drug Discov. Today 16, 747–750. ( 10.1016/j.drudis.2011.07.007) [DOI] [PubMed] [Google Scholar]
- 156. Janssen M, Charalabidis Y, Zuiderwijk A. 2012. Benefits, adoption barriers and myths of open data and open government. Inf. Syst. Manag. 29, 258–268. ( 10.1080/10580530.2012.716740) [DOI] [Google Scholar]
- 157. Lafuente B, Downs RT, Yang H, Stone N. 2015. 1. The power of databases: The RRUFF project. In Highlights in mineralogical crystallography, pp. 1–30. Berlin, Germany: De Gruyter. ( 10.1515/9783110417104-003) [DOI] [Google Scholar]
- 158. Koetzle TF. 1989. Benefits of databases. Nature 342, 114–114. ( 10.1038/342114b0) [DOI] [PubMed] [Google Scholar]
- 159. Dhar P. 2023. Octopus and ResearchEquals aim to break the publishing mould. Nature ( 10.1038/d41586-023-00861-0) [DOI] [PubMed] [Google Scholar]
- 160. Silberzahn R, Uhlmann EL, Martin DP, Anselmi P, Aust F, Awtrey E, Bahník Š. 2018. Corrigendum: many analysts, one data set: making transparent how variations in analytic choices affect results. Advances in Methods and Practices in Psychological Science 1, 580–580. ( 10.1177/2515245918810511) [DOI] [Google Scholar]
- 161. Gould E, et al. 2023. Same data, different analysts: variation in effect sizes due to analytical decisions in ecology and evolutionary biology. Ecol. Evol. Biol. ( 10.32942/X2GG62) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 162. Towse JN, Ellis DA, Towse AS. 2021. Opening Pandora’s Box: peeking inside psychology’s data sharing practices, and seven recommendations for change. Behav. Res. Methods 53, 1455–1468. ( 10.3758/s13428-020-01486-1) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 163. Vines TH, et al. 2014. The availability of research data declines rapidly with article age. Curr. Biol. 24, 94–97. ( 10.1016/j.cub.2013.11.014) [DOI] [PubMed] [Google Scholar]
- 164. Katsanevakis S, et al. 2020. Twelve recommendations for advancing marine conservation in European and contiguous seas. Front. Mar. Sci. 7. ( 10.3389/fmars.2020.565968) [DOI] [Google Scholar]
- 165. Fulton EA, et al. 2011. Lessons in modelling and management of marine ecosystems: the Atlantis experience. Fish Fish. 12, 171–188. ( 10.1111/j.1467-2979.2011.00412.x) [DOI] [Google Scholar]
- 166. Karp MA, et al. 2023. Increasing the uptake of multispecies models in fisheries management. ICES J. Mar. Sci. 80, 243–257. ( 10.1093/icesjms/fsad001) [DOI] [Google Scholar]
- 167. Ahmed A. 2023. Nature human behaviour, pp. 1021–1026, vol. 7. UK London: Nature Publishing Group. ( 10.1038/s41562-023-01637-2) [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
This article has no additional data.
