Table 2. Themes derived and illustrative quotes.
Theme and sub-theme | Illustrative quotes | Reference |
---|---|---|
Theme: Data Integrity | ||
Data quality | There are definitely different comfort levels for people. Some people will forever be confined to studying their own system because they are unable to accept any degree of, you know, sort of taking other people’s word—sort of dealing with data that they didn’t actually see collected themselves. | Zimmerman 2003 [76] |
What had been reported, what had been presented and discussed were, kinda, the best view of the data. [I]n reality, the data did have some problems that weren’t apparent until you got deeply inside and started looking. | Yoon 2017 [74] | |
Data documentation | …a lot of the contextual data that you need is not provided. | Faniel 2013 [43] |
You can tell from the documentation whether or not a research[er] was thorough and careful. | Yoon 2017 [75] | |
It’s so easy to generate this digital data, but if you’re not careful how you name things and how you document stuff and making sense of it later, particularly for someone else, is going to be a real challenge. | Yatcilla 2017 [73] | |
What is worth sharing | Am I worried it won’t be there in 20 years? No. Am I worried it won’t be there in 100? It doesn’t matter. By that point, data become irrelevant except as historical curiosity. | Marcus 2007 [56] |
Biospecimens are very valuable because they were collected before the disease, so they’re good for looking at developing disease…I think it could be used for many years. | Read 2015 [64] | |
Theme: Responsible Conduct of Research | ||
Misuse of data | …my main concern is I don’t want people to misuse it … and if I don’t have some relationship of trust then I don’t know whether they’re going to, you know, just go off and do something and never check with me to see, well, was this a good interpretation. | Cragin 2010 [38] |
…a whole cadre of people whose only job is pilfering other people’s stuff, or parasitically using it. | Hunt 2018 [49] | |
Work culture | I completed an NSF grant in December and… you have to have now a section that describes what you are going to do with your data…Data availability and where you’re going to archive it… So you’re being forced to deal with it now whereas in the past you’re like, ‘Well it’s in my file cabinet. | Frank 2015 [45] |
I think perhaps it’s just tradition or it’s a thing of the past where people have held their data somewhat closely… | Ochs 2017 [61] | |
Protecting one’s own work / Intellectual property | We all collect samples together in the field, but when you come back to process the samples, people want the data without any understanding or agreement about ownership. | Marcus 2007 [56] |
But it’s also the notion of intellectual property, isn’t it? … How are we going to know if other people are picking it up and using it elsewhere, unless they’re being absolutely… | Broom 2009 [34] | |
Control of data | If someone were to use the data would be good to know, what did they do with it, some form of communication… | Johri 2016 [51] |
You would have to describe your intended use of the data. And then the people who originally were the researchers who gathered that data, would all have to agree to consent to each application. And so they still retain the control of the data. | Finn 2014 [44] | |
Privacy/Confidentiality/Ethics | If the systems are such that they can get into our data, we might need to think for the first time about being a little bit more circumspect and think about what qualifications we would want to impose … I think there would probably be a lot of regulatory compliance pieces we might want to spell out more than we do now. | Manion 2009 [55] |
…we can never actually, never guarantee confidentiality of all data, because it could be hacked into and we can’t anymore say that your data will be anonymous because that is nonsense too, because we are able to bring in so many different kinds of data, … that the potential for people to be re-identified or distinguished in the data are quite high… | Finn 2014 [44] | |
Theme: Feasibility of Sharing Data | ||
Infrastructure | I do think that from an institutional level there should be a governing body to provide guidance and to enforce policy, and to make policy for all the systems that will interact and handle activity with other institutions. As far as what functions they would dictate, [they] would be all around the authorization, authentication, and accounting of access to that data. | Manion 2009 [55] |
It’s very easy to see how having a central, university wide, storage and dissemination system for data would be much more cost effective, and probably better executed, than anything we could do ourselves. | McLure 2014 [58] | |
Time/work required | If there's someone in the institute who can [deposit data], instead of individual researchers, that would save lots of our time and [we could] be more productive… | Williams 2013 [72] |
To be quite honest, the biggest hurdle when you’re dealing with genetic data in like depositing … the information and the sequence data onto GenBank is associating that with museum specimens or locality data ….It’s really kind of clunky and it really takes a lot of time to do that. | Frank 2015 [45] | |
Skills | We are not thinking too much about data management. We are thinking more about the approach and methodology… | Diekmann 2012 [41] |
They are resistant to having to learn how to use new tools that make open data and reproducibility easier. They generally kind of just have their process and they feel like they're tested already in terms of their time and their commitment and they don’t really want to add this to the list of things that they have to worry about. |
Noorman 2014 [60] | |
Theme: Value of Sharing Data | ||
Promote future discovery | …there is no sense in collecting data if it can't be used [by other researchers]. | Lage 2011 [54] |
We truly believe that sharing data is the right thing to do, simply because the original data we used for this study was not ours. Our study was only possible because other astronomers made their data publicly available in the first place! | Pepe 2014 [63] | |
Researcher perspective | To incentivize data sharing there should be follow-on grants on data analysis and dissemination grant to bring other researchers on board. If NSF changed their model for a year, there is a lot of data out there. I think there has to be some stipulation about who gets authorship when the data is used but I think funding to bring new people on board is essential. There can also be a solicitation focused on secondary analysis. | Johri 2016 [51] |
I think one barrier to data sharing is the merit review process within institutions for tenure and promotion; things such as ‘how many people accessed your dataset’ are not valued. | Johri 2016 [51] |