Experimental data availability is a cornerstone for reproducibility in experimental fracture mechanics. This is how the technical note begins, the recently published
“Long term availability of raw experimental data in experimental fracture mechanics”, by Patrick Diehl, Ilyass Tabiai, Felix W. Baumann, Daniel Therriault and Martin Levesque, in Engineering Fracture Mechanics, 197 (2018) 21–26.
It is five pages that really deserves to be read and discussed. A theory may be interesting but of little value until it has been proven by experiments. All the proof of a theory is in the experiment. What is the point if there is no raw-data for quallity check?
The authors cite another survey that found that 70% of around 1500 researchers failed to reproduce other scientists experiments. As a surprise, the same study find that the common scientists are confident that peer reviewed published experiments are reproducible.
A few years back many research councils around the world demanded open access to all publications emanating from research finansed by them. Open access is fine, but it is much more important to allow examination of the data that is used. Publishers could make a difference by providing space for data from their authors. Those who do not want to disclose their data should be asked for an explanation.
The pragmatic result of the survey is that only 6% will provide data, and you have to ask for it. That is a really disappointing result. The remaining was outdated addresses 22%, no reply 58% and 14% replied but were not willing to share their data. The result would probably still be deeply depressing, but possibly a bit better if I as a researcher only have a single experiment and a few authors to track down. It means more work than an email but on the other hand I don’t have 187 publications that Diehl et al. had. Through friends and former co-authors and some work I think chances are good. The authors present some clever ideas of what could be better than simply email-addresses that are temporary for many researchers.
The authors of the technical note do not know what hindered those 60% who did receive the request and did not reply. What could be the reason for not replying to a message where a colleague asks you about your willingness to share the raw experimental data of a published paper with others? If I present myself to a scientist as a colleague who plan to study his data and instead of studying his behaviour, then chances that he answers increase. I certainly hope that, and at least not the reversed but who knows, life never ceases to surprise. It would be interesting to know what happens. If anyone would like to have a go, I am sure that the author’s of the paper are willing to share the list of papers that they used.
Again, could there be any good reason for not sharing your raw-data with your fellow creatures? What is your opinion? Anyone, the authors perhaps.
Per Ståhle
»
Comments
Permalink Submitted by Ajit R. Jadhav on Thu, 2018-08-23 23:19.
Thanks for highlighting the issue.
The idea that raw-data should be available seems quite fine by me, at least on the face of it, though let me hasten to add that personally, I mostly work only in theory, and for that reason, this is more or less a complete non-issue for me. Further, as a programmer, the closest thing that comes to sharing data in my case is: sharing the raw output of programs—though I would have strong objections if all parts of algorithms themselves also were to be disclosed to be able to publish a paper.
As to the latter, I was thinking of this hypothetical scenario. Suppose I invent a new algorithm for speeding up certain simulations. I want to sell that algorithm to some company. I want to get the best possible value for my effort (which is not necessarily the same as the most possible money in the immediate present). But the market is highly fragmented, and so, I don’t want to go through the hassle of contacting every potential customer. So, a good avenue for me is to publish a paper about it. Clearly, here, I can share some data but not all. Especially if the raw data itself can be enough for someone else to figure out at least the kind of algorithm I was using. Data can be a window into the algorithm, which I don’t want to open just yet. How does the proposal work out in this case?
The parallel of the programmer’s case to that of the “hard” experimental research is obvious.
Thus, in some cases, I do anticipate that there could be some IPR-related issues related to the design of the experimental apparatus itself, or of algorithms. Disclosing even just the raw-data could be, in some cases, tantamount to disclosing some other data or ideas that in themselves have some commercial value (present or future), implications for the confidentiality clauses with the clients, and/or patents.
Overall, private organizations pursuing cutting-edge research may have good reasons to pursuing a policy that has both these components: (i) not disclosing the raw data itself, and yet (ii) publishing some of their findings in a summary form, so as to keep the interested public informed about the more distinct stages that their research has reached. The twin policy results, because qua research, it needs to be published (say to gain or retain credibility); qua private data, it anyway cannot a property “owned” by “the public.”
Further, in any case, what is meant by raw-data also needs to be discussed by the research community and clarified. No one would want a worthless explostion in the amount of data. … One sure way to hide “real” information is to cover it under tons of worthless data. You can at least buy some time that way! (To wit: media reports about the Right to Information act in India.)
With all that said, in general, however, I do find the idea that “grant providing organizations should ensure that experimental data by public funded projects is available to the public” very appealing. [Emphasis added]. … Poetic justice! 🙂
Best,
–Ajit
https://imechanica.org/node/22590