14

There is a large database maintained by a group of scientists. This group of scientists have made a considerable effort to get many other scientists to contribute to this database, and make it available to other researchers, on the condition that to access some of the data, some data contributors may request authorship. This project has been very successful and many publications have resulted from this database.

Recently, some authors used the entire database in an analysis, and gave associated authorship to respective authors when requested. However, this analysis is published in a journal with a specific data sharing policy that reads,

"(Journal X) requires authors to ensure that data and materials integral to the paper are available to readers in a form which allows for verification and replication of the results in the paper. Where feasible, data should be included as part of the article or as supporting information, however if this is not possible, we expect authors to make use of public data repositories and include the appropriate links and identifiers within the article. It is the strict requirement of the journal that authors will agree to make their data and materials available to readers upon reasonable request, and corresponding authors will be reminded of this at acceptance stage. Please note that this policy also applies to any custom software described in the paper."

To me, this means that if I request this data, then the original authors can no longer require me to include them as authors in any subsequent analysis I generate and choose to publish. However, it only says they have to share the data with me. It says nothing about how I am allowed to use said shared data.

Question: if I request this data, am I somehow obligated to offer authorship to the original data generators?

user234105
  • 292
  • 1
  • 8
  • Have you asked the editors of Journal X what they think about this situation? They might feel that the conditions imposed by the people who gathered the data are still consistent with their journal's rules. – Brian Borchers Jun 19 '16 at 22:56
  • 1
    AFAIK "making data and material available" does not mean that they simply cannot point you to the freely available database. In other words the authors of those papers could simply give you the URL of the database and be done. – Bakuriu Jun 20 '16 at 09:01
  • @Bakuriu its not freely available at a URL. A whole data request process is required, and you must agree to grant co-authorship to use certain subsets of the data, based on the preferences of the contributor of each data subset. – user234105 Jun 20 '16 at 11:44
  • obligated is a tricky word. ① Do people think badly about you when you violated their expectations. ② Do you violate legal copyrights? ③ Will a journal publish your paper? – Christian Jun 20 '16 at 11:52
  • @user234105 So? The journal you mention doesn't state that you cannot put such restrictions. Note that the policy says reasonable request which, might well be interpreted as: I give you the data for free but if you use it I want to be co-author. They do not say any request for data. Also if the data is too big (as you mention: the whole database) the simple request for TBs of data may not be considered reasonable and they might just reply you: that's too much data for us to send you. – Bakuriu Jun 20 '16 at 12:34
  • @user234105 Note that there were legal verdicts about similar situations. E.g. in EU you have the right of asking for the deletion of your account & related personal information in any service, however I know that there have been cases where such deletion would have required an unreasonable amount of work to the service due to how it was built and people have failed to sue companies in such situation. – Bakuriu Jun 20 '16 at 12:37
  • @Bakuriu This seems like a conflict of interest. Say my goal was to replicate their analysis, and I could not, so I publish a critical response to the paper. I am expected to have the people who I am criticizing included as co-authors? – user234105 Jun 20 '16 at 13:44
  • @user234105 in that case they do not have any interest in claiming co-authorship. Keep in mind that they do not get automatic co-authorship, they only get the right to ask to be co-author. So if you are refuting their article it's not in their interest to have their name in it. Otherwise they'd have to retract their previous paper... – Bakuriu Jun 20 '16 at 15:19
  • What have other authors done in this journal? Can you find authors that have "clearly" requested co-authorship for data sharing and evidently gotten it? – Bill Barth Jun 20 '16 at 15:22
  • @BillBarth yes, for sure. Its conditional on them releasing the data. Its a formal part of requesting data from the database. – user234105 Jun 20 '16 at 16:50
  • @Bakuriu or its in their interest to accept co-authorship, and then not agree to publication in order to tank the reply, without revisions that completely mitigate the criticism. – user234105 Jun 20 '16 at 16:51
  • 1
    If I asked for data, I don't think I'd accept a request for co-authorship as reasonable, and then I'd have a conversation with the editor. – Bill Barth Jun 20 '16 at 17:46

2 Answers2

19

I'm not a fan of "mandatory authorship" on published information: I feel that once something is published, one should collect one's rewards by means of citation rather than by strong-arming people into giving you an authorship. Mandatory authorship on already-published data feels to me too much like a form of salami-slicing on a dataset.

That said, the journal's policy does not appear to say anything at all regarding authorship. Therefore, the default position would seem to be that the authors must share the data with you (as mandated by the journal), but are also free to require authorship as "payment" for sharing.

The journal, however, may feel that this goes against the spirit of their data sharing agreement, and if so, then you may be able to obtain the data without being coerced into authorship. I would thus recommend, like @BrianBorchers notes in the comment, that you write to the chief editor(s) and ask for a ruling.

jakebeal
  • 187,714
  • 41
  • 655
  • 920
  • 2
    I think this is the best approach. A related question: say I am publishing a critical response to this paper, based on a re-analysis of the original data. Am I obligated to include the original authors since they are the "data-owners"? This seems like a conflict of interest, since the intent of the paper would be to criticize those very authors. Those authors would therefore have an incentive to tank the paper, or withhold the data. – user234105 Jun 20 '16 at 13:46
13

Your argument here strikes me as incredibly flimsy. As you point out, the data is in fact available, just with the restriction that it cannot be used in further papers without offering coauthorship, and that restriction in no way contradicts the journal's stated policies.

It's not clear whether it violates the intent of the policies. One argument that it doesn't is that the journal specifically says this data must be available "in a form which allows for verification and replication of the results in the paper", which is different from allowing use in other work. Furthermore, the policy says "reasonable request", which suggests that some requests could be considered unreasonable. It's not 100% clear what the policy's authors had in mind (presumably requests for materials could more easily be considered unreasonable), but they certainly didn't say "everyone is entitled to the data and can do whatever they want with it, no questions asked".

So it seems to me that the written policy offers no support for your position.

I do not think it's fruitful to ask the journal editors for permission. Even if they declare that these restrictions are not what they had in mind, they have no authority to impose this interpretation retroactively.

If you try to use the data without offering coauthorship, there's a real risk that the data generators could file a misconduct complaint against you with your university, a relevant professional society, or the journal you end up publishing in. If I had to adjudicate such a complaint, I expect I would decide in their favor.

This is not to say that they are behaving reasonably, and I agree that jakebeal that this is a questionable practice. However, you haven't found a loophole that justifies ignoring their request.

Anonymous Mathematician
  • 132,532
  • 17
  • 374
  • 531
  • 1
    Also: there is a difference between sharing 1KB of data and 2TB of database. The former is probably a reasonable amount to make available, the latter not, because the simple act of sharing it may be non-trivial or costly for the authors. – Bakuriu Jun 20 '16 at 12:40
  • 1
    @Bakuriu the journal statement says, "Where feasible, data should be included as part of the article or as supporting information, however if this is not possible, we expect authors to make use of public data repositories and include the appropriate links and identifiers within the article." Even if its 2TB, the burden is on the authors to make it public, either in the MS or online. – user234105 Jun 21 '16 at 18:16
  • @user234105 Well, AFAICT the database you described in your question fits exactly in that description. Public data repositories doesn't imply that the data is freely available for anyone without any condition, it simply means that is not private/copyrighted. – Bakuriu Jun 21 '16 at 19:24