2

Several academic paper hosts such as PubMed and Cochrane give some machine readable access to their data by providing APIs, but do not directly provide the data sets. Is there any reason why they don't provide some dumps of the data sets, aside from trying to save bandwidth and avoiding to cron dumps?

Franck Dernoncourt
  • 33,669
  • 27
  • 144
  • 313
  • Are: "trying to save bandwidth and avoiding to cron dumps" the reasons they told you, when you asked them; or just your own speculation - in which case, what did they say when you asked them? – 410 gone Dec 14 '15 at 03:04
  • @EnergyNumbers Speculation. I emailed Cochrane (Deborah Pentesco-Gilbert dpentesc@wiley.com) 20 days ago and haven't received any reply so far. I haven't tried to email PubMed yet (custserv@nlm.nih.gov). – Franck Dernoncourt Dec 14 '15 at 03:08

1 Answers1

2

One obvious reason would be that providing an api instead of a dump allows them to see how people are using the data. It's also quite possible that the data contains PPI that is necessary for aggregation, but which shouldn't be exposed for subject privacy reasons, so an api lets them still do the aggregation while masking the sensitive underlying data.

dwoz
  • 1,963
  • 11
  • 13