User:Chrisftaylor/Project documents/Dryad abroad


 * BMRB list of citations of published articles for entries
 * Citations of GEOROC

Update:

People currently indicate 'published' status in different ways in DataCite's metadata registry; e.g.,

Dryad style:

10.1016/J.YMPEV.2011.06.012

3TU style

10.1016/j.marmicro.2009.03.003

And an actual 3TU entry picked at random...

http://resolver.tudelft.nl/uuid:d60a9711-aefc-44fa-b91e-1ab47930f2ec

And a Dryad entry with no DOI for a publication (it's from a proceedings)

http://data.datacite.org/10.5061/DRYAD.8C1P6

So I suppose the question is whether to get the most out of the mess (and start some kind of effort in the background to standardize this -- ideally a boolean somewhere up top, with a regularized way to elaborate further down).

Update:

http://search.datacite.org/ui?&q=relatedIdentifier%3Aissupplementto\%3ADOI\%3A* -> 7511 hits (with DOIs)

http://search.datacite.org/ui?&q=relatedIdentifier%3Aissupplementto\%3A* -> 8136 hits (with or without)

There are typos in DataCite and that may account for some of the discrepancy, but I also suspect that there are datasets in there with component parts also given DOIs. 600 isn't many, but it is far from a guarantee. I've asked them their opinion on this (is it exclusively for paper links).

http://data.datacite.org/10.5061/DRYAD.8C1P6 is a fun one -- no DOI link to the publication in the metadata, presumably because crossref don't know it (proceedings).

Update:

Seems that 'relatedIdentifier:issupplementto\:*' (http://search.datacite.org/ui?&q=relatedIdentifier%3Aissupplementto\%3A*) is DataCite's preferred way to link to a paper, but that doesn't seem to me to exclusively indicate that a journal publication is on the end of the link.

[From @datacite]

'If the author knows about a data-paper link, it is in the metadata http://t.co/zG3l48hn8q\%3A* Often they do not know'

(Original Tweet: https://twitter.com/datacite/status/468691940960391168)

I'm nervous to make any assertions yet, because I'm basically just looking at instances now to confirm it, but I don't see the 'publication' assertion in the DataCite record for Dryad entries.

For example, the basic search result only has

Resource type Dataset: DataPackage

and further down (but with no related-identifier with publication as the type)...

IsReferencedBy: doi:10.1111/MEC.12531'

Assuming their search engine is okay I ran the following (inter alia):
 * http://search.datacite.org/ui?&q=dryad -> 18,456 records
 * http://search.datacite.org/ui?&q=Dryad+Digital%20Repository -> 18,265 records
 * http://search.datacite.org/ui?&q=dryad+datapackage -> 5601 records
 * http://search.datacite.org/ui?&q=dryad+publish* -> 552 records
 * http://search.datacite.org/ui?&q=dryad+publication -> 18 records

All the publi* ones are from mentions in titles and things like that.

So in case the search engine / result rendering was faulty or something (I suspect that the OAI-PMH stuff is -- still checking that out) I grabbed the full metadata from them for that record through content negotiation. The DataCite format is as rich as they get, but no related-identifier with publication as the type (only as a generic DOI):


 * &lt;relatedIdentifier relationType="IsReferencedBy" relatedIdentifierType="DOI"&gt;10.1111/MEC.12531&lt;/relatedIdentifier&gt;

The caveat is that this is one record of thousands. I'm checking others (datacite DL format only, as that seems to return the most data).

Asking for RDF gets:

  <j.0:creator>Haile, James</j.0:creator> <j.0:title>Data from: Who’s for dinner? High-throughput sequencing reveals bat diet differentiation in a biodiversity hotspot where prey taxonomy is largely undescribed</j.0:title> <j.0:creator>Burgar, Joanna M.</j.0:creator> <j.0:creator>Stokes, Vicki</j.0:creator> <j.0:creator>Murray, Daithi C.</j.0:creator> <j.0:date>2013</j.0:date> <owl:sameAs>doi:10.5061/DRYAD.KM6PH</owl:sameAs> <owl:sameAs>info:doi/10.5061/DRYAD.KM6PH</owl:sameAs> <j.0:identifier>10.5061/DRYAD.KM6PH</j.0:identifier> <j.0:publisher>Dryad Digital Repository</j.0:publisher> <j.0:creator>Houston, Jayne</j.0:creator> <j.0:creator>Craig, Michael D.</j.0:creator> <j.0:creator>Bunce, Michael</j.0:creator> </rdf:Description> </rdf:RDF>

Asking for Turtle gets even less: ​​ <http://dx.doi.org/10.5061/DRYAD.KM6PH>

<http://purl.org/dc/terms/creator> "Murray, Daithi C." , "Craig, Michael D." , "Bunce, Michael", "Haile, James" , "Burgar, Joanna M." , "Houston, Jayne", "Stokes, Vicki" ; <http://purl.org/dc/terms/date> "2013" ;    <http://purl.org/dc/terms/identifier> "10.5061/DRYAD.KM6PH" ; <http://purl.org/dc/terms/publisher> "Dryad Digital Repository" ; <http://purl.org/dc/terms/title> "Data from: Whoâ€™s for dinner? High-throughput sequencing reveals bat diet differentiation in a biodiversity hotspot where prey taxonomy is largely undescribed" ; <http://www.w3.org/2002/07/owl#sameAs> "info:doi/10.5061/DRYAD.KM6PH", "doi:10.5061/DRYAD.KM6PH".

JSON is also bad

{ "type":"dataset", "DOI":"10.5061\/DRYAD.KM6PH", "URL":"http:\/\/dx.doi.org\/10.5061\/DRYAD.KM6PH", "title":"Data from: Who’s for dinner? High-throughput sequencing reveals bat diet differentiation in a biodiversity hotspot where prey taxonomy is largely undescribed", "publisher":"Dryad Digital Repository", "issued":{"raw":"2013"}, "author":[{"literal":"Burgar, Joanna M."},
 * {"literal":"Murray, Daithi C."},
 * {"literal":"Craig, Michael D."},
 * {"literal":"Haile, James"},
 * {"literal":"Houston, Jayne"},
 * {"literal":"Stokes, Vicki"},
 * {"literal":"Bunce, Michael"}]

}</tt>

As is BibTeX

@data{414ac90b-fb09-4b39-a396-dea056a58ffe,


 * doi = {10.5061/DRYAD.KM6PH},


 * url = {http://dx.doi.org/10.5061/DRYAD.KM6PH},


 * author = {Burgar, Joanna M.; Murray, Daithi C.; Craig, Michael D.; Haile, James; Houston, Jayne; Stokes, Vicki; Bunce, Michael; },


 * publisher = {Dryad Digital Repository},


 * title = {Data from: Who’s for dinner? High-throughput sequencing reveals bat diet differentiation in a biodiversity hotspot where prey taxonomy is largely undescribed},


 * year = {2013}

}</tt>