Difference between revisions of "Data Access"

From Dryad wiki
Jump to: navigation, search
(SOLR search access)
(Web Browser User Interface)
 
(2 intermediate revisions by the same user not shown)
Line 1: Line 1:
 
== Web Browser User Interface ==
 
== Web Browser User Interface ==
  
Primary access to Dryad is through its web interface, where users most commonly search on authors, titles, subjects and other metadata elements. Data files archived by Dryad may be downloaded one-by-one from their Dryad data package Web pages.
+
Primary access to Dryad is through its web interface at http://datadryad.org, where users most commonly search on authors, titles, subjects and other metadata elements. Data files archived by Dryad may be downloaded one-by-one from their Dryad data package Web pages.
  
Additionally, DSpace, the platform on which Dryad is built, supports several "hidden" ways to hack the system's URLs to get useful metadata from the Web interface.
+
=== Searching the old Dryad (historical content only) ===
 +
 
 +
DSpace, the platform on which the initial version of Dryad was built, supports several "hidden" ways to hack the system's URLs to get useful metadata from the Web interface.
 
<blockquote>Finding a data package page using the article DOI or PMID:
 
<blockquote>Finding a data package page using the article DOI or PMID:
*[http://datadryad.org/discover?query= http://datadryad.org/discover?query=]"doi:10.1111/j.1558-5646.2007.00022.x"
+
*[http://v1.datadryad.org/discover?query= http://v1.datadryad.org/discover?query=]"doi:10.1111/j.1558-5646.2007.00022.x"
*[http://datadryad.org/discover?query= http://datadryad.org/discover?query=]"PMID:17348941"
+
*[http://v1.datadryad.org/discover?query= http://v1.datadryad.org/discover?query=]"PMID:17348941"
 
Viewing full metadata: add "?show=full" to the end of the URL
 
Viewing full metadata: add "?show=full" to the end of the URL
*[http://datadryad.org/resource/doi:10.5061/dryad.20?show=full http://datadryad.org/resource/doi:10.5061/dryad.20?show=full]
+
*[http://v1.datadryad.org/resource/doi:10.5061/dryad.20?show=full http://v1.datadryad.org/resource/doi:10.5061/dryad.20?show=full]
 
Viewing the raw DSpace representation of a page add "DRI" to the URL
 
Viewing the raw DSpace representation of a page add "DRI" to the URL
*[http://datadryad.org/resource/doi:10.5061/dryad.12/DRI http://datadryad.org/resource/doi:10.5061/dryad.12/DRI]
+
*[http://v1.datadryad.org/resource/doi:10.5061/dryad.12/DRI http://v1.datadryad.org/resource/doi:10.5061/dryad.12/DRI]
 
Another way to view the raw DSpace markup is to add "?XML" to the end of the URL. This is less useful than the above method, though, because the page's content won't contain the externalized i18n strings.
 
Another way to view the raw DSpace markup is to add "?XML" to the end of the URL. This is less useful than the above method, though, because the page's content won't contain the externalized i18n strings.
*[http://datadryad.org/resource/doi:10.5061/dryad.12?XML http://datadryad.org/resource/doi:10.5061/dryad.12?XML]
+
*[http://v1.datadryad.org/resource/doi:10.5061/dryad.12?XML http://v1.datadryad.org/resource/doi:10.5061/dryad.12?XML]
 
Viewing metadata in machine-readable (METS) format. Can be performed using a DOI or a (legacy) handle:
 
Viewing metadata in machine-readable (METS) format. Can be performed using a DOI or a (legacy) handle:
*[http://datadryad.org/resource/doi:10.5061/dryad.12/mets.xml http://datadryad.org/resource/doi:10.5061/dryad.12/mets.xml]
+
*[http://v1.datadryad.org/resource/doi:10.5061/dryad.12/mets.xml http://v1.datadryad.org/resource/doi:10.5061/dryad.12/mets.xml]
*[http://datadryad.org/metadata/handle/10255/dryad.1080/mets.xml http://datadryad.org/metadata/handle/10255/dryad.1080/mets.xml]
+
*[http://v1.datadryad.org/metadata/handle/10255/dryad.1080/mets.xml http://v1.datadryad.org/metadata/handle/10255/dryad.1080/mets.xml]
 
</blockquote>
 
</blockquote>
 +
 
== Programmatic Data Access ==
 
== Programmatic Data Access ==
  

Latest revision as of 06:20, 10 September 2019

Web Browser User Interface

Primary access to Dryad is through its web interface at http://datadryad.org, where users most commonly search on authors, titles, subjects and other metadata elements. Data files archived by Dryad may be downloaded one-by-one from their Dryad data package Web pages.

Searching the old Dryad (historical content only)

DSpace, the platform on which the initial version of Dryad was built, supports several "hidden" ways to hack the system's URLs to get useful metadata from the Web interface.

Finding a data package page using the article DOI or PMID:

Viewing full metadata: add "?show=full" to the end of the URL

Viewing the raw DSpace representation of a page add "DRI" to the URL

Another way to view the raw DSpace markup is to add "?XML" to the end of the URL. This is less useful than the above method, though, because the page's content won't contain the externalized i18n strings.

Viewing metadata in machine-readable (METS) format. Can be performed using a DOI or a (legacy) handle:

Programmatic Data Access

In addition to the web interface, Dryad can be accessed programmatically through several APIs.

DataCite

Metadata for all Dryad objects is published through the DataCite system, and available in the DataCite search. This metadata is also available via the Scholix system.

OAI-PMH

The OAI-PMH API is no longer being updated, but the historical contents are still available from http://v1.datadryad.org/oai

OAI-PMH is a harvesting protocol that may be used to access Dryad's metadata. The specification is available, as are online tutorials, but we include a couple of examples of its use here for illustrative purposes.

Identify
Used to learn about the service

ListSets

Used to learn what sets of metadata are supported. Dryad offers a data package set and a data file set.

ListMetadataFormats

Used to learn what metadata formats can be returned by the service. Dryad currently offers METS/MODS, OAI-DC (Dublin Core), OAI-ORE/Atom, and RDF/DC. The amount of information mapped into each format varies. For now, we recommend using the OAI-DC metadata format.

ListIdentifiers

Used to list Dryad's OAI identifiers. It requires from and metadataPrefix parameters to know what range of identifiers to return and what format the metadata should be in (from the options returned by the ListMetadataFormats verb). We may modify this to return DOIs in the future.
NOTE: It is highly recommended that you use this call in conjunction with the "set" parameter, so you retrieve the records of interest. Otherwise, you may retrieve records that Dryad has harvested from other providers

ListRecords

Used to list Dryad records. It requires from and metadataPrefix parameters so it knows the range of records to return. The records will be returned in the format associated with the metadataPrefix requested. Available formats can be discovered by using the ListMetadataFormats verb.
NOTE: It is highly recommended that you use this call in conjunction with the "set" parameter, so you retrieve the records of interest. Otherwise, you may retrieve records that Dryad has harvested from other providers

GetRecord

Used to return a single record. It requires the OAI identifier of the record (the identifier parameter) and the format in which the record should be returned (the metadataPrefix parameter).

Using resumptionTokens with OAI-PMH

OAI-PMH requests may result in partial results lists being returned. In these cases, the results list will contain a resumptionToken that can be used to retrieve the next page of results.

For example, for a call like:

http://v1.datadryad.org/oai/request?verb=ListRecords&from=2010-01-01&metadataPrefix=oai_dc&set=hdl_10255_3

You will receive the first 100 records, ending with a resumptionToken of 2010-01-01T00:00:00Z/9999-12-31T23:59:59Z/hdl_10255_3/oai_dc/100

You can then retrieve the next 100 records with:

http://v1.datadryad.org/oai/request?verb=ListRecords&resumptionToken=2010-01-01T00:00:00Z/9999-12-31T23:59:59Z/hdl_10255_3/oai_dc/100

Note that when using a resumptionToken, OAI expects you to only repeat the verb, not any of the other parameters that were part of the original request.

DataONE API

The DataONE API currently not being updated, but we expect to update it again in the future. The historical contents are still available from http://v1.datadryad.org/mn

As part of Dryad's participation in the DataONE project, Dryad makes content available through a specialized API.

Programmatic access to data files using the DataONE API

  1. Obtain the DataONE ID of a Dryad object using the DataONE listObjects call: http://v1.datadryad.org/mn/object (e.g., dryad.1850/1)
  2. Retrieve the file: http://v1.datadryad.org/mn/object/doi:10.5061/dryad.1850/1/bitstream
  3. Retrieve system metadata about a file, including size and MIME type: http://v1.datadryad.org/mn/meta/doi:10.5061/dryad.1850/1/bitstream
  4. Retrieve descriptive metadata about a file: http://v1.datadryad.org/mn/object/doi:10.5061/dryad.1850/1

If you desire the full filename before downloading, obtain the METS document as described above. The filename is in the <mets:FLocat/> element in the xlink:href attribute.

Accessing Data Packages via Journal ISSN

Journals and their ISSNs can be accessed through a GET command:

http://datadryad.org/api/v1/journals

The corresponding ISSN can be used to get a list of packages in Dryad for that journal using the following GET command:

http://datadryad.org/api/v1/journals/{issn}/packages

If multiple pages of results are returned, the next and previous page links can be accessed from the link headers with `rel=next` and `rel=prev`.

There are additional query parameters that can be used to modify the results returned.

  • `count` specifies the number of results per page.
  • `date_from` and `date_to` can filter results to packages released in a date range.
  • `cursor` can be used to specify the key used to start the results page.

Links to Data Packages/Files

Dryad uses DOIs (Digital Object Identifiers) to identify Dryad data packages and files. A few simple examples follow.

Data packages

Data files

RSS Feeds

The RSS feed is no longer being updated, but the historical contents are still available from http://v1.datadryad.org/feed

There are a couple of feed options. Feeds are used by some browsers and all feed and news readers. They may also be used for programmatic access.

Everything -- data packages, data files, and metadata harvested from partner repositories

Data packages only

Data files only

Twitter Feed

Primary tweets from Dryad (typically not data)

SOLR search access

The SOLR interface is no longer being updated, but historical Dryad content can be searched using a SOLR interface.

http://v1.datadryad.org/solr/search/select/?q=location:l2&facet=true&facet.field=dc.subject_filter&facet.minCount=1&facet.limit=5000&fl=nothing

  • Article DOIs associated with all data published in Dryad over the past 90 days:

http://v1.datadryad.org/solr/search/select/?q=dc.date.available_dt:%5BNOW-90DAY/DAY%20TO%20NOW%5D&fl=dc.relation.isreferencedby&rows=1000000

  • Data DOIs published in Dryad during January 2011, with results returned in JSON format:

http://v1.datadryad.org/solr/search/select/?q=location:l2+dc.date.available_dt:%5B2011-01-01T00:00:00Z%20TO%202011-01-31T23:59:59Z%5D&fl=dc.identifier&rows=1000000&wt=json

For more about using SOLR, see the Apache SOLR documentation.

Widget API

The Widget API will become part of the "New" Dryad API (see below), but components are coming online. The Widget API provides simple images or dynamic iframes that link to content in Dryad and can be embedded into third-party sites.

Other access mechanisms

If you know of other community-developed services that can search or retrieve content that are not listed here, please alert us at help@datadryad.org

Suggest Alternatives

We're interested in hearing what other forms of access people would like. If you have a suggestion for making Dryad's content more accessible, please let us know at help@datadryad.org.