Architecture Overview

From Dryad wiki
Revision as of 22:28, 26 January 2017 by Ryan Scherle (talk | contribs) (Dryad servers/services)

Jump to: navigation, search

The Dryad repository stores data associated with scientific publications. The vast majority of these publications are journal articles. All major aspects of the system are organized around journal articles, including the data model, the process for submitting new data, the curation workflows, and the mechanisms for retrieving data.

Dryad servers/services

Note: The diagram below indicates that Secundus and the Wiki are at Duke. These machines have been moved to AWS.

DryadServers2015.png

A repository

DryadRepositoryArchitecture2015.png

Dryad's data model

DryadDataModel2015.png

A deposit in Dryad consists of:

  • Metadata: Objects in Dryad are described with an extended version of Dublin Core metadata, which is described on the Metadata Profile page.
  • Data Package: A Dryad Data Package represents all of the data associated with a single scientific publication. This is a DSpace item that contains only metadata. The metadata contains a mix of fields that describe the data and fields that describe the associated publication. There are dc.relation fields that connect the Data Package to both its associated Data File items and to the external publication. The landing page for a Dryad Data Package displays a small summary of each associated Dryad Data File. (sample landing page, sample metadata)
  • Data Files: A Dryad Data File represents a single file of downloadable data. Often, this is a standalone file such as a spreadsheet, but it may be a compressed archive containing many other files. Each file is a DSpace item that contains file-specific metadata as well as the bitstreams. There are dc.relation fields that connect each Data File item to its parent Data Package item. (sample landing page, sample metadata)
  • Bitstreams: The bitstreams are downloadable files that contain data OR documentation. In DSpace, these bitstreams are always part of a Dryad Data File item, but some access mechanisms (such as the DataONE API) treat the bitstreams as separate objects. A single DSpace item may contain multiple bitstreams, which contain the same scientific data in different file formats. Each Dryad Data File may also contain a special bitstream with the name README, which contains documentation describing the other bitstreams in the same Dryad Data File.

For examples of atypical/extreme objects in Dryad, see Sample Dryad Content.

DSpace Database Schema

This diagram reflects the "stock" version of DSpace. It does not include Dryad customizations.

DSpace-1.8-Database-Schema.png

Workspace/Workflow

User-visible components

  • homepage
  • static pages
  • search
  • item view
  • submission
  • submission journal integration
  • payment
  • curation
  • statistics

Internal support components

  • DataONE API
  • Solr API
  • DSpace API
  • identifiers
  • versioning
  • embargo
  • reporting
  • curation tools
  • assetstore storage
  • database
  • authN (epeople)
  • authZ (access control)
  • journal settings configuration and management