Difference between revisions of "Old: DOI Services Technology"

From Dryad wiki
Jump to: navigation, search
(DSpace 4 DOI Module)
Line 1: Line 1:
 
== Overview ==
 
== Overview ==
  
Dryad mints, manages, and registers [http://en.wikipedia.org/wiki/Digital_object_identifier Digital Object Identifiers] (DOIs) for data packages and data files deposited into the Dryad Data Repository. This page documents the technical details of these DOI services.
+
Dryad mints, manages, and registers [http://en.wikipedia.org/wiki/Digital_object_identifier Digital Object Identifiers] (DOIs) for data packages and data files deposited into the Dryad Data Repository. This page documents the technical details of these DOI services.
  
 
General information about Dryad's DOI services can be found on the [[DOI Services]] page.
 
General information about Dryad's DOI services can be found on the [[DOI Services]] page.
Line 10: Line 10:
  
 
'''WARNING:''' DOIs are currently stored in three places:
 
'''WARNING:''' DOIs are currently stored in three places:
# Postgres '''doi''' table. [https://github.com/datadryad/dryad-repo/blob/dryad-master/dspace/etc/postgres/database_create_doi_table.sql SQL Schema]. This is the authoritative location where DOIs are minted.
 
# Postgres '''metadatavalue''' table. Item DOIs are recorded in the dc.identifier metadata field. Not authoritative but used for relationships.
 
# '''"dryad" solr index''' -- DEPRECATED. The index is no longer being updated, so it does not include all Dryad records. Do not write any new code that uses this index. When you encounter code that uses this index, change it.
 
  
Prior to 2014-04-11, the authoritative location of DOIs was the '''doi.db file''' in /opt/dryad/doi-minter. This file is no longer used. It was written by the [http://www.mcobject.com/perst perst] library. Access to this file was wrapped by DOIDatabase.java. There were [https://trello.com/c/hpduZclq/182-urgent-submission-system-instability-locking-issues-with-doi-db problems with concurrent access to this file under heavy load], so we migrated to a Postgres table [https://github.com/datadryad/dryad-repo/pull/567 GitHub pull request to migrate doi.db to postgres].
+
#Postgres '''doi''' table. [https://github.com/datadryad/dryad-repo/blob/dryad-master/dspace/etc/postgres/database_create_doi_table.sql SQL Schema]. This is the authoritative location where DOIs are minted.
 +
#Postgres '''metadatavalue''' table. Item DOIs are recorded in the dc.identifier metadata field. Not authoritative but used for relationships.
 +
#'''"dryad" solr index''' -- DEPRECATED. The index is no longer being updated, so it does not include all Dryad records. Do not write any new code that uses this index. When you encounter code that uses this index, change it.
 +
 
 +
Prior to 2014-04-11, the authoritative location of DOIs was the '''doi.db file''' in /opt/dryad/doi-minter. This file is no longer used. It was written by the [http://www.mcobject.com/perst perst] library. Access to this file was wrapped by DOIDatabase.java. There were [https://trello.com/c/hpduZclq/182-urgent-submission-system-instability-locking-issues-with-doi-db problems with concurrent access to this file under heavy load], so we migrated to a Postgres table [https://github.com/datadryad/dryad-repo/pull/567 GitHub pull request to migrate doi.db to postgres].
  
 
== Command-line features ==
 
== Command-line features ==
  
=== Local DOI database ===
+
=== Local DOI database ===
  
 
Dryad maintains a local database of DOIs. These are used for fast lookups within the search system.
 
Dryad maintains a local database of DOIs. These are used for fast lookups within the search system.
  
 
The local DOI service can be managed using a command line call:
 
The local DOI service can be managed using a command line call:
<pre>
+
<pre>dryad/bin/dspace doi-util
dryad/bin/dspace doi-util
 
 
Usage:
 
Usage:
 
-h              Help... prints this usage information
 
-h              Help... prints this usage information
 
-s              Search for a known DOI and return it
 
-s              Search for a known DOI and return it
-p &lt;FILE&gt;       Prints the DOI database to an output stream
+
-p <FILE>       Prints the DOI database to an output stream
 
-c              Outputs the number of DOIs in the database</pre>
 
-c              Outputs the number of DOIs in the database</pre>
 
Database sychronization tool. This synchronizes the local DOI database with the objects in the main Dryad store.
 
Database sychronization tool. This synchronizes the local DOI database with the objects in the main Dryad store.
 
+
<pre>./dspace dsrun org.dspace.identifier.DOIDbSync
<pre>
 
./dspace dsrun org.dspace.identifier.DOIDbSync
 
 
-s: to synchronize + report
 
-s: to synchronize + report
 
-r: to produce the report
 
-r: to produce the report
 
</pre>
 
</pre>
 
 
=== DOI Migration (historical) ===
 
=== DOI Migration (historical) ===
  
Line 43: Line 40:
  
 
Run it like other DSpace command-line tools
 
Run it like other DSpace command-line tools
<pre>
+
<pre>/opt/dryad/bin/dspace dsrun org.dspace.doi.DOIMigrator
/opt/dryad/bin/dspace dsrun org.dspace.doi.DOIMigrator
 
 
</pre>
 
</pre>
 
 
=== EZID DOI database ===
 
=== EZID DOI database ===
  
Line 52: Line 47:
  
 
To check the status of a DOI registered with EZID:
 
To check the status of a DOI registered with EZID:
<pre>
+
<pre>/opt/dryad/bin/dsrun org.dspace.doi.CDLDataCiteService 10.5061/DRYAD.2222
/opt/dryad/bin/dsrun org.dspace.doi.CDLDataCiteService 10.5061/DRYAD.2222
 
 
</pre>
 
</pre>
 
 
To update metadata for a DOI (pushing Dryad metadata to EZID):
 
To update metadata for a DOI (pushing Dryad metadata to EZID):
<pre>
+
<pre>/opt/dryad/bin/dsrun org.dspace.doi.CDLDataCiteService username password doi-to-update target-url update
/opt/dryad/bin/dsrun org.dspace.doi.CDLDataCiteService username password doi-to-update target-url update
 
 
</pre>
 
</pre>
 +
Notes about the above command:
  
Notes about the above command:
+
*to register a new DOI, replace "update" with "register"
* to register a new DOI, replace "update" with "register"
 
  
 
To update DataCite with metadata from '''all''' Dryad objects:
 
To update DataCite with metadata from '''all''' Dryad objects:
<pre>
+
<pre>/opt/dryad/bin/dsrun org.dspace.doi.CDLDataCiteService username password syncall
/opt/dryad/bin/dsrun org.dspace.doi.CDLDataCiteService username password syncall
 
 
</pre>
 
</pre>
 +
The metadata transformation crosswalk for DataCite is stored in DIM2DATACITE.xsl. Items that are in publication blackout are transformed with DIM2DATACITE-BLACKOUT.xsl.
  
The metadata transformation crosswalk for DataCite is stored in DIM2DATACITE.xsl.  Items that are in publication blackout are transformed with DIM2DATACITE-BLACKOUT.xsl.
+
The determination of which crosswalk to use is made by checking the metadata for the item. When an item enters blackout, a provenance record is added that includes the phrase "Entered publication blackout". If this is the last provenance record at the time of registration, the blackout crosswalk is used.
  
The determination of which crosswalk to use is made by checking the metadata for the item. When an item enters blackout, a provenance record is added that includes the phrase "Entered publication blackout".  If this is the last provenance record at the time of registration, the blackout crosswalk is used.
+
When the publication blackout ends, the item is approved and the approval is added as provenance. When the registration is updated, "blackout" is no longer the last provenance record, so the item is registered with the standard metadata.
  
When the publication blackout ends, the item is approved and the approval is added as provenance.  When the registration is updated, "blackout" is no longer the last provenance record, so the item is registered with the standard metadata.
+
== Workflow ==
  
== Workflow ==
+
=== For Submissions ===
 +
 
 +
#DOIs are minted at the point of submission to Dryad. When a data package is submitted, a call to mint a DOI (without registering it) is made to the DOI Service.
 +
#The data package should contain data files -- for a DOI to be registered for the data file, there must be a link in the metadata from the data file to the data package.
 +
#The data package then goes on to be curated by the Dryad Librarian.
 +
#If the data package is approved, the DOI is registered with DataCite through the EZID DOI registration service.
 +
#If the package isn't approved, the DOI remains unregistered
 +
#Lastly, the registered DOI is emailed to the submitter so that it can be included in the article and used to reference the published data package.
  
===For Submissions===
+
=== For Citation Downloads ===
# DOIs are minted at the point of submission to Dryad.  When a data package is submitted, a call to mint a DOI (without registering it) is made to the DOI Service.
 
# The data package should contain data files -- for a DOI to be registered for the data file, there must be a link in the metadata from the data file to the data package.
 
# The data package then goes on to be curated by the Dryad Librarian.
 
# If the data package is approved, the DOI is registered with DataCite through the EZID DOI registration service.
 
# If the package isn't approved, the DOI remains unregistered
 
# Lastly, the registered DOI is emailed to the submitter so that it can be included in the article and used to reference the published data package.
 
  
===For Citation Downloads===
+
#DOIs are passed to the CitationServlet when a user requests a citation download or uses one of the sharing services that Dryad supports (Delicious, Digg, etc.)
# DOIs are passed to the CitationServlet when a user requests a citation download or uses one of the sharing services that Dryad supports (Delicious, Digg, etc.)
+
#DOI Services resolve the DOI and extract the metadata from the record, making it available to be downloaded in RIS or&nbsp;!BibTex format.
# DOI Services resolve the DOI and extract the metadata from the record, making it available to be downloaded in RIS or !BibTex format.
+
#The CitationServlet uses Dryad's DOI Services currently, but might in the future use DSpace's _Identifier Services_ if it becomes an official module and Dryad's DOI resolution is natively built into it.
# The CitationServlet uses Dryad's DOI Services currently, but might in the future use DSpace's _Identifier Services_ if it becomes an official module and Dryad's DOI resolution is natively built into it.
 
  
===For Identifier Services===
+
=== For Identifier Services ===
  
Dryad's DOI Services are also used by the _Identifier Services_ DSpace module. Dryad's DOI Services serve as the local DOI resolver for these DSpace services. In the future, we may better integrate our DOI Services into this module.
+
Dryad's DOI Services are also used by the _Identifier Services_ DSpace module. Dryad's DOI Services serve as the local DOI resolver for these DSpace services. In the future, we may better integrate our DOI Services into this module.
  
 
When an identifier is used (created, modified, or resolved), the IdentifierServiceImpl looks through all of the available IdentifierProviders to see which one is capable of handling the associated identifier. Handling is then passed to the appropriate provider.
 
When an identifier is used (created, modified, or resolved), the IdentifierServiceImpl looks through all of the available IdentifierProviders to see which one is capable of handling the associated identifier. Handling is then passed to the appropriate provider.
Line 100: Line 92:
 
== Configuration ==
 
== Configuration ==
  
Configuration of the DOI Services module, requires additional parameters be set in the dspace.cfg configuration file. The Dryad project places these parameters in a Maven profile; they are then pulled into the dspace.cfg file when Dryad is built.
+
Configuration of the DOI Services module, requires additional parameters be set in the dspace.cfg configuration file. The Dryad project places these parameters in a Maven profile; they are then pulled into the dspace.cfg file when Dryad is built.
  
 
In the dspace.cfg file, the following parameters are used to configure the DOI services:
 
In the dspace.cfg file, the following parameters are used to configure the DOI services:
 
+
<pre># URL that resolves DOIs
<pre>
 
# URL that resolves DOIs
 
 
doi.hostname = [http://dx.doi.org http://dx.doi.org]
 
doi.hostname = [http://dx.doi.org http://dx.doi.org]
 
# Base URL of Dryad used in registering DOIs
 
# Base URL of Dryad used in registering DOIs
Line 113: Line 103:
 
# Directory where DOI minter files should be stored
 
# Directory where DOI minter files should be stored
 
doi.dir = ${dspace.dir}/doi-minter
 
doi.dir = ${dspace.dir}/doi-minter
# Username and password of the CDL !DataCite Web service
+
# Username and password of the CDL&nbsp;!DataCite Web service
 
doi.username = ${default.doi.username}
 
doi.username = ${default.doi.username}
 
doi.password = ${default.doi.password}
 
doi.password = ${default.doi.password}
Line 130: Line 120:
  
 
</pre>
 
</pre>
 
 
These settings, put into your Maven profile (in most cases, the settings.xml file), are pulled into the dspace.cfg file when Dryad is built:
 
These settings, put into your Maven profile (in most cases, the settings.xml file), are pulled into the dspace.cfg file when Dryad is built:
 
+
<pre><!-- The real username and password of DOI registration service -->
<pre>
 
<!-- The real username and password of DOI registration service -->
 
 
<default.doi.username>USERNAME</default.doi.username>
 
<default.doi.username>USERNAME</default.doi.username>
 
<default.doi.password>PASSWORD</default.doi.password>
 
<default.doi.password>PASSWORD</default.doi.password>
Line 150: Line 137:
 
<default.solr.dryad.server>http://localhost:9999/solr/dryad</default.solr.dryad.server>
 
<default.solr.dryad.server>http://localhost:9999/solr/dryad</default.solr.dryad.server>
 
</pre>
 
</pre>
 
 
== Implementation ==
 
== Implementation ==
  
Line 156: Line 142:
  
 
CDLDataCiteService
 
CDLDataCiteService
* communicates with the [http://n2t.net/ezid/doc/apidoc.html EZID API] to register, update, and lookup DOIs.
+
 
* Provides a method to get the DataCite metadata (extractDataciteMetadata) for a registered DOI, so registration status can be relayed to a curator.
+
*communicates with the [http://n2t.net/ezid/doc/apidoc.html EZID API] to register, update, and lookup DOIs.
 +
*Provides a method to get the DataCite metadata (extractDataciteMetadata) for a registered DOI, so registration status can be relayed to a curator.
  
 
DryadDOIRegistrationHelper
 
DryadDOIRegistrationHelper
* provides a utility method to check if an item is currently in publication blackout for purposes of DataCite metadata
+
 
 +
*provides a utility method to check if an item is currently in publication blackout for purposes of DataCite metadata
  
 
Identifier services code is in dspace/modules/api/src/main/java/org/dspace/identifier/
 
Identifier services code is in dspace/modules/api/src/main/java/org/dspace/identifier/
Line 167: Line 155:
  
 
For more details on EZID, see:
 
For more details on EZID, see:
* [http://www.cdlib.org/services/uc3/ezid EZID overview]
+
 
* [http://n2t.net/ezid EZID user interface]
+
*[http://www.cdlib.org/services/uc3/ezid EZID overview]
* [http://www.cdlib.org/uc3/docs/ezidapi.html EZID API documentation]
+
*[http://n2t.net/ezid EZID user interface]
 +
*[http://www.cdlib.org/uc3/docs/ezidapi.html EZID API documentation]
  
 
== Configuration ==
 
== Configuration ==
  
Communication is enabled if the doi.datacite.connected property is true in the dspace.cfg file
+
Communication is enabled if the doi.datacite.connected property is true in the dspace.cfg file Test DOIs are minted and registered if doi.service.testmode is true in the dspace.cfg file
Test DOIs are minted and registered if doi.service.testmode is true in the dspace.cfg file
+
 
* The test mode prefix should begin with 10.5072/FK2 (e.g. 10.5072/FK2/10.5061 for Dryad)
+
*The test mode prefix should begin with 10.5072/FK2 (e.g. 10.5072/FK2dryad for Dryad)
* EZID allows API consumers to use this prefix to test the API. Entries are created with EZID but the DOIs are not pushed out to dx.doi.org and are deleted
+
*EZID allows API consumers to use this prefix to test the API. Entries are created with EZID but the DOIs are not pushed out to dx.doi.org and are deleted
  
 
== Relation to DSpace ==
 
== Relation to DSpace ==
Line 183: Line 172:
  
 
Some notes:
 
Some notes:
* IdentifierService provides an abstraction over many different IdentifierProviders. Multiple Providers may be used within a single DSpace instance.
+
 
 +
*IdentifierService provides an abstraction over many different IdentifierProviders. Multiple Providers may be used within a single DSpace instance.
  
 
Currently, the Dryad DOI services modules exist as a separate DSpace module, but in the future some of these services might be integrated into the Identifier Services module.
 
Currently, the Dryad DOI services modules exist as a separate DSpace module, but in the future some of these services might be integrated into the Identifier Services module.
  
A related package in DSpace is the [http://www.dspace.org/1_6_2Documentation/ch03.html#N10C23 Handle server]. Dryad's DOI Services replace our use of the DSpace Handle server, though the Handle server continues to serve links published before we moved to DOIs.
+
A related package in DSpace is the [http://www.dspace.org/1_6_2Documentation/ch03.html#N10C23 Handle server]. Dryad's DOI Services replace our use of the DSpace Handle server, though the Handle server continues to serve links published before we moved to DOIs.
  
 
=== Berlin Implementation ===
 
=== Berlin Implementation ===
  
 
The Technical University of Berlin has implemented a DOI service that works with the API of the DataCite MDS system. It may be included in an upcoming release of DSpace:
 
The Technical University of Berlin has implemented a DOI service that works with the API of the DataCite MDS system. It may be included in an upcoming release of DSpace:
* https://wiki.duraspace.org/display/~pbecker/DOI+support+using+DataCite
+
 
* https://github.com/tuub/DSpace/tree/DOI
+
*[https://wiki.duraspace.org/display/~pbecker/DOI+support+using+DataCite https://wiki.duraspace.org/display/~pbecker/DOI+support+using+DataCite]
* Created a new DOIIdentifierProvider.
+
*[https://github.com/tuub/DSpace/tree/DOI https://github.com/tuub/DSpace/tree/DOI]
 +
*Created a new DOIIdentifierProvider.
  
 
=== DSpace 4 DOI Module ===
 
=== DSpace 4 DOI Module ===
Line 201: Line 192:
  
 
In April 2014, dleehr reviewed the DSpace 4 implementation and compared to Dryad's implementation:
 
In April 2014, dleehr reviewed the DSpace 4 implementation and compared to Dryad's implementation:
* DOIs are stored in item metadata as dc.identifier.uri in DSpace 4, and dc.identifier in Dryad.
+
 
* There is a DOI class and database table "DOI". These names collide with Dryad classes/tables, and are not interface-compatible out of the box
+
*DOIs are stored in item metadata as dc.identifier.uri in DSpace 4, and dc.identifier in Dryad.
* In the DSpace 4 implementation, DOI synchronization/registration is coupled with the identifier storage. The DOI table includes a registration status column.
+
*There is a DOI class and database table "DOI". These names collide with Dryad classes/tables, and are not interface-compatible out of the box
* DSpace 4 has a DOIConnector interface for reserving/registering DOIs, implemented by DataCiteConnector.
+
*In the DSpace 4 implementation, DOI synchronization/registration is coupled with the identifier storage. The DOI table includes a registration status column.
* DSpace 4 also implements DOI registration through EZID with EZIDIdentiferProvider.
+
*DSpace 4 has a DOIConnector interface for reserving/registering DOIs, implemented by DataCiteConnector.
* DSpace 4 synchronizes DOIs with external provider using DOIOrganiser. Dryad has DOIDbSync
+
*DSpace 4 also implements DOI registration through EZID with EZIDIdentiferProvider.
 +
*DSpace 4 synchronizes DOIs with external provider using DOIOrganiser. Dryad has DOIDbSync
  
 
== See Also ==
 
== See Also ==
  
* [[Manually updating DOI metadata with EZID]]
+
*[[Manually updating DOI metadata with EZID]]
 
+
[[Category:Software]]<br/>[[Category:Technical Documentation|Technical_Documentation]]
[[Category:Software]]
 
[[Category:Technical Documentation]]
 

Revision as of 23:08, 2 April 2015

Overview

Dryad mints, manages, and registers Digital Object Identifiers (DOIs) for data packages and data files deposited into the Dryad Data Repository. This page documents the technical details of these DOI services.

General information about Dryad's DOI services can be found on the DOI Services page.

The structure of DOIs is described on the DOI Usage page.

Storage information (and Warning)

WARNING: DOIs are currently stored in three places:

  1. Postgres doi table. SQL Schema. This is the authoritative location where DOIs are minted.
  2. Postgres metadatavalue table. Item DOIs are recorded in the dc.identifier metadata field. Not authoritative but used for relationships.
  3. "dryad" solr index -- DEPRECATED. The index is no longer being updated, so it does not include all Dryad records. Do not write any new code that uses this index. When you encounter code that uses this index, change it.

Prior to 2014-04-11, the authoritative location of DOIs was the doi.db file in /opt/dryad/doi-minter. This file is no longer used. It was written by the perst library. Access to this file was wrapped by DOIDatabase.java. There were problems with concurrent access to this file under heavy load, so we migrated to a Postgres table GitHub pull request to migrate doi.db to postgres.

Command-line features

Local DOI database

Dryad maintains a local database of DOIs. These are used for fast lookups within the search system.

The local DOI service can be managed using a command line call:

dryad/bin/dspace doi-util
Usage:
-h              Help... prints this usage information
-s              Search for a known DOI and return it
-p <FILE>       Prints the DOI database to an output stream
-c              Outputs the number of DOIs in the database

Database sychronization tool. This synchronizes the local DOI database with the objects in the main Dryad store.

./dspace dsrun org.dspace.identifier.DOIDbSync
-s: to synchronize + report
-r: to produce the report

DOI Migration (historical)

The DOIMigrator class was used to move DOIs from the perst doi.db file to the postgres DOI table. This class is now deprecated since the migration has been completed and doi.db is no longer used.

Run it like other DSpace command-line tools

/opt/dryad/bin/dspace dsrun org.dspace.doi.DOIMigrator

EZID DOI database

EZID manages Dryad's DOIs and their registration with the DOI Federation.

To check the status of a DOI registered with EZID:

/opt/dryad/bin/dsrun org.dspace.doi.CDLDataCiteService 10.5061/DRYAD.2222

To update metadata for a DOI (pushing Dryad metadata to EZID):

/opt/dryad/bin/dsrun org.dspace.doi.CDLDataCiteService username password doi-to-update target-url update

Notes about the above command:

  • to register a new DOI, replace "update" with "register"

To update DataCite with metadata from all Dryad objects:

/opt/dryad/bin/dsrun org.dspace.doi.CDLDataCiteService username password syncall

The metadata transformation crosswalk for DataCite is stored in DIM2DATACITE.xsl. Items that are in publication blackout are transformed with DIM2DATACITE-BLACKOUT.xsl.

The determination of which crosswalk to use is made by checking the metadata for the item. When an item enters blackout, a provenance record is added that includes the phrase "Entered publication blackout". If this is the last provenance record at the time of registration, the blackout crosswalk is used.

When the publication blackout ends, the item is approved and the approval is added as provenance. When the registration is updated, "blackout" is no longer the last provenance record, so the item is registered with the standard metadata.

Workflow

For Submissions

  1. DOIs are minted at the point of submission to Dryad. When a data package is submitted, a call to mint a DOI (without registering it) is made to the DOI Service.
  2. The data package should contain data files -- for a DOI to be registered for the data file, there must be a link in the metadata from the data file to the data package.
  3. The data package then goes on to be curated by the Dryad Librarian.
  4. If the data package is approved, the DOI is registered with DataCite through the EZID DOI registration service.
  5. If the package isn't approved, the DOI remains unregistered
  6. Lastly, the registered DOI is emailed to the submitter so that it can be included in the article and used to reference the published data package.

For Citation Downloads

  1. DOIs are passed to the CitationServlet when a user requests a citation download or uses one of the sharing services that Dryad supports (Delicious, Digg, etc.)
  2. DOI Services resolve the DOI and extract the metadata from the record, making it available to be downloaded in RIS or !BibTex format.
  3. The CitationServlet uses Dryad's DOI Services currently, but might in the future use DSpace's _Identifier Services_ if it becomes an official module and Dryad's DOI resolution is natively built into it.

For Identifier Services

Dryad's DOI Services are also used by the _Identifier Services_ DSpace module. Dryad's DOI Services serve as the local DOI resolver for these DSpace services. In the future, we may better integrate our DOI Services into this module.

When an identifier is used (created, modified, or resolved), the IdentifierServiceImpl looks through all of the available IdentifierProviders to see which one is capable of handling the associated identifier. Handling is then passed to the appropriate provider.

NOTE: The DOIIdentifierProvider is stored in Dryad's api module, while other IdentifierProviders are stored in the identifier-services module.

Configuration

Configuration of the DOI Services module, requires additional parameters be set in the dspace.cfg configuration file. The Dryad project places these parameters in a Maven profile; they are then pulled into the dspace.cfg file when Dryad is built.

In the dspace.cfg file, the following parameters are used to configure the DOI services:

# URL that resolves DOIs
doi.hostname = [http://dx.doi.org http://dx.doi.org]
# Base URL of Dryad used in registering DOIs
dryad.url = [http://datadryad.org http://datadryad.org]
# DOI prefix associated with Dryad
doi.prefix = ${default.doi.prefix}
# Directory where DOI minter files should be stored
doi.dir = ${dspace.dir}/doi-minter
# Username and password of the CDL !DataCite Web service
doi.username = ${default.doi.username}
doi.password = ${default.doi.password}
# How long (# of chars) the DOI suffixes should be
doi.suffix.length = 5
# Local, static part of the suffix of the generated ID
doi.localpart.suffix = dryad.
# Whether the registration service should be used
doi.datacite.connected = ${default.doi.datacite.connected}
# URL for the DOI Services Web endpoint
doi.service.url=${default.doi.service}
# Indicates test mode for the Identifier Services connection to DOI Services
doi.service.testmode=false
# The prefix to use instead of doi.prefix when doi.service.testmode is true
doi.testprefix = ${default.doi.testprefix}

These settings, put into your Maven profile (in most cases, the settings.xml file), are pulled into the dspace.cfg file when Dryad is built:

<!-- The real username and password of DOI registration service -->
<default.doi.username>USERNAME</default.doi.username>
<default.doi.password>PASSWORD</default.doi.password>
<!-- The DOI prefix for DOIs minted; for Dryad this is the value below -->
<default.doi.prefix>10.5061</default.doi.prefix>
<!-- Whether to rewrite URLs to use the local DOI resolver or the dx.doi.org one -->
<default.dryad.localize>true</default.dryad.localize>
<!-- Whether to register the DOIs minted or just pretend like you did -->
<default.doi.datacite.connected>false</default.doi.datacite.connected>
<!-- The actual endpoint of the DOI service -->
<default.doi.service>http://localhost:9999/doi</default.doi.service>
<!-- Test mode configuration.  Used instead of default.doi.prefix if test mode is true in dspace.cfg -->
<default.doi.testprefix>10.5072/FK2/10.5061</default.doi.testprefix>
<!-- An index used for DataONE that works with the DOI registration process -->
<default.solr.dryad.server>http://localhost:9999/solr/dryad</default.solr.dryad.server>

Implementation

The core code is in dspace/modules/doi

CDLDataCiteService

  • communicates with the EZID API to register, update, and lookup DOIs.
  • Provides a method to get the DataCite metadata (extractDataciteMetadata) for a registered DOI, so registration status can be relayed to a curator.

DryadDOIRegistrationHelper

  • provides a utility method to check if an item is currently in publication blackout for purposes of DataCite metadata

Identifier services code is in dspace/modules/api/src/main/java/org/dspace/identifier/

DOIIdentifierProvider.mint() = the main method for modifying what a DOI means (to the local system)

For more details on EZID, see:

Configuration

Communication is enabled if the doi.datacite.connected property is true in the dspace.cfg file Test DOIs are minted and registered if doi.service.testmode is true in the dspace.cfg file

  • The test mode prefix should begin with 10.5072/FK2 (e.g. 10.5072/FK2dryad for Dryad)
  • EZID allows API consumers to use this prefix to test the API. Entries are created with EZID but the DOIs are not pushed out to dx.doi.org and are deleted

Relation to DSpace

The Dryad DOI Services modules relates to the Identifier Services module developed for the DSpace community by Atmire.

Some notes:

  • IdentifierService provides an abstraction over many different IdentifierProviders. Multiple Providers may be used within a single DSpace instance.

Currently, the Dryad DOI services modules exist as a separate DSpace module, but in the future some of these services might be integrated into the Identifier Services module.

A related package in DSpace is the Handle server. Dryad's DOI Services replace our use of the DSpace Handle server, though the Handle server continues to serve links published before we moved to DOIs.

Berlin Implementation

The Technical University of Berlin has implemented a DOI service that works with the API of the DataCite MDS system. It may be included in an upcoming release of DSpace:

DSpace 4 DOI Module

DSpace 4 includes support for DOIs as identifiers, as well as DOI registration with EZID/DataCite.

In April 2014, dleehr reviewed the DSpace 4 implementation and compared to Dryad's implementation:

  • DOIs are stored in item metadata as dc.identifier.uri in DSpace 4, and dc.identifier in Dryad.
  • There is a DOI class and database table "DOI". These names collide with Dryad classes/tables, and are not interface-compatible out of the box
  • In the DSpace 4 implementation, DOI synchronization/registration is coupled with the identifier storage. The DOI table includes a registration status column.
  • DSpace 4 has a DOIConnector interface for reserving/registering DOIs, implemented by DataCiteConnector.
  • DSpace 4 also implements DOI registration through EZID with EZIDIdentiferProvider.
  • DSpace 4 synchronizes DOIs with external provider using DOIOrganiser. Dryad has DOIDbSync

See Also