Difference between revisions of "Dryad REST API Technology"
Ryan Scherle (talk | contribs) (→Manuscripts) |
|||
Line 3: | Line 3: | ||
The Dryad REST API (API) is implemented as a DSpace module. It provides a RESTful interface to Dryad, leveraging JSON and OAuth standards. It uses the following libraries: | The Dryad REST API (API) is implemented as a DSpace module. It provides a RESTful interface to Dryad, leveraging JSON and OAuth standards. It uses the following libraries: | ||
− | * [https://jersey.java.net Jersey 1.18] - RESTful web services in Java | + | *[https://jersey.java.net Jersey 1.18] - RESTful web services in Java |
− | * [https://oltu.apache.org Apache Oltu] - OAuth Protocol implementation in Java | + | *[https://oltu.apache.org Apache Oltu] - OAuth Protocol implementation in Java |
The initial version of the API is intended for partner journals to exchange manuscript information with Dryad. More features will be added in the future. | The initial version of the API is intended for partner journals to exchange manuscript information with Dryad. More features will be added in the future. | ||
Line 57: | Line 57: | ||
A `GET` request will list all available manuscripts in an organization. These requests can be modified with query parameters: | A `GET` request will list all available manuscripts in an organization. These requests can be modified with query parameters: | ||
− | * '''count''' limits the number of results returned (default is 1000), | + | |
− | * '''search''' filters the results for the presence of that one word in the metadata. | + | *'''count''' limits the number of results returned (default is 1000), |
+ | *'''search''' filters the results for the presence of that one word in the metadata. | ||
= Authentication and Authorization = | = Authentication and Authorization = | ||
Line 66: | Line 67: | ||
OAuth 2 references | OAuth 2 references | ||
− | # http://aaronparecki.com/articles/2012/07/29/1/oauth2-simplified | + | #[http://aaronparecki.com/articles/2012/07/29/1/oauth2-simplified http://aaronparecki.com/articles/2012/07/29/1/oauth2-simplified] |
− | # http://blogs.steeplesoft.com/posts/2013/07/11/a-simple-oauth2-client-and-server-example-part-i/ | + | #[http://blogs.steeplesoft.com/posts/2013/07/11/a-simple-oauth2-client-and-server-example-part-i/ http://blogs.steeplesoft.com/posts/2013/07/11/a-simple-oauth2-client-and-server-example-part-i/] |
− | # https://github.com/hasanozgan/apache-oltu-oauth2-provider-demo/blob/master/src/main/java/com/bilyoner/api/endpoints/TokenEndpoint.java | + | #[https://github.com/hasanozgan/apache-oltu-oauth2-provider-demo/blob/master/src/main/java/com/bilyoner/api/endpoints/TokenEndpoint.java https://github.com/hasanozgan/apache-oltu-oauth2-provider-demo/blob/master/src/main/java/com/bilyoner/api/endpoints/TokenEndpoint.java] |
== Tokens == | == Tokens == | ||
Line 77: | Line 78: | ||
Currently, tokens must be created and assigned manually. This can be done by generating a random md5, and inserting it into the token table, with an expiration date and eperson id | Currently, tokens must be created and assigned manually. This can be done by generating a random md5, and inserting it into the token table, with an expiration date and eperson id | ||
− | |||
<pre>$ head /dev/random | md5 | <pre>$ head /dev/random | md5 | ||
f36def40ade795ede0401c1f74144852 | f36def40ade795ede0401c1f74144852 | ||
Line 96: | Line 96: | ||
For example: | For example: | ||
− | |||
<pre>https://datadryad.org/api/v1/organizations/?access_token=f36def40ade795ede0401c1f74144852</pre> | <pre>https://datadryad.org/api/v1/organizations/?access_token=f36def40ade795ede0401c1f74144852</pre> | ||
Tokens can also be included as an HTTP request header | Tokens can also be included as an HTTP request header | ||
− | |||
<pre>Authorization: Bearer f36def40ade795ede0401c1f74144852</pre> | <pre>Authorization: Bearer f36def40ade795ede0401c1f74144852</pre> | ||
− | == Authorization & | + | == Authorization & Access Control == |
A valid OAuth token that corresponds to an eperson account will identify the bearer (authentication), but a token alone does not identify what the bearer is authorized to do. | A valid OAuth token that corresponds to an eperson account will identify the bearer (authentication), but a token alone does not identify what the bearer is authorized to do. | ||
Line 110: | Line 108: | ||
As an example, suppose a partner journal user has eperson id 2860 and we've assigned organization code '''test'''. Their resource authorizations should probably be | As an example, suppose a partner journal user has eperson id 2860 and we've assigned organization code '''test'''. Their resource authorizations should probably be | ||
− | |||
<pre>eperson_id | http_method | resource_path | <pre>eperson_id | http_method | resource_path | ||
2860 | POST | organizations/test/manuscripts | 2860 | POST | organizations/test/manuscripts | ||
Line 133: | Line 130: | ||
See | See | ||
− | * dspace/modules/dryad-rest-webapp/src/main/java/org/datadryad/rest/handler/ManuscriptXMLConverterHandler.java | + | *dspace/modules/dryad-rest-webapp/src/main/java/org/datadryad/rest/handler/ManuscriptXMLConverterHandler.java |
− | * dspace/modules/dryad-rest-webapp/src/main/java/org/datadryad/rest/converters/ManuscriptToLegacyXMLConverter.java | + | *dspace/modules/dryad-rest-webapp/src/main/java/org/datadryad/rest/converters/ManuscriptToLegacyXMLConverter.java |
== Data Package Metadata == | == Data Package Metadata == | ||
Line 140: | Line 137: | ||
Upon creating (POST) or updating (PUT) a manuscript, the API webapp will attempt to locate a Dryad Data Package (provided the manuscript includes a <code>dryadDataDOI</code>) and synchronize the following metadata: | Upon creating (POST) or updating (PUT) a manuscript, the API webapp will attempt to locate a Dryad Data Package (provided the manuscript includes a <code>dryadDataDOI</code>) and synchronize the following metadata: | ||
− | # Publication DOI | + | #Publication DOI |
− | # Manuscript Number | + | #Manuscript Number |
− | # Manuscript Keywords | + | #Manuscript Keywords |
− | # Manuscript Title | + | #Manuscript Title |
− | # Manuscript Abstract | + | #Manuscript Abstract |
− | # Publication Date | + | #Publication Date |
The values from the manuscript are copied into the data package metadata if the manuscript is submitted or accepted. If rejected, the values are removed from the metadata | The values from the manuscript are copied into the data package metadata if the manuscript is submitted or accepted. If rejected, the values are removed from the metadata | ||
Line 151: | Line 148: | ||
See | See | ||
− | * dspace/modules/dryad-rest-webapp/src/main/java/org/datadryad/rest/handler/ManuscriptMetadataSynchronizationHandler.java | + | *dspace/modules/dryad-rest-webapp/src/main/java/org/datadryad/rest/handler/ManuscriptMetadataSynchronizationHandler.java |
== Review workflow == | == Review workflow == | ||
Line 161: | Line 158: | ||
See | See | ||
− | * dspace/modules/dryad-rest-webapp/src/main/java/org/datadryad/rest/handler/ManuscriptReviewStatusChangeHandler.java | + | *dspace/modules/dryad-rest-webapp/src/main/java/org/datadryad/rest/handler/ManuscriptReviewStatusChangeHandler.java |
= API usage examples = | = API usage examples = | ||
Line 192: | Line 189: | ||
Listing manuscripts in an organization: can use filter query terms | Listing manuscripts in an organization: can use filter query terms | ||
− | |||
<pre>curl --insecure \ | <pre>curl --insecure \ | ||
--request GET \ | --request GET \ | ||
− | https://staging.datadryad.org/api/v1/organizations/test/manuscripts/?access_token=f36def40ade795ede0401c1f74144852& | + | https://staging.datadryad.org/api/v1/organizations/test/manuscripts/?access_token=f36def40ade795ede0401c1f74144852&count=10&search=Smith</pre><br/><br/><br/><br/><br/>[[Category:Technical Documentation|REST API Technology]]<br/>[[Category:API]]<br/>[[Category:Software]]<br/>[[Category:Submission Integration|Submission_Integration]] |
− | [[Category:Technical Documentation| |
Revision as of 09:21, 4 September 2015
Contents
Overview and Technologies
The Dryad REST API (API) is implemented as a DSpace module. It provides a RESTful interface to Dryad, leveraging JSON and OAuth standards. It uses the following libraries:
- Jersey 1.18 - RESTful web services in Java
- Apache Oltu - OAuth Protocol implementation in Java
The initial version of the API is intended for partner journals to exchange manuscript information with Dryad. More features will be added in the future.
Installation
Code
The API webapp code exists within dspace/modules/dryad-rest-webapp
, and will be compiled if the module is enabled.
Database Schema
The API uses database objects for storing resources, OAuth tokens, and access control lists. These objects can be installed via dspace/etc/postgres/dryad-rest-webapp.sql
(and removed with clean-dryad-rest-webapp.sql
).
Metadata Schema
For purposes of recording publication date, a field has been added to the dspace/config/registries/dublin-core-types.xml
registry. The metadata fileds should be loaded to ensure this field is present in the database.
/opt/dryad/bin/dsrun org.dspace.administer.MetadataImporter -u /opt/dryad/config/registries/dublin-core-types.xml
Server deployment
The API is implemented as a Java Webapp, and should be added to the Tomcat server's server.xml
. See dspace/etc/server/server.xml-fragment
for an example.
API Endpoints
Versioning
The API uses versioning within resource paths. This document describes functionality in Version 1, so all of the endpoints begin with /api/v1/
When installed on a Dryad server, the API webapp is designed to service /api
, with /api/v1/
handled by a single servlet ().
Transport Encryption
The OAuth2 spec requires HTTPS for all communication. Without transport encryption, tokens in the URL or header could easily be read by third parties. This should be enforced at deployment time, possibly through Apache proxy rules.
Resources
The initial version of the API is intended for partner journals to exchange manuscript information, so the resources are organizations and manuscripts.
See Dryad Manuscript API on Apiary.
Organizations
Organizations represent partner journal organizations - companies, publishers, or entities that will be supplying metadata relating to Dryad submissions. Organizations are identified by a short organizationCode, assigned by Dryad staff. Organizations are stored in the organization
table, and include a code and a descriptive name.
Organization codes are the same concept as journal codes currently defined in the DryadJournalSubmission.properties
file, and must adopt any existing assignments.
Manuscripts
Manuscripts represent documents of metadata, corresponding to a journal article or other work, which may have data submitted to Dryad. Individual manuscripts can be accessed through their manuscriptId, which is an identifier assigned by the organization and unique within the organization.
A `GET` request will list all available manuscripts in an organization. These requests can be modified with query parameters:
- count limits the number of results returned (default is 1000),
- search filters the results for the presence of that one word in the metadata.
Authentication and Authorization
OAuth2 tokens are used for authentication - there is currently no support for anonymous or unauthenticated access. Tokens are tied to an eperson (user account) record, and access control lists specify which resources (API Paths) can be accessed and how. Since the application is RESTful, these are specified in terms of HTTP verbs that can be performed on a certain path.
OAuth 2 references
- http://aaronparecki.com/articles/2012/07/29/1/oauth2-simplified
- http://blogs.steeplesoft.com/posts/2013/07/11/a-simple-oauth2-client-and-server-example-part-i/
- https://github.com/hasanozgan/apache-oltu-oauth2-provider-demo/blob/master/src/main/java/com/bilyoner/api/endpoints/TokenEndpoint.java
Tokens
OAuth 2 Tokens are stored in the oauth_token
table. Tokens are 32-character strings generated by MD5. The table row includes the token string, the eperson id, and an expiration date.
Token Generation
Currently, tokens must be created and assigned manually. This can be done by generating a random md5, and inserting it into the token table, with an expiration date and eperson id
$ head /dev/random | md5 f36def40ade795ede0401c1f74144852 $ psql dryad_repo dryad_repo=# select eperson_id from eperson where email = 'dan.leehr@nescent.org'; eperson_id ------------ 6307 (1 row) dryad_repo=# INSERT INTO oauth_token (eperson_id, token, expires) VALUES (6307,'f36def40ade795ede0401c1f74144852','2014-12-31'); INSERT 0 1
Token Usage
Tokens can be supplied by the client in a few ways, according to the OAuth specification. The simplest is to include an access_token
query parameter in the URL. This works for both GET and POST requests.
For example:
https://datadryad.org/api/v1/organizations/?access_token=f36def40ade795ede0401c1f74144852
Tokens can also be included as an HTTP request header
Authorization: Bearer f36def40ade795ede0401c1f74144852
Authorization & Access Control
A valid OAuth token that corresponds to an eperson account will identify the bearer (authentication), but a token alone does not identify what the bearer is authorized to do.
Resource Access control lists are stored in the rest_resource_authz
table. Each table row consists of an eperson ID, an HTTP method (GET/POST/PUT/DELETE), and a resource path (e.g. organizations/org1/manuscripts
). These rows determine what a user can access.
Note that granting access to a resource path is recursive - if a user has GET access to organizations
, he/she has read access to every organization and manuscript in the system.
As an example, suppose a partner journal user has eperson id 2860 and we've assigned organization code test. Their resource authorizations should probably be
eperson_id | http_method | resource_path 2860 | POST | organizations/test/manuscripts 2860 | PUT | organizations/test/manuscripts 2860 | GET | organizations/test
POST allows creation of new manuscripts within the test organization. PUT allows replacement of manuscripts (though the PUT would actually happen at organizations/test/manuscripts/:id), and GET allows them to read the organization and all of its sub-resources.
Note the user is not granted DELETE, and has no access anywhere outside of organizations/test.
Interaction with other components
These initial APIs are designed to handle exchange of manuscript metadata with partner journals and processing of changes. This job is currently done by the journal-submit webapp, which parses metadata emails from journals and stores them for the submission system.
Journal-Submit webapp
The journal-submit webapp parses emails and generates XML files within a path on the filesystem (typically /opt/dryad/submission/journalMetadata
). When authors enter a manuscript in the submission system, Dryad reads their manuscript metadata out of these XML files, based on the entered Manuscript number.
The API webapp has been built to be compatible with this behavior. As manuscripts are created (POST) or updated (PUT), their JSON structure is stored in the manuscript
table. Additionally, they are converted to an XML format that is compatible with the existing submission system.
This integration requires the organization code (in the organizations table) and journal code (in the properties file) to match, so that the webapp can determine where to place the XML files.
See
- dspace/modules/dryad-rest-webapp/src/main/java/org/datadryad/rest/handler/ManuscriptXMLConverterHandler.java
- dspace/modules/dryad-rest-webapp/src/main/java/org/datadryad/rest/converters/ManuscriptToLegacyXMLConverter.java
Data Package Metadata
Upon creating (POST) or updating (PUT) a manuscript, the API webapp will attempt to locate a Dryad Data Package (provided the manuscript includes a dryadDataDOI
) and synchronize the following metadata:
- Publication DOI
- Manuscript Number
- Manuscript Keywords
- Manuscript Title
- Manuscript Abstract
- Publication Date
The values from the manuscript are copied into the data package metadata if the manuscript is submitted or accepted. If rejected, the values are removed from the metadata
See
- dspace/modules/dryad-rest-webapp/src/main/java/org/datadryad/rest/handler/ManuscriptMetadataSynchronizationHandler.java
Review workflow
Upon creating (POST) or updating (PUT) a manuscript, the API webapp will check the status (submitted/accepted/rejected) and transform the status into a review action if a corresponding data package (by dryadDataDOI
) is currently in review.
The ApproveRejectReviewItem class has been refactored to support this action
See
- dspace/modules/dryad-rest-webapp/src/main/java/org/datadryad/rest/handler/ManuscriptReviewStatusChangeHandler.java
API usage examples
Below are some examples using cURL to interact with the API. Remember, these require creating tokens and access control in the database:
Listing organizations - GET /api/v1/organizations
curl --insecure \ --request GET \ https://staging.datadryad.org/api/v1/organizations/?access_token=f36def40ade795ede0401c1f74144852
Creating an organization - POST a JSON object to /api/v1/organizations.
curl --insecure \ --header "Content-Type:application/json" \ --request POST \ -d '{"organizationCode":"test","organizationName":"Dryad Test Organization"}' \ https://staging.datadryad.org/api/v1/organizations/?access_token=f36def40ade795ede0401c1f74144852
Creating a manuscript - POST a JSON object to /api/v1/organizations/{organizationCode}/manuscripts. Object must have manuscriptId, which will be the identifier for future operations
curl --insecure \ --header "Content-Type:application/json" \ --request POST \ --data-binary @manuscript.json \ https://staging.datadryad.org/api/v1/organizations/test/manuscripts?access_token=f36def40ade795ede0401c1f74144852
Updating a manuscript - change the data within the JSON and PUT to /api/v1/organizations/{organizationCode}/manuscripts/{manuscriptId}
curl --insecure \ --header "Content-Type:application/json" --request PUT --data-binary @manuscript-revised.json \ https://staging.datadryad.org/api/v1/organizations/test/manuscripts/MS12345?access_token=f36def40ade795ede0401c1f74144852
Example JSON documents are in dspace/modules/dryad-rest-webapp/src/main/resources
Listing manuscripts in an organization: can use filter query terms
curl --insecure \ --request GET \ https://staging.datadryad.org/api/v1/organizations/test/manuscripts/?access_token=f36def40ade795ede0401c1f74144852&count=10&search=Smith