Data Display Widget Technology

Technical documentation for support and extension of Dryad Display Widget interface

= Overview =

The Dryad Data Display Widget API is an interface that enables content from Dryad to be embedded and viewed on third-party sites.

Example


The Data Display Widget displaying a Python script.

= Usage =

Embedding in a webpage
See Data Display Widget Embedding for a how-to on embedding a widget in a webpage.

File Types
The data display widget currently supports display of the following file types (mime-type):


 * Plain text file (text/plain)
 * Python script (application/x-python)
 * CSV spreadsheet (text/csv)
 * JPEG Image (image/jpg)
 * Portable Network Graphics, PNG (image/png)

As of May 2014, these types represented approximately 44.1% (9927/22492) of the Dryad data files deposited. Additionally, it is intended for the following file formats to be handled by the data display widget upon subsequent release:


 * Microsoft Excel OpenXML spreadsheet (application/vnd.openxmlformats-officedocument.spreadsheetml.sheet)
 * Microsoft Excel 97-2007 spreadsheet (application/vnd.ms-excel)
 * Zip archive (application/zip)
 * Gzip archive (application/x-gzip)
 * Tape Archive File, TAR (application/x-tar)
 * Unix Tar File Gzipped, TGZ (application/x-gtar)
 * Adobe Portable Document Format, PDF (application/pdf)

Once these file types are handled, the data display widget will be capable of handling 73.2% (16477/22492) of data files deposited at Dryad (as of May 2014).

File Size
Content from Dryad data files is truncated at  for plain-text files and 100 rows for CSV files. Size limits for other content types are forthcoming. Users are prompted in a widget displaying a truncated resource to download data files exceeding that size. File truncation is meant to prevent browser under-performance or malfunction in the case that large or numerous data files are displayed in a single page.

Unhandled formats
Files deposited in formats unsupported for rendering in the display widget are displayed using a placeholder that directs the user to the Dryad site to download the file. In this case, the "full" widget is displayed, including the controls to view file metadata, cite the data, or download the data.

Plain text
Plain text files are displayed "as-is" in the display widget, up to the maximum file size indicated above. If possible, files are converted to UTF-8 encoding; if not, the file is considered unhandlable, and the user is directed to download the file for viewing locally.

Spreadsheets
Tabular data contained in CSV or binary encoded spreadsheet files (e.g., XLS, XLSX) are displayed using the DataTables Javascript plug-in, which provides a spreadsheet-like interface.

Table column headings are are not currently inferred for data displayed as a spreadsheet. In addition, multiple worksheets are not viewable from the widget interface.

Source Code
Source code files deposited at Dryad that are recognized as such, currently Python scripts with mime-type "application/x-python", are displayed in the widget with syntax highlighting provided client-side by the Javascript libarary reveal.js. Dryad does not confirm that deposited code conforms to the syntax of its language, so syntax highlighting cannot be assured to be applied completely or without misinterpretation in all cases.

Source code files that are not recognized as source code, for example, R scripts and Perl scripts, are stored as "text/plain" in the repository, and are displayed in the widget without syntax highlighting.

Images
Image formats supported natively by browsers, e.g., PNG and JPEG, will be displayed at full resolution, granted the original file is under the maximum filesize threshold indicated above. Otherwise, these files will be downsampled to the extent required for display in the widget. Image formats not supported natively by browsers, e.g., TIFF, will be converted to a format handlable by a browser, with the same filesize constraints applied. Original, unmodified versions of images will be available for download through the widget or at the Dryad site.

Versioning
The Widgets API uses versioning within resource paths. This document describes functionality in Version 1, so all of the endpoints begin with

API Endpoints
The display widget API has a single user-exposed endpoint: /widgets/v1/display/{doi}/loader.js Usage of this API returns a single Javascript file, which updates the page with,  , and   elements, as detailed below, to enable viewing of a single Dyrad data resource. When a request is made for a non-existant resource, the service returns a Javascript file with no effects.

This call has a single required request parameter,, which must be provided with the   attribute value of the element that is to contain the display widget. See the API Usage Example below for sample code.

This call has an optional request parameter,, to be used for tracking resource requests. This parameter should be passed with an identifier for the journal.

Authentication & Access Control
The Dryad widgets provide read-only access to world-readable resources made available through the Dryad site. require end-user authentication.

Resources made available through Dryad are subject to a blackout period set by the depositor.

Requests for Dryad display widgets may be made over HTTP or HTTPS.

API Call Effects
A call to the display widget API results in importing into a host page Javascript code that has following effects:


 * modification of the DOM
 * importing of CSS stylesheets

The extent of these actions is documented here.

DOM Manipulation
Once the Dryad data-display widget is fetched, the requested "loader" script inserts a number of elements in the host page's DOM, including:


 * a  element requesting the " Display Widget stylesheet
 * a  element requesting jQuery version 1.9 or greater, if not present
 * a  element for the Magnific Popup lightbox plugin
 * an  element requesting the widget UI and data content

Calls between the widget's iframe content (i.e., when the buttons' onclick events are triggered) and the host page is handled by means of Javascript calls to the iframe's  method. As communication between content derived from different domains is generally prohibited by CORS policy constraints, the use of  is currently considered a safe Javascript practice.

CSS Stylesheets
The content of the Dryad display widget is namespaced under a top-level  class to avoid collisions between CSS selectors in the host page and the widget. The exception is CSS stylesheets related to the Magnific Popup lightbox plugin, which is applied to content on the host page under the  class names.

= Workflow =

Data file handling
Data files deposited at Dryad are heterogenous in format, and many cannot be displayed effectively in a browser in their deposited file formats. Consequently, the display widget implementation takes advantage of the pipelining capabilities of Cocoon to enable a subset of Dryad to be viewed as HTML/CSS or as a binary object in the widget.

Source files and classes mentioned here may be found under.

Bitstream XML serialization
The text or binary data contained in a single deposited data resource is referred to in Dryad/DSpace as a bitstream. These resources may be plain-text files viewable in a browser, binary formats viewable natively in a browser (e.g., PNG images), or binary formats not viewable natively in a browser (e.g., XLSX spreadsheets or GZip archives). Additionally, some plain-text formats, notably CSV and other tabular formats, are viewable in a browser but are not viewable optimally as plain-text files.

Bitstreams serilization, where possible, begins when a bitstream requested from the DataOne MN service by, which is called as a Cocoon generator from the widget sitemap file. Based on the mime-type of the requested bitstream (which is returned by the DataONE-MN service as the "DataONE-formatId" HTTP header), the bitstream is either retrieved and further processed, if serializable, or not retrieved.

The result of this generation stage is an  document with metadata about the bitstream possibly data from the bitstream. Once handled in the generation phase of the pipeline, a bitstream is passed to the transformation stage of the pipeline, where a browser-targeted HTML/CSS/JS document is produced and returned in response to the HTTP request.

Serializable formats
If serializable, the bitstream object is passed to a subclass of. The methods of the superclass (which extends ) initialize the   document returned by the generator, add metadata to the   document to be used later in the pipeline, and begin the body of the   to capture content generated by the subclass.

For example, a Python source file will be initialized by  by initializing a document as follows:   application/x-python http://datadryad.org/mn/object/{...}/bistream 1686 The subclass generator is then invoked to generate the content of the  of the   document. Since a Python source file is plain-text in format, the content handler here is the same as for files with "text/plain".

As an additional example, the generator to handle bitstreams with mime-type, which is  , generates a document that begins:   Kingdom Higher taxonomic group Order This XML serialization is then converted to HTML in the later XSLT stage of the pipeline.

Once the subclass generator has finished, the superclass generator closes the  and   elements and control is passed back to the Cocoon pipeline controller.

Unerializable formats
If not serializable, the class  is dispatched to return an   document to be used later in the pipeline that contains metadata about the bitstream, but no content.

HTML serialization
Once an  document has been generated successfully for a bitstream by , the pipeline then dispatches an XSLT transformation stage, the result of which is serialized to   by the Cocoon pipeline controller. The stylesheets are located at, and the stylesheet that handles the generated   to   transformation is. This stylsheet dispatches templates based on mime-type of the bitstream object, either a default/unhandled template for bitstreams that are not currently supported for display in the widget or a named template responsible for handling a given content type.

HTML
The markup underlying a widget rendered in a host page has this overall structure:

 ...  ... ...



... <script type="text/javascript"> ... ...

<script id="ddw-js-1" async="true" type="text/javascript" src="..."> where the elements  and   were added to a host page.

The use of  elements is meant to:


 * insulate the widget from the host page to the extent possible, and vice versa
 * allow the non-data regions of the widget load in advance of the embedded data frame, which may be intensive to generate

CSS
The CSS stylesheets, delivered to the rendered widget as, are developed using the Less CSS preprocessor. The LESS-CSS source files are located at  in the   module, and may be compiled using the shell script.

The "main" LESS-CSS source file is, which is used to namespace the widget CSS selectors under a   class while importing the included widgets sub-stylesheets. Namespacing is meant to insulate the page hosting the display widget from unintended collisions during stylesheet application. The sub-stylesheets are organized as follows:


 * brand.less: color, border, and margin configuration for embedded widget
 * layout.less: placement and dimension of embedded widget components
 * lightbox.less: configuration of overlay widget (lightbox popup)
 * magnific-popup.less: stylesheet for the Magnific-Popup plugin
 * reset.less: browser normalization stylesheet, adapted from Eric Meyer's reset.css
 * type.less: widget typography
 * vars.less: global CSS settings to be used across stylesheets

Javascript
The  file responsible for injecting a display widget into a host page is produced by the xsl stylesheet. The Javascript produced by this stylesheet is responsible for:


 * confirming the availability of a sufficiently recent jQuery library
 * importing the Magnific-Popup jQuery plugin
 * importing the display widget CSS stylesheet into the host page, for styling the widget's outer frame
 * handling the  events for the widget buttons

In the current Javascript implementation, which extends little JS-driven functionality to the user, there is no use of a Javascript application framework or global widget controller object in the host page. Not relying on a JS framework in the current implementation is meant to provide the following advantages:


 * minimize risk of collision with frameworks used in host page
 * avoid import of large JS framework libries
 * decrease complexity of the widget's JS implementation

Not using a global singleton controller object in the host page is meant to reduce the complexity of the widget Javascript implementation around synchronization and coordination between multiple widgets in a page. Each widget is a standalone unit as far as the Javascript is concerned. The expense of no global widget singleton controller is a small amount of Javascript code repetition in a page when more than one widget is embedded in the page.

The data-display frame of a widget may itself import additional Javascript libraries, depending on the content displayed. For example, the


 * text/csv: DataTables jQuery plugin
 * application/x-python: highlight.js soure code syntax highlighter

These JS resources are fetched asynchronously once the  loads.

Installation and deployment
The  module is a dependency of the   module, and is installed during a Maven build according to configuration in the associated pom.

Adding newly supported mime-types
Adding a new mime-type handler for bitstreams requires updates in both the Dryad Widgets Java code and the XSL stylesheets.

See Widget next steps for a listing of the next mime-types to be implemented.

Cocooon Generator
The package org.dspace.app.xmlui.aspect.dryadwidgets.display.bitstreamHandler.DefaultBitstreamHandler located at dryad-repo/dspace/modules/dryad-widgets/dryad-widgets-webapp/src/main/java/org/dspace/app/xmlui/aspect/dryadwidgets/display/bitstreamHandler in the  module may be referred to for the minimal implementation required for developing a new bitstream handler. If the bitstream's content is to be serialized to XML, the  method must be implemented to generate the SAX events that will do so.

Additionally, the mapping of mime-type to handler classname must be registered in the Java class. The map  must have an entry added mapping the string mime-type value to the string class name for the class created to generate for the mime-type.

Note that certain mime-types, for example, "image/png", are handled by the default bitstream handler, which passes metadata to later stages of the platform without a call to a  method. This approach is appropriate for plain text mime-types and other formats handlable by browsers.

Using existing Cocoon generator classes
The bitstream generators for mime-types "text/plain" and "text/csv" take advantage of existing Cocoon-project generators for serializing this content to XML. Note that these generator implementations make use of a  object to stream the output of the existing generators to the same document as is used for the dryad-widgets pipeline (defined in the widgets.xmap sitemap).

Cocooon Transformer
Once a generator has been added to handle a new bitstream mime-type, a corresponding XSL stylesheet must be added to serialize the generated XML to HTML. The base stylesheet for this transformation is located in the  module at dryad-repo/dspace/modules/dryad-widgets/dryad-widgets-webapp/src/main/webapp/static/xsl/widgets/display/dataFileBitstream.xsl This stylesheet dispatches templates to handle the generated content by mime-type, according to the following table: <ddw:templates> <ddw:template mime-type="application/x-python">code</ddw:template> <ddw:template mime-type="image/png"          >image-native</ddw:template> <ddw:template mime-type="image/jpeg"         >image-native</ddw:template> <ddw:template mime-type="text/csv"           >text-csv</ddw:template> <ddw:template mime-type="text/plain"         >text-plain</ddw:template> </ddw:templates> The stylesheets containing the dispatched templates are located in the  subdirectory.

Debugging
The dryad-widgets sitemap file has a pipeline for debugging bitstream serialization, which is defined by the pattern: <map:match pattern="v1/display/debug/**/bitstream"> This API call requests the XML produced by the  generator class, and may be used as input an XSL stylesheet under development.

= Relation to DSpace =

The Dryad Display Widget API is implemented as a Dspace module, located under dspace/modules/dryad-widgets in the Dryad source respository. The widget module is declared a dependency of the Dryad/DSpace XMLUI module's project. The widget module has its Cocoon sitemap, which defines the endpoints of the API described here, mounted by the XMLUI sitemap.

While the Data Display Widget runs within the Dryad Tomcat application, it is not coupled with the submission or content management subsystems of the application. Rather than interacting with the DSpace Java API directly, the widget aggregates data primarily by means of server-side HTTP requests to the Dryad DataOne-MN service. Resources are transformed and serialized as defined in the Cocoon sitemap, returning Javascript and HTML/CSS for display by a browser in a webpage. Other widget resources, including Javascript libraries, CSS stylesheets, and images, are served as static assets by the XMLUI module.