Data Display Widget Technology

From Dryad wiki
Jump to: navigation, search
Status: In final testing

Technical documentation for support and extension of Dryad Display Widget interface

Overview

The Dryad Data Display Widget API is an interface that enables content from Dryad to be embedded and viewed on third-party sites.

Example

Display widget preview x-python 2014-12-15.png

The Data Display Widget displaying a Python script.

Usage

Embedding in a webpage

See Data Display Widget Embedding for a how-to on embedding a widget in a webpage.

Supported File Types and Size

File Types

The data display widget currently supports display of the following file types (mime-type):

  • Plain text file (text/plain)
  • Python script (application/x-python)
  • CSV spreadsheet (text/csv)
  • JPEG Image (image/jpg)
  • Portable Network Graphics, PNG (image/png)

As of May 2014, these types represented approximately 44.1% (9927/22492) of the Dryad data files deposited. Additionally, it is intended for the following file formats to be handled by the data display widget upon subsequent release:

  • Microsoft Excel OpenXML spreadsheet (application/vnd.openxmlformats-officedocument.spreadsheetml.sheet)
  • Microsoft Excel 97-2007 spreadsheet (application/vnd.ms-excel)
  • Zip archive (application/zip)
  • Gzip archive (application/x-gzip)
  • Tape Archive File, TAR (application/x-tar)
  • Unix Tar File Gzipped, TGZ (application/x-gtar)
  • Adobe Portable Document Format, PDF (application/pdf)

Once these file types are handled, the data display widget will be capable of handling 73.2% (16477/22492) of data files deposited at Dryad (as of May 2014).

File Size

Content from Dryad data files is truncated at XXX MB for plain-text files and 100 rows for CSV files. Size limits for other content types are forthcoming. Users are prompted in a widget displaying a truncated resource to download data files exceeding that size. File truncation is meant to prevent browser under-performance or malfunction in the case that large or numerous data files are displayed in a single page.

File Display

Unhandled formats

Files deposited in formats unsupported for rendering in the display widget are displayed using a placeholder that directs the user to the Dryad site to download the file. In this case, the "full" widget is displayed, including the controls to view file metadata, cite the data, or download the data.

Plain text

Plain text files are displayed "as-is" in the display widget, up to the maximum file size indicated above. If possible, files are converted to UTF-8 encoding; if not, the file is considered unhandlable, and the user is directed to download the file for viewing locally.

Spreadsheets

Tabular data contained in CSV or binary encoded spreadsheet files (e.g., XLS, XLSX) are displayed using the DataTables Javascript plug-in, which provides a spreadsheet-like interface.

Table column headings are are not currently inferred for data displayed as a spreadsheet. In addition, multiple worksheets are not viewable from the widget interface.

Source Code

Source code files deposited at Dryad that are recognized as such, currently Python scripts with mime-type "application/x-python", are displayed in the widget with syntax highlighting provided client-side by the Javascript libarary reveal.js. Dryad does not confirm that deposited code conforms to the syntax of its language, so syntax highlighting cannot be assured to be applied completely or without misinterpretation in all cases.

Source code files that are not recognized as source code, for example, R scripts and Perl scripts, are stored as "text/plain" in the repository, and are displayed in the widget without syntax highlighting.

Images

Image formats supported natively by browsers, e.g., PNG and JPEG, will be displayed at full resolution, granted the original file is under the maximum filesize threshold indicated above. Otherwise, these files will be downsampled to the extent required for display in the widget. Image formats not supported natively by browsers, e.g., TIFF, will be converted to a format handlable by a browser, with the same filesize constraints applied. Original, unmodified versions of images will be available for download through the widget or at the Dryad site.

API usage

Versioning

The Widgets API uses versioning within resource paths. This document describes functionality in Version 1, so all of the endpoints begin with /widgets/v1/

API Endpoints

The display widget API has a single user-exposed endpoint:

/widgets/v1/display/{doi}/loader.js

Usage of this API returns a single Javascript file, which updates the page with iframe, link, and script elements, as detailed below, to enable viewing of a single Dyrad data resource. When a request is made for a non-existant resource, the service returns a Javascript file with no effects.

This call has a single required request parameter, wrapper, which must be provided with the @id attribute value of the element that is to contain the display widget. See the API Usage Example below for sample code.

This call has an optional request parameter, referrer, to be used for tracking resource requests. This parameter should be passed with an identifier for the journal.

Authentication & Access Control

The Dryad widgets provide read-only access to world-readable resources made available through the Dryad site. require end-user authentication.

Resources made available through Dryad are subject to a blackout period set by the depositor.

Requests for Dryad display widgets may be made over HTTP or HTTPS.

API Call Effects

A call to the display widget API results in importing into a host page Javascript code that has following effects:

  • modification of the DOM
  • importing of CSS stylesheets

The extent of these actions is documented here.

DOM Manipulation

Once the Dryad data-display widget is fetched, the requested "loader" script inserts a number of elements in the host page's DOM, including:

  • a link[@rel=stylesheet] element requesting the " Display Widget stylesheet
  • a script[type=text/javascript] element requesting jQuery version 1.9 or greater, if not present
  • a script[type=text/javascript] element for the Magnific Popup lightbox plugin
  • an iframe element requesting the widget UI and data content

Calls between the widget's iframe content (i.e., when the buttons' onclick events are triggered) and the host page is handled by means of Javascript calls to the iframe's window.parent.postMessage method. As communication between content derived from different domains is generally prohibited by CORS policy constraints, the use of postMessage is currently considered a safe Javascript practice.

CSS Stylesheets

The content of the Dryad display widget is namespaced under a top-level dryad-ddw class to avoid collisions between CSS selectors in the host page and the widget. The exception is CSS stylesheets related to the Magnific Popup lightbox plugin, which is applied to content on the host page under the .mfp-* class names.

Workflow

Data file handling

Data files deposited at Dryad are heterogenous in format, and many cannot be displayed effectively in a browser in their deposited file formats. Consequently, the display widget implementation takes advantage of the pipelining capabilities of Cocoon to enable a subset of Dryad to be viewed as HTML/CSS or as a binary object in the widget.

Source files and classes mentioned here may be found under dspace/modules/dryad-widgets.

Bitstream XML serialization

The text or binary data contained in a single deposited data resource is referred to in Dryad/DSpace as a bitstream. These resources may be plain-text files viewable in a browser, binary formats viewable natively in a browser (e.g., PNG images), or binary formats not viewable natively in a browser (e.g., XLSX spreadsheets or GZip archives). Additionally, some plain-text formats, notably CSV and other tabular formats, are viewable in a browser but are not viewable optimally as plain-text files.

Bitstreams serilization, where possible, begins when a bitstream requested from the DataOne MN service by WidgetDisplayBitstreamGenerator, which is called as a Cocoon generator from the widget sitemap file widgets.xmap. Based on the mime-type of the requested bitstream (which is returned by the DataONE-MN service as the "DataONE-formatId" HTTP header), the bitstream is either retrieved and further processed, if serializable, or not retrieved.

The result of this generation stage is an xhtml document with metadata about the bitstream possibly data from the bitstream. Once handled in the generation phase of the pipeline, a bitstream is passed to the transformation stage of the pipeline, where a browser-targeted HTML/CSS/JS document is produced and returned in response to the HTTP request.

Serializable formats

If serializable, the bitstream object is passed to a subclass of BaseBitstreamHandler. The methods of the superclass (which extends org.apache.cocoon.generation.AbstractGenerator) initialize the xhtml document returned by the generator, add metadata to the xhtml document to be used later in the pipeline, and begin the body of the xhtml to capture content generated by the subclass.

For example, a Python source file will be initialized by BaseBitstreamHandler by initializing a document as follows:

<?xml version="1.0" encoding="UTF-8"?>
<xhtml xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta property="dc.format">application/x-python</meta>
<meta property="dc.source">http://datadryad.org/mn/object/{...}/bistream</meta>
<meta property="dc.extent">1686</meta>
</head>
<body>

The subclass generator is then invoked to generate the content of the body of the xhtml document. Since a Python source file is plain-text in format, the content handler here is the same as for files with "text/plain".

As an additional example, the generator to handle bitstreams with mime-type text/csv, which is org.dspace.app.xmlui.aspect.dryadwidgets.display.bitstreamHandler.Text.CSV, generates a document that begins:

<csv:document xmlns:csv="http://apache.org/cocoon/csv/1.0">
<csv:record number="1">
<csv:field number="1">Kingdom</csv:field>
<csv:field number="2">Higher taxonomic group</csv:field>
<csv:field number="3">Order</csv:field>

This XML serialization is then converted to HTML in the later XSLT stage of the pipeline.

Once the subclass generator has finished, the superclass generator closes the body and xhtml elements and control is passed back to the Cocoon pipeline controller.

Unerializable formats

If not serializable, the class DefaultBitstreamHandler is dispatched to return an xhtml document to be used later in the pipeline that contains metadata about the bitstream, but no content.

HTML serialization

Once an xhtml document has been generated successfully for a bitstream by WidgetDisplayBitstreamGenerator, the pipeline then dispatches an XSLT transformation stage, the result of which is serialized to html by the Cocoon pipeline controller. The stylesheets are located at dryad-widgets/dryad-widgets-webapp/src/main/webapp/static/xsl/widgets/display, and the stylesheet that handles the generated xhtml to html transformation is dataFileBitstream.xsl. This stylsheet dispatches templates based on mime-type of the bitstream object, either a default/unhandled template for bitstreams that are not currently supported for display in the widget or a named template responsible for handling a given content type.

Front-end implementation

HTML

The markup underlying a widget rendered in a host page has this overall structure:

<div class="dryad-ddw-outer" id="dryad-ddw-frame1">
<!-- @src: /widgets/v1/display/frame -->
<iframe class="dryad-ddw" src="..." height="100%" width="100%">
<html>
<head> ... </head>
<body id="dryad-ddw-body">
<div class="dryad-ddw">
<div id="ddw-header" class="dryad-ddw-header">
<div class="dryad-ddw-banner"> ... </div>
<div class="dryad-ddw-title"> ... </div>
</div>
<div id="ddw-body" class="dryad-ddw-body" style="height: 425px;">
<div id="ddw-body-frame" class="dryad-ddw-frame" style="height: 100%;">

<!-- @src: /widgets/v1/display/bitstream -->
<iframe width="100%" height="100%" src="..." class="dryad-ddw-data"></iframe>

</div>
<div id="ddw-footer" class="dryad-ddw-footer"> ... </div>
<script type="text/javascript"> ... </script>
</div>
</body>
</html>
</iframe>
</div>
...
<!-- @src: /widgets/v1/display/{doi}/load.js -->
<script id="ddw-js-1" async="true" type="text/javascript" src="..."></script>

where the elements div#dryad-ddw-frame1 and script#ddw-js-1 were added to a host page.

The use of iframe elements is meant to:

  • insulate the widget from the host page to the extent possible, and vice versa
  • allow the non-data regions of the widget load in advance of the embedded data frame, which may be intensive to generate

CSS

The CSS stylesheets, delivered to the rendered widget as dryad-ddw.min.css, are developed using the Less CSS preprocessor. The LESS-CSS source files are located at dryad-widgets/dryad-widgets-webapp/src/main/webapp/static/css/widgets/display/less in the dryad-widgets module, and may be compiled using the shell script dryad-ddw.less.sh.

The "main" LESS-CSS source file is dryad-ddw.less, which is used to namespace the widget CSS selectors under a .dryad-ddw class while importing the included widgets sub-stylesheets. Namespacing is meant to insulate the page hosting the display widget from unintended collisions during stylesheet application. The sub-stylesheets are organized as follows:

  • brand.less: color, border, and margin configuration for embedded widget
  • layout.less: placement and dimension of embedded widget components
  • lightbox.less: configuration of overlay widget (lightbox popup)
  • magnific-popup.less: stylesheet for the Magnific-Popup plugin
  • reset.less: browser normalization stylesheet, adapted from Eric Meyer's reset.css
  • type.less: widget typography
  • vars.less: global CSS settings to be used across stylesheets

Javascript

The widgets/v1/display/load.js file responsible for injecting a display widget into a host page is produced by the xsl stylesheet dataFileLoader.xsl. The Javascript produced by this stylesheet is responsible for:

  • confirming the availability of a sufficiently recent jQuery library
  • importing the Magnific-Popup jQuery plugin
  • importing the display widget CSS stylesheet into the host page, for styling the widget's outer frame
  • handling the onclick events for the widget buttons

In the current Javascript implementation, which extends little JS-driven functionality to the user, there is no use of a Javascript application framework or global widget controller object in the host page. Not relying on a JS framework in the current implementation is meant to provide the following advantages:

  • minimize risk of collision with frameworks used in host page
  • avoid import of large JS framework libries
  • decrease complexity of the widget's JS implementation

Not using a global singleton controller object in the host page is meant to reduce the complexity of the widget Javascript implementation around synchronization and coordination between multiple widgets in a page. Each widget is a standalone unit as far as the Javascript is concerned. The expense of no global widget singleton controller is a small amount of Javascript code repetition in a page when more than one widget is embedded in the page.

The data-display frame of a widget may itself import additional Javascript libraries, depending on the content displayed. For example, the

These JS resources are fetched asynchronously once the iframe loads.

Installation and deployment

The dryad-widgets module is a dependency of the xmlui module, and is installed during a Maven build according to configuration in the associated pom.

Adding newly supported mime-types

Adding a new mime-type handler for bitstreams requires updates in both the Dryad Widgets Java code and the XSL stylesheets.

See Widget next steps for a listing of the next mime-types to be implemented.

Cocooon Generator

The package

org.dspace.app.xmlui.aspect.dryadwidgets.display.bitstreamHandler.DefaultBitstreamHandler

located at

dryad-repo/dspace/modules/dryad-widgets/dryad-widgets-webapp/src/main/java/org/dspace/app/xmlui/aspect/dryadwidgets/display/bitstreamHandler

in the dryad-widgets module may be referred to for the minimal implementation required for developing a new bitstream handler. If the bitstream's content is to be serialized to XML, the generate() method must be implemented to generate the SAX events that will do so.

Additionally, the mapping of mime-type to handler classname must be registered in the Java class org.dspace.app.xmlui.aspect.dryadwidgets.display.WidgetDisplayBitstreamGenerator. The map private final static HashMap<String,String> bitstreamHandlerClasses must have an entry added mapping the string mime-type value to the string class name for the class created to generate for the mime-type.

Note that certain mime-types, for example, "image/png", are handled by the default bitstream handler, which passes metadata to later stages of the platform without a call to a generate() method. This approach is appropriate for plain text mime-types and other formats handlable by browsers.

Using existing Cocoon generator classes

The bitstream generators for mime-types "text/plain" and "text/csv" take advantage of existing Cocoon-project generators for serializing this content to XML. Note that these generator implementations make use of a XMLConsumer object to stream the output of the existing generators to the same document as is used for the dryad-widgets pipeline (defined in the widgets.xmap sitemap).

Cocooon Transformer

Once a generator has been added to handle a new bitstream mime-type, a corresponding XSL stylesheet must be added to serialize the generated XML to HTML. The base stylesheet for this transformation is located in the dryad-widgets module at

dryad-repo/dspace/modules/dryad-widgets/dryad-widgets-webapp/src/main/webapp/static/xsl/widgets/display/dataFileBitstream.xsl

This stylesheet dispatches templates to handle the generated content by mime-type, according to the following table:

<ddw:templates>
<ddw:template mime-type="application/x-python">code</ddw:template>
<ddw:template mime-type="image/png"           >image-native</ddw:template>
<ddw:template mime-type="image/jpeg"          >image-native</ddw:template>
<ddw:template mime-type="text/csv"            >text-csv</ddw:template>
<ddw:template mime-type="text/plain"          >text-plain</ddw:template>
</ddw:templates>

The stylesheets containing the dispatched templates are located in the handlers subdirectory.

Debugging

The dryad-widgets sitemap file has a pipeline for debugging bitstream serialization, which is defined by the pattern:

<map:match pattern="v1/display/debug/**/bitstream">

This API call requests the XML produced by the WidgetDisplayBitstreamGenerator generator class, and may be used as input an XSL stylesheet under development.

Relation to DSpace

The Dryad Display Widget API is implemented as a Dspace module, located under dspace/modules/dryad-widgets in the Dryad source respository. The widget module is declared a dependency of the Dryad/DSpace XMLUI module's project. The widget module has its Cocoon sitemap, which defines the endpoints of the API described here, mounted by the XMLUI sitemap.

While the Data Display Widget runs within the Dryad Tomcat application, it is not coupled with the submission or content management subsystems of the application. Rather than interacting with the DSpace Java API directly, the widget aggregates data primarily by means of server-side HTTP requests to the Dryad DataOne-MN service. Resources are transformed and serialized as defined in the Cocoon sitemap, returning Javascript and HTML/CSS for display by a browser in a webpage. Other widget resources, including Javascript libraries, CSS stylesheets, and images, are served as static assets by the XMLUI module.