Tabbed Searching Technology
Dryad harvests records from other scientific data repositories (like KNB and TreeBASE) via OAI-PMH. (See description of the Harvesting process.) Harvested data is put into its own DSpace collection (one collection per harvested resource). We want to be able to provide a search across all these collections but treat the display of them differently that we would just just another search facet.
We want to present each data collection in a separate search tab. The Dryad tab should be the default, but we want to give the user the option to view the results from these other collections. We want to display a hit count so the user will know, without clicking on the tab, how many resources are available from the other collection. This requires some customization of the DSpace Discovery module.
Since the additional functionality that we want to provide is something that is, at least at this point, unique to Dryad we wanted to make as few changes to the underlying Discovery module as possible, preferring instead to provide a thin layer over the Discovery module that would treat our harvested collections differently from other collections (and from other search facets in general).
When a user searches Dryad, the Discovery module is used to query a Solr index and return results. This takes place through a mix of Solr and Discovery specific search and display syntax. The tabbed searching that sits on top of Discovery works directly with Solr, avoiding the Discovery module, with the exception that it must take the Discovery query that is embedded in the page's metadata and convert any Discovery specific syntax into the corresponding Solr syntax.
This translation is done so that Solr can be queried directly to get the number of hits for the same search performed against a harvested collection. Parsing the Discovery query syntax is also important because the tabbed search layer must create a URL with the Discovery syntax so that when a user clicks on a search tab s/he performs that search in Dryad (using the Discovery module). The process works like this:
- User queries Dryad
- Discovery module handles query, returns results, and puts query URL into the page's metadata
- Dryad XSLT processes page metadata and uses the functions in the !DryadSearch.xsl file to parse out the elements of the query (deduping when necessary, changing syntax to pure Solr syntax, etc.)
- Dryad XSLT puts the cleaned query URL into a class attribute for each tab that will be displayed in HTML
- Dryad XSLT creates a link back into Dryad with the parameters required to query that harvested collection using the Discovery module (each collection has a collection ID within DSpace that can be passed in as a location parameter to narrow a search to a collection)
- The user's browser loads the HTML that is created by the Dryad XSLT
- Instead of returning lots of XML results, Solr just returns the wrapper information for such a search, which includes a total number of results found (this is controlled by setting the Solr search paramer rows to 0)
- It ends with the tab containing the number of hits a search would retrieve and a URL for that search to be performed in Dryad; the user can now click on that tab and display the results in the main Dryad interface
- This toggling of which tab is active takes places in the Dryad XSLT and is based on the collection IDs that are passed as location parameters in the URL query; these are currently hard-coded values in the XSLT so any additional Dryad instances need to insure that they use the same collection IDs for collections (which should happen automatically when a database is copied from one instance to another)
Configuration is hard-coded in the XSLT for now. It consists of associating an externally harvested collection with a particular DSpace collection ID, which is then used in the DryadSearch.xsl
Relation to DSpace
The Dryad tabbed search functionality is related to, and uses, the DSpace Discovery module. The version of DSpace used by Dryad currently is 1.6.2. The version of Discovery that Dryad uses is tagged version 0.9.4
Discovery is going to be more completely integrated into the DSpace core going forward (though will remain a distinct module as DSpace moves towards structuring the entire codebase using smaller, interrelated modules).