Difference between revisions of "Search System Technology"

From Dryad wiki
Jump to: navigation, search
(SOLR indexes)
(Maintenance)
 
(One intermediate revision by the same user not shown)
Line 14: Line 14:
 
* dspace-solr-search.cfg -- specifies the parameters that are added to queries, both automatic queries and user-generated queries
 
* dspace-solr-search.cfg -- specifies the parameters that are added to queries, both automatic queries and user-generated queries
 
* solr/(index name)/conf -- specifies the fields stored in an index, and how those fields are processed
 
* solr/(index name)/conf -- specifies the fields stored in an index, and how those fields are processed
 +
 +
== Maintenance ==
 +
 +
Primary maintenance of the index is performed with:
 +
<pre>
 +
/opt/dryad/bin/dspace update-discovery-index
 +
</pre>
 +
 +
Options available:
 +
* -i <internal_item_id>
 +
** re-index a single item
 +
* -f
 +
** forced every item to be re-indexed, even if it is up-to-date
 +
* -b
 +
** rebuild by dropping the index and reindexing everything
 +
* -r <handle>
 +
** remove an item from the index
 +
* -o
 +
** optimize the solr indexes on disk
 +
 +
For convenience, the dryad-utils package has a script that will update the index for items that were archived during a particular date range.
 +
<pre>
 +
dryad-utils/reindex-discovery.py --date_from 2017-01-01 --date_to 2017-01-31
 +
</pre>
  
 
[[Category:Technical Documentation]]
 
[[Category:Technical Documentation]]

Latest revision as of 13:12, 1 March 2017

The Dryad search system is based on the DSpace Discovery system.

SOLR indexes

  • authority -- terms for autocompletion using controlled vocabularies, including HIVE
  • dataoneMNlog -- log of accesses through the DataONE API
  • dryad -- local storage of DOIs (this index needs to be renamed)
  • search -- primary search index
  • statistics -- log of accesses to item pages and bitstream downloads

Configuration

  • dspace.cfg -- specifies the URLs for the various solr indexes, specifies fields that are used within the search system
  • dspace-solr-search.cfg -- specifies the parameters that are added to queries, both automatic queries and user-generated queries
  • solr/(index name)/conf -- specifies the fields stored in an index, and how those fields are processed

Maintenance

Primary maintenance of the index is performed with:

/opt/dryad/bin/dspace update-discovery-index

Options available:

  • -i <internal_item_id>
    • re-index a single item
  • -f
    • forced every item to be re-indexed, even if it is up-to-date
  • -b
    • rebuild by dropping the index and reindexing everything
  • -r <handle>
    • remove an item from the index
  • -o
    • optimize the solr indexes on disk

For convenience, the dryad-utils package has a script that will update the index for items that were archived during a particular date range.

dryad-utils/reindex-discovery.py --date_from 2017-01-01 --date_to 2017-01-31