Difference between revisions of "Search System Technology"
From Dryad wiki
Ryan Scherle (talk | contribs) (→SOLR indexes) |
Ryan Scherle (talk | contribs) (→Maintenance) |
||
(One intermediate revision by the same user not shown) | |||
Line 14: | Line 14: | ||
* dspace-solr-search.cfg -- specifies the parameters that are added to queries, both automatic queries and user-generated queries | * dspace-solr-search.cfg -- specifies the parameters that are added to queries, both automatic queries and user-generated queries | ||
* solr/(index name)/conf -- specifies the fields stored in an index, and how those fields are processed | * solr/(index name)/conf -- specifies the fields stored in an index, and how those fields are processed | ||
+ | |||
+ | == Maintenance == | ||
+ | |||
+ | Primary maintenance of the index is performed with: | ||
+ | <pre> | ||
+ | /opt/dryad/bin/dspace update-discovery-index | ||
+ | </pre> | ||
+ | |||
+ | Options available: | ||
+ | * -i <internal_item_id> | ||
+ | ** re-index a single item | ||
+ | * -f | ||
+ | ** forced every item to be re-indexed, even if it is up-to-date | ||
+ | * -b | ||
+ | ** rebuild by dropping the index and reindexing everything | ||
+ | * -r <handle> | ||
+ | ** remove an item from the index | ||
+ | * -o | ||
+ | ** optimize the solr indexes on disk | ||
+ | |||
+ | For convenience, the dryad-utils package has a script that will update the index for items that were archived during a particular date range. | ||
+ | <pre> | ||
+ | dryad-utils/reindex-discovery.py --date_from 2017-01-01 --date_to 2017-01-31 | ||
+ | </pre> | ||
[[Category:Technical Documentation]] | [[Category:Technical Documentation]] |
Latest revision as of 13:12, 1 March 2017
The Dryad search system is based on the DSpace Discovery system.
SOLR indexes
- authority -- terms for autocompletion using controlled vocabularies, including HIVE
- dataoneMNlog -- log of accesses through the DataONE API
- dryad -- local storage of DOIs (this index needs to be renamed)
- search -- primary search index
- statistics -- log of accesses to item pages and bitstream downloads
Configuration
- dspace.cfg -- specifies the URLs for the various solr indexes, specifies fields that are used within the search system
- dspace-solr-search.cfg -- specifies the parameters that are added to queries, both automatic queries and user-generated queries
- solr/(index name)/conf -- specifies the fields stored in an index, and how those fields are processed
Maintenance
Primary maintenance of the index is performed with:
/opt/dryad/bin/dspace update-discovery-index
Options available:
- -i <internal_item_id>
- re-index a single item
- -f
- forced every item to be re-indexed, even if it is up-to-date
- -b
- rebuild by dropping the index and reindexing everything
- -r <handle>
- remove an item from the index
- -o
- optimize the solr indexes on disk
For convenience, the dryad-utils package has a script that will update the index for items that were archived during a particular date range.
dryad-utils/reindex-discovery.py --date_from 2017-01-01 --date_to 2017-01-31