Batch Metadata Editing

From Dryad wiki
Jump to: navigation, search
  • Login to the Dryad server (via SSH)
  • Export the metadata (note there is the option to export a single item or collection)
 bin/dspace metadata-export -f myExport.csv
  • Copy the metadata to a local machine
  • Edit the exported metadata in a spreadsheet (MS Excel 2011 does not work! see below)
    1. in Data tab, select "from text"
    2. ensure the "File origin" is set to UTF-8
    3. comma delimited
    4. set all columns to import as text so the date format isn't changed by Excel
    5. Note: you can move an item to a new collection by just changing the collection field
    6. Note: you can map an item into a multiple collections by adding "||collection2ID" to the collection value
    7. Note: If you're deleting information from the column, leave the column header there. Otherwise, the batch import will assume you don't want to change that column.
    8. save the file as CSV
  • Copy back to the Dryad server
  • Login to the appropriate Dryad server (via SSH)
  • Import the metadata
 bin/dspace metadata-import -f myExport.csv

Notes and Caveats

  1. A general rule of thumb is to break the items up into groups of 1000 for importing, otherwise there is a potential for errors.
    1. If you are using the GUI, there is a limit (set in dspace.cfg) for the number of items processed at a time
  2. MS Excel 2011 does not work! It will mangle accented characters, regardless of the encoding used when the CSV is imported. Possible alternatives:
    1. Other versions of Excel. The process seemed to work using Excel 2008.
    2. Open Office 3.3 for Mac definitely works.
  3. If you're deleting information from the column, leave the column header there. Otherwise, the batch import will assume you don't want to change that column.