Old:Monthly Reports

From Dryad wiki
Revision as of 22:40, 30 April 2017 by Debra (talk | contribs) (Monthly Shopping Cart Report)

Jump to: navigation, search

Monthly Update of Sitemaps

To update the sitemaps used for CLOCKSS harvesting, login to the production server and run

sudo /opt/dryad/bin/generate-sitemaps

This process should be automated, but we had a problem with it producing incorrect output when running from a cron job, so we currently run it manually.

Monthly Updates of Dryad-Data

First, get all of the deposit dates. On the production server:

postgres-client.sh -c "select text_value from metadatavalue where metadata_field_id=12 and item_id in (select item_id from item where owning_collection=2 and in_archive='t');" > dryadSubmitDates.txt

(collection 2 is "Data Packages" and metadata field 12 is "Date Available")

Edit the file to remove the timestamps. Leave only the date portion. This can be done using the sed command as follows:

sed 's/T.*//' dryadSubmitDates.txt

Sort the dates.

In the dryad-data GitHub repository:

  • Add the updated dryadSubmitDates.txt
  • Count the number of new packages for the given month (grep for the year-month combination and pipe to wc)
  • Manually edit the dataPackagesInDryad.csv file to include the new month.

Monthly Shopping Cart Report

'This report is automated. The Python script is located in https://github.com/datadryad/dryad-data/tree/master/monthlyReporting/monthlyReport.py.

On production, export CSVs using commands:

cd dryad-data/monthlyReporting
git pull


  1. Create new blank spreadsheet in Google Drive (https://drive.google.com/open?id=0BzWuiqo-UGbqTUFlck1PamMxcDg).
  2. Import report CSV file (name is monthlyReport followed by date in YYYYMMDD format, for example monthlyReport20170101.csv) and verify that column headers and content match up as expected.
  3. Select the entire spreadsheet. sort by journal, subsort by archive_date.
  4. Share the spreadsheet with the appropriate people.

See Also