Old:Monthly Reports

From Dryad wiki
Revision as of 09:36, 18 April 2016 by Ryan Scherle (talk | contribs) (Monthly Updates of Dryad-Data)

Jump to: navigation, search

Monthly Update of Sitemaps

To update the sitemaps used for CLOCKSS harvesting, login to the production server and run

sudo /opt/dryad/bin/generate-sitemaps

This process should be automated, but we had a problem with it producing incorrect output when running from a cron job, so we currently run it manually.

Monthly Updates of Dryad-Data

First, get all of the deposit dates. On the production server:

postgres-client.sh -c "select text_value from metadatavalue where metadata_field_id=12 and item_id in (select item_id from item where owning_collection=2 and in_archive='t');" > dryadSubmitDates.txt

(collection 2 is "Data Packages" and metadata field 12 is "Date Available")

Edit the file to remove the timestamps. Leave only the date portion. This can be done using the sed command as follows:

sed 's/T.*//' dryadSubmitDates.txt

Sort the dates.

In the dryad-data GitHub repository:

  • Add the updated dryadSubmitDates.txt
  • Count the number of new packages for the given month (grep for the year-month combination and pipe to wc)
  • Manually edit the dataPackagesInDryad.csv file to include the new month.

Monthly Shopping Cart Report

This report is partially automated. The current script is in https://github.com/datadryad/dryad-data/tree/master/monthlyReporting/monthly-report.sh, but it is incomplete.

On production, export CSVs using commands:

cd dryad-data/monthlyReporting
git pull


  1. Import filtered file (shoppingChargedFiltered.csv) into google spreadsheet.
  2. Copy/paste headings from previous report.
  3. Remove three columns before the journal name -- journal names should line up with the headings.
  4. Move the last_mod_date and archive_date to the right, above the columns of dates.
  5. Remove columns that do not have headings.
  6. Filter spreadsheet to remove items that have a payment date beyond the correct month. (In the future, this should be accomplished by restricting the Postgres queries, as long as all carts now have dates.)
  7. After filtering is complete, copy the first column (cart ID) into https://github.com/datadryad/dryad-data/tree/master/monthlyReporting/shoppingCartIdsSeenInReports.txt. Sort this file numerically (sort -n). Push the changes to github.
  8. Select the entire spreadsheet. sort by journal, subsort by archive_date.
  9. Share the spreadsheet with the appropriate people.

See Also