Monthly Reports

From Dryad wiki
Jump to: navigation, search

Monthly Update of Sitemaps

To update the sitemaps used for CLOCKSS harvesting, login to the production server and run

sudo /opt/dryad/bin/generate-sitemaps

This process should be automated, but we had a problem with it producing incorrect output when running from a cron job, so we currently run it manually.

Monthly Updates of Dryad-Data

First, get all of the deposit dates. On the production server:

postgres-client.sh -c "select text_value from metadatavalue where metadata_field_id=12 and item_id in (select item_id from item where owning_collection=2 and in_archive='t');" > dryadSubmitDates.txt

(collection 2 is "Data Packages" and metadata field 12 is "Date Available")

Edit the file to remove the timestamps. Leave only the date portion. This can be done using the sed command as follows:

sed 's/T.*//' dryadSubmitDates.txt

Sort the dates.

In the dryad-data GitHub repository:

  • Add the updated dryadSubmitDates.txt
  • Count the number of new packages for the given month (grep for the year-month combination and pipe to wc)
  • Manually edit the dataPackagesInDryad.csv file to include the new month.

Monthly Shopping Cart Report

'This report is automated. The Python script is located in https://github.com/datadryad/dryad-data/tree/master/monthlyReporting/monthlyReport.py.

Run the report from the command line on production. Following is an example of the commands to run a report covering 2017-01-01 to 2017-01-31. (Modify the dates to match the desired time frame of the report.):

cd /home/monthlyreports
python monthlyReportNew.py -s 2017-01-01 -e 2017-01-31

Run the report from the command line on production. Following are the input and output for the report

Args:
    -s: start date for the report
    -e: end date for the report
Output: Running the report results in a CSV file called monthlyReport.csv in the current directory
Raises: ValueError for invalid date
Output: Running the report results in a CSV file called monthlyReport.csv in the current directory

Then:

  1. Create new blank spreadsheet in the "financial/shopping cart reports" folder in Google Drive (https://drive.google.com/open?id=0BzWuiqo-UGbqTUFlck1PamMxcDg).
  2. Import report CSV file (name is monthlyReport followed by date in YYYYMMDD format, for example monthlyReport20170101.csv) and verify that column headers and content match up as expected.
  3. Select the entire spreadsheet. sort by journal, subsort by archive_date.
  4. Share the spreadsheet with the appropriate people.

See Also