Annual Journal Statistics Requirements

Requirements for Annual Journal Statistics report templates

Requirements
The requirements of this project are to


 * 1) Create a system that generates statistics from Dryad.
 * 2) Generate sets of reports containing statistics (see below) annually for each journal
 * 3) Store the reports in a repository
 * 4) Distribute the reports to the journal

Generating Statistics

 * Statistics used for the annual State-of-Dryad report are listed on Annual Statistics Reports.
 * Dryad includes curation tasks that generate some of these statistics - Statistics Reporting Technology. The date ranges are hard-coded, and the reports include data from all journals.

Current and cumulative stats for each journal

 * Cumulative total number of data packages (from all yrs)
 * Cumulative total number of files (from all yrs)
 * Number of files currently under embargo
 * Cumulative storage size (for published data from all yrs)
 * For integrated journals, all the nontrivial questionnaire options that journal has selected, including their payment plan
 * Number of submissions
 * Number of data packages published
 * Number of waivers issued
 * Number of submissions deposited but not yet published, by month of submission

Yearly stats (also for each journal)

 * Number of data packages
 * 3 most popular data packages (by max downloads in past year)
 * Number of page views of all data packages (in the past year)
 * Number of downloads of data packages (in the past year)
 * Total number of clicks on article DOIs, if possible
 * The distribution of embargo lengths
 * Number of helpdesk tickets associated with submissions to that journal

For each individual data package published in the past year (again, for each journal)

 * The complete citation
 * The date published
 * The number and total size of files submitted
 * The min and max embargo length of files in the package
 * The number of pageviews since publication
 * Max downloads among all data files since publication

Additional statistics wish list

 * Min, max and median downloads per data package by journal
 * Can be aggregated by the above stats
 * Amount of traffic from the pages to the journal.
 * Unless we can get this out of Google Analytics we don't have this data historically.