Curation

From Dryad wiki
Revision as of 15:20, 26 October 2009 by Jane (talk | contribs) (Curation metadata)

Jump to: navigation, search

Introduction

The Dryad repository project team is developing a plan to hire a curator to work full-time towards the curation of datasets hosted by the Dryad repository, and supervise undergraduate assistants assigned to curatorial tasks. The Dryad repository team will be developing a job description and schedule for hiring a curator during the summer of 2009.

From the Digital Curation Centre:

  • Digital curation can be defined as follows: 'The activity of managing the use of data from its point of creation to ensure it is available for discovery and re-use in the future.' Data curation can also include managing vast data sets for daily use; updating it to keep it readable, etc. Therefore the term data curator is applicable to a large range of professional backgrounds, from minimal management of digital materials, to the addition of metadata, to managing institutional repositories.

Professional Data Curation Tasks

  1. Name authority/authority control for authors
  2. Quality control
    • Clean up citation fields.
    • View the contents of metadata fields across the repository, and enforce consistency.
  3. Maintain documentation of cataloging/curation policies.
  4. Spot check entries to make sure they have high-quality metadata.
  5. Spot check entries to make sure the files have the data they claim to have.
  6. Determine when files need to be migrated to new formats and supervise the migration process.

Skills Required

  1. Working knowledge of metadata standards, application profiles, Semantic Web concepts
  2. Working knowledge of (applicable) vocabularies, ontologies, and mapping strategies
  3. Supervisory/managerial experience
  4. Basic computing skills
    • to verify file contents and perform file migrations
  5. Basic knowledge of relational databases
    • to perform simple batch updates and describe more complex updates to the developers
  6. A minimum of a BA/BS in biological field
    • to identify taxon names, gene names, etc.
    • to recognize high-value data sets and give them more curatorial attention
  7. Excellent communication skills
    • to communicate with authors
    • to write documentation
    • to create Dryad tutorials for conferences

Example Postings

  1. "Microarray Data Curator Position Available," PLEXdb Curator
  2. "Scientific Data Curator," Phenoscape project @ NESCent
  3. "Scientific Data Curator," Harvard School of Public Health Bioinformatics Core (HBC)
  4. "Biological Data Curator," South African National Bioinformatics Institute, eVOC system, a controlled vocabulary to describe gene expression states (http://evocontology.org)
  5. "Scientific Database Curator (2 Positions)," for the curation of the UniProt Knowledgebase (UniProtKB).
  6. "Scientific Curator," The Jackson Laboratory

Summer 2009 Curation Project

Sarah Carrier produced two documents during summer 2009 that detail the curatorial management of data and metadata in Dryad, and offer some ideas for overall policy and requirements. The first document is for the redesign of the Dryad interface to better accommodate curation tasks. The second is a manual that details the current (as of summer 2009) curation workflow. This manual will be used by a curator hired fall 2009.

These two documents represent the latest information regarding curation. Other pages on the wiki that include curation information used during the summer 2009 curation project:

Other Resources