Difference between revisions of "Curation"

From Dryad wiki
Jump to: navigation, search
(Curation metadata)
(Curation metadata)
Line 67: Line 67:
 
*A selected review of standards follows here:
 
*A selected review of standards follows here:
  
* A-Core: Metadata about Content Metadata. Retrieved December 9, 2008, from http://metadata.net/admin/draft-iannella-admin-01.txt. - who, what, where, when.
+
*A-Core: Metadata about Content Metadata. Retrieved December 9, 2008, from http://metadata.net/admin/draft-iannella-admin-01.txt. - who, what, where, when.
  
* AC - Administrative Components - Dublin Core DCMI Administrative Metadata: Final Specification:  http://www.bs.dk/standards/AdministrativeComponents.htm.  (Validity date - start and/or end date of the validity of the metadata content.)
+
*AC - Administrative Components - Dublin Core DCMI Administrative Metadata: Final Specification:  http://www.bs.dk/standards/AdministrativeComponents.htm.  (Validity date - start and/or end date of the validity of the metadata content.)
  
 
*IEEE LOM (Learning Object Model)- http://ltsc.ieee.org/wg12/files/LOM_1484_12_1_v1_Final_Draft.pdf. - "completion status or condition of this
 
*IEEE LOM (Learning Object Model)- http://ltsc.ieee.org/wg12/files/LOM_1484_12_1_v1_Final_Draft.pdf. - "completion status or condition of this
 
learning object" -- not the metadata record; also has version.
 
learning object" -- not the metadata record; also has version.
 +
 +
*National Library of Australia’s Preserving Access to Digital Information (PADI) metadata schema - http://www.nla.gov.au/padi/metadata.html -  includes elements that refer to status:  1. record status - status assigned to the resource for administrative purposes (accepted; on hold; rejected; under review); 2. date status - date the status was assigned, 3. reason for status, 4.  review category- a category assigned to a record with "on hold" or "review" status that defines the type of review required.
 +
 +
*OCLC catalog record - http://www.oclc.org/holdingsformat/en/leader.htm - corrected or revised, new; see also MARC leader:  http://www.loc.gov/marc/bibliographic/bdleader.html.
 +
 +
*TEI, EAD, METS all have admin. metadata, focus is on the resource.
  
 
[[Category:Project Management]]
 
[[Category:Project Management]]
 
[[Category:Metadata]]
 
[[Category:Metadata]]
 
[[Category:Curation]]
 
[[Category:Curation]]

Revision as of 12:32, 28 September 2009

Introduction

The Dryad repository project team is developing a plan to hire a curator to work full-time towards the curation of datasets hosted by the Dryad repository, and supervise undergraduate assistants assigned to curatorial tasks. The Dryad repository team will be developing a job description and schedule for hiring a curator during the summer of 2009.

From the Digital Curation Centre:

  • Digital curation can be defined as follows: 'The activity of managing the use of data from its point of creation to ensure it is available for discovery and re-use in the future.' Data curation can also include managing vast data sets for daily use; updating it to keep it readable, etc. Therefore the term data curator is applicable to a large range of professional backgrounds, from minimal management of digital materials, to the addition of metadata, to managing institutional repositories.

Professional Data Curation Tasks

  1. Name authority/authority control for authors
  2. Quality control
    • Clean up citation fields.
    • View the contents of metadata fields across the repository, and enforce consistency.
  3. Maintain documentation of cataloging/curation policies.
  4. Spot check entries to make sure they have high-quality metadata.
  5. Spot check entries to make sure the files have the data they claim to have.
  6. Determine when files need to be migrated to new formats and supervise the migration process.

Skills Required

  1. Working knowledge of metadata standards, application profiles, Semantic Web concepts
  2. Working knowledge of (applicable) vocabularies, ontologies, and mapping strategies
  3. Supervisory/managerial experience
  4. Basic computing skills
    • to verify file contents and perform file migrations
  5. Basic knowledge of relational databases
    • to perform simple batch updates and describe more complex updates to the developers
  6. A minimum of a BA/BS in biological field
    • to identify taxon names, gene names, etc.
    • to recognize high-value data sets and give them more curatorial attention
  7. Excellent communication skills
    • to communicate with authors
    • to write documentation
    • to create Dryad tutorials for conferences

Example Postings

  1. "Microarray Data Curator Position Available," PLEXdb Curator
  2. "Scientific Data Curator," Phenoscape project @ NESCent
  3. "Scientific Data Curator," Harvard School of Public Health Bioinformatics Core (HBC)
  4. "Biological Data Curator," South African National Bioinformatics Institute, eVOC system, a controlled vocabulary to describe gene expression states (http://evocontology.org)
  5. "Scientific Database Curator (2 Positions)," for the curation of the UniProt Knowledgebase (UniProtKB).
  6. "Scientific Curator," The Jackson Laboratory

Summer 2009 Curation Project

Sarah Carrier produced two documents during summer 2009 that detail the curatorial management of data and metadata in Dryad, and offer some ideas for overall policy and requirements. The first document is for the redesign of the Dryad interface to better accommodate curation tasks. The second is a manual that details the current (as of summer 2009) curation workflow. This manual will be used by a curator hired fall 2009.

These two documents represent the latest information regarding curation. Other pages on the wiki that include curation information used during the summer 2009 curation project:

Other Resources

Curation metadata

  • Study initiatied to capture "status" of metadata records, when to know they are ready to be published. Sarah Carrier pursued this work in the metadata class, and ended up establishing the "Dryad" namespace, and then registered the "status" element for the Dryad application profile, following the Singapore Framework recommendation.
  • A selected review of standards follows here:

learning object" -- not the metadata record; also has version.

  • National Library of Australia’s Preserving Access to Digital Information (PADI) metadata schema - http://www.nla.gov.au/padi/metadata.html - includes elements that refer to status: 1. record status - status assigned to the resource for administrative purposes (accepted; on hold; rejected; under review); 2. date status - date the status was assigned, 3. reason for status, 4. review category- a category assigned to a record with "on hold" or "review" status that defines the type of review required.
  • TEI, EAD, METS all have admin. metadata, focus is on the resource.