Difference between revisions of "Old:December 2006 Workshop Plans"

From Dryad wiki
Jump to: navigation, search
(Metadata Class, Mock Workshop)
(Metadata Class, Mock Workshop)
Line 88: Line 88:
=== Metadata Class, Mock Workshop===
=== Metadata Class, Mock Workshop===
* [[NESCent720.doc|class handout]]
* [[Media:NESCent720.doc|class handout]]

Revision as of 00:12, 8 December 2006

December "Stakeholder" Meeting


  • To inform the society and journal reps of our plans to date and get feedback on those plans
  • To discuss how to gather requirements.


  • Date: Dec 5, 2006
  • Time: core meeting 9am-11pm + extended discussions through lunch
  • Place: NESCent



  • NESCent-MRC DRIADE team, wg-digitaldata@nescent.org
  • Ahrash Bissell (Duke/OpenContext), ahrash.bissell@duke.edu
  • Harold Heatwole (editor, Integrative & Comparative Biology), harold_heatwole@ncsu.edu
  • Mohammed Noor (NESCent working group leader on meta-analysis), noor@duke.edu
  • Bob Peet (former editor, Ecology), peet@unc.edu
  • Mark Rausher (editor, Evolution), mrausher@duke.edu
  • Michael Whitlock (editor, American Naturalist)
  • Kathleen Smith (NESCent), kksmith@duke.edu
  • Marcy Uyenoyama (incoming editor, Molecular Biology & Evolution), marcy@duke.edu
  • Don Waller (SSE), dmwaller@wisc.edu


  • 9:00 Goals of this meeting (Todd Vision)
  • 9:15 Roundtable introductions
  • 9:30 Requirements and open questions (Hilmar Lapp)
  • 9:45 Issues regarding metadata (Jane Greenberg)
  • 10:00 Roundtable discussion
    • Expectations and desires of the journals, publishers and scientific societies
    • What are the priorities?
    • Ideas for the requirements gathering phase
    • Suggestions for attendess at the spring stakeholders meeting
  • 11:00-on (for those remaining)
    • Further discussion of project plans and the two major upcoming meetings

March "Consultant" Meeting


The workshop will take place in the week of March 5-10, 2007.

Invitees, including roles, and potential alternates

  • Ahrash Bissell, OpenContext, how raw should the data be? (alternate: Eric/Sandy Kanza)
  • Margret Branchofsky, Dspace, data federation (alternate: McKenzie Smith)
  • Joe Bush, Taxonomy Strategies, digital lifecycle management
  • Adam Goldstein, Darwin Digital Library, what metadata is required?
  • John Graybeal, Marine Metadata Initiative, metadata generation by scientists
  • Jane Greenberg, Dublin Core, metadata requirements
  • Chris Greer, NSF, sustainable funding
  • Kevin Gamiel, RENCI, data federation & grid storage
  • Margaret Hedstrom, U Michigan, trust level & digital lifecycle management
  • Bryan Heidorn, UIUC, use of the grid for storage
  • Dianne Hillman, Cornell, metadata generation by scientists
  • Matt Jones, SEEK, how raw should the data be, user interface rqmnts & metadata generation
  • Paul Jones, iBiblio, sustainable funding
  • Liz Liddy, Center for Natural Language Processing, School of Information Studies, Syracuse Univeristy, metadata generation
  • Josh Madin, SEEK, what metadata is rqd
  • Michael Nelson, OAI-PMH, enabling 3rd party harvesting (alternate: Carl Logoze)
  • Mohammed Noor, NESCent WG, integration of submission with journals, how raw should the data be? (alternates: Maria Servedio, Emila Martins)
  • Sandy Payette, Fedora, interface w/ journals (alternate: Carl Logoze?)
  • Bob Peet, Ecology Society, integration of submission with journals
  • Dav Robertson, NIEHS, repository trust level
  • Val Tannen, Penn, data integration (not sure how important his presence would be)
  • Herbert Van de Sompel, LANL, enabling 3rd party harvesting
  • Mary Vardigan, DDI/ICRSP, administration & sustainable funding, what metadata is rqd
  • John Willbanks, GBIF & Science Commons, intellectual property, and data federation (alternates: Stan Blum, Don Hobern)
  • someone from NASA, incentivizing data sharing

Not currently on invitee list, but probably should be)

  • someone from CIESEN?
  • Bruce Bauer (World Data Center for Paleoclimatology)
  • Tom Hammond (Conservation Commons)
  • Emilia Martins (EthBase)
  • David Schloen (OCHRE)

Potential floaters

  • C. Lynch

Metadata Class, Mock Workshop


The Four Virtues that we strive toward are the Sharing, Reuse, Preservation, and Synthesis of published evolutionary data. Decisions have to be made on how to promote these Virtues, and to what degree.

Questions for the participants:

  • Breakout 1
    • Raw data in repositories or processed data only? Spreadsheet data? (Bissell, M. Jones, others needed)
    • How can depositors be incentivized? (someone from NASA, others needed)
  • Breakout 2
    • How would the system be administered and sustainably funded? (Greer, P. Jones, Vardigan)
    • What intellectual property policies need to be put into place? (Willbanks)
  • Breakout 3
    • What is the role for data federation technology (central vs distributed repository)? (Branchofsky, Gamiel, Willbanks)
    • What is the role for bona fide data integration technology? (Heidorn, Gamiel, Tannen)
    • What is the role of distributed/grid storage? (Gamiel, Heidorn)
  • Breakout 4
    • What level of trust is necessary for the repository, e.g., persistence of data, protection of data from tampering, quality of meta-data? (Headstrom, Robertson)
    • What metadata is required and how to generate it (Dublin Core, DDI-lite, EML, standards imposed by specialized repositories)? (Bissell, Goldstein, Greenberg, Madin, Vardigan)
  • Breakout 5
    • Do we need to plan for metadata lifeycle management, and to what extent? (Bush, Headstrom)
    • Should the system be capable of metadata generation, and if so to what extent, with how much human review? (Greenberg, Hillman, Liddy)
  • Breakout 6
    • How to synchronize ingestion with journal publication and 3rd-part database deposition? (Noor, Peet, others needed)
    • How to enable harvesting of data by 3rd-parties (e.g. OAI-MHP)? (Nelson, van de Stompel)
    • What should be the functionality of the interface to the centralized registry? (Bissell, M. Jones)

Provisional agenda

Day 1:

  • Introductions and presentation of objectives
  • Refine, as a group, tasks for the breakout sessions.
  • Three concurrent breakout sessions over lunch and into early afternoon, with short chalktalks relevant to each topic followed by focused discussion on the breakout tasks.
  • Break
  • Late afternoon breakout group summaries

Day 2:

  • Three concurrent morning breakout sessions, again with short chalktalks relevant to each topic followed by focused discussion on the breakout tasks.
  • Lunch
  • 1-2 hour large group discussion
  • Writing of recommendations

May "Planning" Meeting


  •  ? NCBI interface w/ specialized dbs
  • Bill Piel, Treebase, interface w/ specialized dbs (possibly 2nd meeting instead)
  • Greg Riccardo, Morphbank, interface w/ specialized dbs (possibly 2nd meeting instead)