Difference between revisions of "Main Page"

From Dryad wiki
Jump to: navigation, search
 
(86 intermediate revisions by 7 users not shown)
Line 1: Line 1:
<big>'''Digital Repository of Information and Data for Evolution (DRIADE)'''</big>
+
{{Box1|This site contains documentation for the Dryad repository and associated projects. To access the Dryad repository, go to [http://www.datadryad.org/ DataDryad.org]. For the latest news on Dryad and data archiving, check out [http://blog.datadryad.org Dryad News and Views]. Please contact [mailto:help@datadryad.org Dryad staff] with any inquiries.}}
  
{| width="100%"
+
'''What is Dryad?'''
| width="10" |
 
| http://www.nescent.org/img/stock/NESCentLogo.png
 
| width="50" |
 
| [[Image:MRC_logo.png]]‎
 
|
 
|}
 
  
 +
[http://datadryad.org Dryad] is both an international repository of data underlying peer-reviewed scientific and medical literature, and a membership organization, governed by journals, publishers, scientific societies, and other stakeholders.
  
<big>'''A joint project of <span class="plainlinks">[http://www.nescent.org NESCent]</span>  and the <span class="plainlinks">[http://ils.unc.edu/mrc/  UNC Metadata Research Center]</span>'''</big>
+
Dryad is distinguished by the close association of data deposition with the process and business of scholarly publishing, and by using article publication as a model for how researchers can benefit from data sharing infrastructure. Dryad has the potential to transform the way research data are communicated and preserved. The credibility and effectiveness of the research enterprise is due in large part to the social contract behind scholarly publishing. Researchers disclose their work to their peers in return for professional credit. In so doing, they also expose their findings to be confirmed or refuted, and enable other researchers to build upon their results. Dryad seeks to extend this social contract to research data by providing a model for how a disciplinary repository can motivate researchers to disclose the data that is of the greatest value for scientific reuse, that associated with publications, and realize the manifold benefits of free access to scientific data in perpetuity.
  
This wiki is used by the NESCent-MRC working group charged with establishing a repository for heterogeneous digital datasets in the field of evolutionary biology. 
+
'''What is in the repository?'''
  
The focus will be on published datasets, with tight linkages to major evolutionary biology journals and domain-specific community databases.  
+
Dryad serves as a repository for tables, spreadsheets, flatfiles, and all other kinds of published data that do not have another discipline-specific repository. The Dryad repository allows investigators to validate published findings, explore new analysis methodologies, repurpose the data for research questions unanticipated by the original authors, and perform synthetic studies such as formal meta-analyses. All data files in Dryad are available for download and reuse, except those that are under a temporary embargo period, as permitted by editors of the relevant journals. A special section of the repository named [[DryadLab]] hosts datasets of particular educational value for use in undergraduate and graduate training.
  
Some guiding principles:
+
'''How does data get submitted?'''
* Minimize the technical expertise and time required for data deposition and metadata generation
 
* Provide tools and incentives to researchers for quality metadata generation and dataset reuse
 
* Be sensitive to the intellectual property rights of researchers
 
* Ensure a self-sustaining economic model with a plan for long-term data stewardship
 
* Engage related efforts in allied fields (e.g. ecology, paleontology, genetics) and in the information science community.
 
  
This wiki contains planning documents of various sorts as well as the products of our own research into related efforts in other fields. Apart from a few specialized databases, there has historically been little cyberinfrastructure for data preservation, discovery, sharing, or synthesis for most published data in the field of evolutionary biology.  Existing systems for the storage and retrieval of heterogeneous scientific data either put a high burden of metadata generation on the individual researcher or do not capture sufficient metadata to enable resource discovery and reuse.  The provider may also be burdened by a requirement to submit different subsets of the data package to one or more specialized databases.  Finally, there is no infrastructure in the field for fine-grained and communally-shared, data access privileges that would allow different rights for individuals, collaborative groups, and the general public across multiple repositories and over the digital resource lifecycle.
+
Dryad welcomes data submissions related to any published, or accepted, scholarly publication. Any society, journal or publisher that wishes to encourage data archiving may refer authors to Dryad.
  
 +
Journals and publishers may greatly facilitate their authors' data archiving by implementing "submission integration," by which the journal manuscript submission system interfaces with Dryad. In a nutshell: the journal sends automated notifications to Dryad of new manuscripts, which enables Dryad to create a provisional record for the article's data, thereby streamlining the author's data upload process. The published article includes a link to the data in Dryad, and Dryad links to the published article. See complete documentation of this process [[Submission Integration|here.]]
  
We propose a Digital Repository for Information and Data on Evolution (DRIADE), that will be the primary home for published data in the field of evolutionary biology. Building on existing technologies and following the OAIS functional model, we will develop a number of software modules supporting digital resource lifecycle management from data ingestion to curation to discovery and reuse.  Computer-aided metadata generation and augmentation will assist the data provider in capturing metadata of sufficient richness and quality to enable advanced data discovery, reusability and data integration. Specialized modules will allow data submission to be coordinated with the manuscript review and publication process of participating journals, as well as with the submission process to external specialized databases, including NCBI (for biomolecular sequences), Treebase (for character matrices and phylogenetic trees) and Morphbank (for images). This will provide one-stop data submission for the user.  A data curator will oversee data and metadata quality control, supported by a separate data curation software module that employs automatic techniques to evaluate metadata quality.  An identity, authority and data security module will be developed to implement fine-grained data access privileges for users using global user identities. Resource discovery, sharing, and interoperability with external repositories will be enabled by implementing the OAI-PMH metadata harvesting standard supplemented by custom web services. These services will be exposed to collaborating journals, specialized data repositories, third-party content aggregators, and the DRIADE web portal itself.  Extensive evaluations and user testing will be employed throughout the design and implementation process by conducting metadata generation studies and analyzing the resulting quality of metadata content; developing data use cases; and conducting information retrieval experiments and usability studies to evaluate the effectiveness and performance of the system.  A working group of stakeholders will develop a management structure to ensure the long-term maintenance and financial sustainability of the repository.
+
'''How is Dryad governed and supported?'''
  
The proposed work pioneers the application of digital data sharing to a 'small science' discipline. It is anticipated that DRIADE will have a broad impact in making available for discovery and repurposing the data underlying hundreds of studies published annually in evolutionary biology, and staunch the ongoing loss of this body of data that could be used to drive future evolutionary discoveries, with comcomitant benefits to medicine, agriculture, conservation and basic science.
+
The Dryad organization provides a forum for all stakeholders to set priorities for the repository, participate in planning, and share knowledge and coordinate action around data policies. Members include journals, publishers, scientific societies, funding agencies, and other data centers. Dryad is incorporated as a non-profit organization registered in the US; for more information see [[Governance]].
 +
 
 +
Dryad is able to provide free access to data due to financial support from members and data submitters. Dryad’s [http://datadryad.org/pages/faq#depositing-cost Data Publishing Charges] are designed to sustain its core functions by recovering the basic costs of curating and preserving data. New innovations are enabled by research and development grants and by support from donors. For more detail about these charges, see the [http://datadryad.org/pages/pricing Pricing plan information] on the Dryad website.
 +
 
 +
As part of its commitment to sustainability, Dryad participates in the DataONE network (the Data Observation Network for Earth, [http://dataone.org http://dataone.org]), and is actively developing partnerships with other international data networks and scholarly publishing organizations.
 +
 
 +
Take a look at our [[Business Plan and Sustainability|business and sustainability plan]], our [[Grants|grant funding]], and the [[Participants|people]] who have helped develop the repository.

Latest revision as of 06:43, 29 June 2016

This site contains documentation for the Dryad repository and associated projects. To access the Dryad repository, go to DataDryad.org. For the latest news on Dryad and data archiving, check out Dryad News and Views. Please contact Dryad staff with any inquiries.

What is Dryad?

Dryad is both an international repository of data underlying peer-reviewed scientific and medical literature, and a membership organization, governed by journals, publishers, scientific societies, and other stakeholders.

Dryad is distinguished by the close association of data deposition with the process and business of scholarly publishing, and by using article publication as a model for how researchers can benefit from data sharing infrastructure. Dryad has the potential to transform the way research data are communicated and preserved. The credibility and effectiveness of the research enterprise is due in large part to the social contract behind scholarly publishing. Researchers disclose their work to their peers in return for professional credit. In so doing, they also expose their findings to be confirmed or refuted, and enable other researchers to build upon their results. Dryad seeks to extend this social contract to research data by providing a model for how a disciplinary repository can motivate researchers to disclose the data that is of the greatest value for scientific reuse, that associated with publications, and realize the manifold benefits of free access to scientific data in perpetuity.

What is in the repository?

Dryad serves as a repository for tables, spreadsheets, flatfiles, and all other kinds of published data that do not have another discipline-specific repository. The Dryad repository allows investigators to validate published findings, explore new analysis methodologies, repurpose the data for research questions unanticipated by the original authors, and perform synthetic studies such as formal meta-analyses. All data files in Dryad are available for download and reuse, except those that are under a temporary embargo period, as permitted by editors of the relevant journals. A special section of the repository named DryadLab hosts datasets of particular educational value for use in undergraduate and graduate training.

How does data get submitted?

Dryad welcomes data submissions related to any published, or accepted, scholarly publication. Any society, journal or publisher that wishes to encourage data archiving may refer authors to Dryad.

Journals and publishers may greatly facilitate their authors' data archiving by implementing "submission integration," by which the journal manuscript submission system interfaces with Dryad. In a nutshell: the journal sends automated notifications to Dryad of new manuscripts, which enables Dryad to create a provisional record for the article's data, thereby streamlining the author's data upload process. The published article includes a link to the data in Dryad, and Dryad links to the published article. See complete documentation of this process here.

How is Dryad governed and supported?

The Dryad organization provides a forum for all stakeholders to set priorities for the repository, participate in planning, and share knowledge and coordinate action around data policies. Members include journals, publishers, scientific societies, funding agencies, and other data centers. Dryad is incorporated as a non-profit organization registered in the US; for more information see Governance.

Dryad is able to provide free access to data due to financial support from members and data submitters. Dryad’s Data Publishing Charges are designed to sustain its core functions by recovering the basic costs of curating and preserving data. New innovations are enabled by research and development grants and by support from donors. For more detail about these charges, see the Pricing plan information on the Dryad website.

As part of its commitment to sustainability, Dryad participates in the DataONE network (the Data Observation Network for Earth, http://dataone.org), and is actively developing partnerships with other international data networks and scholarly publishing organizations.

Take a look at our business and sustainability plan, our grant funding, and the people who have helped develop the repository.