Digital Repository of Information and Data for Evolution (DRIADE)
A joint project of and the
This wiki is used by the NESCent-MRC working group charged with establishing a repository for published data in the field of evolutionary biology.
Apart from a few specialized databases, there has historically been little cyberinfrastructure for data preservation, discovery, sharing, or synthesis for most published data in the field of evolutionary biology. Existing systems for the storage and retrieval of heterogeneous scientific data either put a high burden of metadata generation on the individual researcher or do not capture sufficient metadata to enable resource discovery and reuse. The provider may also be burdened by a requirement to submit different subsets of the data package to one or more specialized databases. Finally, there is no infrastructure in the field for fine-grained and communally-shared, data access privileges that would allow different rights for individuals, collaborative groups, and the general public across multiple repositories and over the digital resource lifecycle.
The goal of this project is to develop a Digital Repository for Information and Data on Evolution (DRIADE), that will be the primary home for published data in the field of evolutionary biology. Building on existing technologies and following the OAIS functional model, we are developing software to support digital resource lifecycle management from data ingestion to curation to discovery and reuse. Computer-aided metadata generation and augmentation will assist the data provider in capturing metadata of sufficient richness and quality to enable advanced data discovery, reusability and data integration. Specialized modules will allow data submission to be coordinated with the manuscript review and publication process of participating journals, as well as with the submission process to external specialized databases (e.g. for sequence data, phylogenies, anatomical images). This will provide one-stop data submission for the user. Data and metadata quality control are to be overseen by a curatorial staff, supported by a separate data curation software module that employs automatic techniques to evaluate metadata quality. An identity, authority and data security module will implement fine-grained data access privileges for users using global user identities. Resource discovery, sharing, and interoperability with external repositories will be enabled by implementing the OAI-PMH metadata harvesting standard supplemented by custom web services. These services will be exposed to collaborating journals, specialized data repositories, third-party content aggregators, and the DRIADE web portal itself.
Extensive evaluations and user testing are being employed throughout the design and implementation process by conducting metadata generation studies and analyzing the resulting quality of metadata content; developing data use cases; and conducting information retrieval experiments and usability studies to evaluate the effectiveness and performance of the system. A separate working group of stakeholders is charged with developing a management structure to ensure the long-term maintenance and financial sustainability of the repository.