Difference between revisions of "Proposed Student Projects"

From Dryad wiki
Jump to: navigation, search
(Processing data packages with SWORD and ORE)
(Small tasks from the current development queue)
 
(4 intermediate revisions by the same user not shown)
Line 4: Line 4:
  
 
Below is a list of student-accessible projects culled from the current development queue. It may be outdated, as items are continually moved into and out of the queue:
 
Below is a list of student-accessible projects culled from the current development queue. It may be outdated, as items are continually moved into and out of the queue:
* FEAT: package-level metadata editing (propagate changes to files) https://trello.com/c/lAryQSGp
 
* FEAT: Public journal/publisher pages https://trello.com/c/f5EVVDyk
 
 
* FEAT: Improve robustness of build/deploy process https://trello.com/c/5wXNaSi0
 
* FEAT: Improve robustness of build/deploy process https://trello.com/c/5wXNaSi0
 
* FEAT: detect items with inappropriate embargo settings https://trello.com/c/YfHWSQfM
 
* FEAT: detect items with inappropriate embargo settings https://trello.com/c/YfHWSQfM
Line 12: Line 10:
 
* FEAT: automate weekly summary reports https://trello.com/c/vsCMaBqY
 
* FEAT: automate weekly summary reports https://trello.com/c/vsCMaBqY
 
* FEAT: Display of non-DOI identifiers https://trello.com/c/M1qdfd59
 
* FEAT: Display of non-DOI identifiers https://trello.com/c/M1qdfd59
* FEAT: VM image with basic Dryad install https://trello.com/c/OPcgXpTt
 
 
* FEAT: Improve curation report for profileformats https://trello.com/c/31zurH4g
 
* FEAT: Improve curation report for profileformats https://trello.com/c/31zurH4g
* FEAT: Author order - allowing easier reordering of author names during deposit https://trello.com/c/J6SZuofk
 
 
* FEAT: Altmetrics for data packages https://trello.com/c/7TVeLJNK
 
* FEAT: Altmetrics for data packages https://trello.com/c/7TVeLJNK
  
== Processing data packages with SWORD and ORE ==
+
== Improving the Dryad API ==
  
{{StatusBox | This is being removed as a student project and moved to a real Dryad development task.}}
+
Implement the proposal for a new [[API|Dryad API]].
 
 
Extend the DSpace SWORD interface to support BagIt data packages OAI-ORE descriptions. Upon receiving a package with an OAI-ORE description, it should be able to generate the equivalent data package and data files in the submission system. Extend the existing BagIt exporter to create equivalent packages for export purposes.
 
 
 
Proposed as a project for the [http://informatics.nescent.org/wiki/Phyloinformatics_Summer_of_Code_2012#A_System_for_Exchanging_Phyloinformatic_Data 2012 Phyloinformatics Summer of Code].
 
 
 
Notes:
 
* DSpace's [https://wiki.duraspace.org/display/DSPACE/ReplicationTaskSuite#ReplicationTaskSuite-ConfiguringusageofDSpaceBagItAIPFormat Replication Task Suite] has code for working with BagIt objects (beyond the BagIt transformer that is specific to Dryad).
 
* People who may be interested in collaborating:
 
** Stuart Lewis (SWORD)
 
** Hardy Pottinger (U Missouri-Rolla) was planning to build similar technology
 
** Marco Fabiani (Queen Mary University of London) is looking for similar technology
 
 
 
== Importing metadata from authoritative sources ==
 
  
History:
+
== Cleaning Temporal Metadata ==
# [http://wiki.code4lib.org/index.php/HAMR:_Human/Authority_Metadata_Reconciliation HAMR] was conceived during a hackathon at the 2011 Code4Lib conference.
 
# Proposed as a project to the [https://wiki.duraspace.org/display/GSOC/DSpace+Summer+of+Code+Ideas 2012 DSpace Summer of Code], but the organization was not accepted.
 
# Proposed (in slightly more general form) as a project for the [http://www.csse.rose-hulman.edu/srproj Rose-Hulman Senior Project program].
 
# In 2013, [[Update of Publication Metadata|re-designed as project more deeply integrated with DSpace]], added to [https://trello.com/c/0pPUMt0W the Dryad development schedule].
 
  
Build a tool that allows curators to compare DSpace metadata with metadata from authoritative sources. The tool will allow curators to see DSpace metadata alongside metadata from a system such as CrossRef or PubMed. Individual metadata fields will be color-coded according to the degree of consistency. Curators will be able to click a button for each metadata field they wish to import from the authoritative source.
+
The temporal coverage metadata is not standardized. There is a mix of actual dates, date ranges, geologic periods, and more free-form statements (e.g., "~100 MYA", "the last 50 years"). Determine a formal method for representing the various types of statements, and convert all entries to conform to this format.
 
 
== Improving the Dryad API ==
 
 
 
Implement the proposal for a new [[API|Dryad API]].
 
  
 
== Generating Reports ==
 
== Generating Reports ==
Line 60: Line 35:
 
Follow on the initial DCAP work generated for Dryad AP 2.0 to the current 3.0 version and publish the work as a DCAP compliant with the DCMI Singapore Framework.  This could be integrated into the LOD project noted above
 
Follow on the initial DCAP work generated for Dryad AP 2.0 to the current 3.0 version and publish the work as a DCAP compliant with the DCMI Singapore Framework.  This could be integrated into the LOD project noted above
  
==  HIVE and Dbpedia comparison for indexing Dryad holdings ==  
+
==  HIVE and Dbpedia comparison for indexing Dryad holdings ==
  
 
Compare HIVE vocabularies and Dbpedia's underlying terminology for indexing Dryad content.  A mapping experiment with Dryad's current search logs might also be considered.
 
Compare HIVE vocabularies and Dbpedia's underlying terminology for indexing Dryad content.  A mapping experiment with Dryad's current search logs might also be considered.

Latest revision as of 09:55, 17 July 2016

Dryad occasionally has the opportunity to work with student interns through projects like the Google Summer of Code and the DataONE internship program. This page collects ideas for projects that are suitable for student work. All of these projects provide valuable progress for Dryad. They are relatively self-contained projects, requiring a minimal amount of background knowledge before the student is able to make a meaningful contribution.

Small tasks from the current development queue

Below is a list of student-accessible projects culled from the current development queue. It may be outdated, as items are continually moved into and out of the queue:

Improving the Dryad API

Implement the proposal for a new Dryad API.

Cleaning Temporal Metadata

The temporal coverage metadata is not standardized. There is a mix of actual dates, date ranges, geologic periods, and more free-form statements (e.g., "~100 MYA", "the last 50 years"). Determine a formal method for representing the various types of statements, and convert all entries to conform to this format.

Generating Reports

Develop more reports that are needed from the list of Curator Reports.

Improve the process for generating statistics associated with Dryad. There are two types of statistics, those associated with periodic reports to stakeholders (e.g., board meetings, annual reports to funders), and those associated with the Global Statistics Display.

Publish Dryad Metadata as LOD

This goal includes 1.) registering Dryad-specific + relevant properties at an appropriate Dryad name space (e.g., datadryad.org), so that Dryad metadata published as linked data can resolve. 2.) generating current Dryad metadata (where appropriate) as linked data, following on the DataONE LOD4DataONE work completed by Aida Gandara https://notebooks.dataone.org/lod4dataone/author/aida-gandara/ [summer 2011].

Dryad Metadata DCAP (Dublin Core Application Profile)

Follow on the initial DCAP work generated for Dryad AP 2.0 to the current 3.0 version and publish the work as a DCAP compliant with the DCMI Singapore Framework. This could be integrated into the LOD project noted above

HIVE and Dbpedia comparison for indexing Dryad holdings

Compare HIVE vocabularies and Dbpedia's underlying terminology for indexing Dryad content. A mapping experiment with Dryad's current search logs might also be considered.

ORCID work

Students could work on some of the projects proposed for ORCID Integration.

Lingering Issues

The Dryad issue-tracking system contains many issues that have not been resolved. Many of these issues could be addressed by a student. A student could approach these issues in two ways:

  1. Identify individual issues that are tractable and solve them. These issues include problems like minor usability tweaks and documentation needs.
  2. Identify classes of issues that occur frequently. Develop tools/processes to either prevent these issues or to solve these issues as they occur.