Difference between revisions of "Citing Data"

From Dryad wiki
Jump to: navigation, search
m (Proposed format for Dryad)
m
(42 intermediate revisions by 4 users not shown)
Line 1: Line 1:
==Proposed format for Dryad==
+
This page presents Dryad data citation guidelines, as well as some from other repositories and organizations.
 +
 +
==Dryad Data Citation Guidelines==
  
Suggested wording for answering the question, How should data in Dryad be cited?
+
Detailed data citation guidelines for Dryad are described on the Depositing page [http://www.datadryad.org/using#howCite here]. We suggest: '''When citing data found in Dryad, please cite both the original article, as well as the Dryad data package.'''  Both of these citations are found on the Dryad page for each data package.
  
When using a data package archived in Dryad, always cite
+
Recommendations for formatting the Dryad DOI are to present it with the web prefix http://dx.doi.org. This is described on the Depositing page [http://datadryad.org/depositing#howFormat here].
the original paper associated with the data, using the
 
normal format of the journal you are writing for.
 
  
In addition, it may sometimes be useful to cite a specific
+
Data citation practices are actively evolving and vary considerably among journals. Dryad does not have a recommendation for the location of data citations in the original, data-sharing, article at this time. Some publishing organizations, such as [http://www.crossref.org/10quarterly/quarterly.html#dois_in_use CrossRef], Springer and [http://gigasciencejournal.com/authors/instructions/research#formatting-references BioMed Central], recommend reporting the data both in the text (e.g., within a Methods or a dedicated Data Resources section) and in the References.
data file directly, particularly if a data file is used
 
that needs to be distinguished from the rest of the data
 
package. When a specific data file is cited, please use a
 
format such as the following, using the handle from the
 
specific data file:
 
  
Brian Sidlauskas (2007). Data file from: Testing for Unequal Rates of Morphological
+
An example of how to reference the data in the text would be:
Diversification in the Absence of a Detailed Phylogeny: A Case Study From
 
Characiform Fishes. Evolution 61:299–316.  Dryad Digital Repository.
 
http://hdl.handle.net/10255/dryad.20
 
  
 +
* Data deposited in the Dryad repository: http://dx.doi.org/10.5061/dryad.585t4
  
Mike Whitlock's Notes: Include the return after the first sentence to set
+
An example of a full citation to the Heneghan et al. (2011) data package would be:
it aside. It is the most important point of the whole text.
 
Don’t include the DOI for the paper. Normal paper citations
 
do not include the DOI, and it is confusing here whether it
 
might allude to the paper or the data.
 
  
==Scholarly articles==
+
* Heneghan C, Thompson M, Billingsley M, Cohen, D (2011) Data from: Medical-device recalls in the UK and the device-regulation process: retrospective review of safety notices and alerts. Dryad Digital Repository. http://dx.doi.org/10.5061/dryad.585t4
 +
 
 +
Regardless of where the data citations are placed, it is important that the DOI be included. Also note that while this may be written as 'doi:10.5061/dryad.585t4', the full URL address 'http://dx.doi.org/10.5061/dryad.585t4' is generally preferred.
 +
 
 +
== Other Standards and Proposals for Data Citation ==
  
 
* Peter Buneman's [http://homepages.inf.ed.ac.uk/opb/papers/ssdbm2006.pdf thoughts on making identifiers citable].
 
* Peter Buneman's [http://homepages.inf.ed.ac.uk/opb/papers/ssdbm2006.pdf thoughts on making identifiers citable].
Line 33: Line 26:
 
** Sample citation based on minimum recommended components:
 
** Sample citation based on minimum recommended components:
 
*** Micah Altman; Karin MacDonald; Michael P. McDonald, 2005, "Computer Use in Redistricting", hdl:1902.1/AMXGCNKCLU UNF:3:J0PkMygLPfIyT1E/8xO/EA== http://id.thedata.org/hdl%3A1902.1%2FAMXGCNKCLU
 
*** Micah Altman; Karin MacDonald; Michael P. McDonald, 2005, "Computer Use in Redistricting", hdl:1902.1/AMXGCNKCLU UNF:3:J0PkMygLPfIyT1E/8xO/EA== http://id.thedata.org/hdl%3A1902.1%2FAMXGCNKCLU
** Useful comments on '''versioning''': "We recommend versions of the same data set be given new identifiers and treated as separate data sets, with links back to the prior version kept in the metadata describing that data set. Forward links to new versions from the original are easily accomplished via a metadata search on the unique global identifier. New versions of very large data sets (relative to available storage capacity) can be kept by creating a new object that contains only differences from the original, and describing how to combine the differences with the original on the object's metadata description page. Version changes should be reflected by a change in the date, and may also be noted in the title, or by using the extended citation elements."
+
** Useful comments on '''versioning''': "''We recommend versions of the same data set be given new identifiers and treated as separate data sets, with links back to the prior version kept in the metadata describing that data set.'' Forward links to new versions from the original are easily accomplished via a metadata search on the unique global identifier. New versions of very large data sets (relative to available storage capacity) can be kept by creating a new object that contains only differences from the original, and describing how to combine the differences with the original on the object's metadata description page. Version changes should be reflected by a change in the date, and may also be noted in the title, or by using the extended citation elements."
 
* [http://dx.doi.org/10.1787/603233448430 We Need Publishing Standards for Datasets and Data Tables] - White paper from OECD publishing.
 
* [http://dx.doi.org/10.1787/603233448430 We Need Publishing Standards for Datasets and Data Tables] - White paper from OECD publishing.
 
** Advocates a slightly more verbose citation standard than Altman & King. (includes a comparison table for the two standards)
 
** Advocates a slightly more verbose citation standard than Altman & King. (includes a comparison table for the two standards)
 
** In the new system being built by OECD, "All the DOIs for the datasets and tables will be deposited with CrossRef, ready for other publishers to use."
 
** In the new system being built by OECD, "All the DOIs for the datasets and tables will be deposited with CrossRef, ready for other publishers to use."
 
+
* [http://libraries.mit.edu/guides/subjects/data/access/citing.html MIT Libraries - Social Science Data Services]
==Guides from other initiatives and institutions==
+
* [http://www.icdp-online.org/contenido/std-doi/ STD-DOI project] from the German Science Foundation
# [http://www.icpsr.umich.edu/org/citation.html Interuniversity Consortium for Political and Social Research (ICPSR) - Citing Electronic Data Files]
+
** [http://www.icdp-online.org/contenido/std-doi/front_content.php?client=8&lang=7&idcat=1085&idart=182&m=&s= Citation of Data]
# [http://www.cdc.gov/nchs/howto/citelec.htm National Center for Health Statistics (NCHS) - How to Cite Electronic Media]
+
** SOME EXAMPLES LISTED:
# [http://libraries.mit.edu/guides/subjects/data/access/citing.html MIT Libraries - Social Science Data Services]
+
*** Nozawa, Toru (2004): IPCC-DDC_CCSRNIES_SRES_B2: 211 YEARS MONTHLY MEANS, National Institute for Environmental Studies and Center for Climate System Research Japan, WDCC. doi:10.1594/WDCC/CCSRNIES_SRES_B2
# [http://gcmd.nasa.gov/records/CIESIN_SEDAC_CITATIONS.html Socioeconomic Data and Applications Center (SEDAC) Guide for Citing Data, Applications and Web Resources]
+
*** Kamm,H; Machon, L; Donner, S (2004): Gas Chromatography (KTB Field Lab), GFZ Potsdam. doi:10.1594/GFZ/ICDP/KTB/ktb-geoch-gaschr-p
#* [http://sedac.ciesin.columbia.edu/citations/ SEDAC Citation Guideline and Index]
+
*** Stein, R.; Fahl, K. (2003): Distribution of grain size and clay minerals in surface sediments of the Kara Sea, PANGAEA, doi:10.1594/PANGAEA.119754.
# [http://www.icdp-online.org/contenido/std-doi/ STD-DOI project] from the German Science Foundation
 
#* [http://www.icdp-online.org/contenido/std-doi/front_content.php?client=8&lang=7&idcat=1085&idart=182&m=&s= Citation of Data]
 
#** SOME EXAMPLES LISTED:
 
#*** Nozawa, Toru (2004): IPCC-DDC_CCSRNIES_SRES_B2: 211 YEARS MONTHLY MEANS, National Institute for Environmental Studies and Center for Climate System Research Japan, WDCC. doi:10.1594/WDCC/CCSRNIES_SRES_B2
 
#*** Kamm,H; Machon, L; Donner, S (2004): Gas Chromatography (KTB Field Lab), GFZ Potsdam. doi:10.1594/GFZ/ICDP/KTB/ktb-geoch-gaschr-p
 
#*** Stein, R.; Fahl, K. (2003): Distribution of grain size and clay minerals in surface sediments of the Kara Sea, PANGAEA, doi:10.1594/PANGAEA.119754.
 
 
 
== Other Repository Standards ==
 
 
 
 
* [http://daac.ornl.gov/ ORNL DAAC] users cite both the paper and the dataset.
 
* [http://daac.ornl.gov/ ORNL DAAC] users cite both the paper and the dataset.
* What does Pangaea say about citing its content?
+
* [http://thedata.org/citation Dataverse]  has a model citation format
** There is a statement at the top of most pages: Always quote citation when using data! Otherwise there does not seem to be an explicit statement or anything about citations, but when looking at a record, the following is listed:
+
* [http://wiki.pangaea.de/wiki/Citation Pangaea]:
 +
** From the About page: Each dataset can be identified, shared, published and cited by using a Digital Object Identifier (DOI). Data are archived as supplements to publications or as citable data collections. Citations are available through the catalog of the German National Library of Science and Technology (TIBORDER).
 +
** There is a statement at the top of most pages: Always quote citation when using data! When looking at a record, the following is listed:
 
*** Citation: Barker, Peter F; Kennett, James P; Shipboard Scientific Party (2005): Core section summary of Hole 113-690C, doi:10.1594/PANGAEA.253771
 
*** Citation: Barker, Peter F; Kennett, James P; Shipboard Scientific Party (2005): Core section summary of Hole 113-690C, doi:10.1594/PANGAEA.253771
 
*** Reference(s): Barker, Peter F; Kennett, James P; et al (1988): Proceedings of the Ocean Drilling Program, Initial Reports, College Station, Texas (Ocean Drilling Program), 113, 785 pp<br />ODP/TAMU (2005): Janus Database (data copied from JANUS to PANGAEA February to June 2005), Ocean Drilling Program, Texas A&M University, College Station TX 77845-9547, USA, http://odp.pangaea.de/database/
 
*** Reference(s): Barker, Peter F; Kennett, James P; et al (1988): Proceedings of the Ocean Drilling Program, Initial Reports, College Station, Texas (Ocean Drilling Program), 113, 785 pp<br />ODP/TAMU (2005): Janus Database (data copied from JANUS to PANGAEA February to June 2005), Ocean Drilling Program, Texas A&M University, College Station TX 77845-9547, USA, http://odp.pangaea.de/database/
Line 61: Line 47:
 
* [http://www.eol.org/content/page/citing Encyclopedia of Life] citation recommendations
 
* [http://www.eol.org/content/page/citing Encyclopedia of Life] citation recommendations
 
** Listed example: Hancock, John. 2009. "Xysticus posti: Diagnostic description." Edited by David Shorthouse. In The Nearctic Spider Database. Accessed 15 January 2009, available from Encyclopedia of Life, http://eol.org/pages/1210360.
 
** Listed example: Hancock, John. 2009. "Xysticus posti: Diagnostic description." Edited by David Shorthouse. In The Nearctic Spider Database. Accessed 15 January 2009, available from Encyclopedia of Life, http://eol.org/pages/1210360.
* Treebase?
 
* Fishbase?
 
 
* [http://www.ncdc.noaa.gov/paleo/citation.html NOAA Paleoclimatology Program - Data Citation]
 
* [http://www.ncdc.noaa.gov/paleo/citation.html NOAA Paleoclimatology Program - Data Citation]
 
** Example:
 
** Example:
Line 74: Line 58:
 
* U.S. Geological Survey's Earth Resources Observation and Science (EROS) Center/NASA's Land Processes Distributed Active Archive Center (LP DAAC)
 
* U.S. Geological Survey's Earth Resources Observation and Science (EROS) Center/NASA's Land Processes Distributed Active Archive Center (LP DAAC)
 
** Only request acknowledgement, i.e., "Data available from the U.S. Geological Survey" or "These data are distributed by the Land Processes Distributed Active Archive Center (LP DAAC), located at USGS/EROS, Sioux Falls, SD. http://lpdaac.usgs.gov."
 
** Only request acknowledgement, i.e., "Data available from the U.S. Geological Survey" or "These data are distributed by the Land Processes Distributed Active Archive Center (LP DAAC), located at USGS/EROS, Sioux Falls, SD. http://lpdaac.usgs.gov."
** SIMILARLY: the [http://daac.gsfc.nasa.gov/data_citation.shtml Goddard Earth Science Data and Information Services Center] requires a statement like: "The data used in this study were acquired as part of the activities of the NASA Earth-Sun System Division, and are archived and distributed by the Goddard Earth Sciences (GES) Data and Information Services Center (DISC)"
+
** SIMILARLY: the Goddard Earth Science Data and Information Services Center requires a statement like: "The data used in this study were acquired as part of the activities of the NASA Earth-Sun System Division, and are archived and distributed by the Goddard Earth Sciences (GES) Data and Information Services Center (DISC)"
** SIMILARLY: [http://podaac.jpl.nasa.gov/WEB_INFO/citations.html PO.DAAC - Crediting PO.DAAC Data Products, Images and Services]
+
** SIMILARLY: PO.DAAC - Crediting PO.DAAC Data Products, Images and Services
 
*** "Please provide acknowledgement of the use of PO.DAAC data products, images, and services in publications or presentations."
 
*** "Please provide acknowledgement of the use of PO.DAAC data products, images, and services in publications or presentations."
  
 
[[Category:Project Management]]
 
[[Category:Project Management]]
 
[[Category:Metadata]]
 
[[Category:Metadata]]

Revision as of 09:56, 30 May 2012

This page presents Dryad data citation guidelines, as well as some from other repositories and organizations.

Dryad Data Citation Guidelines

Detailed data citation guidelines for Dryad are described on the Depositing page here. We suggest: When citing data found in Dryad, please cite both the original article, as well as the Dryad data package. Both of these citations are found on the Dryad page for each data package.

Recommendations for formatting the Dryad DOI are to present it with the web prefix http://dx.doi.org. This is described on the Depositing page here.

Data citation practices are actively evolving and vary considerably among journals. Dryad does not have a recommendation for the location of data citations in the original, data-sharing, article at this time. Some publishing organizations, such as CrossRef, Springer and BioMed Central, recommend reporting the data both in the text (e.g., within a Methods or a dedicated Data Resources section) and in the References.

An example of how to reference the data in the text would be:

An example of a full citation to the Heneghan et al. (2011) data package would be:

  • Heneghan C, Thompson M, Billingsley M, Cohen, D (2011) Data from: Medical-device recalls in the UK and the device-regulation process: retrospective review of safety notices and alerts. Dryad Digital Repository. http://dx.doi.org/10.5061/dryad.585t4

Regardless of where the data citations are placed, it is important that the DOI be included. Also note that while this may be written as 'doi:10.5061/dryad.585t4', the full URL address 'http://dx.doi.org/10.5061/dryad.585t4' is generally preferred.

Other Standards and Proposals for Data Citation

  • Peter Buneman's thoughts on making identifiers citable.
  • A Proposed Standard for the Scholarly Citation of Quantitative Data by Micah Altman and Gary King.
    • Summary of article: Citations to numerical data should include, at a minimum, six required components. The first three components are traditional, directly paralleling print documents. They include the author(s) of the data set, the date the data set was published or otherwise made public, and the data set title. The other three are: a unique global identifier, a universal numeric fingerprint, and a bridge service. They are also designed to take advantage of the digital form of quantitative data.
    • Sample citation based on minimum recommended components:
    • Useful comments on versioning: "We recommend versions of the same data set be given new identifiers and treated as separate data sets, with links back to the prior version kept in the metadata describing that data set. Forward links to new versions from the original are easily accomplished via a metadata search on the unique global identifier. New versions of very large data sets (relative to available storage capacity) can be kept by creating a new object that contains only differences from the original, and describing how to combine the differences with the original on the object's metadata description page. Version changes should be reflected by a change in the date, and may also be noted in the title, or by using the extended citation elements."
  • We Need Publishing Standards for Datasets and Data Tables - White paper from OECD publishing.
    • Advocates a slightly more verbose citation standard than Altman & King. (includes a comparison table for the two standards)
    • In the new system being built by OECD, "All the DOIs for the datasets and tables will be deposited with CrossRef, ready for other publishers to use."
  • MIT Libraries - Social Science Data Services
  • STD-DOI project from the German Science Foundation
    • Citation of Data
    • SOME EXAMPLES LISTED:
      • Nozawa, Toru (2004): IPCC-DDC_CCSRNIES_SRES_B2: 211 YEARS MONTHLY MEANS, National Institute for Environmental Studies and Center for Climate System Research Japan, WDCC. doi:10.1594/WDCC/CCSRNIES_SRES_B2
      • Kamm,H; Machon, L; Donner, S (2004): Gas Chromatography (KTB Field Lab), GFZ Potsdam. doi:10.1594/GFZ/ICDP/KTB/ktb-geoch-gaschr-p
      • Stein, R.; Fahl, K. (2003): Distribution of grain size and clay minerals in surface sediments of the Kara Sea, PANGAEA, doi:10.1594/PANGAEA.119754.
  • ORNL DAAC users cite both the paper and the dataset.
  • Dataverse has a model citation format
  • Pangaea:
    • From the About page: Each dataset can be identified, shared, published and cited by using a Digital Object Identifier (DOI). Data are archived as supplements to publications or as citable data collections. Citations are available through the catalog of the German National Library of Science and Technology (TIBORDER).
    • There is a statement at the top of most pages: Always quote citation when using data! When looking at a record, the following is listed:
      • Citation: Barker, Peter F; Kennett, James P; Shipboard Scientific Party (2005): Core section summary of Hole 113-690C, doi:10.1594/PANGAEA.253771
      • Reference(s): Barker, Peter F; Kennett, James P; et al (1988): Proceedings of the Ocean Drilling Program, Initial Reports, College Station, Texas (Ocean Drilling Program), 113, 785 pp
        ODP/TAMU (2005): Janus Database (data copied from JANUS to PANGAEA February to June 2005), Ocean Drilling Program, Texas A&M University, College Station TX 77845-9547, USA, http://odp.pangaea.de/database/
    • Example: Stein, R.; Fahl, K. (2003): Distribution of grain size and clay minerals in surface sediments of the Kara Sea, PANGAEA, doi:10.1594/PANGAEA.119754.
  • Encyclopedia of Life citation recommendations
    • Listed example: Hancock, John. 2009. "Xysticus posti: Diagnostic description." Edited by David Shorthouse. In The Nearctic Spider Database. Accessed 15 January 2009, available from Encyclopedia of Life, http://eol.org/pages/1210360.
  • NOAA Paleoclimatology Program - Data Citation
    • Example:
      • General form for citing published World Data Center for Paleoclimatology Data: Anderson, D.W., W.L. Prell, and N.J. Barratt. 1989. Estimates of sea surface temperature in the Coral Sea at the last glacial maximum. Paleoceanography 4(6):615-627. Data archived at the World Data Center for Paleoclimatology, Boulder, Colorado, USA.
    • Related: Notices when using National Climate Data Center data:
      • "Please acknowledge contributors and where appropriate, data cooperatives (e.g. International Tree-Ring Data Bank), when using these data."
      • NOTE: PLEASE CITE ORIGINAL REFERENCE WHEN USING THIS DATA!!!!!
      • Also offered by NCDC:
        • SUGGESTED DATA CITATION: Shen, G.T. and E.A. Boyle. 2004.
          Lead in Corals Data.
          IGBP PAGES/World Data Center for Paleoclimatology
          Data Contribution Series #2004-096.
          NOAA/NGDC Paleoclimatology Program, Boulder CO, USA.
        • ORIGINAL REFERENCE: Shen, G.T. and E.A. Boyle. 1987.
          Lead in corals: reconstruction of historical industrial fluxes to the surface ocean.
          Earth and Planetary Science Letters 82: 289-304.
  • U.S. Geological Survey's Earth Resources Observation and Science (EROS) Center/NASA's Land Processes Distributed Active Archive Center (LP DAAC)
    • Only request acknowledgement, i.e., "Data available from the U.S. Geological Survey" or "These data are distributed by the Land Processes Distributed Active Archive Center (LP DAAC), located at USGS/EROS, Sioux Falls, SD. http://lpdaac.usgs.gov."
    • SIMILARLY: the Goddard Earth Science Data and Information Services Center requires a statement like: "The data used in this study were acquired as part of the activities of the NASA Earth-Sun System Division, and are archived and distributed by the Goddard Earth Sciences (GES) Data and Information Services Center (DISC)"
    • SIMILARLY: PO.DAAC - Crediting PO.DAAC Data Products, Images and Services
      • "Please provide acknowledgement of the use of PO.DAAC data products, images, and services in publications or presentations."