Data Management Plans

A Data Management Plan (DMP)...

  • ...describes the data you expect to acquire or generate during the course of a research project
  • ...tells how you will manage, describe, analyze, and store those data
  • ...documents what mechanisms you will use at the end of your project to share and preserve your data
  • ...is an integral part of the research process

Why Data Management Plans?

  • Funding Agencies (e.g.: NIH, NSF) require it
  • Well-managed data is easier to
    • find, understand, analyze, validate & share
  • Promote new discoveries
  • Support Open Access

Below you will find a listing of the questions that must be covered in a DMP. The answers are not necessarily long and complicated, but the information must be there.

  • Who on your team is handling data?
  • Who is responsible for the data?
    • Is the PI responsible for everything?
    • Responsibilities of other investigators?
    • Responsibilities of student assistants?
  • Who ensures the DMP is being followed?
  • How would you transfer responsibility of the data?
  • Who will bear the cost of the preparation, management, and preservation of the data?
  • What type of the data you will be generating?
    • Experimental measurements
    • Qualitative data
    • Simulations
    • Other
  • How will the data be created or captured?
  • Are you using existing data?
  • Which file formats will you use for your data and why?
  • Which standards will you use?
    • If non-standard how will you convert to an accessible format?
    • If there are no applicable standards, how will you format your data so others can make use of it?
  • How will you keep it during the project?
    • Three copies
      • Original
      • Local copy
      • Remote copy
  • Best for long-term access
    • Non-proprietary
    • Unencrypted / uncompressed
    • File names without special characters or spaces
  • Examples of preferred formats
    • Containers: TAR, GZIP, ZIP
    • Databases: XML, CSV
    • Geospatial: SHP, DBF, GeoTIFF, NetCDF
    • Moving Images: MOV, MPEG, AVI, MXF
    • Audio: WAVE, AIFF, MP3, MXF
    • Numbers/statistics: ASCII, DTA, POR, SAS, SAV
    • Images: TIFF, JPEG 2000, PDF, PNG, GIF, BMP
    • Text: PDF/A, HTML, ASCII, XML, UTF-8
    • Web Archive: WARC
  • What contextual details do you need to make the data meaningful to others?
  • Metadata
    • Descriptive records each of the data files
      • Embedded into the data files
      • External to the data files
  • What form will the metadata describing your data take?
    • Which metadata standards will you use?
    • If no applicable standard, how will you describe your data to make it accessible to others?
  • How will metadata files be generated for each data set?
    • Who will do the work of data description
  • Who would be interested in your data?
    • For how long after the project?
  • How and when will you make the data available
    • Does this match the data sharing norms practiced by those interested in your data?
  • How long will you embargo your data?
  • Will any restrictions be placed on the data?
    • Personal restrictions
    • IRB policies
    • Publisher policies
  • Are there any issues that might require limitations on the data being shared?
  • Are there group members who would need to claim intellectual property rights to the data?
    • Patents from the research
    • Academic partners
    • Corporate partners
  • What conditions will you attach to potential uses for your data?
    • Will you permit the re-use of your data?
    • Will you permit the re-distribution of your data?
  • What data sets that you plan to generate will have long-term value to others?
    • For how long?
  • How long will the data be kept beyond the life of the project?
  • What is your long-term strategy for archiving, and providing access to your data?
  • Which archive, repository, or database, is the best place to deposit your data?
    • What are the procedures for preservation and backup?
      • Are there any security measures needed when storing and distributing the data?
  • What are the procedures for forward migration of storage technologies?
  • What data preparation, description, or cleaning procedures will be necessary for archiving?
    • Quality or consistency checks
    • De-identification
    • Compliance with IRB requirements
    • Obtain consent from project members or other stakeholders
    • What metadata and other documentation will be submitted with the data?
    • Will any other related information be deposited?
  • How much will it cost to preserve and disseminate the data and how will these costs be covered?

DMP Resources

Free to Reuse with Credit