Data Management Plan Guidance
Effective January 18, 2011, the National Science Foundation’s (NSF)
Proposal and Award Policies and Procedures Guide (PAPPG) was amended to
require submission of a Data Management Plan with all proposals. The plan is limited
to two pages and must describe how the proposal will conform to NSF policy on the
dissemination and sharing of research results.
The following components may be included in the plan:
-
the types of data, samples, physical collections, software, curriculum materials,
and other materials to be produced in the course of the project;
-
the standards to be used for data and metadata format and content (where existing
standards are absent or deemed inadequate, this should be documented along with
any proposed solutions or remedies);
-
policies for access and sharing including provisions for appropriate protection
of privacy, confidentiality, security, intellectual property, or other rights or
requirements;
-
policies and provisions for re-use, re-distribution, and the production of derivatives;
and plans for archiving data, samples, and other research products, and for preservation
of access to them.
(Optional topics to consider are provided at the bottom of this page.)
The Data Management Plan will be reviewed as an integral part of the proposal, coming
under Intellectual Merit or Broader Impacts or both, as appropriate for the scientific
community of relevance.
Individual NSF Directorates, Offices,
Divisions, Programs, or other units may issue specific guidance for the
data management requirements of their areas. If guidance specific to a particular
program is not available, then the requirements listed above will apply.
NSF DMP Template Guidance Tool – use this step-by-step tool to build your DMP
For more information visit the NSF Proposal and
Award Policies and Procedures Guide or contact your
ORA Pre-Award Administrator.
Data Management Plan Optional Topics
- Types of data
Samples, physical collections, software, curriculum materials, and other materials
to be produced in the course of the project.
- What data will be generated in the research? (Give a short description, including
amount – if known and the content of the data).
- What data types will you be creating or capturing? (e.g. experimental measures,
observational or qualitative, model simulation, processed etc.)
- How will you
capture or create the data?
- If you will be using existing data, state that fact
and include where you got it. What is the relationship between the data you are
collecting and the existing data?
- Data and Metadata Standards
Standards to be used for data and metadata format and content (where existing standards
are absent or deemed inadequate, this should be documented along with any proposed
solutions or remedies).
- Which file formats will you use for your data, and why?
- What contextual details (metadata) are needed to make the data you capture or collect
meaningful?
- How will you create or capture these details?
- What form
will the metadata take?
- Which metadata standards will you use?
- Why have
you chosen particular standards and approaches for metadata and contextual documentation?
(e.g. recourse to staff expertise, Open Source, accepted domain-local standards,
widespread usage)
- Policies for access and sharing and provisions for appropriate protection/privacy
- How will you make the data available? (Resources needed: equipment, systems, expertise,
etc.)
- When will you make the data available? (Give details of any embargo periods
for political/commercial/patent reasons.)
- What is the process for gaining access
to the data?
- Will access be chargeable?
- Does the original data collector/
creator/ principal investigator retain the right to use the data before opening
it up to wider use?
Provisions for appropriate protection of privacy, confidentiality, security, intellectual
property, or other rights or requirements;
- Are there ethical and privacy issues?
- If so, how will these be resolved? (e.g. anonymization of data, institutional ethical
committees, formal consent agreements.)
- What have you done to comply with your
obligations in your IRB Protocol?
- Is the dataset covered by copyright? If so, who owns the copyright and other intellectual
property?
- How will the dataset be licensed if rights exist? (e.g. any restrictions or delays
on data sharing needed to protect intellectual property, copyright or patentable
data.)
- Policies and provisions for re-use, re-distribution
- Will any permission restrictions need to be placed on the data?
- Which bodies/groups are likely to be interested in the data?
-
What and who are the intended or foreseeable uses / users of the data?
-
Are there any reasons not to share or re-use data? (Suggestions: ethical, non-disclosure,
etc.)
- Plans for archiving and Preservation of access
- Plans for archiving data, samples, and other research products, and Preservation
of access to them.
- What is the long-term strategy for maintaining,
curating and archiving the data?
- Which archive/repository/central database/ data centre have you identified
as a place to deposit data?
- What transformations will be necessary
to prepare data for preservation / data sharing? (e.g. data cleaning/anonymization
where appropriate.)
- What metadata/ documentation will be submitted
alongside the data or created on deposit/ transformation in order to make the data
reusable?
- What related information will be deposited (e.g. references, reports, research
papers, fonts, the original bid proposal, etc.)
- How long will/should
data be kept beyond the life of the project?
- What procedures does your
intended long-term data storage facility have in place for preservation and backup?
|