Research Data Management: Data Storage and Preservation
Why Data Storage and Preservation Matters
Proper data storage and preservation is essential to making data accessible now and in the future. Preservation specifically ensures data are accessible while retaining their integrity and authenticity over time. Selecting the proper storage location(s) and preservation methods for your data should be done in the planning stage of the Data Lifecycle and be included in Data Management Plans.
Key Components of Data Preservation
1. Select which data—and at which stage in the lifecycle—will be preserved.
2. Ensure you have adequate, descriptive metadata, so future users can correctly interpreted the data.
3. Use an open, non-proprietary, and commonly used file format.
4. Follow the 3-2-1 rule for backup copies: have 3 copies of the data stored on 2 different media with at least 1 stored off-site or in a cloud environment.
5. Document a file retention plan including who will be responsible for the data over time.
6. Select an appropriate location for storage and archiving. See Data Storage Considerations below.
Source: Smithsonian Library, "Best Practices for Storing, Archiving and Preserving Data."
Data Storage Considerations
There are several key considerations when selecting a storage location for your data. To determine the best place to store your research data, ask yourself the following questions:
- Do you have funding to support data storage in a repository?
- Is your research funded by an agency that has required or recommended repositories? See our summary of national agency requirements in the Policies and Guidelines tab of this guide.
- What is the subject of your research? Is there a subject- or discipline-specific repository that is regularly used in your field? See our Data Storage for Research Activities guide for support in choosing the right repository for your project.
Once you have selected a repository or a list of possible repositories, consider the following:
- Are there fees associated with storing your data?
- Does the repository preserve and/or back up your data? If so, is there a limited retention period and what are your storage options once the retention period has ended?
- What are the requirements for ingesting your data into the repository (README files and/or metadata, additional documentation, etc.)?
- Will a persistent identifier (PID) be minted for your dataset in this location?
- Which file formats does the repository accept? Are there file size limitations?
- Can datasets be restricted to specific users or include other measures for sensitive data? Depending on the nature of your data, is the repository HIPAA compliant?
- Is the repository open access or is there a paywall in place for users who want to access the data?
- Does the repository provide information on how users should cite your data?
Data Preservation Resources @ The U
The Hive is the University of Utah's research data repository provided by both the J. Willard Marriott Library and Eccles Health Sciences Library. It designed to broadly disseminate the intellectual contributions in research and creativity produced by the University's faculty, staff and students to ensure its longevity. Here you will find information on preparing, uploading and depositing your datasets and the corresponding documentation.For more information about depositing your data into The Hive, email us at email@example.com.
LabArchives is a general-purpose electronic lab notebook (ELN) for research groups on campus. It is cloud-based and can run on Windows, Mac, and Linux. Apps are also available for use with tablets and phones—both Android and iOS. The campus license covers LabArchives ELN Education, ELN for Research, and laboratory inventory management systems. These are available, free of charge to assist researchers in managing their research. For more information, see our LabArchives guide.
Training and Consultations
Your Research Data Librarians at the J. Willard Marriott Library are pleased to offer various training sessions and consultation appointments to assist you with all of your research data management needs. If you would like to request a consultation or schedule data management training for your class or department, contact Kaylee Alexander (firstname.lastname@example.org) or Madison Golden (email@example.com).
Center for High Performance Computing (CHPC)
In addition to deploying and operating high performance computational resources and providing advanced user support and training, CHPC serves as an expert team to broadly support the increasingly diverse research computing and data needs on campus. These needs include support for big data, big data movement, data analytics, security, virtual machines, Windows science application servers, protected environments for data mining and analysis of protected health information, and advanced networking.
DOI and ARK Identifier Minting Service
The J. Willard Marriott Library offers DOI and ARK minting services to university researchers. Faculty, graduate students, postdocs, and research associates may create up to 10 DOIs or ARKs per year by logging into the service using your UNID.
Before minting a DOI or ARK for your dataset, be sure to review our Persistent Identifiers guide and/or set up a consultation with Kaylee Alexander (firstname.lastname@example.org) or Madison Golden (email@example.com).