Skip to Main Content
All University of Utah libraries course and research guides, in one place.
Data repositories are being developed as a result of federal agencies, journals, and scholarly societies mandating that researchers share the data resulting from their research. Some data repositories have been around for over 50 years, but most have been developed since the early '90s. Repositories have been developed by governments, scholarly societies, institutions and universities. They can be open (all are invited to deposit and use the data) or closed (only members may deposit or use the data).
When selecting a repository take into consideration that "publishing" datasets is becoming like publishing your research paper – you want it to be found and cited. Eventually, datasets may be a part of the tenure and promotion procedure (At the U the conversation has yet to begin). Web of Science folks have already developed Data Citation (like Science Citation for research data). Researchers are starting to cite datasets along with their other references. See the tab, CITING DATA for additional information.
Before settling on a repository consider-
The subject(s) the repository will allow in their system.
Your funding agencies may have a specific repository for your datasets.
The journals in which you will be publishing may have a specific repository for your datasets or require it be in an open access repository, e.g. PLoS.
Your scholarly society and colleagues may already be depositing datasets in a repository. Talk to them.
If you were required to write a data management plan to include with your grant proposal, what did you say you about sharing your research data?
Check out the cost for using the repository. Do you have the funding to cover it? The cost to deposit and/or the maintenance fees depends on the repository. Not all repositories will charge to deposit your research data. If it is a repository requiring membership, then either the researcher must belong or the researcher's institution must belong. The Marriott Library pays for membership to ICPSR (Inter-university Consortium for Political and Social Research) so everyone on campus conducting research in social science can use it and deposit datasets.
Check to see if the repository is able to preserve (not just backing up) your datasets. Does it have the technology and policy in place for preservation to ensure your datasets will be maintained for use in the future?
Check out the metadata and vocabulary requirements being used by the repository. This information should include enough information about how the project was conducted so that it can be replicated. Your discipline may have already developed a standard vocabulary.
Check out what file formats are acceptable. Usually file names should include only letters, numbers, dashes ("-"), underscores ("_") and should all be lower-case. The repository may have additional restrictions.
Check to make sure the datasets receive persistent identifiers, PIDs to identify the dataset. A DOI is the most commonly used PID for datasets and publications. ARKs can be deleted so are not useful PIDs for datasets. PIDs are used to link the datasets with the publications.
Check to see if your datasets can be restricted to specific users, if it is sensitive data. Can the datasets be restricted for a specific time period?
Does the repository provide information on how to cite data reused by others? If you are going to do all the work of depositing your data you may as well receive credit for it.
There are other ways of sharing your research data besides depositing in a repository. See the tab NOT ONLY REPOSITORIES for additional information. You can also publish a paper describing your datasets. See the tab DATA JOURNALS for additional information.