Skip to main content
STORAGE OPTIONS FOR PRESERVATION AND SHARING OF THE RESEARCH DATA
1. The Hive: The University Data Repository
The Hive is a University of Utah campus data repository for storing research datasets generated by faculty, student, and staff researchers at the University of Utah. The Hive accepts publicly available datasets up to 500 GB in a wide variety of file formats. The Hive is free of charge and provides a variety of services including a structured metadata record for searching and finding data, DOI minting services, data storage for a minimum of ten years in a trusted and secure environment, and consultation with experienced data librarians. Currently, The Hive is in a pilot phase, but will be broadly available to all campus researchers in Spring 2018.
2. Center for High Performance Computing, CHPC has an archive storage solution based around object storage. This solution uses the open-source Ceph software (http://ceph.com/), a distributed object store suite developed at UC Santa Cruz. We have an initial raw capacity of 1.15PB, with a cost of $80/TB raw space. In order to calculate the cost per TB of usable space you must consider the replication configuration. Initially, we are offering a 6+3 erasure coding configuration which results in a price of $120/TB (research subsidized price; total cost of operation price for non-research usage to be determined) of usable capacity for the 5-year lifetime of the hardware. As we currently do with our group space, we will operate this space in a condominium model by reselling this space in TB chunks.
One of the key features of the archive system is that users manage the archive directly, unlike the tape archive option. Users can move data in and out of the archive storage as needed --they can archive milestone moments in their research, store an additional copy of crucial instrument data, or retrieve data as needed. This archive storage solution will be accessible via applications that use Amazon’s S3 API. GUI tools such as transmit (for Ma, https://www.panic.com/transmit/) as well as command-line tools such as s3cmd and rclone (https://rclone.org/, https://www.chpc.utah.edu/documentation/software/rclone.php) can be used to move the data. In addition, Globus (https://www.globus.org, https://www.chpc.utah.edu/documentation/software/globus.php) can be used to access this space; however note that the Globus Ceph plugin is a new tool that is still in development and should be treated as such.