Research Data Management : Data Sharing: Find the repository for your data

This guide is aimed at assisting researchers in making the choice to share your data, prepare for it and select the appropriate repository to do so.

Search for repositories

Thinking about saving your data in a repository? Checked already for one recommended by your scientific community ( i.e. Royal Society of Chemistry, IEEE DataPort™), your intended publisher or funder and still didn't find the one for your data? There is a list of general purpose repositories (see below) that may be suitable for your data. Does your research project include the development of code? Check this list of code repositories we've compiled for your reference. You still haven't found the repository for your data? Try searching for one in:

1., a global registry of research data repositories that covers repositories from different academic disciplines. A simple search tool will help you find a list of potential suitable repositories to consider whereas the use of filters will help you refine your search. Sara Jones from DCC explains shortly how to efficiently use re3data in this short video.

2., (formerly BIOsharing) is a web-based, searchable portal of three interlinked registries: standards, databases and data policies, combined with an integrated view across all three types of resource (collections). The simple search box searches through all types of content. The search can be narrowed down to specific repositories by using the checkbox “Databases” underneath it or “Collections” in order to access information that combines more types of resource (standard, database or policy) by domain, project or organization.

General purpose repositories

Where a data-type specific repository is not available for your data needs, the following generalist repositories could be the right destination since they can handle a wide variety of data. With the exception of DRYAD these repositories can also be used to provide institutional solutions. For more information see figshare for institutions, Dataverse for institutions, Zenodo communities, OSF for institutions.

 Repository Name  Information on fees/costs  Size limits

 KAUST Repository *

 Free of charge for KAUST affiliated researchers  Recommended not to exceed:
   -  10 GB per individual file
   -  50 GB per data package  
For larger sizes please contact repository  

Dryad Digital Repository

$120 USD for first 20 GB, and $50 USD for each additional 10 GB  None stated as total.
 Individual file or package size limit 10GB

 figshare **

 All personal accounts are free

 For personal accounts:
   - private total storage 20GB
   - file size limit 5GB
   - unlimited public space
 Harvard Dataverse   Contact repository    -  2.5 GB per file
   -  10 GB per dataset

 Zenodo **

 Donations towards sustainability encouraged  50 GB per dataset
 Open Science Framework (OSF) ***  Free  5 GB per file, multiple files can be uploaded

* The KAUST Repository integrates with DataCite to provide DOIs, more details here.

** Zenodo and Figshare integrate with GitHub to make code hosted in GitHub citable by issuing DOIs.

*** OSF integrates for storage with Amazon S3, Bitbucket, Box, Dataverse, Dropbox, figshare, Github, GitLab, Google Drive, OneDrive, ownCloud and especially for code GitHub.

Information last updated 2018-05-10

What can the KAUST Repository do for your data?

The KAUST Repository intends to capture, store and preserve all research output of KAUST faculty, researchers and students (conference papers, technical reports, peer-reviewed articles, preprints, theses, images, data sets, and other research-related works), and to make it available to the research communityData uploaded to the repository are registered with DataCite and issued permanent identifiers (DOIs), making it easier for other researchers to find and cite your data, thus helping to keep the data linked to the publications whose findings it underpins.  If you release your data through your own website, the repository can also work with you to support the preservation of your data beyond the life of your website.

The repository also serves as a registry of research data that you release through specialized repositories in your discipline or other sites.  

Finally, integration with ORCID can keep your data linked to your public research profiles, while the PlumX dashboard service helps measure and give insight into how your data is used.

Which one is the appropriate repository for your data?

The Digital Curation Center in UK has published a checklist for evaluating data repositories to assist in assessing the appropriateness of a repository for one's circumstances.

Their key considerations could be summarized in the following five questions:

1.  is a reputable repository available in your research discipline? 
2.  will the repository take the data (types, format etc) you want to deposit?
3.  will it be safe in legal terms?
4.  will it make your data Findable, Accessible, Interoperable and Re-usable (FAIR)?
5.  will it support analysis and track data usage?

The Data Management Services at Johns Hopkins Libraries has compiled a list of questions to ask in determining whether a particular repository meets your data needs. 

The Data Management team in MIT Libraries have developed a data repository comparison table to facilitate the process of choosing the most appropriate repository among a list a several candidates.