The United States spends billions of dollars every year to publicly support research that has resulted in critical innovations and new technologies. Unfortunately, the outcome of this work, published articles, only provides the story of the research and not the actual research itself. This often results in the publication of irreproducible studies or even falsified findings, and it requires significant resources to discern the good research from the bad. There is way to improve this process, however, and that is to publish both the article and the data supporting the research. Shared data helps researchers identify irreproducible results. Additionally, shared data can be reused in new ways to generate new innovations and technologies. We need researchers to “React Differently” with respect to their data to make the research process more efficient, transparent, and accountable to the public that funds them.
Kristin Briney is a Data Services Librarian at the University of Wisconsin-Milwaukee. She has a PhD in physical chemistry, a Masters in library and information studies, and currently works to help researchers manage their data better. She is the author of “Data Management for Researchers” and regular blogs about data best practices at dataabinitio.com.
Cite Your Data
Like any content that is used in the publication process, it is important to cite your sources, and this applies to using research data as well. Research data citation standards are still being developed, so the following elements are important when citing or providing citations for your own research data:
Data-PASS - The Data Preservation Alliance for the Social Sciences (Data-PASS) is a voluntary partnership of organizations created to archive, catalog and preserve data used for social science research.
DataCite - is a leading global non-profit organisation that provides persistent identifiers (DOIs) for research data.
Typically researchers share data via email, posting it to personal or via Google Drive or Amazon. However, these methods make it challenging to discover research data. Depositing data in repositories helps to discover, manage, cite and preserve data for the long-term. The options below by no means comprise a comprehensive list of repositories.
If you would like to suggest repositories to include in this list or need assistance with depositing your data to one of these repositories, please email email@example.com. Also, you can share sharing data with multiple repositories which will help increase visibility and preservation of your research, so carefully consider which repositories will help you achieve this.
Data produced at Rowan University (any discipline)
Rowan Digital Works- This institutional repository was created to capture, distribute, and preserve the scholarly and creative works of Rowan University faculty, researchers and students. Authors can archive their digital works in a variety of formats, including datasets. For more information on how to deposit data into Rowan Digital Works, contact firstname.lastname@example.org.
Search data repositories
re3data.org: Registry of Research Data Repositories- re3data.org is a global registry of research data repositories that covers research data repositories from different academic disciplines. It presents repositories for the permanent storage and access of data sets to researchers, funding bodies, publishers and scholarly institutions. re3data.org promotes a culture of sharing, increased access and better visibility of research data.
BioModels.Net - is a repository of computational models of biological processes. Models described from literature are manually curated and enriched with cross-references. All models are provided in the Public Domain.
DigiMorph- Digital Morphology library is a dynamic archive of information on 3D scans, animations, and high-resolution X-ray computed tomography of biological specimens.
Dryad - Dryad is an international repository of data underlying peer-reviewed articles in the basic and applied biosciences.
PLEXdb - Gene expression data for plants and plant pathogens.
Protein DataBank- Experimentally determined structures for macromolecules (protein and nucleic acids). The site includes search and visualization tools
The Cell: An Image Library- Images of all cell types from all organisms, including intracellular structures and movies or animations demonstrating functions. This project relies upon the cell biology community to populate the library.
UniProt - Access protein sequences and functional information.
XNAT Central - is a database for sharing neuroimaging and related data with select collaborators or the general community.
NIH repositories list - lists NIH-supported data repositories that make data accessible for reuse. Most accept submissions of appropriate data from NIH-funded investigators (and others), but some restrict data submission to only those researchers involved in a specific research network. Also included are resources that aggregate information about biomedical data and information sharing systems.
National Climatic Data Centre - NOAA's National Centers for Environmental Information (NCEI) is responsible for preserving, monitoring, assessing, and providing public access to the Nation's treasure of climate and historical weather data and information.
Climate Change Knowledge Portal (Beta) - is a central hub of information, data and reports about climate change around the world. Here you can query, map, compare, chart and summarize key climate and climate-related information.
National Snow and Ice Data Center (NSIDC) - NSIDC offers hundreds of scientific data sets for research, focusing on the cryosphere and its interactions. Data are from satellites and field observations. All data are free of charge.
All research requires the sharing of information and data. The general philosophy is that data are freely and openly shared. However, funding organizations and institutions may require that their investigators cite the impact of their work, including shared data. By creating a usage rights statement and including it in data documentation, users of your data will be clear what the conditions of use are, and how to acknowledge the data source.
Include a statement describing the "usage rights" management, or reference a service that provides the information. Rights information encompasses Intellectual Property Rights (IPR), copyright, cost, or various Property Rights. For data, rights might include requirements for use, requirements for attribution, or other requirements the owner would like to impose. If there are no requirements for re-use, this should be stated.
Usage rights statements should include what are appropriate data uses, how to contact the data creators, and acknowledge the data source. Researchers should be aware of legal and policy considerations that affect the use and reuse of their data. It is important to provide the most comprehensive access possible with the fewest barriers or restrictions.
There are three primary areas that need to be addressed when producing sharable data:
Privacy and confidentiality: Adhere to your institution's policy
Copyright and intellectual property (IP): Data is not copyrightable. Ensure that you have the appropriate permissions when using data that has multiple owners or copyright layers. Keep in mind that information documenting the context of data collection may be under copyright.
Licensing: Data can be licensed. The manner in which you license your data can determine its ability to be consumed by other scholars. For example the Creative Commons Zero License provides for very broad access.
If your data falls under any of the categories below there are additional considerations regarding sharing:
Rare, threatened or endangered species
Cultural items returned to their country of origin
Native American and Native Hawaiian human remains and objects
Any research involving human subjects
If you use data from other sources, you should review your rights to use the data and be sure you have the appropriate licenses and permissions.
When sharing data, or using data shared by others, researchers should be aware of any policies that might affect the use of the data. Including a usage rights statement makes clear to data repository users what the conditions of use are, and how to acknowledge the data source.