Tools Data cards
Description of tools, year of launching, primary workflow measure, use in research phase
MANTRA
MANTRA is a free online course for those who manage digital data as part of their research project.
Bulk Rename Utility;
Renamer
PSRenamer
Free renaming tools to bulk rename files
Examples of Research Data include:
File formats used to capture, store and deliver research data are an important consideration as they influence future file/program accessibility. It is important to plan for software obsolescence.
Formats more likely to be accessible in the future are:
Examples of preferred file format choices include:
Consider migrating your data into a format with the above characteristics, in addition to keeping a copy in the original software format. Note that not all repositories are able to migrate data files to newer file formats for preservation.
For more, see the UK Data Service Recommended Formats or the Recommended Formats Statement of the Library of Congress
File names should reflect the contents of the file and include enough information to uniquely identify the data file. File names may contain information such as project acronym, study title, location, investigator, year(s) of study, data type, version number, and file type.
When choosing a file name, check for any database management limitations on file name length and use of special characters. Also, in general, lower-case names are less software and platform dependent. Avoid using spaces and special characters in file names, directory paths and field names. Automated processing, URLs and other systems often use spaces and special characters for parsing text string. Instead, consider using underscore ( _ ) or dashes ( - ) to separate meaningful parts of file names. Avoid $ % ^ & # | : and similar.
If versioning is desired a date string within the file name is recommended to indicate the version.
Avoid using file names such as mydata.dat or 1998.dat.
Description Rationale
Clear, descriptive, and unique file names may be important when your data file is combined in a directory or FTP site with your own data files or with the data files of other investigators. File names that reflect the contents of the file and uniquely identify the data file enable precise search and discovery of particular files.
Examples
An example of a good data file name:
Sevilleta_LTER_NM_2001_NPP.csv
Source: DataOne
Metadata (data about data) standards help to describe data in a consistent manner. Metadata can include descriptive information, provenance, quality and access/use of data. Here are a few standards that may be useful in describing your data for access and preservation.
USGS defines a Data Dictionary as a repository of structured data names that define and describe a resource.
See Best Practices for Data Dictionary Definitions and Usage by Northwest Environmental Data Network
Documenting your data includes capturing sufficient metadata (descriptive information) about your data in order to make it discoverable, identifiable and usable in the future. Information you capture should include some, if not all, of the following elements:
Title of the dataset or research project
Creator names of individuals or institutions responsible for creating the data
Unique Identifier that helps distinguish the data used to identify the data
Dates: Project start and end dates, release date, any other date of importance during the length of the research study
Subject: Keywords or phrases describing the subject or content of the data
Funding Agency responsible for funding the research
Intellectual Property Rights associate with the data
Language(s) in which data is generated
Sources for data derived from other sources
Geographical location or coverage where data was collected
Methodology for data collection
Version of the dataset if updated
Using sustainable metadata standards is highly recommended though to ensure that data are accessible in the future. Such standards are open (not proprietary), used widely, uncompressed, use standard encoding and contain enough information to analyze the context, content and structure of record.
Metadata schema sources
CalTech Library's File Naming Convention Worksheet
This worksheet helps researchers to build their own work file names
When searching for data, whether locally on one's machine or in external repositories, one may use a variety of search terms. In addition, data are often housed in databases or clearinghouses where a query is required in order access data. In order to reproduce the search results and obtain similar, if not the same results, it is necessary to document which terms and queries were used.
In order to reproduce a data set or result set, it is necessary to document which terms were originally used to capture that data. By documenting this information while the search is being conducted, one greatly enhances the chance of being able to reproduce the results at a later date.
Source: DataONE
Storage
Storing data reliably is an important function of data management. There are several options to store your data files -
Security
To make sure your backup system is working properly, test your system periodically. Try to retrieve data files and make sure you can read them.
The UK Data Archive provides additional guidelines on data storage, back-up, and security.
Purdue University Libraries has a very useful guide for addressing issues with sharing research data involving human subjects or other sensitive data sets.