Selection of data to preserve

Although a great deal of research data should be preserved for the long term, either for (re)use or to validate research results, that is not true for all data. Similarly to generating data, preservation of data costs time and money. Before the preservation, researchers should carefully select what data to retain and what to dispose of.

Some of the questions to consider include:  

  • Is there an obligation (for example, from funders) to preserve the research data for (re)use purposes? If yes, for how long? If not, are there other valid reasons? (for example, the uniqueness of the data, the importance of the data for the history of science, value of the data in terms of reuse, originality, quality, size, scale, innovative nature, or costs of data production or data useful for education?)
  • Is there an obligation to destroy the research data (for example, for ethical reasons)?
  • Is there an obligation for verification purposes? For how long?
  • Is there an obligation for general (non-academic) purposes? (for example, for cultural heritage or historical reasons)
  • How expensive is it to reproduce/preserve the data?

Before making the final selection, it is best to check whether the data adhere to the specific requirements. Some of the questions to consider include:

  • Which data formats, software, and programs are used?
  • Is metadata available and sufficient?
  • Is research data raw, analyzed, and published?
  • What about intellectual property rights, ethical aspects, third-party agreements?
  • Is there infrastructure available for preservation?

If you decide to delete or destroy data, keep in mind that this must be done following the relevant laws and regulations, the requirements set by the research funder(s), the current data management plan(s), and/or the applicable codes of conducts regarding these data.

For a detailed explanation, check the DCC’s checklist for selecting the data.

The time limit for preservation

As a researcher, you are responsible for the careful preservation of research data as well as the necessary metadata/data documentation collected throughout the project and after the project or publication date. Usually, the time limit is five or 10 years depending on the type of the project. Added to this, collected samples/data which are subject to the Nagoya Protocol and the samples/data from clinical studies should be stored up to 20 years and up to 25 years, respectively. When research data are still of potential value after passing the preservation deadline, they must be kept permanently and transferred to the University Archive.

Preservation systems

Repositories

For long-term preservation, the best way is to choose a reputable data repository, providing continued access and maximum reuse values for future research projects. This way your data will be easily findable, as it will be assigned a DOI it will make your data easier to cite, etc.

There are three types of repositories: institutional, general, and domain/discipline-specific. UAntwerpen currently does not have its own institutional repository specific for research data, but it produces and maintains an Institutional Repository for UAntwerpen. This database gives an overview of the scientific publications of the UAntwerpen researchers, makes (a survey of) the publications by researchers visible online, supports evaluations and policy in general and feeds external databases (e.g. VABB-SHW). This underlines the importance of a correct and complete input of all publications by every researcher. Publications indexed by the Web of Science are automatically included; other publications that qualify for recording have to be reported .

As a researcher, you can become a member of the specific UAntwerpen community on Zenodo, which is a general (no-discipline specific) repository. Figshare and Dryad are also good general repositories. Zenodo and Figshare are free of charge, while Dryad charges a fee. Some of the domain/discipline-specific repositories include arXiv (physics, mathematics, non-linear science, computer science, quantitative biology, quantitative finance, and statistics), GenBank (genetic sequences), Cogprints (psychology, neuroscience, and linguistics), Pangea (earth and environmental science), RePEc (Research Papers in Economics) (economics and related sciences), etc. If you need to save data that contain sensitive or confidential information, some repositories offer restricted access. You can search a repository by subject or restricted access filter using Re3data.org.

Some journals also encourage researchers to submit research data in their own repositories, or in a discipline-specific (open access) repository. In another way, some journals such as Nature, Springer Nature, and PLOS have developed a useful list of recommended discipline-specific repositories.

It is possible that some funders also have a preference for specific repositories.

After uploading your data to a repository, you can add a reference to these data (alone or in combination with a journal article/book) into the Institutional Repository of UA (IRUA), wherein all journal articles or (a chapter of) books published by UA researchers are collected.

For more detailed information, please contact the University Archive.​

Other systems for preservation

Other systems can also be applied for the preservation, such as an institutional or central server (N:\ drive), a cloud data storage (Microsoft Teams or Sharepoint), a project website, a laboratory server, etc.

For more detailed information, please contact the University Archive.