gms | German Medical Science

63. Jahrestagung der Deutschen Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie e. V. (GMDS)

Deutsche Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie

02. - 06.09.2018, Osnabrück

Usage of persistent identifiers to implement the FAIR guiding principles in medical research data management systems

Meeting Abstract

Search Medline for

  • Cornelius Knopp - Department of Medical Informatics, University Medical Center Göttingen, Göttingen, Deutschland
  • Christian R. Bauer - Department of Medical Informatics, University Medical Center Göttingen, Göttingen, Deutschland
  • Harald Kusch - Department of Medical Informatics, University Medical Center Göttingen, Göttingen, Deutschland; Department of Molecular Biology, University Medical Center Göttingen, Göttingen, Deutschland
  • Ulrich Sax - Department of Medical Informatics, University Medical Center Göttingen, Göttingen, Deutschland

Deutsche Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie. 63. Jahrestagung der Deutschen Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie e.V. (GMDS). Osnabrück, 02.-06.09.2018. Düsseldorf: German Medical Science GMS Publishing House; 2018. DocAbstr. 232

doi: 10.3205/18gmds172, urn:nbn:de:0183-18gmds1721

Published: August 27, 2018

© 2018 Knopp et al.
This is an Open Access article distributed under the terms of the Creative Commons Attribution 4.0 License. See license information at http://creativecommons.org/licenses/by/4.0/.


Outline

Text

Introduction: Within the scope of medical research projects, the awareness of impactful and long-term data management has been steadily growing in recent years [1]. Due to the increasing expansion of multi-center research projects, the requirements for data management regarding collaboration, interoperability and secure long-term archiving are equally rising. The resolutions of the German Research Foundation (DFG) [2] as well as the G7 and G20 Communiqués from 2016 summits [3], [4] push the FAIR principles (Findable, Accessible, Interoperable, Reusable) [5] to the focal point. One precondition for compliance with FAIR is the unambiguous identification of resources. This can be achieved by facilitating Persistent Identifiers (PID) which give each resource a long living identifier similar to a Uniform Resource Identifier (URI) or the International Standard Book Number (ISBN). We aimed to explore the practical use of PIDs in the implementation of FAIR guidelines within an active research environment.

Methods: Two open-source research data storage platforms, SEEK [6] and openBIS [7], were examined as general research support solutions with the special goal to implement PIDs. Subsequently, requirements for applicable PID services were defined regarding the prior approaches [8]. These requirements include, in addition to administrative and strategic needs, operational, technical, juridical and regulatory demands. Additionally, the requirements for the identifiers themselves were also compiled to comply with European Persistent Identifier Consortiums (ePIC) policies [9]. A resource in the sense of this approach is any research object represented by an item within the research data platform.

Results: Adhering to the defined and discovered requirements, a prototypical module for the registration of PIDs within the platform was developed. Due to its enhanced modular extensibility, the SEEK platform proved to be a more appropriate choice for this approach and was therefore chosen as the base software. The module was developed in correspondence with SEEKs software design using Ruby on Rails. The prototype assigns ePIC compliant PIDs to 100 research objects stored in SEEK and archives selected metadata of these resources in the identifier dataset via an Application Programming Interface (API). This identifier dataset contains information like author, publication year, version numberor tags. Based on the preparation and development of this module as well as of the advanced concept, the impact of the general support of the FAIR principles by PIDs especially in connection with a data storage platform was validated.

Discussion: In conclusion, persistent identification is crucial for FAIR data management, facilitating data accessibility by their identifier properties. Additionally, PIDs provide not just local but – when desired – global findability providing an external URI. PID features with the addition of inherent metadata have the potential to increase reusability of medical research data. A precise definition of the operational concept is required to obtain long-term identification. Therefore, the next steps encompass the definition of a use-case-specific operational concept and the according configuration of the prototype. Although ePIC offers a good basic functionality for immediate use in any project, the lack of a specified concept can lead to usability problems.

Acknowledgements: This work was supported by the German Federal Ministry of Education and Research (BMBF) within the framework of the research and funding concepts e:Med (01ZX1306C/sysINFLAME) and i:DSem (031L0024A/MyPathSem).

The authors declare that they have no competing interests.

The authors declare that an ethics committee vote is not required.


References

1.
Bauer C, Umbach N, Baum B, et al. Architecture of a Biomedical Informatics Research Data Management Pipeline. In: Hoerbst A, Hackl WO, de Keizer N, Prokosch HU, Hercigonja-Szkeres M, de Lusignan S, editors. Exploring complexity in health. An interdisciplinary systems approach: proceedings of MIE2016 at HEC2016. Amsterdam, Berlin, Tokyo: IOS Press; 2016.
2.
Deutsche Forschungsgemeinschaft. Informationsverarbeitung an Hochschulen. Organisation, Dienste und Systeme. Stellungnahme der Kommission für IT-Infrastruktur für 2016-2020. 2016.
3.
G7 and G8 Research Group. G7 Science and Technology Ministers' Meeting in Tsukuba, Ibaraki, Japan, May 17, 2016. Available from: http://www.g8.utoronto.ca/science/2016-tsukuba.html External link
4.
G20 Leaders. G20 Leaders’ Communique: Hangzhou Summit, Hangzhou, China, 4-5 September 2016. Available from: https://www.bundesregierung.de/Content/DE/_Anlagen/G7_G20/2016-09-04-g20-kommunique-en.pdf?__blob=publicationFile&v=6 External link
5.
Wilkinson MD, Dumontier M, Aalbersberg IJJ, et al. The FAIR Guiding Principles for scientific data management and stewardship. Scientific data. 2016;3:160018. DOI: 10.1038/sdata.2016.18 External link
6.
Wolstencroft K, Owen S, Krebs O, et al. SEEK. A systems biology data and model management platform. BMC systems biology. 2015;9(33). DOI: 10.1186/s12918-015-0174-y External link
7.
Bauch A, Adamczyk I, Buczek P, et al. openBIS. A flexible framework for managing and analyzing complex data in biology research. BMC bioinformatics. 2011;12:468. DOI: 10.1186/1471-2105-12-468 External link
8.
Hilse HW, Kothe J. Implementing persistent identifiers. London: Consortium of European Reasearch Libraries; 2006.
9.
Kálmán T, Kurzawe D, Schwardmann U. European Persistent Identifier Consortium ­ PIDs für die Wissenschaft. In: Altenhöner R, Oellers C, editors. Langzeitarchivierung von Forschungsdaten. Standards und disziplinspezifische Lösungen. Berlin: Scivero Verlag; 2012.