gms | German Medical Science

65th Annual Meeting of the German Association for Medical Informatics, Biometry and Epidemiology (GMDS), Meeting of the Central European Network (CEN: German Region, Austro-Swiss Region and Polish Region) of the International Biometric Society (IBS)

06.09. - 09.09.2020, Berlin (online conference)

Sustainable Deployment of Research Infrastructure Components

Meeting Abstract

  • Theresa Bender - Department of Medical Informatics, University Medical Center Göttingen, Göttingen, Germany
  • Marcel Parciak - Department of Medical Informatics, University Medical Center Göttingen, Göttingen, Germany
  • Markus Suhr - Department of Medical Informatics, University Medical Center Göttingen, Göttingen, Germany
  • Ulrich Sax - Department of Medical Informatics, University Medical Center Göttingen, Göttingen, Germany
  • Christian R. Bauer - Department of Medical Informatics, University Medical Center Göttingen, Göttingen, Germany

Deutsche Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie. 65th Annual Meeting of the German Association for Medical Informatics, Biometry and Epidemiology (GMDS), Meeting of the Central European Network (CEN: German Region, Austro-Swiss Region and Polish Region) of the International Biometric Society (IBS). Berlin, 06.-09.09.2020. Düsseldorf: German Medical Science GMS Publishing House; 2021. DocAbstr. 241

doi: 10.3205/20gmds162, urn:nbn:de:0183-20gmds1628

Published: February 26, 2021

© 2021 Bender et al.
This is an Open Access article distributed under the terms of the Creative Commons Attribution 4.0 License. See license information at http://creativecommons.org/licenses/by/4.0/.


Outline

Text

Background: The analysis and visualization of integrated data originating from disparate sources represents an important part of modern medical research. In many projects at the University Medical Center in Göttingen we rely on tranSMART [1] as valuable research data platform [2], [3]. Due to the modular nature of tranSMART, we operate on complex configurations for each project-tailored tranSMART instance. Many tweaks and helpful extensions improve performance and make handling of tranSMART easier. Besides a local central tranSMART installation, we needed a reliable and user-friendly way to roll out tranSMART instances. It serves as an analytics tool suitable for individual projects in different infrastructures, for internal testing, and for teaching.

Methods: We operate on Docker version 3 as a robust automation and virtualization environment. Currently available tranSMART-Docker implementations from the community did not fit our demand for a feature-heavy, adaptable, but simple implementation. With no option being up-to-date and fulfilling our requirements, we selected https://github.com/dennyverbeeck/transmart-docker as a starting point for our implementation. With docker-compose six containers are orchestrated: four central containers (tranSMART application, database, Solr, Rserve), a data loading container and a web proxy. We automated all configuration and moved all user runtime options to entrypoints for quick deployment with docker images.

Results: We automated the installation process of tranSMART 16.2 with docker and docker-compose, condensing all necessary configuration to a single .env file with currently seven variables with only the domain/ip configuration being mandatory. We added database improvements like indexes. To simplify data upload, we added a transmart-batch instance that polls for new research data in external volumes. This system automatically imports an example on start-up. The core tranSMART application is built continuously by our local GitLab instance, providing updated builds. Furthermore, additional R-based plug-ins are included in our distribution. Two different NGINX configurations for the web proxy are available: supplying an existing SSL-certificate for productive systems or automatic Let's Encrypt certificates for testing and teaching. We published the created code publically (tinyurl.com/tmdocker) with corresponding docker-images and manuals available.

Conclusion: We simplified the tranSMART set-up process considerably by automating as much as possible and providing sane default configuration. This was especially important for teaching environments where we performed introductory hands-on courses for student cohorts ranging from 10 to over 50 and for giving students an easy starting point for local tranSMART plugin development. Moreover, we deploy our implementation in local projects and distribute it in the HiGHmed consortium as a simple data analytics solution [4]. Since the development of tranSMART had branched into different implementations since the release of 16.2 we are currently in the process of finding suitable current tranSMART versions for updates of our approach. To conclude, this approach enables the start of a feature-rich tranSMART filled with demo data in seconds.

Funding: This work was supported by the German Federal Ministry of Education and Research (BMBF) within the framework of the research and funding concepts of the Medical Informatics Initiative (01ZZ1802B/HiGHmed) and MyPathSem (BMBF 031L0024A), by the German Research Foundation (DFG) within the INF project of the Collaborative Research Center (CRC) 1002 and the project NMDR (DFG 315072261) and funded by Volkswagen Foundation within the project MTB-Report.

The authors declare that they have no competing interests.

The authors declare that an ethics committee vote is not required.


References

1.
Scheufele E, Aronzon D, Coopersmith R, McDuffie MT, Kapoor M, Uhrich CA, et al. tranSMART: An Open Source Knowledge Management and High Content Data Analytics Platform. AMIA Jt Summits Transl Sci Proc. 2014; 2014:96–101.
2.
Bauer CR, Umbach N, Baum B, Buckow K, Franke T, Grütz R, et al. Architecture of a Biomedical Informatics Research Data Management Pipeline. Stud Health Technol Inform. 2016; 228:262–6.
3.
Bauer CR, Knecht C, Fretter C, Baum B, Jendrossek S, Rühlemann M, et al. Interdisciplinary approach towards a systems medicine toolbox using the example of inflammatory diseases. Brief Bioinform. 2017; 18(3):479-487.
4.
Haarbrandt B, Schreiweis B, Rey S, Sax U, Scheithauer S, Rienhoff O, et al. HiGHmed – An Open Platform Approach to Enhance Care and Research across Institutional Boundaries. Methods Inf Med. 2018; 57(S 01):e66-e81.