gms | German Medical Science

65th Annual Meeting of the German Association for Medical Informatics, Biometry and Epidemiology (GMDS), Meeting of the Central European Network (CEN: German Region, Austro-Swiss Region and Polish Region) of the International Biometric Society (IBS)

06.09. - 09.09.2020, Berlin (online conference)

Big Data or structured datasets – hurdles on the path towards the realization of hopes

Meeting Abstract

Search Medline for

  • Iris Pigeot - Leibniz Institute for Prevention Research and Epidemiology – BIPS, Bremen, Germany

Deutsche Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie. 65th Annual Meeting of the German Association for Medical Informatics, Biometry and Epidemiology (GMDS), Meeting of the Central European Network (CEN: German Region, Austro-Swiss Region and Polish Region) of the International Biometric Society (IBS). Berlin, 06.-09.09.2020. Düsseldorf: German Medical Science GMS Publishing House; 2021. DocAbstr. 242

doi: 10.3205/20gmds013, urn:nbn:de:0183-20gmds0131

Published: February 26, 2021

© 2021 Pigeot.
This is an Open Access article distributed under the terms of the Creative Commons Attribution 4.0 License. See license information at http://creativecommons.org/licenses/by/4.0/.


Outline

Text

Background: The term “Big Data” is ubiquitous in these days: in the context of modern media, cutting-edge technologies, data protection, but also in the context of potentials and challenges in science and research. However, it is pretty obvious that there is no clear distinction between the term “Big Data” and huge datasets, where big data only describes a small section of possibilities that huge datasets may offer for research.

Methods: Hopes related to big data comprise the use of various real time data sources to provide pressing answers on health-related issues. However, such data are typically not quality-tested and non-hypothesis driven analysis methods prevail. In contrast, big datasets can be found far more often, namely as huge research databases. Unlike the above these are characterized by quality-tested data generated by a highly structured data collection as basis for hypothesis-driven research. Between these two extremes, big databases can be found that are used to answer research questions although they were not primarily set up for this purpose like, for example, routine data of statutory health insurances. All three types of databases have their own potential to address complex research questions in health science.

Results: In this talk, we first clarify the term “Big Data” in contrast to structured datasets before we highlight the potentials of huge databases in health research and the related challenges on data quality and statistical methods by means of practical examples.

Conclusion: “Make big data as small as possible as quick as possible” (Robert Gentleman)

The authors declare that they have no competing interests.

The authors declare that an ethics committee vote is not required.