Article
Big Data or structured datasets – hurdles on the path towards the realization of hopes
Search Medline for
Authors
Published: | February 26, 2021 |
---|
Outline
Text
Background: The term “Big Data” is ubiquitous in these days: in the context of modern media, cutting-edge technologies, data protection, but also in the context of potentials and challenges in science and research. However, it is pretty obvious that there is no clear distinction between the term “Big Data” and huge datasets, where big data only describes a small section of possibilities that huge datasets may offer for research.
Methods: Hopes related to big data comprise the use of various real time data sources to provide pressing answers on health-related issues. However, such data are typically not quality-tested and non-hypothesis driven analysis methods prevail. In contrast, big datasets can be found far more often, namely as huge research databases. Unlike the above these are characterized by quality-tested data generated by a highly structured data collection as basis for hypothesis-driven research. Between these two extremes, big databases can be found that are used to answer research questions although they were not primarily set up for this purpose like, for example, routine data of statutory health insurances. All three types of databases have their own potential to address complex research questions in health science.
Results: In this talk, we first clarify the term “Big Data” in contrast to structured datasets before we highlight the potentials of huge databases in health research and the related challenges on data quality and statistical methods by means of practical examples.
Conclusion: “Make big data as small as possible as quick as possible” (Robert Gentleman)
The authors declare that they have no competing interests.
The authors declare that an ethics committee vote is not required.