gms | German Medical Science

65th Annual Meeting of the German Association for Medical Informatics, Biometry and Epidemiology (GMDS), Meeting of the Central European Network (CEN: German Region, Austro-Swiss Region and Polish Region) of the International Biometric Society (IBS)

06.09. - 09.09.2020, Berlin (online conference)

A device agnostic analysis framework for wearables' data – learnings from a PhIIb trial

Meeting Abstract

  • Maike Ahrens - Chrestos Concept GmbH & Co. KG, Essen, Germany
  • Sebastian Voss - Chrestos Concept GmbH & Co. KG, Essen, Germany
  • Frank Kramer - Bayer AG, Wuppertal, Germany
  • Michael Kunz - Bayer AG, Berlin, Germany
  • Karl Koechert - Bayer AG, Berlin, Germany

Deutsche Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie. 65th Annual Meeting of the German Association for Medical Informatics, Biometry and Epidemiology (GMDS), Meeting of the Central European Network (CEN: German Region, Austro-Swiss Region and Polish Region) of the International Biometric Society (IBS). Berlin, 06.-09.09.2020. Düsseldorf: German Medical Science GMS Publishing House; 2021. DocAbstr. 72

doi: 10.3205/20gmds011, urn:nbn:de:0183-20gmds0112

Veröffentlicht: 26. Februar 2021

© 2021 Ahrens et al.
Dieser Artikel ist ein Open-Access-Artikel und steht unter den Lizenzbedingungen der Creative Commons Attribution 4.0 License (Namensnennung). Lizenz-Angaben siehe http://creativecommons.org/licenses/by/4.0/.


Gliederung

Text

Background: Wearables aka direct patient sensor data are a promising new type of data in interventional clinical trials. These data with heart rate as an example often come with an unprecedented resolution in the time domain and are therefore promising candidates to measure the complex dynamics of either human beings or diseases. By nature, these data are time series with thousands or up to millions of data points per single patient. In contrast, data in classical clinical trials are collected in rather large time intervals, in a typical phase-III clinical trial setting ranging from weeks up to even months. Wearables data are high dimensional implying that classical statistical theory usually applied in clinical studies shows limitations. To overcome the limitations of classical statistical methodology specific analyses frameworks using time-series and machine learning (ML)-techniques are warranted. Here, we present such a framework, that is standardized and easily scalable to data sets of almost any size.

Methods: We present analyses from a phase-IIb clinical drug development trial, in which extensive wearables data were collected, e.g. physical activity, electrocardiogram (ECG). In a first step, patient wise time series data were subjected to time series feature extraction. Subsequently, all features were used as predictors for the 6 minute walking distance (6MWD) test – which is an accepted clinical efficacy parameter in a cardiology setting – using Random Forest. Ultimately, selected features were used in linear models to estimate the effect size of the associations of features with the 6 MWD. All analyses were conducted in R using established packages for time series analyses and Random Forests. In addition, a simulation framework was implemented as well to gauge the performance of the presented techniques.

Results: The analysis framework readily identified various types of hidden signals (including pink noise, GARCH as well as markov chain based processes) in the simulated heart rate data. Out of bag prediction AUCs were ≥ 0.9 in the given settings. With this technical proof of concept, we analyzed heart rate measured in 5 minute (min) intervals for 3 days in our phase-IIb trial. We were able to identify a set of time series features that are significantly (p<0.05) associated with 6MWD. One such feature was the autocorrelation function of the residuals of a seasonal decomposition model. At lag 1 (i.e. only looking at the 5 min interval just before the actual measurement), this feature was associated with 6MWD. The analysis suggested, that subjects with a high auto-correlation at lag 1 walk shorter distances than subjects with less auto-correlation.

Conclusion: The analysis with both the simulated as well as the real clinical data show that the suggested analysis framework is suitable for identifying and quantifying clinically relevant signals from time series data measured by wearables. Since the derived features are mathematical characteristics, the proposed analysis flow may be used for time series data from just any device. As for the exact definition of these features, they allow for biological interpretation as well, a very useful property in clinical development.

Competing interests:

  • Maike Ahrens: paid consultancy for Bayer AG
  • Sebastian Voss: paid consultancy for Bayer AG
  • Frank Kramer: salaried employee of Bayer AG
  • Michael Kunz: owns stock of Bayer AG, is salaried employee of Bayer AG
  • Karl Köchert: owns stock of Bayer AG, is salaried employee of Bayer AG

The authors declare that a positive ethics committee vote has been obtained.