gms | German Medical Science

Gesundheit – gemeinsam. Kooperationstagung der Deutschen Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie (GMDS), Deutschen Gesellschaft für Sozialmedizin und Prävention (DGSMP), Deutschen Gesellschaft für Epidemiologie (DGEpi), Deutschen Gesellschaft für Medizinische Soziologie (DGMS) und der Deutschen Gesellschaft für Public Health (DGPH)

08.09. - 13.09.2024, Dresden

Rhythmic patterns in longitudinal open clinical datasets

Meeting Abstract

  • Niklas Giesa - Institute of Medical Informatics, Charité - Universtitätsmedizin Berlin, Berlin, Germany
  • Anne Rike Flint - Institute of Medical Informatics, Charité - Universtitätsmedizin Berlin, Berlin, Germany
  • Louis Agha-Mir-Salim - Institute of Medical Informatics, Charité - Universtitätsmedizin Berlin, Berlin, Germany
  • Felix Balzer - Institute of Medical Informatics, Charité - Universtitätsmedizin Berlin, Berlin, Germany
  • Sebastian Daniel Boie - Institute of Medical Informatics, Charité - Universtitätsmedizin Berlin, Berlin, Germany

Gesundheit – gemeinsam. Kooperationstagung der Deutschen Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie (GMDS), Deutschen Gesellschaft für Sozialmedizin und Prävention (DGSMP), Deutschen Gesellschaft für Epidemiologie (DGEpi), Deutschen Gesellschaft für Medizinische Soziologie (DGMS) und der Deutschen Gesellschaft für Public Health (DGPH). Dresden, 08.-13.09.2024. Düsseldorf: German Medical Science GMS Publishing House; 2024. DocAbstr. 198

doi: 10.3205/24gmds156, urn:nbn:de:0183-24gmds1568

Veröffentlicht: 6. September 2024

© 2024 Giesa et al.
Dieser Artikel ist ein Open-Access-Artikel und steht unter den Lizenzbedingungen der Creative Commons Attribution 4.0 License (Namensnennung). Lizenz-Angaben siehe http://creativecommons.org/licenses/by/4.0/.


Gliederung

Text

Introduction: Over the past years, secondary analysis of electronic health records (EHRs) has gained popularity [1]. Most retrospective studies discarded temporal information in longitudinal data by mean or median time-aggregations [2]. Aggregations exclude rhythmic activities, such as circadian rhythms, from consideration in biomedical analyses. To investigate the extend of rhythmic activities, we aimed to identify sinusoidal patterns in vital signs within open clinical datasets.

Methods: We preprocessed two public datasets comprising longitudinal EHRs represented as time series. The MIMIC-III dataset holds hourly sampled data for over 53 thousand hospital stays in a large US-based tertiary care center. HIRID stores high-frequency data sampled every three minutes for over 34 hospitalizations in Switzerland [3].

We selected four common vital signs in both datasets, namely blood oxygen saturation (spo2), heart rate (hr), systolic and diastolic blood pressure (sysBp, diaBp). We determined descriptive statistics of time series values and the fraction of missing values inside a time series (MISS).

We applied the package CosinorPy for fitting additive cosine curves within a regression model [4]. The regression described cyclic rhythmic patterns for a 24h period. The goodness of fit was determined by the adjusted R2 ranging from 0 (no fit at all) to 1 (perfect fit). R2 scores are reported with the mean and standard-deviation (SD) across time series. We also investigated additional correlations with the Pearson correlation coefficient (PCC) ranging from -1 to 1 where 0 indicates no correlation [5].

Results: Vital signs were available with low mean MISS rates ranging from 1.98% (hr HIRID) to a maximum of 15.56% (sysBp MIMIC). MISS rates were highly inter-correlated between sysBp and diaBp (HIRID: 1.00, MIMIC: 0.96 PCC), divergent for hr and diaBp across sets (HIRID: 0.28, MIMIC: 0.90 PCC), and two times stronger in MIMIC (sums PCC: 22.22) than in HIRID (sums PCC: 10.11). SysBp values varied the most (HIRID: 23.19, MIMIC: 20.39 SD), spo2 values the least (HIRID: 2.68, MIMIC: 2.59 SD).

We evaluated six additive functions as appropriate for constructing a cosine regression model leading to the overall best fits per feature. SysBp expressed the highest R2 scores of 0.55±0.33 (mean±SD) in HIRID and 0.66±0.28 in MIMIC. Hr was fitted with similar mean fitting scores ranging from 0.54 to 0.65. Spo2 led to overall poor cosine fits, especially for wider-sampled data (MIMIC: 0.24±0.67, HIRID: 0.42±0.31 R2).

The relationship between missingness (MISS) and cosine fits (R2) was strongest for diaBp in HIRID (0.51 PCC), and for sysBp in MIMIC (0.21 PCC). Spo2 expressed the weakest of such relationships in MIMIC (0.03 PCC), hr in HIRID (-0.01 PCC).

Discussion: As expected, we found spo2 time series that expressed low temporal variability (SD) as poorly fitted with cosine functions. High frequency data (like HIRID) seem to improve cosine model fits due to narrow sampling intervals, as seen in spo2. Blood pressure values expressed positive correlations between cosine fits and time series missingness. Systolic blood pressure and heart rate best followed a 24h cyclic pattern contributing to the exploration of circadian rhythms using open clinical datasets.

The authors declare that they have no competing interests.

The authors declare that a positive ethics committee vote has been obtained.


References

1.
Sauer CM, Chen LC, Hyland SL, Girbes A, Elbers P, Celi LA. Leveraging electronic health records for data science: common pitfalls and how to avoid them. The Lancet Digital Health. 2022 Dec 1;4(12):e893-8. DOI: 10.1016/S2589-7500(22)00154-6. Externer Link
2.
MIT Critical Data, editor. Secondary Analysis of Electronic Health Records. Cham (CH): Springer; 2016. DOI: 10.1007/978-3-319-43742-2 Externer Link
3.
Goldberger AL, Amaral LA, Glass L, Hausdorff JM, Ivanov PC, Mark RG, Mietus JE, Moody GB, Peng CK, Stanley HE. PhysioBank, PhysioToolkit, and PhysioNet: components of a new research resource for complex physiologic signals. Circulation. 2000 Jun 13;101(23):E215-20. DOI: 10.1161/01.cir.101.23.e215 Externer Link
4.
Moškon M. CosinorPy: a python package for cosinor-based rhythmometry. BMC Bioinformatics. 2020 Oct 29;21(1):485. doi: 10.1186/s12859-020-03830-w Externer Link
5.
Benesty J, Chen J, Huang Y, Cohen I. Pearson Correlation Coefficient. In: Noise Reduction in Speech Processing. Berlin, Heidelberg; Springer; 2009. (Springer Topics in Signal Processing; 2). DOI: 10.1007/978-3-642-00296-0_5 Externer Link