Artikel
Calibrating expert knowledge derived models with patient data using low dimensional representations for synthetic patient trajectories
Suche in Medline nach
Autoren
Veröffentlicht: | 6. September 2024 |
---|
Gliederung
Text
Introduction: In longitudinal clinical studies, such as those focusing on the progression of rare diseases like Epidermolysis Bullosa (EB), various interrelated measures need to be considered, such as blisters, wounds, inflammation, anaemia, and physical growth [1]. However, modelling patient trajectories is often hampered by limited observations and complex missing data patterns in individual patient data (IPD) as data is usually gathered in the course of clinical routine care. Therefore, representing expert knowledge from the scientific literature via quantitative models can help augment IPD with synthetic data. Nonetheless, this process still lacks calibration with real data. We here construct ordinary differential equations (ODEs) to represent knowledge of the natural course of EB, specifically simulating the evolution of biomarkers over time. To address the complexity and noise in real-world clinical data, we leverage deep neural networks for ODE calibration, which has shown promising results in other fields [2].
Methods: Drawing upon expert knowledge from a comprehensive literature search, we developed a system of five ordinary differential equations (ODEs) to accurately capture the dynamics of key biomarkers like C-Reactive Protein (a marker of inflammation), hemoglobin (indicative of anemia), and other variables relevant to EB progression. Initial conditions are sampled from Gaussian distributions, with the means of these distributions also being calibrated. To address the complexity to calibrate the ODE system with real data, an autoencoder, i.e., a neural network-based approach, is introduced for dimension reduction. A dataset of 100 patients observed across 5 time points, resulting in up to 500 observations (Reimer et al. 2020), is used for training. For calibrating the distributions for the initial conditions, we employ a centering approach in the latent space. We also simulate realistic data scenarios by introducing noise, missing data patterns, including complete missing variables and noise variables. The problem of missing variables is addressed by a separately trained imputation layer outside of the autoencoder. As a performance metric, we employ the mean squared error (MSE) to evaluate the accuracy of an imputation task. In addition, we investigate the Euclidean distance between true and calibrated ODE parameters in a simulation study.
Results: Our approach enables the calibration of our extensive ODE system in a low-dimensional latent space. Even when introducing a large amount of noise and complex missing data patterns, the results remain reliable, which is justified by the stability of the gradients. Furthermore, our centering approach maximizes the utilization of available information.
Discussion and conclusion: We demonstrate how to tackle the calibration challenge of a complex expert-informed synthetic data model by investigating the similarity of real and generated observations in a low-dimensional latent space learned by a neural network. Combining ODEs with an autoencoder offers benefits such as effective dimension reduction and flexibility in handling various data complexities. This enables naturalistic synthetic IPD while having control over the generative process, e.g., for potentially synthesizing realistic control groups. This approach is applicable to observational clinical data with limited sample size, where rare diseases like EB are a typical application scenario.
The authors declare that they have no competing interests.
The authors declare that an ethics committee vote is not required.
References
- 1.
- Reimer A, Hess M, Schwieger-Briel A, Kiritsi D, Schauer F, Schumann H, Bruckner-Tuderman L, Has C. Natural history of growth and anaemia in children with epidermolysis bullosa: a retrospective cohort study. Br J Dermatol. 2020;182(6):1437-1448. DOI: 10.1111/bjd.18475
- 2.
- Grassi T, Nauman F, Ramsey JP, Bovino S, Picogna G, Ercolano B. Reducing the complexity of chemical networks via interpretable autoencoders. Astron Astrophys. 2022;668. DOI: 10.1051/0004-6361/202039956