gms | German Medical Science

65th Annual Meeting of the German Association for Medical Informatics, Biometry and Epidemiology (GMDS), Meeting of the Central European Network (CEN: German Region, Austro-Swiss Region and Polish Region) of the International Biometric Society (IBS)

06.09. - 09.09.2020, Berlin (online conference)

Influence of single observations on the choice of the penalty parameter in ridge regression

Meeting Abstract

Search Medline for

  • Kristoffer Hellton - Norwegian Computing Center, Oslo, Norway
  • Camilla Lingjærde - Norwegian Computing Center, Oslo, Norway
  • Riccardo De Bin - University of Oslo, Oslo, Norway

Deutsche Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie. 65th Annual Meeting of the German Association for Medical Informatics, Biometry and Epidemiology (GMDS), Meeting of the Central European Network (CEN: German Region, Austro-Swiss Region and Polish Region) of the International Biometric Society (IBS). Berlin, 06.-09.09.2020. Düsseldorf: German Medical Science GMS Publishing House; 2021. DocAbstr. 43

doi: 10.3205/20gmds112, urn:nbn:de:0183-20gmds1121

Published: February 26, 2021

© 2021 Hellton et al.
This is an Open Access article distributed under the terms of the Creative Commons Attribution 4.0 License. See license information at http://creativecommons.org/licenses/by/4.0/.


Outline

Text

Penalized regression methods, such as ridge regression, heavily rely on the choice of a tuning, or penalty, parameter, which is often computed via cross-validation. Discrepancies in the value of the penalty parameter may lead to substantial differences in regression coefficient estimates and predictions. In this paper, we investigate the effect of single observations on the optimal choice of the tuning parameter, showing how the presence of influential points can dramatically change it. We distinguish between points as “expanders” and “shrinkers”, based on their effect on the model complexity. Our approach supplies a visual exploratory tool to identify influential points, naturally implementable for high-dimensional data where traditional approaches usually fail. Applications to real data examples, both low- and high-dimensional, and a simulation study are presented.

The authors declare that they have no competing interests.

The authors declare that an ethics committee vote is not required.