Article
Investigating selection strategies for identifying biometrical techniques: a case study on group variable selection methods in R
Search Medline for
Authors
| Published: | September 15, 2023 |
|---|
Outline
Text
Introduction: The current body of biometrical literature highlights the importance of neutral comparisons to objectively evaluate biostatistical methods [1]. Guidelines have been established to assist with the design and reporting of simulation studies [2]. Nonetheless, there is a lack of agreement on the appropriate decision-making process for selecting statistical methods to compare. If one wishes to make not only neutral and objective, but also complete and practically relevant comparisons, the choice of which methods to include is itself of considerable consequence. In some cases, comparisons may be inadequate and incomplete if pertinent techniques are not considered, and selective exclusion of methods could even be exploited to achieve a desired result. To avoid this, systematic methodologies can be used to identify appropriate methods. This study investigates several such approaches.
Methods: This study examined four distinct method selection strategies: a systematic review of the literature, a systematic review of software, a selective review of the literature, and a selective review of software. These strategies were used to identify techniques implemented in R for selecting groups of variables associated with a given outcome. A single reviewer conducted the systematic reviews using the R-Package packagefinder for the software review and adhering to the PRISMA guideline for the literature review [3]. In contrast, the selective reviews drew upon pre-existing sources [4].
Results: The four selection methodologies collectively revealed 18 distinct techniques implemented in R for selecting variable groups. Most of the approaches use a penalty term to select features and feature groups. The methods were implemented in twelve different packages, all available on the Comprehensive R Archive Network (CRAN). The grpreg package provides by far the most methods with six approaches. The systematic review of software uncovered 17 approaches, the systematic literature review identified 14, the selective literature review yielded eight, and the selective software review yielded six. Among all the techniques identified, only five were common to all review strategies. The systematic software review identified three techniques that were not found by any other strategy, while the systematic literature-based review revealed one such method. The most recent of the identified methods was presented in 2021 and identified only by the systematic software review. The oldest identified method was first introduced in 1999 and identified by all strategies.
Conclusions Systematic review methodologies are preferable over selective strategies, as they were shown to be more effective in identifying a greater number of appropriate approaches. The selective reviews, by contrast, largely yielded the techniques that were identified by all strategies, implying that they only uncover the most well-established techniques. As a systematic software review can be automated to a large extent, it is generally more efficient and reproducible than a systematic literature review. However, relying solely on a software-based review has its limitations, as, for example, it may miss out on some approaches that are too simplistic to have a dedicated implementation. A combination of a literature review in conjunction with a software-based review may result in the most comprehensive identification of all relevant approaches.
The authors declare that they have no competing interests.
The authors declare that an ethics committee vote is not required.
References
- 1.
- Boulesteix AL, Binder H, Abrahamowicz M, Sauerbrei W. On the necessity and design of studies comparing statistical methods. Biometrical Journal Biometrische Zeitschrift. 2017;60(1):216-8.
- 2.
- Morris TP, White IR, Crowther MJ.Using simulation studies to evaluate statistical methods. Statistics in medicine. 2019;38(11):2074-102.
- 3.
- Buch G, Schulz A, Schmidtmann I,Strauch K, Wild PS. A systematic review and evaluation of statistical methods for group variable selection. Statistics in Medicine. 2023;42(3):331-52.
- 4.
- Huang J, Breheny P, Ma S. A selective review of group selection in high-dimensional models. Statistical science: a review journal of the Institute of Mathematical Statistics. 2012;27(4).
