gms | German Medical Science

15. Deutscher Kongress für Versorgungsforschung

Deutsches Netzwerk Versorgungsforschung e. V.

5. - 7. Oktober 2016, Berlin

Pay for Performance (P4P) in hospitals: an analysis of the effectiveness under the consideration of context and program design factors

Meeting Abstract

  • Tim Mathes - Institut für Forschung in der Operativen Medizin, Abteilung für Evidenzbasierte Versorgungsforschung, Köln, Deutschland
  • Dawid Pieper - Institut für Forschung in der Operativen Medizin, Abteilung für Evidenzbasierte Versorgungsforschung, Köln, Deutschland
  • Johannes Morche - Gemeinsamer Bundesausschuss (GBA), Berlin, Deutschland
  • Michaela Eikermann - Medizinischer Dienst des Spitzenverbandes Bund der Krankenkassen (MDS), Essen, Deutschland

15. Deutscher Kongress für Versorgungsforschung. Berlin, 05.-07.10.2016. Düsseldorf: German Medical Science GMS Publishing House; 2016. DocFV03

doi: 10.3205/16dkvf055, urn:nbn:de:0183-16dkvf0553

Published: September 28, 2016

© 2016 Mathes et al.
This is an Open Access article distributed under the terms of the Creative Commons Attribution 4.0 License. See license information at



Background: “Pay-for-performance (P4P) programs are designed to offer financial incentives to meet defined quality, efficiency, or other targets”. In some health care systems programs were initiated that link the reimbursement of a hospital to its quality. The main aim was to counteract unintended effects of the basic payment scheme (e.g. volume increase in case-based reimbursement systems). Also in Germany it is under discussion to implement P4P for inpatient care. The effectiveness and influencing factors for effectiveness are not sufficiently evaluated, yet.

Objective: The objective was to evaluate the effectiveness of P4P and to identify barriers and facilitators for the effectiveness based on empirical evidence.

Methods: A systematic literature search was performed in several economic and medical databases. Additionally manual searches were performed. Cluster randomized controlled trials (cRCTs), controlled before after studies (CBAs), and interrupted time series (ITS), comparing P4P to a payment scheme without a component that incentivises quality, were eligible. Only payments directly to hospitals were considered. The primary outcomes were measures for quality of processes (e.g. prescription according guidelines) and quality of results (e.g. hospital mortality). Data on patient and hospital characteristics, P4P-design, context/setting, results and subgroup analysis (interactions for effect) were extracted in a-priori piloted standardized tables. Risk of bias (RoB) was assessed with the EPOC-Cochrane tool. If necessary the data were reanalysed using autoregressive moving average regression models (ARIMA).

Study selection, data extraction and RoB assessment were performed by two reviewers independently. Discrepancies were discussed until consensus.

A structured data synthesis was performed. Data from subgroup analyses and evaluation of difference between P4P-programms were used to analyse the influence of the following factors.

  • Type of P4P (additional payments vs penalties, payments for quality targets vs quality improvement)
  • Level of incentive
  • Linkage to process or result indicators
  • Existing other quality interventions (e.g. public reporting)
  • Accompanying other quality interventions
  • Baseline quality of hospitals
  • Hospital characteristics (e.g. ownership, financial situation)
  • Health care system factors (e.g. hospital competition)

Results: Twelve studies (three ITS, nine CBAs) on five different P4P programs (three in the USA [Medicare and Medicaid], two in England [NHS]) as add-on to DRG based payments were identified. RoB was high. The main reason in CBAs was the difference in baseline measures between intervention and control hospitals and risk of contamination. The main reason in ITS was the unclear effect of time trends.

All studies showed a slight effect in favour of P4P. Effect of result indicators (e.g. mortality) was mostly low and became marginal in the long-term. Although sample sizes were large, the results were not clearly in favour of the intervention in the majority of studies. Strong short term effects were most clear for penalties. Effects were larger using higher incentives and in hospitals in a good financial situation. The influence of the baseline quality of hospitals was heterogeneous depending on program. An effect was observed in programs that pay for quality targets but not in programs that pay for quality improvement.

From the difference in programs it can be deduced that the (additional) effect is lower if other quality interventions were already implemented. There was the tendency of lower effects of P4P in more competitive markets where public quality reporting is in effect. Mandatory programs appear more promising.

For other P4P-design or context factors no obvious difference for an effect could be found.

Discussion: Considering the empirical evidence the effect of P4P seems modest. The level of incentive and financial situation of hospitals seem to have influence on the effect. Particularly important appears the interactions between P4P-design (e.g. low incentives) and (interacting) context factors (e.g. competition). The results are limited by the low level of evidence.

Implications for practice: Future P4P-programs should have sufficient size and chance of incentives. P4P-programms should be mandatory to target hospitals that are less interested in quality improvement.

In the implementation process the financial situation of the hospitals should be paid attention to because hospitals that have little financial scope to invest in quality activities have usually lower potential to increase quality.

Furthermore a detailed evaluation of all possible influencing context factors is necessary to adjust the program to the pre-existing conditions. The raised dots should also be considered before implementation of P4P in the German hospital system.