Download Report

Developing new approaches to measuring NHS
outputs and productivity
FINAL REPORT
8 September 2005
(revised 2 December)
Diane Dawson∗, Hugh Gravelle**,
Mary O’Mahony***, Andrew Street*, Martin Weale***
Adriana Castelli*, Rowena Jacobs*, Paul Kind*, Pete Loveridge***,
Stephen Martin****, Philip Stevens***, Lucy Stokes***
∗
Centre for Health Economics, University of York. **National Primary Care Research and
Development Centre, Centre for Health Economics, University of York. ***National Institute for
Economic and Social Research. ****Department of Economics and Related Studies, University of York.
Table of contents
1
Introduction ......................................................................... 7
1.1
The research remit...........................................................................................7
1.2
Research delivered ...........................................................................................8
1.3
Quality.............................................................................................................10
1.4
Value for money and technical change ........................................................11
1.5
Structure of the report...................................................................................11
2
Productivity and output measurement............................ 14
2.1
Total factor productivity growth in private markets .................................14
2.2
Total factor productivity growth and quality change ................................16
2.3
Application of TFPG methods in the NHS ..................................................17
2.4
Value weighted NHS output index ...............................................................21
2.5
Changes in marginal social values over time...............................................22
2.6
Outcomes and attribution .............................................................................24
2.7
Cost and value weights ..................................................................................26
3
Current practice................................................................. 27
3.1
The cost weighted activity index (CWAI)....................................................27
3.2
The ‘experimental’ NHS cost efficiency and service effectiveness indices28
3.2.1
The experimental cost efficiency index ...................................................29
3.2.2
The service effectiveness growth measure ..............................................31
3.3
Pharmaceuticals and prescribing .................................................................32
2
4
Quality adjustment with available data ........................... 33
4.1
Activities or outputs as the unit of analysis .................................................34
4.1.1
Activities: institutional approach .............................................................34
4.1.2
Outputs: patient-centred or disease-based approach................................35
4.2
Unit of hospital output...................................................................................36
4.3
Alternative sources for hospital activity ......................................................38
4.4
General practitioner consultations...............................................................41
4.5
Measures of marginal social value of outputs .............................................41
4.5.1
Unit costs .................................................................................................41
4.5.2
Private sector prices .................................................................................43
4.5.3
International prices...................................................................................44
4.5.4
Value of health.........................................................................................45
4.5.5
Value of waiting time...............................................................................45
4.5.6
Expert groups ...........................................................................................46
4.6
Quality adjustment for health effects of treatment ....................................47
4.7
Quality adjustment using long term survival..............................................52
4.8
Quality adjustment with short term survival..............................................54
4.8.1
Simple survival adjustment......................................................................54
4.8.2
Incorporating estimates of health effects .................................................60
4.8.3
Life expectancy and health effects...........................................................64
4.8.4
Cost of death adjustment..........................................................................65
4.8.5
In-hospital versus 30 day mortality..........................................................68
4.8.6
Conclusions: survival based quality adjustment ......................................71
4.9
Readmissions ..................................................................................................74
4.9.1
Readmissions as health effects.................................................................74
4.9.2
Readmissions (and clinical errors and MRSA) as a deadweight loss......80
4.10
Waiting times..................................................................................................81
4.10.1
Waiting time as a characteristic ...............................................................82
3
4.10.2
Waiting time as a scaling factor...............................................................83
4.10.2.1
Discounting to start of wait..........................................................85
4.10.2.2
Discounting to date of treatment with charge for waiting ...........89
4.10.3
Optimal waiting times..............................................................................92
4.10.4
Distribution of waiting times ...................................................................92
4.10.5
Outpatient waits .......................................................................................96
4.10.6
Waiting time adjustment: conclusion.......................................................97
4.11
Patient satisfaction .......................................................................................100
4.12
Discount rate on health................................................................................103
4.13
Quality adjustment for general practice....................................................104
4.14
Atkinson principles and quality adjustment .............................................107
5
Experimental indices of NHS output ............................. 110
5.1
General trends, index form and data sources ...........................................110
5.2
Index form ....................................................................................................113
5.3
Spells versus episodes ..................................................................................114
5.4
Survival adjustments: hospital output .......................................................115
5.4.1
Simple survival adjustment....................................................................115
5.4.2
Survival and estimated health effects adjustment ..................................121
5.4.3
Survival adjustments with health effects and life expectancy ...............122
5.5
Waiting time and survival adjustments: hospital output.........................123
5.5.1
Effect of waiting time adjustments ........................................................125
5.5.2
Outpatient waits .....................................................................................130
5.6
Additional quality adjustments ..................................................................131
5.6.1
Adjusting for the costs of poor treatment: readmissions and MRSA ....131
5.6.2
Patient satisfaction .................................................................................134
5.7
Conclusions...................................................................................................137
4
6
Specimen output index................................................... 138
6.1
Introduction..................................................................................................138
6.2
Data ...............................................................................................................140
6.3
Cost weighted output indices ......................................................................150
6.4
Health outcome weighted output indices ...................................................155
6.5
Value weighted output index.......................................................................159
6.6
Conclusion ....................................................................................................160
7
Effects of quality adjustments on hospital and NHS
output indices: summary......................................................... 161
8
Labour input .................................................................... 165
8.1
Introduction..................................................................................................165
8.2
Labour input in the NHS.............................................................................166
8.2.1
Volume of labour input..........................................................................166
8.2.2
Quality of labour input...........................................................................166
8.2.3
Data sources and volume trends ............................................................167
8.2.4
Quality adjustments based on qualifications..........................................170
8.2.5
Quality adjustments: refinements ..........................................................177
8.3
Conclusion ....................................................................................................180
9
Experimental productivity estimates............................. 180
9.1
Labour input and labour productivity growth .........................................181
9.2
Intermediate and capital inputs..................................................................183
9.3
Total factor productivity growth................................................................186
5
10
Improving the data .......................................................... 188
10.1
Outcomes data..............................................................................................188
10.1.1
Health outcomes.....................................................................................188
10.1.2
Feasibility...............................................................................................191
10.1.3
Cost ........................................................................................................192
10.2
Other outcome measures: patient satisfaction ..........................................192
10.3
General practice data ..................................................................................193
10.3.1
GP activity .............................................................................................193
10.3.2
GP cost weights......................................................................................194
10.3.3
General practice staff .............................................................................194
10.3.4
Prescribing .............................................................................................194
10.4
Other primary care data .............................................................................195
10.5
Inputs ............................................................................................................196
11
Conclusions and recommendations ............................. 198
11.1
Methods.........................................................................................................198
11.1.1
The preferred approach ..........................................................................198
11.1.2
Methods using existing data...................................................................199
11.2
Results ...........................................................................................................200
11.2.1
Results for the hospital sector ................................................................200
11.2.2
Other quality indicators..........................................................................201
11.3
Total factor productivity growth................................................................202
11.4
Recommendations ........................................................................................202
11.5
Acknowledgements ......................................................................................205
References ................................................................................ 206
Annex: How should NHS output be measured? ....................................................212
Table of Notation ...................................................................................................215
6
1 Introduction
1.1
The research remit
In March 2004 the Department of Health commissioned a research team from the
Centre for Health Economics at the University of York and the National Institute for
Economic and Social Research to develop new approaches to measuring NHS outputs
and productivity. The research objectives were development of:
•
A comprehensive measure of NHS outputs and productivity
•
Methods to facilitate regular in-year analysis of NHS productivity
•
Output measures capable of measuring efficiency and productivity at subnational levels.
The research team was also asked to co-operate with The Atkinson Review on
measurement of government output and productivity for the national accounts.
Three interim reports on this research were produced (July 2004, November 2004 and
June 2005) as well as memoranda on data requirements (September 2004) and
methodology (January 2005, August 2005). The work was presented for scrutiny at
two workshops (7 July 2004 and 17 June 2005). The research team presented work in
progress to four meetings of the NHS Outputs Steering Group (7 July 2004, 2
February 2005, 10 May 2005, 20 July 2005). This is the Final Report on the research
project.
The background to the research remit referred to the Public Service Agreement (PSA)
following the 2002 Spending Review that “set a ‘value for money’ (productivity)
target of 2%”. The target required information on quality improvement that had not
previously been measured for the NHS as a whole. While PSA targets have changed
over time, it is likely that some measure of quality improvement will continue to be
required in reporting performance. Quality adjusted measures of NHS output were
also required for other Department of Health purposes such as monitoring the
performance of Trusts and identifying the scope for efficiency gains.
It is important to appreciate that there are significant differences between the concepts
7
of efficiency, value for money, productivity and productivity growth that have
implications for both methods of measurement and policy relevance of the resulting
indices.
•
Efficiency is measured as the ratio of output produced with given inputs
relative to the maximum feasible output.
•
‘Value for money’ reflects the value individuals/society place on output
relative to the costs of production. This often corresponds to a cost-benefit
analysis.
•
Productivity is the ratio of a measure of total output to a measure of total
inputs.
•
Productivity growth is the change in output relative to the change in inputs. It
is often interpreted as reflecting the effect of technical change on production.
Robust measurement requires precise definition of the concept to be measured.
Effective employment of these measures in pursuit of policy objectives requires
selection of the appropriate measure for the issue at hand.
1.2
Research delivered
The research team has responded to the research remit by delivering the following
outputs.
1. A methodology for producing a comprehensive quality adjusted index of NHS
output.
This is referred to as the “value weighted output index”.
Data
necessary to estimate this index are not currently available for all NHS
activities but are feasible to collect. The DH has already planned or is
considering collection of the relevant data.
2. Methodologies for calculating quality adjusted NHS output indices with
existing data.
These are cost weighted indices that incorporate varying
combinations of changes in survival, health effects, waiting times, patient
satisfaction, readmissions and MRSA. We present estimates of experimental
indices which examine their sensitivity to different ways of measuring waiting
8
times, survival, and to different assumptions about the health effect, discount
rates and other parameters.
3. For the small set of hospital based treatments where there are some data on
health outcomes before and after treatment, we have produced a “specimen”
index that illustrates how the recommended value weighted index can be
populated with data on health outcomes when they become more generally
available.
4. We have suggested additional data that are feasible to collect that would not
only improve future measurement of NHS output but would also be of value in
managing the NHS.
5. We have constructed a new index of labour input in the NHS. It combines data
from a range of sources to calculate a volume measure of total hours worked
and includes an adjustment to take account of increases in the skills of the
workforce.
6. Using the cost weighted quality adjusted index of outputs and inputs, we have
produced provisional estimates of Labour Productivity Growth and Total
Factor Productivity Growth for the period 1998/99-2003/04.
7. The methodology and data used in these indices can be applied to sub-national
groups of institutions (e.g. NHS Trusts).
8. For many purposes, quality adjusted measures of output and productivity
growth for particular diseases and across institutional settings will be of more
value to the NHS than a comprehensive index. We indicate how, with planned
changes to NHS data collection, it will be feasible to produce disease specific
output and productivity indices with the methodology presented in this report.
Although key data used in our output indices, predominantly from the Hospital
Episode Statistics (HES), are available on a quarterly basis, we would not recommend
publication of within year estimates of output growth. The quarterly HES data are
9
subject to significant revision and use of quarterly index numbers could be
misleading.
1.3
Quality
Central to all the work reported is a method of defining and measuring “quality”.
We define the quality of treatment as the level of the characteristics valued by patients
and changes in quality are measured as the rate of change of these characteristics.
Given that improving the health of patients is a primary objective of the NHS,
improved health outcomes are likely to be the most important characteristic of
treatment. In addition, the literature suggests the main impact of technical change in
health care has been to improve expected health outcomes—e.g. the expected health
outcomes from heart surgery or management of diabetes are better today than ten
years ago. There is little data on health outcomes in the NHS and hence it has not
been possible to measure quality improvement, productivity growth and technical
change.
For the present the main available health outcomes data are for mortality or survival
rates. This is a severe limitation on any attempt to measure the quality of output or
productivity since only 3% of NHS patients die soon after treatment. There is no
routine data with which to measure the improvement in health following treatment for
the 97% of patients who survive. It appears that this situation may change and the
NHS may start collecting data on health outcomes. In Section 4 of this report, we
present the structure of output indices that should be used if and when data on health
outcomes in the NHS become available. The equations could be used for a subset of
patients if initially outcomes data are collected for only a limited set of procedures.
It follows from our definition of quality that the unit for measuring NHS output
should be the patient treated. This makes it necessary to link the activities directed at
treatment of a patient. For example, a patient undergoing treatment for heart disease
would receive prescriptions for various drugs, attend outpatient clinics, undergo
10
diagnostic tests, perhaps surgery and follow-up care from a GP. At present it is not
possible to identify the set of activities delivered to an NHS patient with a particular
condition. The Department of Health plans to introduce a patient identifier that in
future will permit analysis of the care delivered to a patient across activities,
institutions and over time. For the present it is necessary to continue to use counts of
activities as proxies for output. However, the indices recommended could readily be
adapted to a patient-based definition of output when linked data become available.
1.4
Value for money and technical change
Recent work in the US illustrates how, with data on outcomes and an ability to link
activities/inputs to patients with particular conditions, it is possible to obtain
approximate disease specific measures of value for money and technical change.
Cutler et al. (2001), for example, examine improved survival rates for patients
admitted with acute myocardial infarction (AMI). By placing a monetary value on
quality adjusted additional years of life expectancy and dividing by the cost of inputs
used for treating this group of patients, estimates can be produced of the growth in
value for money. Similar work has been done for depression, schizophrenia and
cataract surgery.
The DH requested the research team to produce formulae and estimates for a
comprehensive index of NHS output and productivity growth. When data become
available that identify the set of inputs used to treat particular conditions and the
monetary value of output, the approach we outline in Section 6 can be applied to
studies of individual conditions as in the US work.
1.5
Structure of the report
For any new method of measuring NHS output and productivity to be generally
accepted, it is important that the methodology be well grounded in economic theory.
In Section 2 we set out the theory behind measurement of Total Factor Productivity,
the issues relevant to attribution of NHS activity to improvement in health outcomes
and the choice of weights necessary to sum the many NHS outputs into a single index
11
number.
Section 3 outlines recent and current DH practice for estimating output and
productivity.
This provides a baseline for comparison with the quality adjusted
indices provided by the research team.
In Section 4 we set out our preferred approach to measuring quality adjusted output.
This is a value weighted output index that attaches monetary values to the
characteristics that measure quality. We discuss the appropriate units of output and
available data. While it is feasible to collect the data necessary for a value weighted
output index, the data are not currently available. In the remainder of the section we
explore the possibilities for estimating a quality adjusted cost weighted output index
with existing data. Quality adjusting a cost weighted index is not straightforward and
we set out the assumptions required. In the absence of data on health outcomes for
most NHS activity, we focus on the possibility of quality adjusting for changes in
long and short term survival and for changes in waiting times. In order to illustrate
the impact of including some information on health outcomes, we examine the
structure of an index that includes an indicative health gain for survivors. Ordinarily
improvements in survival are considered an improvement in the quality of NHS care.
However, for a number of conditions, the NHS provides terminal care. We examine
adjustments to the quality indicator required to deal with this issue. The appropriate
method for quality adjusting a cost weighted index for changes in waiting times is not
obvious. We explore several alternative methods for doing this. We conclude section
4 by comparing our approach to the recommendations of the Atkinson Review.
In Section 5 we present results for an experimental quality adjusted cost weighted
output index. We show the sensitivity of the index to different ways of treating
mortality rates, waiting times and choice of discount rate. We also examine the
feasibility of augmenting the index with available data on patient satisfaction,
readmission rates and incidence of MRSA.
The results reported in Section 5 reflect what is feasible at present when estimating a
comprehensive index. However, there are a few conditions for which outcomes data
are available. In Section 6 we use these data to estimate a “specimen” index. We
12
present results for a value weighted output index and for variants of a cost weighted
index that incorporate observed health gains in the survival adjustment and allow for
changes in waiting times. We also use the specimen index to illustrate the effect of
substituting health outcome weights for cost weights.
In Section 7 we draw on the results from Sections 5 and 6 and present two variants of
the quality adjustments, one of which is our preferred variant. We show the effects
for hospital sector and for overall NHS output indices.
Section 8 is devoted to measuring labour input in the NHS. It outlines methods for
calculating labour volumes and quality adjusted labour input where the latter takes
account of different productivities of workers, dividing the workforce by skill group.
It combines data from the NHS employment census with the rich data on worker
characteristics available in the Labour Force Survey.
Section 9 brings together the output measures reported in Section 5 with the labour
input measures in Section 8 to derive labour productivity estimates. Using estimates
for growth in intermediate inputs and capital from a range of sources, productivity
estimates are shown that account additionally for these two inputs.
In Section 10 we summarise the lessons learned in the course of this research for the
availability of relevant data. We stress the importance of making better use of existing
data (e.g. by record linkage and diagnosis added to prescription forms), the scope for
improving output measurement with data beginning to be collected (GP consultations)
and the need for the NHS to routinely collect health outcomes data.
We conclude in Section 11 with recommendations on how the Department of Health
can advance work on output and productivity measurement.
13
2 Productivity and output measurement
2.1
Total factor productivity growth in private markets
If private markets are complete and competitive, prices reflect marginal utilities of the
services to consumers and the marginal costs of providers. With some additional
assumptions, the measurement and interpretation of productivity growth is then
straightforward. Denote the vector of outputs from a firm at a time t as x(t). We index
the goods by j. Let z(t) be the vector of n inputs (types of capital, labour and
materials). ν(t) is a parameter which captures the state of technology at time t. The
technology of the firm is described by the implicit production function
g (x (t ) , z (t ) , v (t )) = 0
(1)
Assume that the technology exhibits constant returns to scale.
Differentiating (1) with respect to time gives
∂g
∑ ∂x
j
x& j + ∑
n
j
∂g
∂g
z&n + i v& = 0
∂zn
∂v
(2)
A profit maximising firm in a competitive market will choose x(t), z(t) to satisfy
p j = −θ∂g / ∂x j , wn = θ∂g ∂zn , where θ = − p1 /(∂g / ∂x j ) is the Lagrange multiplier
on the production constraint. We can rearrange (2) as
∑ω
j
where
y
j
x& j
xj
− ∑ ωnz
n
⎛ ∂x v ⎞ v&
z&n θ ∂g
v& = ω1y ⎜ 1 ⎟
=
zn px ∂v
⎝ ∂v x1 ⎠ v
ω jy = p j x j
∑p x
j
j
j
, ωinz = wn zin
(3)
∑p x
j ij
j
and y is the value of output from the firm y = ∑ j p j x j .
The left-hand side of (3) is the rate of change of a Divisia quantity index of outputs,
minus the rate of change of a Divisia quantity index of inputs. Since total factor
productivity (TFP) is the ratio of an index of outputs to an index of inputs, the left
hand side is also a measure of total factor productivity growth (TFPG). If production
14
takes place with constant returns to scale, then the total value of the product is
expended on the costs of the inputs and we can replace the second term on the left
hand side with the rate of change of an input index based on the cost shares
wn zn
∑w z
n n
n
The middle and last terms in (3) are equivalent expressions for the rate of technical
progress. In the last term the rate of technical progress is given as the increase in one
output (x1), holding all other outputs and inputs constant, made possible by the change
in technology. Thus TFPG also measures the rate of technological progress.
Technical progress increases welfare by relaxing the production constraint on the
economy. Under certain assumptions total factor productivity growth can be given a
direct welfare interpretation. Thus suppose that the economy is characterised by the
implicit production function g(x,z,v) = 0 and resources are allocated to maximise
current period welfare U(x,z) where x and z are vectors of outputs and inputs. The
Lagrangean for the welfare problem is
L = U ( x , z ) + λ g ( x , z, v )
(4)
and from the envelope theorem
dU / dv = dL / dv = ∂L / ∂v = λ g v
(5)
Hence, if U is derivable from an individualistic, non-paternal welfare function, the
fact that the allocation in an economy with a complete set of competitive markets
maximises some such welfare function, means that TFPG is an increasing monotonic
function of the change in welfare resulting from technological change.
The simple story above takes no account of changes in the stock of capital goods used
to produce consumption goods. Since what is consumed no longer equals what is
produced it is more complicated to give a welfare interpretation to changes in the
output index, though it is possible to do so in some cases (Sefton and Weale,
forthcoming).
15
2.2
Total factor productivity growth and quality change
A measure of TFPG in a market sector when there is quality change can be
constructed in the following manner. Let the production function for a firm or sector
which produces only one type (j) of output be
g j ( x j , q1 j ,..., qKj , z j , v j ) = 0
(6)
Here xj is the volume or quantity of output j (the number of units produced) and qkj is
the amount of outcome or characteristic k produced by consumption of one unit of
output j. The vector qj determines the quality of the product. At the equilibrium of a
market economy the price paid for a unit of output j depends on the outcomes it
produces: pj(qj), and is also a measure of quality.
If the market for good j is
competitive a profit maximising firm’s choice of output, inputs, and outcomes will
satisfy p j = −θ∂g j / ∂x j , x j ∂p j / ∂qkj = −θ∂g j / ∂qkj , and wn = θ∂g j ∂z jn . Totally
differentiating the production function with respect to time gives
∂g j
∂x j
x& j + ∑
k
∂g j
∂qkj
q&kj + ∑
n
∂g j
∂z jn
z& jn +
∂g j
∂v j
v& j = 0
(7)
and after using the profit maximising conditions, assuming constant returns to scale to
substitute total cost for the value of output in the weights on the inputs, and
rearranging we get
⎛ ∂p q ⎞ q&
⎛ ∂y v ⎞ v&
z&
θ ∂g j
v& j = ⎜ j j ⎟ j
+ ∑ ⎜ j kj ⎟ kj − ∑ ω nz jn =
⎜ ∂v j x j ⎟ v j
x j m ⎜⎝ ∂qkj p j ⎟⎠ qkj n
z jn p j x j ∂v j
⎝
⎠
x& j
where ω nz = wn z jn
∑w z
n
jn
(8)
.
n
Thus if we do not take account of the change in quality (the middle term in the left
hand side of (8)) and merely calculate the difference between the rate of growth of the
output and input indices we will not be measuring the rate of technical progress (the
second and last terms). Equivalently, if we define TFPG as the difference between
the rates of growth of the value of output and the cost of inputs, we will typically
underestimate TFPG if we do not allow for the changing value of outputs because of
improvements in quality. Consequently we need to take account of the change in the
mix of outcomes (characteristics) embodied in each unit of output.
16
Denoting the marginal effect of outcome or characteristic k on the price of output j as
π kj ≡ ∂p j / ∂qkj we can write the rate of growth of the total value of output summed
across all sectors ( y = ∑ j p j x j = ∑ j y j ) as
⎡
p x ⎡ π q q&
x& ⎤
q&
x& ⎤
y&
= ∑ j j ⎢ ∑ jk kj kj + j ⎥ = ∑ ω jy ⎢ ∑ ω kj jk + j ⎥
y
y ⎢⎣ k p j qkj x j ⎥⎦
q jk x j ⎥⎦
j
j
⎢⎣ k
where ω kj = π kj qkj
∑π
(9)
q
kj kj
k
In competitive equilibrium these prices represent social values as well as costs of
production. Thus, in principle the prices obtained in the competitive equilibrium
enable us to calculate the rate of growth of the value of output and so derive the rate
of technical progress via TFPG. We need to estimate the hedonic price functions
pj(qj) which relate prices to the quality of goods (Rosen, 2002). In practice there are
considerable difficulties even in market sectors in allowing for quality changes.
The discussion shows that a measure of TFPG which relates only to the volume of
outputs and ignores their outcome or quality characteristics is incomplete. Note also
that it is also important to capture any quality change in inputs as well as outputs.
Thus if the NHS is employing more skilled labour, a measure that merely counts
number of workers without taking account of differences in marginal productivities
across skill types, will overestimate TFPG. The contribution from using better quality
labour is incorrectly attributed to technical progress. Section 8 deals with quality
adjusting labour input.
2.3
Application of TFPG methods in the NHS
The construction of a NHS productivity measure should capture the valuable things
that the NHS produces. However, operationalising this simple idea is not
straightforward because of the difficulties of defining NHS outputs, attaching values
to the outputs, and obtaining the relevant data.
We distinguish activities (operative procedures, diagnostic tests, outpatient visits,
consultations…), outputs (courses of treatment which may require a bundle of
17
activities), and outcomes (the characteristics of output which affect utility). The focus
in health economics has been on the change in health produced by a course of
treatment, typically measured in quality adjusted life years (QALYs). But other
characteristics of treatment also affect utility: the length of time waited for treatment,
the degree of uncertainty attached to the waiting time, distance and travel time to
services, the interpersonal skills of GPs, the range of choice and quality of hospital
food, the politeness of the practice receptionist, the degree to which patients feel
involved in decisions about their treatment, etc. The aim is to measure the change in
the volume of NHS outputs taking account of quality changes (changes in the volume
of characteristics produced) but not of changes in the marginal social value of those
characteristics. The distinction between outputs and outcomes is identical to that
between goods and characteristics in consumption technology models (Deaton and
Muellbauer, 1980, Ch. 10; Lancaster, 1971) where consumers value goods because of
the bundle of characteristics that yield utility. The quality of the output is a function
of the vector of outcomes it produces.
In the measurement of private sector productivity growth the focus is on outputs
rather than the characteristics they produce because of the assumption that the market
price of the output measures the consumers’ marginal valuation of the bundle of
characteristics from consuming the output. In measuring private sector productivity
we also do not need to concern ourselves with counting activities because they are
embodied in the outputs which are produced and sold.
The direct application of the methods used to measure TFPG in the private sector is
problematic in the NHS for two main reasons. First, there is no final market for NHS
outputs which makes calculation of TFPG more difficult. Second, NHS production
may not be optimal, which undermines the welfare interpretation of TFPG.
One of the justifications for having the NHS in the first place is to eliminate a market
in which patients buy outputs from producers. Even in the few cases where the NHS
does sell its output to the final consumer, as for pharmaceuticals prescribed by general
practitioners (GPs) and dispensed to patients who are not exempt from payment, the
price does not equal marginal cost.
18
The absence of final markets has two major consequences for attempts to measure
NHS productivity. The first is that some outputs are not counted at all or are poorly
measured. Instead there may be data only on the activities and even these may be
lacking in many areas of activity. We discuss the implications in section 4.
The second consequence is that, because there are no prices to reveal patients’
marginal valuations of NHS outputs, we have to find other means of estimating their
value. We can do so in two equivalent ways: we can measure the outputs and attempt
to estimate the marginal valuations attached to them or we can measure the outcomes
produced by each unit of output and attempt to estimate marginal valuations of the
outcomes. The bundle of outcomes produced by a unit of output is likely to change
over time in the NHS because of, among other things, changes in technology or
treatment thresholds. In a private market the price of output would change to reflect
this. But in the absence of market prices for NHS outputs it is likely to be easier to
calculate the change in the marginal value of output by focusing on the change in the
vector of outcomes. We show below how the changing mix of outcomes (quality
change) may be allowed for in principle. We discuss how quality adjustments based
on the currently available data can be incorporated into an output index in section 4
and show the results of applying these methods to calculate experimental quality
adjusted indices in Section 5.
We have made suggestions as to how the quality
adjustment can be improved by the collection of additional data in our Second Interim
Report and discuss this further in Section 10.
The major problem in interpreting TFPG in the NHS is that it is by no means obvious
that the NHS is producing optimally.
It may be technically inefficient in the sense
that it is possible to increase some type of output without increasing inputs or
reducing some other output. It may also be producing the wrong mix of outputs.
Figure 2.1 illustrates.
19
Figure 2.1 Productivity, efficiency and welfare
Output
Production frontier
B
P2
P1
Social welfare indifference curve
A
Input
A at year 1 has higher productivity than B at year 2 but lower welfare and is
less efficient (further away from its period production frontier)
Consider the simple single input, single output case in Figure 2.1. Point A in year 1
has higher productivity than point B in year 2 but welfare is lower at point A and, on
any reasonable measure of technical efficiency, A has lower technical efficiency since
it is further from its period production frontier. Technical progress has shifted the
frontier upward from P1 to P2 but the productivity change does not even have the
same sign as technical progress. The increase in welfare between period 1 and 2 is in
part due to technical progress (B was not even feasible with the old technology) and to
improvements in efficiency, perhaps because of changes in institutional structures and
incentive mechanisms.
Note also that both technologies in this example have diminishing returns to scale so
that increases in inputs along the frontier reduce productivity but that such a
movement along the frontier can be welfare increasing.
These considerations suggest that there are problems in interpreting productivity
growth as a welfare or efficiency measure. Nevertheless it can be a useful summary
statistic to be used in conjunction with other data on the NHS. A further justification
for attempting to measure productivity is that it will stimulate improvements in NHS
information collection and processing which may lead to improved decision making
within the NHS.
20
2.4
Value weighted NHS output index
To measure NHS TFPG we need a measure of output growth which reflects the
changes in quality. Let yjt be the social value of the volume of NHS output j (xjt)
measured at the marginal social value of output j at date t (pjt). The marginal social
value of output j depends on the mix of characteristics produced by a unit of output j:
y jt = x jt p jt = x jt
(∑ π
k
q
kt kjt
)
(10)
where qkjt is the amount of characteristic k produced by a unit of j. Notice that we
assume that the marginal social value function is linear
p jt = ∑ k π kt qkjt
(11)
The assumptions that the marginal social value of a unit of output j is a linear function
of its characteristics and that the πkt is independent of j are strong. The latter for
example requires that an improvement in the quality of hospital food (say) per day in
hospital has the same effect on the value of treatment for throat cancer as on the value
of a hip replacement.
The total value of NHS output is yt = ∑ j y jt . We want to measure the discrete time
version of the growth rate of y.
We could use the Tornqvist discrete time
approximation to the continuous Divisia index (9) but for simplicity present the
analysis in terms of a base weighted index.1 In practice there is little difference
between a chained base weighted index and the Tornqvist index. The base value
weighted output index that we seek to measure is
I ytxq =
∑x ∑π
∑x ∑π
j
j
q
jt +1
k
kt kjt +1
jt
k
kt kjt
q
(12)
which allows for changes in volume of outputs (xjt) and of their characteristics (qkjt)
but holds the marginal value of the characteristics (πkt) constant.
We can also express the value weighted index as
1
We compare the results obtained from calculating base weighted indices with those from current
weighted indices and Fisher indices (the square root of the product of the current and base weighted
indices).
21
⎛ x ⎞ ⎛ ∑ π kt qkjt +1 ⎞ ⎛ p jt x jt ⎞
I ytxq = ∑ ⎜ jt +1 ⎟ ⎜ k
⎜ x ⎟ ⎜ ∑ π q ⎟⎟ ⎜ y ⎟
t
⎠
⎝ jt ⎠ ⎝
k kt kjt ⎠ ⎝
kt
⎤ ω ytjt
= ∑ j (1 + g xjt ) ⎡⎣ ∑ k (1 + g qkjt ) ω pjt
⎦
(13)
where gxjt is the growth rate of output xj, gqkjt is the growth rate of characteristic k
produced by output j, ω ptkt is the proportion of the marginal value of output j
accounted for by the k’th characteristic, and ω ytjt is share of the total value of period t
output accounted for by output j.
Note that year to year changes in the marginal value of characteristics (πkt) produced
by an output j do not affect the year to year rate of growth of output j (gxjt). They will
however affect the weights for any index form with chained weights. Thus the
overall, weighted average, rates of growth over a period of years will depend on the
changes in the marginal values. This is precisely analogous to the effect of changing
product prices in output indices for private sector goods and services.
2.5
Changes in marginal social values over time
y = ∑ j pjxj
In section 2.4 we specified the value of NHS output as
= ∑ j ∑ k π k q jk x j . In the rate of growth of the value of NHS output we assumed that
the marginal social values of output (pj) or of outcomes (πk ) were constant over time.
If instead we had allowed changes in marginal values over time then the rate of
growth of the value of NHS output would have been
∑ (1 + g ) ⎡⎣∑ (1 + gπ ) (1 + g ) ω
j
xjt
k
kt
qkjt
kt
pjt
⎤ ω ytjt − 1
⎦
(14)
where gπ kt is the growth rate in the marginal value πkt of characteristic k.
(14) depends both on changes in production conditions (the rates of growth of
outcomes per unit of output and the rates of growth of outputs) but also on preferences
(the rates of growth of the marginal social values of outcomes).
We argue that
changes in the marginal social values of outcomes between periods should not affect
the growth rate between periods of the value of NHS output. The measure of output
22
growth is intended to measure changes in real value of output: the volume of outputs
and the volumes of valuable characteristics they produce. Thus gπ kt should not be
included in the value weighted index of NHS output.
For example, under plausible assumptions the growth in the value of a QALY is
determined by the rate of growth of income and the elasticity of marginal utility of
income (Gravelle and Smith, 2001). But it is not affected by decisions within the NHS
(except perhaps to a negligible extent because NHS decisions affect population health
and thus the growth rate in income by improving worker productivity across the
economy). We should not count changes in the marginal value of the QALY when
calculating real NHS output growth.
This does not mean that changes in the value of QALYs and other outcomes have no
relevance for decision making. Most decisions in the NHS have effects on outputs
and outcomes over several periods – the health gain to a treated patient will typically
accrue over several years. In evaluating these decisions the changing value of health
should be taken into account: health changes accruing in different periods have
different values. Changes in the value of health, and other characteristics, should
affect decisions about the allocation of resources within and to the NHS. But they
should not affect the calculation of changes in productivity between one period and
the next, especially if the measure of productivity is intended to be used in part for
monitoring the performance of the NHS.
Whilst we may want to exclude the growth in the marginal value of outcomes as
contributing to TFPG we have to know whether and how the marginal values change
over time in order to use the correct weights in calculating productivity growth.
Note that
p& j
pj
=∑
k
π k q jk ⎛ π& k q& jk ⎞
⎜ +
⎟
p j ⎜⎝ π k q jk ⎟⎠
(15)
which again brings out the importance of the distinction between outcomes and
outputs. Even though we argue that the rate of growth of marginal social values
should not be counted as part of productivity growth this does not mean that we
23
should remove all the rate of growth of marginal social value of outputs since part of
p& j / p j is due to changes in quality rather than to changing preferences.
2.6
Outcomes and attribution
The characteristics qkjt are the marginal effects of the NHS. Thus for example we
wish to measure the marginal effect of output xjt on the health of individuals receiving
this treatment, holding constant all other factors which affect health. Similarly the rate
of growth of the effect of output j on characteristic k is the change in the marginal
product from one period to the next due to changes in the technology (defined widely
to include the way the NHS is organised). Since the other factors which affect the
marginal product will also affect its rate of growth we should hold them constant in
calculating the growth in the marginal product of NHS outputs from one year to the
next. Although the effects of other factors (e.g. improvements in diet) on the marginal
product should be excluded from the calculation of the growth rate for a particular
year, they are not ignored because they affect the weights applied to the growth rates.
Parts of the national income accounting literature note that health depends on factors
in addition to health service outputs (OECD, 2000; para 7.26 – 7.28). For example
health depends on income, education, age and other factors exogenous to NHS
activity. Hence it is argued one cannot use health outcomes to adjust outputs to take
account of “quality” changes because changes in health outcome may not be
attributable to health service outputs. But what we want is the marginal effect of
output j on health. If the health production function is additively separable in health
service outputs and other factors, then the marginal effect of a health service output is
well defined irrespective of the level of other variables affecting health.
It is more plausible that the health production function is not additively separable so
that the marginal effect of xj on health q depends on the confounding factors. This
does not present a fundamental argument against the use of outcomes. The
longstanding practice of standardising mortality rates to produce a measure of
population health suggests a way round the difficulty. Standardisation produces a
measure of population health from which the effects of population structure (age and
24
gender strata) have been removed so that one can make comparisons of mortality
across periods or areas without the confounding effects of demographic structure.
Under certain circumstances direct standardization can identify the true differences in
mortality. The assumptions required are non trivial (age and gender specific mortality
can be affected only proportionately by area or period (e.g. Yule, 1934)) but direct
standardisation is still useful. (The more common method of indirect standardisation
which produces SMRs requires even stronger assumptions.)
Consider a simple example where health depends on a single NHS output x1 and
another variable not controlled by the NHS, for example education or income, x2. The
health production function is
Qt = a0 t + a1t x1t + a2 t x2 t + a3t x1t x2 t
and the marginal product of health service output is
q1t = ∂Qt / ∂x1t = a1t + a3t x2 t
(16)
Generally we expect the effect of health service activity on health to depend on other
factors ( a3t ≠ 0 ). Hence the growth in the marginal health effect of NHS output,
which is crucial for quality adjusting NHS output indices, is affected by changes in
the confounding factor:
qt +1 a1t +1 + a3t +1 x2 t +1
=
qt
a1t + a3t x2 t
(17)
To remove the effect of the confounding factor we can choose an arbitrary fixed level
of the confounding factor in (17). If we think that the changes in the coefficient a3 t
are not due to health service decisions then we should also standardize with respect to
it as well:
qt +1 a1t +1 + a3 x2
=
qt
a1t + a3 x2
(18)
Obvious choices for a3 and x2 are their base period values or an average of the base
period and current period values.
The health gains from treatment may increase simply because patients live longer.
25
Consider the example of an increase in life expectancy that is not due to developments
in the NHS but reflects rising living standards, changes in diet etc. As a result an NHS
treatment, such as a hip replacement, may produce a greater outcome (QALY gain)
because the recipient of a hip replacement is on average alive for longer to enjoy the
reduced pain and increased mobility resulting from the procedure. Thus the marginal
product (the QALY gain) of the treatment is greater for reasons arising outside the
health service. As far as possible, effects of changes to life expectancy which are quite
independent of the procedures carried out should be kept out of calculation of the year
on year growth rate in quality adjusted output of hip replacements. They should, of
course, be allowed for in decisions about efficient resource allocation in the NHS but
this is not the purpose of constructing an index of NHS output.
There will be some cases where the QALY gain may be partly due to improvements
in the procedure and partly due to patients being “better behaved”- e.g. circulatory
treatments produce more QALYs if patients do not smoke. In terms of (16) the
production function is not separable and judgement will be needed about how to
unravel the impacts of factors exogenous to the NHS.
2.7
Cost and value weights
By using costs to value outputs, a cost weighted output index can be calculated: a cost
weighted sum of the growth rates of output
I ctx =
∑x
∑x
c
jt +1 jt
j
j
jt
c jt
⎛ x ⎞ x jt c jt
= ∑ j ⎜ jt +1 ⎟
= ∑ j (1 + g xjt )ωctjt
⎜ x ⎟∑ x c
⎝ jt ⎠ k kt kt
(19)
where cjt is the unit (average cost) of output j (see section 3). The cost weighted index
I ctx is equivalent to the value weighted quality adjusted index I ytxq only if
(a) quality change is zero for all characteristics of all outputs
(b) cjt is proportional to the marginal social value of output (Dawson et al.,
2004a, section 2.11):
c jt = λt p jt = λt ∑ k π kt q kjt
(20)
There is limited information on both characteristics and their marginal social value so
26
that attempts to estimate I ytxq are bound to involve compromises. Suppose that we
could observe the changes in characteristics but not their marginal values (πkt). How
far does the assumption (20) that the output mix in the NHS maximises social value
subject to budget constraint take us in estimating I ytxq ? Using (20) in (19) gives
I ytxq
⎡
π q ⎤
= ∑ j (1 + g xjt ) ⎢ ∑ m (1 + g qkjt ) kt jkt ⎥ λtω ctjt
c jt ⎥⎦
⎢⎣
kt
⎤ ω ctjt
= ∑ j (1 + g xjt ) ⎡⎣ ∑ m (1 + g qkjt )ω pjt
⎦
(21)
Knowledge of cjt and λt is not sufficient to calculate the value weighted output index.
kt
in
We require knowledge of the relative importance of each characteristic ω pjt
determining the marginal social value of output: we need to know the marginal social
values of each characteristic (πkt), and the amount of each characteristic produced by
each output (qkjt). However if only one characteristic k is socially valuable then
assumption (20) and knowledge of unit costs and the growth rate of the single
valuable characteristic (k) is sufficient:
I ytxq = ∑ j (1 + g xjt )(1 + g qkjt ) ω ctjt
(22)
In practice there is also imperfect information about the amount of the characteristics
(qkjt). Section 4 discusses the possibility of using the currently available data to quality
adjust the cost weighted output index.
3 Current practice
The terminology employed by the Department of Health differs in some respects from
that used in the economics literature. In our outline of current practice we use the
Department of Health terminology but attempt to relate it to the economic concepts
set out in Section 1.1 and used in this report.
3.1
The cost weighted activity index (CWAI)
Prior to 2004 the measure of annual NHS productivity change published by the
Department of Health was based on estimating the change in a cost weighted activity
27
index (CWAI) less the change in NHS expenditure deflated by the index of NHS costs
and prices, to generate a cost weighted efficiency index (CWEI).
CWAI was estimated using data on activity for twelve categories of Hospital and
Community Health Service (HCHS) expenditure:
•
Inpatient and day case episodes
•
Outpatient, A&E and ward attenders
•
Regular day patients
•
Chiropody
•
Family planning
•
Screening
•
District nursing
•
Community psychiatric nursing
•
Community learning disability nursing
•
Dental episodes of care
•
Ambulances
Each category of activity was weighted by its share in HCHS expenditure. There was
no adjustment for improved health outcomes so that the only source of productivity
improvement was an increase in the number of patients treated in hospital, ambulance
trips, etc. per pound of real expenditure.
3.2
The ‘experimental’ NHS cost efficiency and service effectiveness
indices
In 2004 the DH replaced the CWEI and developed two new ‘interim’ indices: an NHS
cost efficiency index and a service effectiveness index. The approach was dictated by
the need to respond to the Treasury’s view that ‘Value-for-money’ should be
measured in ways that permitted assessment of performance against a target of 1%
p.a. improvement in cost efficiency and 1% p.a. improvement in service effectiveness.
The latter was generally understood to refer to return on expenditure to improve
quality.
28
3.2.1 The experimental cost efficiency index
The experimental cost efficiency index incorporates a change to the measurement of
outputs and a change to the measurement of inputs. The change to the measurement
of outputs involved replacing CWAI with an Output Index, which includes
significantly more activities than CWAI and uses Reference Costs to weight different
activities. The Output Index now counts over 1,700 categories of NHS activity and
includes activity in primary care. The services covered are:
•
Elective inpatients (over 500 activity categories)
•
Non-elective inpatients (over 500 activity categories)
•
Outpatients (around 300 activity categories)
•
A&E (9 activity categories)
•
Mental health services (30 activity categories)
•
Primary care prescribing (almost 200 activity categories)
•
Primary care consultations (5 activity categories)
•
NHS Direct calls answered (1 activity category)
•
NHS Direct online internet hits (1 activity category)
•
Walk in centre visits (1 activity category)
•
Ambulance journeys (1 activity category)
•
General Ophthalmic Services (1 activity category)
•
General Dental Services (1 activity category)
•
Others including Critical care, Audiological Services, Pathology, Radiology,
Chemotherapy, Renal dialysis, Community services, Bone marrow transplants
& Rehabilitation (over 100 activity categories)
The coverage is not complete (Lee, 2004) and some of the omitted activities, such as
the Prison Health Service, are not small; though others (Parentcraft Classes) seem
unlikely to have a large impact on the index. But the extension of coverage is a very
significant improvement.
Use of Reference Costs to weight Hospital and Community Health Services (HCHS)
activity means that increases in more expensive treatments will have greater weight in
the Output Index than increases in relatively low cost treatments. This is also true of
29
primary care prescribing which is measured as prescriptions issued and weighted by
the cost of drugs prescribed.
An increase in prescribing more expensive
pharmaceuticals will have a greater effect on the Output Index than increased
prescribing of less expensive drugs. The Output Index is currently used by ONS to
measure NHS output in the National accounts.
Table 3.1 shows the relative weights for each main type of activity in the Output
Index.
Table 3.1 Components of the NHS output index
Cost share
Electives+ day cases
Non-electives
Outpatients
Mental Health
GP & practice nurse consultations
Dentists
Prescriptions
Accidents & Emergency
CCS
Other
Total
DH Output Index
(Laspeyres)
2001/02
Growth in 2002/03
relative to 2001/02
12.84
20.64
10.99
9.56
12.44
4.69
16.48
2.17
3.97
6.20
100
5.10
4.92
4.19
3.62
10.27
-0.61
7.85
4.52
-0.28
5.94
5.36
In comparison with the previous CWEI, the experimental cost efficiency index
includes a revised index of inputs in addition to the revised measurement of outputs.
Since there are no measures of quality associated with the activities included in the
Output Index, the DH has attempted to estimate expenditure on inputs net of
expenditure intended to improve quality. Total expenditure on inputs is reduced by
estimated expenditure on:
•
Increases in capital charges
•
Increases in Private Finance Initiative revenue expenditure
30
•
Increases in HCHS drugs expenditure
•
Increases in Information Technology expenditure
•
Increases in clinical supplies expenditure
•
Increases in Family Health Services drugs expenditure
•
Cost of occupational enrichment
•
Cost of grade enrichment
•
Cost of reduced waiting times
The remaining expenditure on NHS services is deflated by the public sector price
deflator to obtain an index of changes in real NHS inputs. The resulting productivity
measure has been published as an index of NHS unit costs (Department of Health,
2004a, 2004b).
3.2.2 The service effectiveness growth measure
In the absence of data on quality improvement for all the activities included in the
DH’s new Output Index, and the need to quantify quality change for the Treasury, the
DH has identified some areas where it believes that aspects of quality change can be
measured and valued in monetary terms. Under consideration are:
•
Reduced waiting times (outpatient, A&E, inpatient treatment)
•
Reduced mortality rates for specific conditions (CHD and cancer)
•
Improved patient experience
Discussion of how to value these quality improvements is still under way but
possibilities include:
•
Incorporating changes in mortality rates and estimates of the number of ‘lives
saved’. Given the age and gender of lives saved, an estimate could be made of
the Quality Adjusted Life Years (QALYs) produced and valued at £30,000 per
QALY.
An alternative is the £1m per road death avoided used by the
Department of Transport.
•
Placing a value on reduced waiting times and patient experience using data
from discrete choice experiments.
(Source: personal communication.)
31
3.3
Pharmaceuticals and prescribing
Prescriptions issued in primary care are counted as activities and therefore as outputs
in the DH’s new Output Index. The cost weight on this activity is total expenditure on
the drugs prescribed. By contrast, in the hospital sector drugs are treated as inputs,
not outputs. For hospital based activity, drugs prescribed only enter the output index
as an element in the cost weight (Reference Cost) attached to an activity such as a
bypass operation or dialysis: they are not counted as an activity.
The impact of the current treatment of prescribing in primary care as an output
weighted by the cost of the drugs prescribed can be seen in Table 3.2. It is the
movement toward prescribing more expensive drugs that contributes most to the
growth in output.
Table 3.2 GP prescribing in the NHS output index (annual growth rates)
2002/03
2001/02
2000/01
Number of prescriptions
Cost weighted prescriptions
5.44
7.85
5.41
7.52
5.01
6.28
Impact on overall index
DH output index
DH output index excluding prescriptions
5.24
4.74
4.22
3.53
1.82
0.66
When the new NHS output index is used in estimates of productivity growth,
prescription drugs are also counted as an input. ONS in their measure of productivity
change present two variants for family health services drugs (net of receipts from
prescription charges) employing deflators based on the average unit cost of all items
and a Paasche price index for existing items. The latter is an attempt to adjust the
deflator for the changing quality of drugs. These two variants lead to quite big
differences, amounting to about half a percentage point per annum from 1995 to 2003
in real input growth (Lee, 2004, Hemingway, 2004). ONS rely on the Prescription
Pricing Authority (PPA) and plan to consider the division by item in more detail in
future revisions.
32
These problems would disappear in our “preferred” value weighted index of NHS
output (12). Patients treated would be the unit of output weighted by health gain.
Pharmaceuticals would be counted as an input. If prescribing more expensive drugs
turned out to be cost effective in improving health outcomes, this would appear as a
productivity increase.
There is no doubt that GPs add value through the activity of prescribing—otherwise
all licensed drugs would be available over the counter. If this value added is not
reflected in the assumption that the wage rate approximates the marginal product of
GPs, a measure of this value added would be the correct weight for the activity of
prescribing in the short-term cost weighted activity index.
4 Quality adjustment with available data
In this section we discuss how we can use available data to calculate a quality
adjusted index of NHS output which corresponds as closely as possible to the ideal
value weighted output index
I ytxq =
∑x ∑π
∑x ∑π
j
j
q
jt +1
k
kt kjt +1
jt
k
kt kjt
q
(12)
Calculation of (12) requires information on the outputs (xjt), the outcomes qkjt, and the
marginal social values of the outcomes πkt. Previous NHS output indices have been
derived from information on outputs and have implicitly assumed that unit costs
measure marginal social values. As we noted in section 2.7, even if this assumption is
correct the resulting cost weighted index is not equivalent to the value weighted index
unless there is no change in quality. With current information any outcome index will
have to rely heavily on the assumption that unit costs measure marginal social value.
Thus the main focus of the section is the extent to which it is possible to use
additional existing data to calculate a quality adjusted cost weighted index. Sections
4.1 to 4.4 examine issues in the measurement of outputs (x), section 4.5 discusses
sources of information on marginal social values (π), and sections 4.6 to 4.13 consider
how existing data on long and short term survival, readmissions, MRSA, waiting
33
times and patient satisfaction can be used to proxy changes in outcomes as quality
adjustments (q). The annex provides a flow chart showing the relationship of the
various indices estimated in the report.
4.1
Activities or outputs as the unit of analysis
International guidance on the measurement of government output for national
accounting purposes recommends distinguishing activities, outputs and outcomes. In
the health service, activities would include operative procedures, diagnostic tests,
outpatient visits, and consultations; outputs might comprise courses of treatment
which may require a bundle of activities; and outcomes would be defined as the
characteristics of output which affect utility.
4.1.1 Activities: institutional approach
NHS productivity measures have been based upon estimates of the number of
particular types of activities (procedures, consultations etc) or the number of patients
treated in various institutional settings (see section 3).
There are advantages to continuing within this framework. In instances where care for
a patient with a particular condition is provided entirely within one setting,
aggregation within the setting is equivalent to aggregation by patient pathway or
disease group. It ensures compatibility with current NHS reporting systems and is
likely to prove amenable to analysis at a disaggregated level. It can be a useful means
for monitoring and managing lower level units within the NHS. Further, the approach
would ensure consistency with other policy initiatives, most notably the Payment by
Results reforms (Department of Health, 2002a).
The major disadvantage is that most patient cases pass through more than one
institutional setting and their care requires several activities. For example, a patient
who has a hip replacement will typically have been seen in general practice, in an
outpatient department, treated as an inpatient in hospital and received after care
treatment from her general practitioner and from personal social services. Such care
patterns can lead to double counting and make problematic the valuation of output of
34
separate sectors contributing to joint production across sectors.
Current routine administrative data systems cannot track patients and their resource
use as they move along care pathways across settings. Even within institutional
settings data may not be appropriately linked. For example, whilst there are very
detailed data on types and quantities of different drugs dispensed to the patients of
individual general practitioners, they are not linked to the individual patient or even to
diagnostic group, so it is not possible to say who got what prescriptions or for what
condition.
4.1.2 Outputs: patient-centred or disease-based approach
The bulk of NHS activities or services are delivered to individual patients with the
aim of improving their health.
But a disease or patient pathway approach has
demanding data requirements. The approach is being investigated by US researchers
(Berndt et al., 2002; Berndt, Busch and Frank, 2001; Cutler and Huckman, 2003;
Shapiro, Shapiro and Wilcox, 2001) and, in the UK, by the Office for National
Statistics. It is probably the best way forward in the long run but is not fully
implementable with the types of data available in the NHS in the short to medium
term. One key element required is linkage of patient records across activities and this
improvement in the data is planned by the DH. Another requirement is the use of
clinical teams to identify procedures and tests relevant to specific conditions and
provision to update coding for procedures along clinical pathways as technology
changes.
The relative advantages of the patient/disease group and institutional setting
approaches depend on the degree of coverage, ease and timeliness of data collection;
the dangers of double counting (for instance, where patients suffer multiple health
problems); the ability to link to data on outcomes or prices; and the usefulness of the
disaggregated measures (for instance, in changing behaviour).
For the short to medium term the lack of linked routine data means that the
measurement of NHS output will be based predominantly on the measurement of
activities rather than patients.
35
4.2
Unit of hospital output
The main source of data on hospital output (excluding outpatient activity) is the
Hospital Episode Statistics (HES) which is derived from the cleaned returns submitted
by hospital trusts.2 There are four possible measures of hospital activity.
•
Consultant episodes. The basic unit in HES is the consultant episode. Each
observation records the treatments provided to a patient whilst they are under
the care of a particular consultant.
HES contains episodes which are
unfinished at the start and end of each HES year.
•
Finished consultant episodes (FCEs). A count of episodes means that an
episode which spans two HES years would be counted in each year. FCEs are
episodes which have finished by the end of the HES year, though they may
have begun before the start of the HES year. The DH’s new Output Index use
finished consultant episodes (FCEs) since unit costs are derived from the
Reference Costs data and these are defined for FCEs.
•
Provider spells (PS). Around 8% of patients have more than one FCE during a
spell in a hospital. It is possible to link episodes in the same spell to count
provider spells.
•
Continuous inpatient spells (CIPS). Some patients (around 1%) are transferred
to another provider at the end of an episode and it is possible to link episodes
across providers to yield continuous inpatient spells.
The amount of HES activity by year for FCEs, and CIPS is shown in Table 4.1.
where, as they should, total FCEs exceed total CIPS. Both of these HES volume data
are also always larger than those reported in the Reference Cost returns. The growth
in activity (measured as the total numbers of Reference Cost hospital activities, and
by HES based FCEs and CIPS) varies according to the measure employed, with all
showing a larger increase in activity between 2002/03 and 2003/04. Appendix B
describes our use of HES in more detail, including the construction of unit costs for
spells.
2
From 2003/4 HES data includes outpatient attendances and Accident and Emergency department
activity but this had not been included in released databases at the time of producing this report.
36
CIPS more nearly correspond to the patient journey. CIPS capture most
comprehensively the full package of inpatient care and they are less vulnerable to
being miscounted if transfers among providers vary over time or if there are changes
in how “being under the care of a consultant” is defined. We have therefore calculated
most of our indices using CIPS, though we also report comparisons of CIPS and FCE
based indices (section 5).
We recommend that future measures of hospital sector output use CIPS as the unit of
outcome.
37
Table 4.1 Number of episodes, CIP spells from HES and number of episodes
from Reference Costs
HES data
Episodes
Electives
Non-electives
Total
1998/99
1999/00
2000/01
2001/02
2002/03
2003/04
5491046
6486405
11977451
5577523
6613290
12190813
5573942
6692208
12266150
5485256
6802085
12287341
5664968
7025377
12690345
5815929
7516611
13332540
98/99-99/00
1.57%
1.96%
1.78%
99/00-00/0
-0.06%
1.19%
0.62%
5487579
5618166
11105745
5479633
5595606
11075239
98/99-99/00
1.12%
-2.86%
-0.94%
99/00-00/0
-0.15%
-0.40%
-0.27%
4805812
5220380
10026192
5166244
5350960
10517204
98/99-99/00
1.59%
3.34%
2.50%
99/00-00/0
7.50%
2.50%
4.90%
Growth
Electives
Non-electives
Total
CIP spells
Electives
Non-electives
Total
5427066
5783750
11210816
Growth
Electives
Non-electives
Total
00/01-01/02 01/02-02/03 02/03-03/04
-1.59%
3.28%
2.66%
1.64%
3.28%
6.99%
0.17%
3.28%
5.06%
5386575
5607484
10994059
5578093
5963742
11541835
5736331
6411777
12148108
00/01-01/02 01/02-02/03 02/03-03/04
-1.69%
3.55%
2.84%
0.21%
6.35%
7.51%
-0.73%
4.98%
5.25%
Reference Cost data
Electives
Non-electives
Total
Growth
Electives
Non-electives
Total
4.3
4730410
5051451
9781861
5171867
5604390
10776257
5360406
5684987
11045393
5467913
6021765
11489678
00/01-01/02 01/02-02/03 02/03-03/04
0.11%
3.65%
2.01%
4.74%
1.44%
5.92%
2.46%
2.50%
4.02%
Alternative sources for hospital activity
There are two alternative sources of information about hospital activity:
•
the Reference Cost returns and
•
the Hospital Episode Statistics (HES).
Table 4.1 compares Reference Cost activity volumes with those for HES FCEs and
CIPS.
38
The Reference Cost returns have been compiled annually since 1998 and have
become steadily more comprehensive. Hospital activity is summarised as aggregated
counts separately for elective inpatients, elective daycases and non-electives by each
Healthcare Resource Group. Based on version 3.1 HRGs, the Reference Cost returns
include up to 3×565 HRG categories for hospital activity (excluding “unclassified”
HRGs).
Hospital activity is also available from the Hospital Episode Statistics. HES returns
have been submitted by NHS providers since the late 1980s. HES contains data on
every admitted patient, and comprises individual patient records, with information
extracted directly from each patient’s medical record.
HES provides different counts of activity to that recorded in the Reference Costs
returns, the main reasons being the following:
•
First, the HES data undergo a more thorough process of validation than the
Reference Cost returns. Among other things, this validation strips out
duplicate records and ensures assignment to the correct Healthcare Resource
Group. The estimates of activity submitted in the Reference Cost returns are
not subject to the same validation process.
•
Second, HES counts all FCEs, whereas there is variable practice in what is
recorded in the Reference Cost returns: sometimes all FCEs are recorded,
sometimes only first FCEs are recorded. The main discrepancies between
activity counts in HES and Reference Cost returns relate to activities with very
long lengths of stay, including rehabilitation, mental health, bone marrow
transplants, cystic fibrosis, etc. These are in HES but stripped out of Reference
Cost activity.
•
Third, there may be differences in how activity is apportioned to each year.
HES includes all FCEs that are completed within the financial year. It is not
clear how patients are counted in the Reference Cost returns when their
hospital stay crosses the end of the financial year.
As well as being more thoroughly validated, HES is to be preferred to the Reference
Cost return for the following reasons:
39
•
Ideally, as explained in the previous section, we should be capturing each
individual’s journey through the health system. The best available measure of
this the Continuous Inpatient Spell (CIPS). CIPS cannot be derived from
Reference Cost returns.
•
Being individual patient records, it is possible to aggregate the HES data in
various ways. We aggregated the HES data into Healthcare Resource Groups,
so that there is an equivalent set of activity categories as for the Reference
Costs. But it is perfectly feasible to aggregate the data to other groupings, such
as specialty or OPCS procedure. Moreover, HES data can be allocated easily
to different HRG classifications, as the classification system is periodically
revised. This flexibility in deciding activity categories is lacking in the
Reference Cost returns, because these data have already been aggregated.
•
We argue that NHS activities should be quality adjusted. For hospital activity,
the HES data include items by which it is possible to make these adjustments,
notably the waiting time prior to admission and the discharge status of the
patient (from which mortality rates are derived). This information is not
available in the Reference Cost returns.
For there reasons, we use HES based activity estimates rather than the Reference Cost
returns for all elective inpatient, daycase and non-elective hospital activity. We use
the Reference Costs database for the other sources of activity.
There are two broad HRG groups in HES not included in the reference costs – these
have code T (mental health) and U (unclassified). In a spells calculation we need to
include all HES activity. If we did not do so then patients whose spell included one of
the omitted categories would be excluded. In addition it is also important that
unclassified groups are included in a count of activities since it is likely that over time
less and less activities get put into an unclassified category. It is necessary to impute
unit costs to these activities. In the case of group T we used average unit cost for other
Mental Health activities and for group U we used the median cost across FCEs.
40
4.4
General practitioner consultations
Estimates of consultation activity are derived from the consultations reported by
respondents in the General Household Survey and are available by location (surgery,
home, phone) and provider (GP, practice nurse (but only after 2000)). The estimate of
the number of consultations per year is made by multiplying the number of reported
consultations in the 14 days prior to interview by 26.
No allowance is made for seasonal factors - the date of the consultation varies across
respondents and has also varied between rounds of the GHS. There have been
implausibly large changes in the numbers of consultations reported in the GHS for
some age-gender groups from one year to next. The GHS was also not undertaken in
1997/8 and 1999/2000 so that estimates for these years have to be interpolated. Data
on consultations with practice nurses was not collected before 2000.
These deficiencies of the GHS as source of GP consultations information are widely
recognised (Atkinson, 2005, pp 108-111). The DH has been investigating the use of
GP record systems as a source of more accurate and detailed data.
We have
previously made detailed suggestions on how such data should be collected (Dawson,
et al., 2004b, 2004c).
New data from the QRESEARCH database derived from downloads from around 500
general practices has recently become available but too late for inclusion in this
report. We have agreed to undertake an analysis of general practice consultation rates
data from QRESEARCH for the DH which will examine what if any adjustments
need to be made to QR consultation counts to produce an estimate of consultation
activity. This report will be delivered separately in the Spring of 2006.
4.5
Measures of marginal social value of outputs
4.5.1 Unit costs
Current NHS practice, which follows the recommendation of Eurostat (2001), is to
use production costs (such as the average costs as reported in the annually produced
41
Schedule of Reference Costs) as weights in the calculation of output indices. This
implies that costs reflect the value that society places upon these activities. So
cochlear implant (with a unit cost of £23,747 in the 2002/03 Reference Costs) is
assumed to be 25 times more valuable than a normal delivery without complications
(unit cost £921). The use of unit costs as weights reflecting the marginal social value
of outputs has the support, albeit reluctant, of Hicks (1940) but as we have noted
(section 2.7) it rests on the strong assumption that resources are allocated efficiently
in the NHS so that unit costs are proportional to the marginal value of output
produced. Even with this assumption the use of unit costs will not allow for quality
changes (section 2.7) in the calculation of growth between one year and the next.
Reference Costs estimates of unit costs are based on allocations of fixed costs to
HRGs with FCEs as the unit of measurement. We have investigated whether it would
be possible to improve on this method of estimating activity costs by using regression
analysis. The Second Interim Report (Dawson et al., 2004c; section 3.7.1) describes
how we attempted to estimate cost functions using a provider Trust level panel of data
on activities and costs and the problems we encountered. Our subsequent estimations
were no more successful. We describe these attempts in Appendix D. Apart from
difficulties in trying to back compute total provider costs from the Reference Cost
data on unit costs and activities, the main problem is one of degrees of freedom.
There are more HRG activity types (approximately 550) than Trusts (approximately
180) so that even with observations over 6 years it is necessary to use quite high
levels of aggregation of activities.
We feel that the unit costs in the Reference Costs are very unlikely to measure
marginal costs, even long run marginal costs, because of the accounting procedures
used to generate them. We understand that, as a result of the introduction of Practice
Based Commissioning and Payment by Results, the DH is considering the production
of a new set of unit costs for spells, rather than for FCEs. There is a danger that
Payment by Results will encourage misreporting behaviour, with providers reporting
their Reference Costs close to the tariff and being reluctant to divulge information
about where their costs deviate from the tariff. If such behaviour is widespread, in
future the Reference Cost database may not even approximate average, let alone,
marginal costs.
42
There is Reference Cost data on unit costs from 1997/98. The data had patchy
coverage in the early years: only 76% of activity currently recorded had unit costs
assigned to them in 1997/98. Moreover there were some considerable fluctuations in
unit costs for specific HRGs in the early years (Street and AbdulHussain, 2004).
We therefore decided to use the unit estimates for 1999/00 for all previous years.
When there were missing unit cost data we used the estimates from the previous or
following year if these were available. Some activities measured in HES have no
corresponding unit costs in the Reference Costs databases and for these activities we
felt it was better to retain them in the index by applying the weighted average
reference cost for all other activities for that year. Having dealt with the question of
duplicate entries we took great care to ensure that no other entries were dropped as a
result of missing data. The general principle was that it was important not to lose any
patients merely on the grounds that the records were less than complete. To this end
we replaced missing data for HRG unit costs by averages for the whole. Other missing
variables were again replaced by suitable population averages. For example, in order
to determine the individuals’ age we use the variable STARTAGE (age at start of
episode) from the first episode in the spell. However, this is missing for some
individuals, (e.g. in 2002/3 35,554 episodes did not have age recorded). These were
replaced with the mean age for individuals of the same gender in the particular HRG
and year. For those in sparsely-populated HRGs, missing values were replaced with
the mean for the whole population.
4.5.2 Private sector prices
Under certain conditions the market prices for goods and services measure their
marginal social value and hence can be aggregated for the construction of measures of
the growth rate of output. One possible method of valuing NHS output might be to
use prices from the private sector. Some NHS activities have close matches in the
private sector. The main example is that some types of elective care are provided
both in the private and public sectors. There are also a few private sector general
practitioners. Non-emergency ambulance transport is similar to a taxi service.
In principle it might be possible either to use the private sector prices of outputs to
43
value NHS outputs or to estimate hedonic price functions to value characteristics or
outcomes of NHS output such as waiting times and hotel services. However there are
problems with attempting to use private health sector prices as measures of marginal
social value of NHS outputs or outcomes:
(a) the private sector produces very little emergency care and relatively little nonelective care, roughly half of NHS activity.
(b) private sector outputs have a different mix of characteristics compared to the NHS.
The health effects of treatment are probably broadly similar, but waiting times are
much shorter and the quality of hotel services higher. Thus it would be necessary to
attempt to estimate hedonic price functions to derive the marginal value of
characteristics (πkjt) rather than use the market price of the output to weight NHS
outputs. Time and resource constraints meant that we did not consider this to be a
feasible option for this project, though it may be worthwhile for the DH to
commission scoping review to investigate the possibility.
(c) private patients are not a random sample of the population – they tend to be richer
and better educated. Thus any estimated hedonic price function from the private sector
may not predict the marginal valuations of characteristics for the general population.
(d) much private health care is purchased by insured individuals so that the market
price of care will overstates its marginal value to the private patient.
(e) because the NHS is now encouraging commissioners (PCTs and general practices)
to buy care from the private sector, prices for care to private patients will be
increasingly influenced by the prices set by the NHS, which are based on Reference
Costs.
Private sector prices are therefore unlikely to be useful as sources of marginal social
values for most NHS outputs.
4.5.3 International prices
There is a precedent in cost benefit analysis for using world prices to value domestic
output when domestic prices are absent or distorted. The rationale is that because
trade could take place at world prices, they are legitimate measures of opportunity
cost to the domestic economy. This option is not particularly useful in the valuation of
UK health care outputs. There is not a significant world market in health care. In the
44
countries that do have published prices for health treatment, these tend to be
administered prices subject to stringent domestic regulation or negotiation. It is highly
unlikely that the relative prices observed in other countries will correspond to the
relative value of NHS outputs.
These caveats notwithstanding, we did report in our Second Interim Report (Dawson,
et al., 2004c; section 3.7.2) whether the valuations of activity would be sensitive to
the use of price information from other countries. We concluded that international
prices were not likely to be useful as sources of relative marginal social valuations of
NHS outputs. There were major differences in definitions of outputs so that it was not
clear that similar outputs could be compared. Even when we were reasonably
confident that the outputs were similar there were marked differences in the relative
costs of treatments between different countries. For example the 2001/2 ratio of the
costs of bilateral primary and primary hip replacement to the cost of a percutaneous
transluminal coronary angioplasty (PTCA) was 2.54 in Australia and 0.94 in Italy.
(The ratio in the Reference Costs database was 1.89). Other studies have also found
marked differences in the input usage and hence costs for particular conditions (Baily
and Garber, 1997).
4.5.4 Value of health
A value per QALY of £30,000 is believed to be compatible with the decisions made
by the National Institute of Clinical Excellence, although they do not mention a value
of life explicitly (Devlin and Parkin, 2004). The DH has commissioned research into
the value of a QALY but its results are not yet available. We have taken £30000 as
our reference value, assuming it applies for the year 2002/3 and adjusted it by the rate
of growth of money GDP for other years.
4.5.5 Value of waiting time
A value weighted output index requires that we weight changes in the various
characteristics by an estimate of the monetary value of each characteristic. One
source of data on willingness to pay to reduce waiting times is evidence from discrete
choice experiments. A recent review of the literature (Ryan, Odejar and Napper,
45
2004) reported that few studies addressed the issue of the monetary value of reducing
waiting times for health care and contrasted this with the significantly greater volume
of work on the value of time saving in transport. Of the six UK papers, only one
sampled the English population. The other five were location or procedure specific.
Ryan summarises the available evidence converting to 2002/03 prices. Propper’s
analysis of English data suggests estimated values between £36.25 and £94.19 for a
one month reduction in waiting time. Hurst’s study of waiting time for non-urgent
rheumatology estimated values between £11.95 and £23.68 per week. Ryan points
out that the Propper and Hurst studies give similar values assuming a linear additive
model. A major limitation of the data available is its age. Propper’s survey was
undertaken in 1987. While it is possible to adjust prices for inflation, it is also likely
that willingness to pay to reduce waiting time has changed over the last eighteen
years.
We illustrate the effect of adopting a value weighted output index in place of a cost
weighted index for a small subset of outputs where we have some health effects data
(section 6). We have used the upper limit of the Propper evidence, £94.19 per month
which corresponds to £3.13 per day in 2002/03 prices. This was the willingness to pay
of retired individuals with above average incomes in the original survey. To explore
the sensitivity of the index to price, we also use £50 per day which implies that a one
month reduction in waiting is worth £1400. This is an arbitrary number which
introspection suggests is likely to be at the high end of any willingness to pay for a
reduction in waiting time for most elective care.
If the DH wishes to make a value weighted output index a regular part of reporting
NHS performance, we recommend that new research is undertaken on social
willingness to pay to reducing waiting times.
4.5.6 Expert groups
Clinical experts could provide estimates of the health effects of treatment without the
need to deny cost-effective treatment to some patients for some treatments. They
have been used in the UK for CABG (Williams, 1985), in the Netherlands to estimate
burden of disease for 52 diagnostic groups accounting for 70% of health care costs,
46
and in the US for producing quality adjusted price indices for depression treatment
(Berndt et al., 2002).
We discussed the use of expert groups in our Second Interim Report (Dawson et al.,
2000c, section 3.1).
We do not believe that they should be used to provide
comprehensive annual updates of the estimated health effects. Such groups are costly
to convene, organise and train. They would be useful for a limited set of major
conditions, supplementing the regular annual snapshot before and after health data
collected from patients that we recommend (section 10.1).
4.6
Quality adjustment for health effects of treatment
In the next four sections we consider how far it is possible to use existing data to
quality adjust the output index for changes in the health effects of treatment, waiting
times, and patient satisfaction with the process of care. We consider first what we
would like to measure in principle.
We assume for the moment that the only valuable characteristic of NHS care is its
effect on health status and examine how we might use data on post treatment
mortality to produce a quality adjusted index of NHS output. In this and following
subsections we consider various methods of using mortality information and
combining it with other very limited data on the health effects of treatment, stressing
the assumptions required.
As we are assuming in this section that health is the only relevant characteristic of
health care we drop the subscript identifying the characteristic. Thus we use qjt, πt
instead of qkjt, πkt. Denote the discounted sum of QALYs produced by the treatment if
the patient survives treatment by
q*jt = ∑ s δ sσ *jt ( s )∑θ ρ *jt (θ , s )h jt ( s ) = ∑ s δ sσ *jt ( s )h*jt ( s )
(23)
δ is the discount factor on QALYs. h*jt ( s ) is the expected level of health s periods
after treatment, conditional on being alive at time t:
h*jt ( s ) = ∑θ ρ *jt (θ , s )h(θ )
(24)
47
σ *jt ( s ) is the probability of surviving s periods given that the patient survived
treatment j at date t, h(θ) is the health level from having health state θ, where θ is a
vector of mental and physical health characteristics, and ρ *jt (θ , s ) is probability of
being in health state θ conditional on surviving s periods after treatment j at date t.
To reduce notational complexity in examining the properties of the various indices we
ignore the effect of age and gender on mortality, survival and the probability
distribution of health states.
Some HRGs are already age specific.
A more
disaggregated analysis is analytically straightforward by defining the output type by
finer age categories and gender as well as HRG.
If the patient had not been treated their discounted sum of expected quality adjusted
life years would have been
qojt = ∑ s δ sσ ojt ( s )∑θ ρ ojt (θ , s )h jt ( s ) = ∑ s δ sσ ojt ( s )h ojt ( s )
(25)
h oj ( s ) is expected health if the patient would have survived s periods hence without
receiving treatment j. σ ojt ( s ) is the probability of surviving s periods if not treated. It
depends on the probabilities of health status θ at s conditional on surviving without
receiving treatment ( ρ oj (θ , s ) ) .
Setting health status when dead to zero, the expected increase in discounted QALYs
from treatment j at time t is
q jt = (1 − m jt ) q*jt − qojt
(26)
where mjt is the probability of death within a short period of treatment j. This
expression for the health effect of treatment is useful because it distinguishes three
components of qjt which are controllable by the NHS to different degrees and hence
should be treated differently in calculating output growth rates attributable to the
NHS.
The amount of health outcome produced per unit of output can change over time
because of changes in
• short term post treatment mortality rate mjt;
48
• survival probabilities and health status probabilities conditional on survival
σ *jt ( s ) , ρ *jt (θ , s )
• survival and the health status probabilities conditional on not having
treatment σ oj ( s ) , ρ oj (θ , s )
The first is arguably the component most clearly attributable to the NHS for many
treatments given current data and the third is unaffected by the NHS for all treatments.
The effect of the NHS on the second will vary across treatments from relatively little
effect on say varicose vein stripping and a large effect for cancer treatments.
Figure 4.1 illustrates the effect of treatment j at date t. The lower dashed line shows
the expected time stream of health given treatment after allowing for the treatment
mortality probability.
Figure 4.1 Expected time streams of health without treatment ( h oj ( s ) ), with
treatment conditional on surviving treatment ( h*jt ( s ) ), and with treatment
( (1 − m jt )h*jt ( s ) )
h
h*jt ( s )
(1 − m jt )h*jt ( s )
h oj ( s )
t
Let the marginal social value at time t of a QALY be πt (£s per QALY) so that the
49
marginal social value of unit of output j at time t is
p jt = π t q jt = π t ⎡⎣(1 − m jt ) q*jt − qojt ⎤⎦
(27)
We wish to calculate the value weighted output index (12) which in the special case in
which health is the only socially valuable characteristic is
I ytxq =
∑ x πq
∑ x πq
jt +1 t
j
j
jt
t
jt +1
jt
⎛ x q ⎞ π t q jt x jt
= ∑ j ⎜ jt +1 jt +1 ⎟
= ∑ j (1 + g xjt )(1 + g qjt )ω jty (28)
⎜ x q ⎟∑ π q x
jt ⎠
⎝ jt
j t jt jt
where g xjt = ( x jt +1 − x jt ) / x jt and g qjt = ( q jt +1 − q jt ) / q jt are the discrete period rates of
growth of the output j and the health outcome per unit of output j.
Consider the health adjusted cost weighted output index
⎛ x jt +1 q jt +1
I ctxq = ∑ j ⎜
⎜ x jt q jt
⎝
⎞
⎟=
∑ j c jt x jt ⎟⎠
c jt x jt
∑
⎛ q jt +1 ⎞
x
⎜⎜
⎟⎟ c jt
jt
+
1
j
⎝ q jt ⎠
∑ x jt c jt
(29)
The assumption of efficient allocation of NHS resources with only one quality
characteristic outcome takes the form
c jt = λtπ t q jt = λtπ j ⎡⎣(1 − m jt ) q*jt − qojt ⎤⎦
(30)
We can interpret λtπ t as the cost per QALY used by the NHS in making its treatment
decisions and is the optimality condition that at the margin all treatments have the
same cost-effectiveness ratio. If we make the assumption (30) the health adjusted cost
weighted output index is also the value weighted output index I ctxq = I ytxq . (This is
just (22) with simpler notation.)
If we do not assume efficient allocation then we can justify calculating I ctxq by arguing
that we are interested in a weighted average of the growth rates of the “real” parts
(outputs, quality as measured by health gain per unit of output) of the value of NHS
activity and that costs are convenient weights.
We can write the term qjt+1/qjt in (28) as
q jt +1
q jt
= 1 + g qjt =
(1 − m jt +1 ) q*jt +1 − qojt +1
(1 − m jt ) q*jt − qojt
=
a jt +1 q*jt +1 a jt q*jt
a jt q*jt
q jt
−
qojt +1 qojt
qojt q jt
50
= (1 + g ajt )(1 + g q* )
jt
a jt q*jt
a jt q*jt − qojt
− (1 + g qo )
jt
qojt
(31)
a jt q*jt − qojt
where ajt = (1−mjt ) is the survival rate (proportion of patients getting treatment j who
are alive for at least a short period after treatment). g ajt is the growth rate of survival
and g q* , g qo the growth rates in q*jt , qojt .
jt
jt
The NHS output growth rate in any year should not include changes in qjt which arise
because of changes in health if not treated ( qojt ). Hence in calculating the growth rate
we should set g qo = 0 and the health effect adjustment should be not (31) but
jt
q jt +1
q jt
⎛ a jt +1 ⎞ ⎛ q*jt +1 ⎞ a jt q*jt qojt
a jt q*jt qojt
=⎜
−
= (1 + g ajt )(1 + g q* )
−
⎜ a ⎟⎟ ⎜⎜ q* ⎟⎟ q
jt
q jt
q jt
q jt
⎝ jt ⎠ ⎝ jt ⎠ jt
(32)
Notice that although we hold qojt constant in calculating the annual growth in qjt
attributable to the NHS, changes in health without treatment will lead to changes in
the weights.
The larger is the growth in survival (gajt) and the larger the growth in health after
treatment ( g q* ), both of which may be attributable to the NHS, the greater is the
jt
health effect adjustment factor (qjt+1/qjt) and the greater the index of NHS output.
In general we do not have data on health conditional on surviving treatment q*jt or
conditional on no treatment qojt . We do have data on the probability of surviving
treatment ajt for all hospital spells. It is also possible that in the near future we may
have information on longer term survival σ *jt ( s ) for a large number of NHS patients.3
We therefore consider in the next section how it will be possible to use information on
treatment survival and longer term survival.
3
Note that we make a distinction between short term survival (a) and long term survival conditional on
short term survival (σ*) whereas the little data currently available is couched in terms of unconditional
survival probabilities which is the product of a and σ*.
51
4.7
Quality adjustment using long term survival
We want to estimate the quality adjusted cost weighted index (29) where the quality
adjustment factor is given by (32). We do not know health status conditional on
treatment and surviving s periods h*jt ( s ) . One possibility is to assume that health
status s periods after treatment is proportional to the current health status ht ( s ) of an
average person who is s years older: h*jt ( s ) = f jth ht ( s ) . Such data are available for
example from the 1996 Health Survey for England.4 Then we can estimate the growth
in discounted expected QALYs conditional on surviving treatment as
q*jt +1
q*jt
∑δσ
=
∑δσ
s
⎛ σ *jt +1 ( s ) ⎞ ⎛ f jth+1 ⎞ ⎛ δ sσ *jt ( s ) f jth ht ( s ) ⎞
= ∑s ⎜ *
⎟⎟
*
h
⎜ σ ( s ) ⎟⎟ ⎜⎜ f h ⎟⎟ ⎜⎜
q*jt
⎝ jt
⎠ ⎝ jt ⎠ ⎝
⎠
jt ( s ) f jt ht ( s )
*
jt +1
s
s
s
( s ) f jth+1ht ( s )
(33)
Notice that we have use ht ( s ) rather than ht +1 ( s ) in the numerator since changes in
the general health of the population should not affect the rate of growth of health
conditional on surviving treatment. We do allow for changes in the proportionality
factor f jth to affect the health conditional on short term survival since this will reflect
improvements in medical technology or in patient selection, both of which should be
attributed to the NHS. In the current state of knowledge we cannot generally estimate
the change in the proportionality factor from one period to the next and so assume that
it is constant. Hence the proportionality factors cancel from the numerator and
denominator in the middle ratio in the final part of (33) and we have
q*jt +1
q*jt
∑δσ
=
∑δσ
s
s
s
⎛ σ *jt +1 ( s ) ⎞ ⎛ δ sσ *jt ( s ) f jth ht ( s ) ⎞
= ∑s ⎜ *
⎟⎟
*
h
⎜ σ ( s ) ⎟⎟ ⎜⎜
q*jt
⎝ jt
⎠⎝
⎠
jt ( s ) f jt ht ( s )
*
jt +1
s
( s ) f jth+1ht ( s )
(34)
Thus with information on ht ( s ), σ *jt ( s ) and assumptions about f jth we can compute
q*jt +1 / q*jt . If it is also the case that the growth rate in survival ( gσ *t = σ *jt +1 / σ *jt - 1) is
constant over s then we do not even need to make assumptions about the magnitude of
the proportionality factor: (34) simplifies to
4
To keep the presentation simple we have assumed implicitly that all patients in an HRG have the
same age and gender so that all have the same survival probabilities and expected health status. In
practice when long term survival data become available it would be necessary to consider whether it
was necessary to calculate age and gender specific survival rates and expected health status.
52
q*jt +1
q*jt
= (1 + g q*t ) = (1 + gσ *t )
(35)
We still need information on h ojt ( s ), σ ojt ( s ) to calculate qojt for (32) and estimates of
ht ( s ) are of little help since we would have to specify proportionality factors which
vary across the activity types and we have no information on survival without
treatment for most types. One way to proceed, which is more plausible for treatment
of cancer and CHD than for cataracts, is to assume that the alternative to treatment is
death so that qojt = 0 and (32) becomes
q jt +1
q jt
=
a jt +1 q*jt +1
a jt q*jt
=
a jt +1
a jt
⎛ σ *jt +1 ( s ) ⎞ ⎛ δ sσ *jt ( s ) f t h h j* ( s ) ⎞
⎟⎟
∑ s ⎜⎜ σ * ( s) ⎟⎟ ⎜⎜
q*jt
⎝ jt
⎠⎝
⎠
S
(36)
Again if we assume a constant growth in long term survival for all ages the health
effect adjustment (36) simplifies further to
q jt +1
q jt
=
a jt +1 q*jt +1
a jt q*jt
= (1 + g at )(1 + gσ *t )
(37)
There are two practical issues to consider. First to which HRGs should the adjustment
in respect of say cancer survival be applied? Cancer diagnoses appear in a number of
HRGs. One possibility is to apply the adjustment to the HRGs which treat the highest
proportions of patients with cancer diagnoses. Alternatively one could attach group
cancer patients by the HRG of their activity.
Second, we must choose a time horizon S for the adjustment. Lakhani et al. (2005)
have recently presented estimates of 5 year cancer survival rates. The longer the
horizon over which the summation in (36) takes place the more accurate the estimate
of health effects. But a long time horizon has disadvantages. If the adjustment to a
year is based on the survival experience of patients actually treated in that year then
the output indices for S previous years will have to be revised every year. The
alternative is to adjust the index for a particular year using the survival experience of
patients treated S years previously. Thus the longer is S the greater the extent of
revisions or the more out of date the adjustment.
53
Rather than use general population estimates of ht ( s ) it may in some cases be
reasonable to make an even cruder assumption, that survival after S years is very low
or that h*jt ( s ) after S years is very low. Setting h*jt ( s ) = 0 for s > S and assuming
that each year to S has the same QALY score ( h*jt ( s ) = h*j ) we get
q jt +1
q jt
=
a jt +1 q*jt +1
a jt q*jt
=
a jt +1
a jt
⎞
⎛ σ *jt +1 ( s ) ⎞ ⎛
σ *jtδ s
⎜
⎜
⎟
∑ s=1 ⎜ σ * ( s) ⎟ ⎜ 5 σ * ( s)δ s ⎟⎟
⎝ jt
⎠ ⎝ ∑ s =1 jt
⎠
S
(38)
If the trend improvement in survival was reasonably stable over long periods then the
use of the lagged survival change data would be a reasonably accurate estimate of the
adjustment based on actual survival experience since one is interested in the growth
rate in survival, not in actual levels.
We believe that the use of longer term survival data is a promising way forward which
will become feasible in the medium term (Lakhani et al., 2005). It will be especially
promising if it is coupled with a programme to measure the health status of samples of
NHS patients before and after treatment (see section 6; Appendix C).
4.8
Quality adjustment with short term survival
4.8.1 Simple survival adjustment
In the absence of longer term survival data we now consider what can be done to
quality adjust the output index using the data which is currently available.
In the absence of information on longer term survival σ *jt ( s ) we cannot estimate q*jt ,
the change in the discounted QALYs associated with treatment conditional on
survival ( q*jt ). If we assume that q*jt = q*jt +1 does not change over time ( g q* = 0), so
jt
that the only reason why qjt changes over time is that the post operative survival rate
changes, the health effect adjustment becomes
q jt +1
q jt
=
a jt +1q*jt − qojt
a jt q − q
*
jt
o
jt
=
a jt +1 − k jt
a jt − k jt
(39)
where k jt = qojt / q*jt . Clearly increases in ajt+1 other things equal lead to a higher
54
quality adjustment factor.
For activities with the same ajt and kjt, the larger the
survival in period t+1 the greater the health adjustment factor.
Comparisons of quality adjustment factors across activities with different k jt = qojt / q*jt
require a little more thought. Remember that we are interested in the effect on the
quality adjustment factor qjt+1/qjt which is the ratio of health effects, not in the effect
of kjt on the level of health effects.
Suppose for definiteness that survival has
increased, so that the quality adjustment factor has increased. Differentiating (39) with
respect to kjt gives
∂ ( q jt +1 / q jt )
∂k jt
=
a jt +1 − a jt
(
a jt − k jt
)
(40)
2
Thus outputs with a larger kjt have a larger health adjustment. Higher k jt = qojt / q*jt can
arise from a smaller q*jt for the same qojt or a larger qojt
for the same q*jt . The
marginal effects of q*jt and qojt are
∂ ( q jt +1 / q jt )
∂q*jt
∂ ( q jt +1 / q jt )
∂qojt
=
=
−( a jt +1 − a jt ) qojt
( a jt − k jt )
2
( a jt +1 − a jt ) q*jt
(
a jt − k jt
)
2
(41)
(42)
Thus, other things equal, activities with larger q*jt have smaller health quality
adjustment factors. This might appear paradoxical for two reasons. First, the larger is
q*jt the greater the health effect in each period so that the health adjustment factor is
smaller for more beneficial activities. Second, the larger is q*jt the greater is the
absolute increase ( a jt +1 − a jt ) q*jt in the health effect between the two periods. But both
“paradoxes” disappear when we remember that what we are interested in is the health
adjustment factor which is the ratio of the health effect in the two periods. An
increase in q*jt increases both numerator health effect qjt+1 and denominator health
effect qjt but has a smaller proportionate effect on qjt+1 than on qjt and so the ratio
55
qjt+1/qjt gets smaller.
Even with the assumptions that q*jt = q*jt +1 is constant over short time intervals we
cannot measure (39) directly unless we know the magnitudes of q*jt , qojt in order to
calculate kjt = qojt / q*jt . In future it may be possible to estimate q*jt , qojt using new data
on longer term survival and on health status from surveys of patients before and after
treatment and from the results of evaluations of different types of treatment. But for
the moment, for the vast majority of activities we have no data on q*jt , qojt , though we
do have information on survival ajt. We can therefore calculate the survival adjusted
cost weighted output index
∑
I ctxa =
j
⎛a ⎞
c jt x jt +1 ⎜ jt +1 ⎟
⎜ a jt ⎟
⎝
⎠=
∑ j (1 + g xjt )(1 + gajt )ωctjt
c
x
∑ j jt jt
(43)
The effect of this simple survival adjustment5 on the rate of growth of NHS
productivity will not be great since the vast majority of NHS patients survive their
treatment so that survival rate does not change rapidly. Thus, for example, the 30 day
CIPS mortality rate for Phakoemulsion Cataract Extraction with Lens Implant (HRG
B02) fell from 0.0017 in 1999/2000 to 0.0013 in 2002/3, an annual rate of decline in
the mortality probability of 0.56%. The survival rate rose from 99.83 to 98.87, an
annual rate of increase in the survival rate of 0.013%. In terms of Figure 4.1 the
effect of increased survival is shown in the shift upward in the dashed line plotting
health post treatment (1 − m jt +1 )h*j − h oj . The effect is small relative to initial health.
5
An alternative apparently simpler adjustment is to apply the survival rates in each period to scale the
output of that period:
weighted
index
∑
j
∑
j
c jt x jt +1a jt +1 / ∑ j c jt x jt a jt . Unfortunately this index equals the value
x jt +1q jt +1π t / ∑ j x jt q jtπ t
qojt +1 = qojt = 0 , and c jt = θπ t q*jt .
under
the
assumptions
that
q*jt +1 = q*jt ,
The last assumption is perverse. It is not an efficiency
assumption: it requires that decision makers ignore the possibility of that a patient may not survive
treatment when allocating resources across treatments. By contrast, the first two requirements and the
(
efficiency assumption c jt = λtπ t a jt q jt − q jt
*
o
) imply that the survival adjusted cost weighted index
(43) does equal the value weighted index.
56
The difference between what we would like to measure (true health adjustment) and
what we can measure using survival data only is
a jt + 1q *jt + 1 - qojt a jt + 1
a jta jt + 1 (q *jt + 1 - q *jt ) + qojt (a jt + 1 - a jt )
=
a jt q *jt - qojt
a jt
a jt q jt
(44)
In general we cannot say anything about the direction of the bias in using a jt + 1 / a jt
instead of (32). But if survival increases and health conditional on survival increases
then (44) will be positive and the simple survival adjustment will underestimate the
true adjustment. If there is little change in health conditional on surviving treatment,
(44) becomes
a jt + 1q *jt - qojt a jt + 1
qojt (a jt + 1 - a jt )
=
a jt q *jt - qojt
a jt
a jt q jt
(45)
and a jt + 1 / a jt will always have the same sign as (32) and will always be less than it
in absolute value. a jt + 1 / a jt is a conservative estimate of the true health adjustment
(32) if health conditional on surviving treatment is constant or increasing.
Table 4.2 gives some indication of the underestimation of the growth rate of the true
health effect ( ( a jt +1 − k jt ) / ( a jt − k jt ) − 1 ) when it is calculated as the growth rate of
survival ((ajt+1/ajt) – 1) . The example has a rate of survival of 0.97 in the base year
which is approximately the average survival rate of patients. The greater the reduction
in mortality the greater the increase in the survival rate and the greater the growth rate
in the true health effect. Notice that because survival is initially high even quite large
proportionate reductions in the mortality risk have small effects on the survival rate
and on the true growth in the health effect. The true growth in the health effect is
larger the larger is k = qto / qt* . Thus, as we discussed above, the smaller the
proportionate effect of treatment on the discounted sum of QALYs, the larger is the
true growth in the health effect.
In the absence of information on the effect of NHS care on health and hence on the
true growth rate it is impossible to say how large the underestimation of the overall
growth rate of the effect of hospital care on health is. If our central guesstimate of the
average value of k = qto / qt* = 0.8 is correct, then the kind of increases in short term
survival which are perhaps towards the upper end of what is plausible (from 0.970 to
57
0.971 or 0.972 -- corresponding to proportionate reductions in mortality of 3.3% or
6.75%) underestimate the true growth in the health effect by 0.5% to 1%. Of course,
the calculation takes no account of changes in health effects arising from increases in
health conditional on surviving treatment.
If q*jt grows then the simple survival
adjustment would be more of an underestimate. Once again we see the importance of
having improved estimates of health conditional on surviving treatment.
Table 4.2 Error in using survival growth rate as estimate of growth rate in
health effect of treatment
Year t
Survival
Mortality
Mortality % decrease
0.97
0.03
Survival % growth
True health growth
if true k = 0.5
0.8
0.9
Error using survival
growth
if true k = 0.5
0.8
0.9
Year t +1
0.971
0.029
-3.33%
0.972
0.028
-6.67%
0.975
0.025
-16.67%
0.980
0.020
-33.33%
0.985
0.015
-50.00%
0.10%
0.21%
0.52%
1.03%
1.55%
0.21%
0.59%
1.43%
0.43%
1.18%
2.86%
1.06%
2.94%
7.14%
2.13%
5.88%
14.29%
3.19%
8.82%
21.43%
0.11%
0.49%
1.33%
0.22%
0.97%
2.65%
0.55%
2.43%
6.63%
1.10%
4.85%
13.25%
1.65%
7.28%
19.88%
If q *jt is constant ((45) holds) so that holds the difference between the quality adjusted
cost weighted and survival adjusted cost weighted indices is
⎛ q 0jt
I ctxq − I ctxa = ∑ j (1 + g xjt ) g ajt ⎜
⎜ q jt
⎝
⎞ x jt c jt
⎟⎟
⎠ ∑ j ' x j 't c j 't
(46)
Since we do not observe q *jt , qojt we cannot determine the magnitude of the absolute
downward bias in using a jt + 1 / a jt
instead of q jt + 1 / q jt . When there is efficient
allocation (30) for conditions where the alternative to NHS treatment is very poor
health ( qoj is small) or treatment has a large effect so (that q*jt is large relative to qoj ),
the bias is small. But if qjt is very small (the treatment has a small effect on health) the
58
bias is very large. Fortunately, the smaller the health gain from the treatments the
smaller the weight of the treatment in the cost adjusted value weighted index (since
costs are proportional to health gains by assumption (30) and so the downward bias is
bounded. But if (30) does not hold so that unit cost is not proportional to marginal
value then the downward bias when q jt is small may not be offset by having a low
cost weight attached to such outputs.
We also see from (43) that an increase in survival will have a smaller effect on the
index the smaller is the cost weight cjt. When (30) holds this is reasonable. The
increase in the health effect from an increase in survival is proportional to q *jt and
the smaller is cjt the smaller is q jt = a jt q *jt - qojt and the more likely is q *jt to be
small. But if (30) does not hold, the fact that survival gains in low cost activities will
have smaller effects on the index than survival gains in high cost activities is less
appealing.
A further difficulty with the pure survival adjustment is that it takes no account of the
age of the patients treated. A given post operative survival gain has the same effect
on the output index if the treatments have the same cost and volume, even though one
treats a much younger group of patients. Again this is reasonable if unit costs are
proportional to health effects since the cost weight adjusts for the effects of
differences in average age at treatment on health effects. But if we do not believe that
unit costs are proportional to health effects we may want to find another means of
allowing for differences in age across HRGs.
We conclude that adjusting for survival is better than ignoring survival changes. We
consider in the next two sections how it is possible to improve on a pure survival
adjustment by combining a little more information (from a small sample of treatments
where there some information on health effects and from estimates of life expectancy)
with further assumptions.
59
4.8.2 Incorporating estimates of health effects
To proceed further we must, in the current state of information about the health effects
of treatment, replace knowledge with additional assumptions.
We consider the
implications of assuming
(a) health conditional on treatment is constant from one period to the next for all
treatments
q *jt = q *jt + 1
(47)
(b) the ratio of health conditional on surviving treatment to health conditional on no
treatment is constant over time and the same for all treatments
qojt / q *jt = k
(48)
The assumption of efficient allocation (30) by itself does not enable us to claim that a
simple cost weighted index is what we want:
I ytxq = ∑ j
x jt +1q jt +1
≠∑j
x jt q jt
x jt +1c jt
(49)
x jt c jt
unless the quality of care is constant. But assumption (30) is useful since we can
combine it with (47) and (48) to get
∑
I ytxq = I ctxq =
⎛ q jt +1 ⎞
x
⎜⎜
⎟⎟ c jt
jt
+
1
j
⎝ q jt ⎠ =
∑ j x jt c jt
∑
⎛ a jt +1 − k ⎞
x
⎜⎜
⎟⎟ c jt
jt
+
1
j
⎝ a jt − k ⎠
∑ j x jt c jt
(50)
Notice that we cannot make assumptions (30), (47) and (48) and then construct an
index by applying the quality adjustments factors ajt+1 – k, ajt – k separately to the
outputs in each year since
∑x
∑x
jt +1
j
j
jt
[a jt +1 − k ]c jt
[a jt − k ]c jt
∑x
=
∑x
jt +1
j
j
≠
jt
=
∑x
∑x
jt +1
j
j
( q jt +1 / q*j ) q jt
( q jt / q*j ) q jt
∑x
∑x
jt +1
j
j
jt
q jt +1
q jt
jt
[a jt +1 − k ]λtπ t q jt
[a jt − k ]λtπ t q jt
∑x
=
∑x
jt +1
j
j
= I ytxq
jt
q jt +1 ( q jt / q*j )
q jt ( q jt / q*j )
(51)
(52)
Only if q jt / q*j is the same across all treatments would separate application of the
60
quality adjustments factors ajt+1 – k, ajt – k to xjt+1, xjt produce the correct result. Our
assumptions imply that qoj / q*j is constant across treatments, not that q jt / q*j is
constant across treatments. The difference is that qoj / q*j does not involve the survival
rate ajt, whereas q jt / q*j = ( a jt q*j − qoj ) / q*j = a jt − qoj / q*j = a jt − k does. To ensure that
q jt / q*j is equal across all treatments we would have to make the additional (and
patently false) assumption that all ajt are equal which then means that ajt+1 must be the
same across all j, though possibly different from ajt. Hence the separate application
of the quality adjustments factors ajt+1 – k, ajt – k to xjt+1, xjt as in (51) is valid only if
we can apply the same adjustment factors to all treatments which is equivalent to
scaling a simple cost weighted index by (at+1 – k)/ (at – k).
From our review of the EQ5D literature and analysis of data from BUPA and York
NHS Trust (summarised in Appendix C), we have snapshot estimates of health status
before ( hlb ) and after ( hl* ) for a limited set of treatments. (See section 6 where we use
these estimates in a specimen index for the set of treatments to illustrate, inter alia,
the implications calculating a value weighted index rather than a cost weighted index.)
We also use these estimates in section 5 to get very rough estimates of the cost
weighted quality adjusted output index for all activities I ctxq by making assumptions
(a) and (b) above and using h o / h* - the average value of hlb / hl* for our limited
sample of treatments - as an estimate of qoj / q*j = k in (50). We use k = 0.8 as our base
case but consider variants 0.7 and 0.9. Notice that we are estimating a ratio of sums of
discounted QALYs by a ratio of health status snapshots. We discuss the implications
further in section 4.8.3.
Clearly the assumptions that the set of treatments for which we happen to have data
on hlb / hl* are representative of the effects of all NHS treatment is very strong but we
make it to illustrate the importance of having information on the health effects of
treatment.
Calculating the index (50) with k set equal to the mean of the values in the sample of
61
procedures for which there are health outcome data creates a problem with some
activities which appear to have a negative health effect given the assumed value of k
and the observed value of ajt. Some activities have high mortality rates so that the
terms ajt+1 – k, ajt – k in the quality adjustment factor in (50) are close to zero or
negative. Small changes in ajt can then to large changes in the index and if both are
negative the adjustment will indicate negative growth when there has been an
improvement in output in the sense that 0 > ajt+1 – k > ajt – k.
A negative value of ajt – k implies that the activity has a negative social value. This
may be true for some treatments but is clearly incorrect for others, such as terminal
care. In the case of terminal care the problem arises from the factorisation of the
health effects as ajt q*jt − qojt where q*jt is the post treatment health stream conditional
on survival. This is not appropriate for terminal care since all treatment ends in death.
The solution is to reformulate the health gain as total discounted QALYs from start of
treatment minus qojt . Terminal care can then have a positive health outcome: patients
are better off with terminal care than without it. In other cases the problem may be
that our estimate of kj as the mean of our sample of procedures for which there are
outcome data is too large: if we had health data specific to the treatment ajt – kj would
be positive. Finally, it is possible that there are treatments which have a negative
health effect and no other valuable characteristics: they have a negative social value.
These create problems for a cost weighted index because they clearly violate the
underlying assumption that their unit cost measures their social value: unit costs
cannot be negative.
If we had information on health effects and could use health effect weights (as in (28))
then activities with negative or very small social value would not lead to small
changes in ajt having disproportionate effects on the index because their weight in the
index would be negative or very small.
In the absence of such information we have to make ad hoc adjustments to calculate
an index which is not disproportionately sensitive to changes in ajt for activities with
small or negative ajt – k. We adopt a cut off rule: if either ajt+1 – k or ajt – k is less
than a threshold value, say 0.15, we use the pure survival adjustment ajt+1/ajt and
62
otherwise we use (ajt+1 – k)/(ajt – k). We calculate indices with various values of the
cut off in section 5.4.
Table 4.3 shows the magnitude of the error in the calculated growth rate in the health
effect for different size errors in the estimated value of k = qto / qt* . The illustration
assumes that the survival rate is 0.97 which is not far from the average post-operative
survival rate. The assumed proportionate increase in survival and reduction in the
mortality rate are on the large size, as is the true growth rate in the health effect when
the true k exceeds 0.70. Notice again that because survival is high the survival growth
rate (1.03%) is low, and is considerably less than the true growth in the health effect
in the second column.
The third column shows the error in adjusting purely by
survival i.e. by setting k = 0. A pure survival adjustment does worse than assuming a
positive value of k when the true value of k is 0.71 or above: it does worse when the
proportionate effect of treatment is smaller.
Our central estimate of k based on the
small sample of HRGs where there is health effects data is 0.8 and for a true value of
k = 0.81 we see that the pure survival adjustment is worse than setting k at 0.7, or 0.8.
Table 4.3 Effect of error in estimated k = qto / qt* on error in calculated growth
rate in health effect
Year t survival 0.97, mortality 0.03; year t+1 survival 0.98, mortality 0.02; % growth in
survival 1.03%, % growth in mortality -34%.
Estimated k
Estimated growth in health effect
True k
0.0
1.03%
0.7
3.70%
0.8
5.88%
0.9
14.29%
0.91
True growth in health
effect
16.67%
Estimated minus true growth rate in health
effect
-15.64%
-12.96% -10.78%
-2.38%
0.85
8.33%
-7.30%
-4.63%
-2.45%
5.95%
0.81
6.25%
-5.22%
-2.55%
-0.37%
8.04%
0.71
3.85%
-2.82%
-0.14%
2.04%
10.44%
0.51
2.17%
-1.14%
1.53%
3.71%
12.11%
0.31
1.52%
-0.48%
2.19%
4.37%
12.77%
The table suggests that even using the same fairly rough and ready estimate of k for
63
all treatments where survival exceeds k by a reasonably margin (say 0.05 or more)
will be better than just using a pure short term survival adjustment.
4.8.3 Life expectancy and health effects
We now consider how it is possible to use information on the age of treated patients to
modify the survival and health effects adjustments.
Consider a simple example. Let Ljt be the certain remaining length of life of patients
who survive treatment j in year t and of those who are not treated. h *j and h oj are the
levels of health status conditional on treatment and without treatment in all periods
and these are constant over the remaining life of patients and are not affected by the
period of treatment (there is no technological progress). The expected discounted
health gain from treatment j in period t is
L jt
q jt = a jt ∫ h e ds − ∫
0
* − rs
j
L jt
0
⎛ 1 − e − rL jt ⎞
h e ds = ( a h − h ) ⎜
⎟
r
⎝
⎠
o − rs
j
*
jt j
o
j
(53)
The quality adjustment factor to be applied to the between period t and t + 1 is
therefore
q jt +1
q jt
(a
=
(a
h − h oj ) ⎛ 1 − e − rL jt +1 ⎞
⎜
− rL jt ⎟
*
o
jt h j − h j ) ⎝ 1 − e
⎠
*
jt +1 j
(54)
Replacing h oj / h*j with the constant k which we estimate from the mean of our sample
of health effect studies as in the previous section, the quality adjusted cost weighted
output index analogous to (50) but allowing for changing life expectancy due to
changes in the mix of patient types is
∑
I ctxa =
x c
j jt +1 jt
(a − k ) ⎛ 1 − e
( a − k ) ⎜⎝ 1 − e
∑xc
− rL jt +1
jt +1
jt
j
jt
− rL jt
⎞
⎟
⎠
(55)
jt
Notice that in section 4.8.2 we assumed that the mean ratio of snapshot health status
for our small set of specimen HRGs for which such data exists was equal to the ratio
of sums of discounted QALYs over the lifetime of patients ( qoj / q*j ). Here we make
64
the possibly more plausible assumption that the ratio of snapshot health status values
is equal to the ratio of snapshot health status without and with treatment ( hoj / h*j ).
The adjustment rests on the implicit assumption that all patients in t have the same life
expectancy Ljt. Since patients generally differ by age and often by gender it will
matter whether we calculate the life expectancy adjustment using estimates based on
the mean age and gender of patients e
− rL jt
=e
− rEi Lijt
for each age and gender group and then use Ee
and e
− rL jt
− rLijt
or whether we calculate use e
. The difference between Ee
− rLijt
− rLijt
is typically very small, less than 0.5% on average for electives. In section 5
we report results using both approaches to determine how sensitive the indices are to
the use of grouped or individual calculations of the life expectancy adjustment.
We use data on age specific health status from the 1996 Health Survey for England
plus life tables to calculate healthy life expectancy, rather than actual life expectancy
for use in the output indices. See Appendix A.
Note that if life expectancy does not change between periods the life expectancy terms
in (55) cancel out. Thus the index does not reflect cross treatment differences in age
at treatment. The rationale is that we have assumed that costs are proportional to the
marginal value of treatment so that any differences in average age at treatment which
affect life expectancy and health gains are already allowed for. Since we have
suggested that this assumption is not appealing we have investigated using life
expectancy with the survival adjustment and estimated health effects in our specimen
index where we have HRG specific information on the health effects. We report in
section 6.4 the results from estimating
∑ j x jt +1 ( a jt +1h*j − hoj ) (1 − e
− rL
∑ j x jt ( a jt h*j − hoj ) (1 − e
− rL jt +1
)
(56)
jt
)
4.8.4 Cost of death adjustment
The conclusion from sections 4.8.1 to 4.8.3 is that in the absence of much of the
65
required data on the effects of NHS activity on health we have either to use a simple
survival adjustment which will have a small effect or to make strong assumptions,
bolstered by further ad hoc adjustments, in order to incorporate health effects into a
cost weighted index. There is a third possibility to which we now turn which replaces
the assumption of efficient allocation (30) with another assumption about the
relationship between unit costs and the health effects of treatment.
Instead of making the assumption of efficient allocation (30) to use information on
costs cjt to make inferences about the effect of treatment on health suppose we assume
that unit costs are equal to the value of output before making any allowance for death:
c jt = pt (q *jt - qojt ) = p t (q jt + m jt q *jt )
(57)
This implies that health care providers take no account of mortality risk when
determining treatment and only consider the gain in health from successful treatment.
Then we can use the assumption to estimate the value of the true health effects, which
allow for mortality risk,
pt q jt = pt [(1 - m jt )q *jt - qojt ] = c jt - m jt p t q *jt
(58)
and the value of the q jt + 1 at period t value of health is
p t q jt + 1 =
q *jt + 1 - qojt + 1
c jt - p t m jt + 1q *jt + 1
q *jt - qojt
(59)
This gives an estimate of the value weighted index as
⎧⎪ q*jt +1 − q 0jt +1
⎫⎪
∑ j ⎨ q* − q 0 c jt − m jt +1π t q*jt +1 ⎬ x jt +1
jt
⎩⎪ jt
⎭⎪
I xγ =
*
∑ j c jt − m jtπ t q jt x jt
{
(60)
}
q*jt + 1 - q jt0 + 1
q*jt - q jt0
If nothing is known about
we can set
q*jt + 1 - q jt0 + 1
q*jt - q jt0
= 1 . This is equivalent to
assuming that qojt = k jq *jt for all j and t and q *jt = q *jt + 1 . The index then simplifies
to
γ
Ix
∑ {c − m π q } x
=
∑ {c − m π q } x
jt
j
j
jt
jt +1 t
*
jt +1
jt
*
jt
t
jt +1
(61)
jt
We can interpret pt q *jt as the cost of death for a patient who would have survived
treatment. Thus the index in (61) is a cost weighted output index in which we deduct a
66
cost of death from the unit cost of activity. We can use pt = £30000, a value which
is generally believed consistent with the approach used by NICE in order to calculate
I Xg . However, we still need an estimate of the discounted sum of QALYs obtained
by the average individual who receives treatment j in periods t
and t+1. One
possibility is to assume that q *jt is proportional to the average expected discounted
sum of QALYs for people with the same age and gender distribution as those treated
( qˆ jt ). We can use general population estimates of QALYs in from the Health Survey
for England, combined with appropriate life tables. However, we are still left with
problem of estimating the proportionality factors fjt = q *jt / qˆ jt . We would expect the
factor for patients receiving cataracts to differ from the factor for those undergoing
heart surgery.
The only source of information from which we can infer patient health post treatment
is the death rate. One apparently simple and appealing possibility is to employ a
proportionality factor f jt (m jt ) = 1 − m jt with the aim of ensuring that we have an
index which attaches a low cost of each death to death in those treatments which have
a high mortality rate. With this adjustment we obtain the index
δ
Ix
∑ {c − m π f ( m ) q$ } x
=
∑ {c − m π f ( m ) q$ } x
jt +1 t
jt
j
jt
j
jt
jt +1
t
jt +1
jt
jt
∑ {c − m π (1 − m ) q$
=
∑ {c − m π (1 − m ) q$
j
jt +1 t
jt
j
jt
jt
jt +1
t
jt +1
jt
jt
jt
jt +1
jt
jt +1
}x
}x
jt +1
(62)
jt
where qˆ jt is estimated from life tables, HSE QALY estimates and the age-gender
distribution of patients getting treatment j. Since very few mortality rates exceed 0.5
we can avoid the difficulty that the cost of death is decreasing with the mortality rate
when it exceeds 0.5 by setting fjt = 0 when mjt > 0.5.
We experimented with calculations of the index based on a value of £30,000 per
QALY. We found that the simple proportionality factor fjt = 1- mjt point to the
hospital service as a whole subtracting output. There is no practical resolution to this
bizarre result except to use a scaling factor which several orders of magnitude smaller
than 1-m. This would be entirely arbitrary. In addition, the underlying assumption
67
(57) about the relationship between unit costs and health effects on which the cost of
death adjustment rests is less acceptable than the efficient allocation assumption made
elsewhere in our this report. We do not recommend this approach.
4.8.5 In-hospital versus 30 day mortality
Hospital Episode Statistics have a field indicating whether the patient was dead or
alive on discharge from hospital. The data are available from 1988/89 onwards. It is
also possible, though it requires considerable processing, to match HES records with
ONS mortality records to count mortality within any required period after admission
to hospital. HES has recently introduced a field recording the date of death if the date
was between the start of the HES year (1 April) and 30 April of the following year (30
days after the end of the HES year). Thus is it possible to count deaths in hospital
plus those within 30 days of discharge. We assign deaths to HRGs using the HRG of
the first episode of the spell where a spell consists of more than one episode.
There are three obvious counts of deaths: in hospital deaths, deaths within 30 days of
admission, in hospital deaths plus deaths within 30 days of discharge. Measuring
deaths within 30 days of discharge will represent a longer follow up than deaths
within 30 days of admission.
Some deaths (e.g. road accidents) outside hospital will have nothing to do with the
quality of NHS care. Moreover, the matching of HES to ONS is not perfect: around
10% of spells with a discharged dead code according to HES do not have a matching
ONS death record (see Appendix B).
This suggests that some patients discharged alive but dying within 30 days may not
have a matching ONS death record. Hence using in hospital deaths from HES plus
deaths within 30 days after discharge from ONS will understate the 30 day post
discharge mortality rate. On the other hand counting only deaths in hospital runs the
risk of missing deaths which occur outside hospital which are capable of being
affected by the quality of care.
68
The correlations between in-hospital and 30 day post discharge survival rates for all
HRGs in 2002/03 are 0.985 for electives and 0.991 for non-electives. (The survival
rates are based on CIPS since this is our preferred unit of output for the NHS hospital
sector.) The correlations between the growth rates of the two measures of survival
rates (in-hospital and 30 day), taken between 2001/02 and 2002/03 are 0.944 for
electives and 0.994 for non-electives.
Figures 4.2 and 4.3 plot the growth rates of in-hospital and 30-day survival rates
between 2001/02 and 2002/03 for elective HRGs and for non-elective HRGs. These
growth rates have been Winsorised at the 5th and 95th percentile to remove outlier
observations.
-.02
Winsorised growth in in-hospital survival rates
-.01
0
.01
.02
Figure 4.2 Growth rates for in-hospital and 30 day CIPS based survival rates,
2001/02-2002/03, electives
-.02
-.01
0
.01
Winsorised growth in 30 day survival rates
.02
69
-.02
Winsorised growth in in-hospital survival rates
-.01
0
.01
.02
Figure 4.3 Growth rates for in-hospital and 30 day survival rates, 2001/022002/03, non-electives
-.02
-.01
0
.01
Winsorised growth in 30 day survival rates
.02
Some HRGs have small numbers of cases (see Appendix B) so that their death rates
are subject to large random fluctuations. It would be possible to allow for small
number randomness with various shrinkage estimators but we decided not to do so.
Precisely because such HRGs have small amounts of activity they will have little
influence on the survival adjusted indices as they will account for a tiny proportion of
activity.
We have estimated indices with both types of death rate in section 5. We feel that at
the moment the choice between possibly more accurately recorded in hospital deaths
and the possibly more useful but less well measured 30 day deaths is finely balanced
but have a mild preference for 30 day mortality. The data on 30 day deaths should
continue to improve. Moreover, there is some evidence that publication of in hospital
mortality rates in US led to reductions in reported in-hospital mortality for some
conditions but an increases in reported 30 day deaths (Baker et al., 2002). Use of
mortality data to quality adjust an output series does not necessitate its use as a
performance indicator but it seems more prudent to use a mortality measure which is
70
less susceptible to manipulation.
We recommend that the DH continue to encourage the refinement of the record
linkage and that the short term survival adjustment be based on 30 day deaths.
4.8.6 Conclusions: survival based quality adjustment
In this sub section we have
•
constructed a set of quality adjustments to the cost weighted output index
which attempt to allow for the changing health effects of treatment.
•
shown how to use estimates on long term survival (say up to five years) when
these become available (section 4.7).
•
shown how to use existing measures of short term survival (section 4.8.1).
•
demonstrated that the pure survival adjustment will almost certainly
underestimate the true growth in the effect of treatment on health,
•
shown how guesstimates of the proportional effect of treatment compared to
no treatment (section 4.8.2) and data on life expectancy (section 4.8.3) can be
incorporated into survival adjustment.
•
shown that assuming that the unit costs are proportional to health effects when
mortality risk is ignored, rather than making the standard assumption that
allocation is efficient and unit costs proportional to health leads to an
adjustment which takes the form of a deduction of a cost of death from the
output valued using the unit costs (section 4.8.4).
We defer judgement on recommending one of these adjustments until we have
considered the results from calculating them on actual data, either on the whole of the
hospital sector (section 5) or for our specimen set of HRGs for which we have some
health effect data (section 6). Section 7 contains calculations based on our preferred
variant. We wish to see if the resulting estimates of output growth are either
implausible or overly sensitive to unverifiable assumptions about the parameters.
However in light of the dubious underlying assumptions about unit costs required to
derive the cost of death adjustment (section 4.8.4) and some preliminary calculations
we do not recommend it and so do not report results using it.
71
Table 4.4 Summary of output indices with survival based adjustments
Quality
adjustments
Long term
survival
Weights
Costs
Form
Rationale
in section
⎞
a jt +1 S ⎛ σ *jt +1 ( s ) ⎞ ⎛
σ *jtδ s
⎜
x
c
∑ j jt +1 jt a ∑ s=1 ⎜⎜ σ * ( s) ⎟⎟ ⎜ 5 σ * ( s)δ s ⎟⎟
jt
⎝ jt
⎠ ⎝ ∑ s =1 jt
⎠
∑ j x jt c jt
4.7
Results
in
section
Not yet
feasible
Assumptions
Comments
*
Efficient allocation. h jt ( s )
constant s = 1,…,5; zero s >5.
h ojt ( s ) = 0 all s.
Survival.
Costs
⎛a ⎞
∑ j c jt x jt +1 ⎜⎜ ajt +1 ⎟⎟
⎝ jt ⎠
∑ j c jt x jt
4.8.1
5.4.1,
6.3
Efficient allocation.
Survival.
Health
effect.
Costs
⎛a −k ⎞
∑ j x jt +1 ⎜⎜ ajt +1− k j ⎟⎟ c jt
j ⎠
⎝ jt
∑ j x jt c jt
4.8.2
5.4.2,
6.4
Efficient allocation. q jt , q jt
*
o
constant. Same proportionate
effect of treatment on QALYs
with and without treatment,
all HRGs ( q j / q j = k ) in
o
*
sec 5, 6. kj varies across j in
section, 7.
Survival.
Health
effect. Life
expectancy.
Costs
∑ j x jt +1c jt
(a − k ) ⎛ 1 − e
( a − k ) ⎜⎝ 1 − e
∑xc
− rL jt +1
jt +1
jt
j
jt
jt
− rL jt
⎞
⎟
⎠
Survival increases in more
costly HRGs have bigger
impact.
4.8.3
5.4.3
*
o
Efficient allocation. h jt , h jt
constant; same proportionate
effect of treatment on health
status all HRGs in sec 5, 6. kj
varies across j in section, 7.
Certain life; same for treated
and untreated; same for all
patients in HRG
Survival growth has larger
effect if HRG more costly.
Underestimates true growth.
Impact of survival unaffected
by age of those treated.
Quality growth has larger
effect if HRG more costly.
Quality adjustment unaffected
by age of those treated.
Requires adjustment set to
ajt+1/ajt survival rate low
(close to k) to avoid instability
in index
Quality growth has larger
effect if HRG more costly.
Requires adjustment set to
ajt+1/ajt survival rate low
(close to k) to avoid instability
in index
72
Survival.
Life
expectancy.
Life
expectancy
∑ x (a
∑ x (a
jt +1
j
j
Cost of
death
Costs
jt
(
h − h oj ) 1 − e
*
jt +1 j
jt
(
h*j − h oj ) 1 − e
− rL jt +1
− rL jt
)
)
∑ {c − m π f ( m ) q$ } x
∑ {c − m π f ( m ) q$ } x
j
j
cjt unit cost, volume xjt volume; ajt, mlt
jt +1 t
jt
jt
jt
jt +1
t
jt
jt +1
jt
jt +1
jt
4.8.3
4.8.5
jt +1
6.4
Not
reported
h*jt , h ojt constant. Certain
life; same for treated and
untreated; same for all
patients in HRG
Cost proportional to health
effect ignoring mortality risk
Requires proportionality
factor fjt to be very small to
avoid negative output
jt
proportion patients alive, dead on discharge (or after 30 days); h*j constant health status conditional on surviving treatment. h oj
constant health status if not treated. σ t* ( s ) probability of surviving s years; k estimate of proportionate effect of treatment on quality adjusted life years (QALYs); Ljt life
expectancy at mean age of patients treated; r discount rate on QALYs; πt value of QALY (£s); qˆ jt QALYs lost by death of average patient; fjt(mjt) proportionality factor
measuring seriousness of HRG.
73
4.9
Readmissions
4.9.1 Readmissions as health effects
A non-trivial proportion of patients are readmitted to hospital within 28 days of
discharge as emergencies. Figure 4.4 shows a rise in these readmissions over time,
with the data being described in more detail in the Appendix. Some of these
readmissions will reflect poor quality care, in that with proper care the readmission
would not have occurred (Hofer and Hayward, 1995; Ludke et al, 1993; Thomas,
1996). However, it is not possible from these data to distinguish whether some failure
of treatment occurred during the first hospital stay because “readmissions” are defined
as any emergency admission at all within 28 days of discharge, whether or not they
were related to the earlier admission.
Since 2001 readmission rates have been used by the Department of Health, CHI and
the Healthcare Commission, as performance indicators for NHS Trusts.
In this
section we discuss whether and how readmissions should be used to quality adjust a
cost weighted output index to reflect poor quality care on first admission. Since
readmissions may also be a distressing experience for patients per se, irrespective of
the consequences of poor care at first admission for their health, we also consider in
section 4.9.2 how to adjust for readmissions as an aspect of the patient experience.
74
Figure 4.4 Emergency readmission to hospital within 28 days of discharge, by
age band
14
12
10
8
Age 75+
Age 16-74
Age 0-15
Readmission rate percent
6
4
2
0
1998/99
1999/00
2000/01
2001/02
2002/03
2003/04
Source: Department of Health
For some procedures a proportion of patients will be readmitted because of
complications arising from the treatment. Let x1jt be the number of first admissions
and x 2jt be the number of readmission, so that the readmission rate is Rjt = x 2jt / x1jt .
The readmissions may be in a different HRG but this is allowed for in the notation
since x 2jt is the number of readmissions to whatever is the appropriate HRG for
readmissions for unsuccessful treatments from the initial HRG j.
If the first admission is successful the health gain to the patient is q1*jt − q1ojt (assume
for the moment that there is no mortality risk for admissions or readmissions). If the
operation is unsuccessful (probability Fjt) and they are not readmitted the effect of the
first admission for this unsuccessful group q 2jto − q1jto . But if they are readmitted they
2o
get a health gain from readmission, compared to having no readmission, of q 2*
jt − q jt .
We assume that the interval between admission and readmission is short enough to
ignore any effect of an unsuccessful first admission on health for the time between
admissions or that health change after an unsuccessful first admission is zero.
75
The value of total health gain is
∑π
j
t
2o
⎡⎣ x1jt ( q1*jt − q1jto )(1 − F jt ) + x1jt F jt ( q 2jto − q1jto ) + x 2jt ( q 2*
jt − q jt ) ⎤
⎦
(63)
If the NHS maximises the value of health subject to the constraints that total cost does
not exceed the NHS budget and that F jt x1jt − x 2jt ≥ 0 , then
c1jt = λtπ t ⎡⎣(1 − F jt ) ( q1*jt − q1jto ) + F jt ( q 2jto − q1jto ) + F jt µ jt ⎤⎦
(64)
2o
⎤
c 2jt = λtπ t ⎡⎣( q 2*
jt − q jt ) − µ jt ⎦
1o
2
⎤
= λtπ t ⎡⎣(1 − Rt ) ( q1*jt − q1jto ) + Rt ( q 2*
jt − q jt ) ⎦ − Rt c jt
(65)
where µ jt is the Lagrange multiplier on F jt x1jt − x 2jt ≥ 0 and λt is the reciprocal of the
multiplier on the budget constraint. Consider solutions in which all those for whom
the first treatment is not successful are readmitted Fjt = Rjt so that the constraint binds
and
µ jt > 0. (This makes no essential difference to the conclusions.) We can
substitute for µ jt in (64) to get
1o
2
⎤
c1jt = λtπ t ⎡⎣(1 − Rt ) ( q1*jt − q1jto ) + Rt ( q 2*
jt − q jt ) ⎦ − Rt c jt
(66)
In allocating resources to first admissions allowance must be made for the fact that
with probability Fjt = Rjt the patient will have an unsuccessful first admission and then
1o
2
be readmitted, getting a health gain q 2*
jt − q jt but generating additional costs c jt . The
health gain when Fjt = Rjt is
∑π
j
t
1o
⎡⎣ x1jt ( q1*jt − q1jto )(1 − R jt ) + x 2jt ( q 2*
jt − q jt ) ⎤
⎦
(67)
Current practice is to calculate a cost weighted outcome index (CWOI) which
includes both first and subsequent admissions
∑ ⎡⎣ x
∑ ⎣⎡ x
c + x 2jt +1c 2jt ⎤⎦
1
1
jt +1 jt
j
J
c + x 2jt c 2jt ⎦⎤
1 1
jt jt
(68)
If there is efficient allocation ((65) and (66) hold) and if there is no change in health
effects or in the readmission rate then the CWOI (68) equals the value weighted
output index
76
∑ ⎡⎣ x
∑ ⎡⎣ x
c + x 2jt +1c 2jt ⎤⎦
1
1
jt +1 jt
j
J
c + x 2jt c 2jt ⎤⎦
1 1
jt jt
=
∑ π ⎡⎣ x
∑π
j
1
jt +1
t
j
t
1o
( q1*jt +1 − q1jto+1 )(1 − R jt +1 ) + x 2jt +1 ( q 2*
jt +1 − q jt +1 ) ⎤
⎦
1o
⎡⎣ x1jt ( q1*jt − q1jto )(1 − R jt ) + x 2jt ( q 2*
jt − q jt ) ⎤
⎦
(69)
To show this use (65) and (66) and write the denominator in (68) is
∑
j
{
}
λtπ t x1jt ⎡⎣ (1 − R jt ) ( q1*jt − q1jto ) + R jt µ jt ⎤⎦ + x 2jt ⎡⎣( q 2*jt − q1jto ) − µ jt ⎤⎦
{
}
1o
= ∑ j λtπ t x1jt ⎡⎣ (1 − R jt ) ( q1*jt − q1jto )⎤⎦ + x 2jt ( q 2*
jt − q jt )
(70)
which is proportional to the value of period t health output at the period t price of
health.
Each term in the denominator of (68) can be written as
{
}
1o
⎤
x1jt +1 ( c1jt + R jt +1c 2jt ) = x1jt +1λtπ t ⎡⎣(1 − R jt ) ( q1*jt − q1jto ) + R jt µ jt ⎤⎦ + R jt +1 ⎡⎣( q 2*
jt − q jt ) − µ jt ⎦
which, with the assumption of a constant readmission rate and constant health effects,
is
λtπ t ⎡⎣ x1jt +1 (1 − R jt +1 ) ( q1*jt +1 − q1jto+1 ) + x 2jt +1 ( q 2*jt +1 − q1jto+1 )⎤⎦
(71)
Hence the numerator in (68) is λt time the value of health output in period t +1 at
period t price of health.
Thus no adjustment need be made to the cost weighted output index for readmissions
if there is efficient allocation and if there are no change in health effects or
readmission rates. This is just a particular example of the result we noted in section
2.7. The conclusion holds if we relax the assumption that the time period between
first and second admission is very short so that the change in health over this time
interval can be ignored. It also holds if we allow for a non-zero mortality risk. What
matters is the assumption, underlying the use of the cost weighted output index, that
allocation of resources is efficient.
Notice that the conclusion does not mean that readmissions do not affect the quality of
care, nor does it mean that performance indicators for Trusts based on readmission
rates do measure anything of interest. Increases in readmission rates, ceteris paribus,
reduce welfare. But on the assumptions made above readmissions are already allowed
for in the simple cost weighted index.
If health effects change but readmission rates do not, then the quality adjustment of
the index raises exactly the same issues as discussed in earlier sections. Suppose that
77
1o
readmission rates change but that the health effects q1*jt − q1ojt , q 2*
jt − q jt do not, and that
the assumption of efficient allocation holds. Is there an adjustment to the CWOI
which ensures that it also measures the change in the value of health effects?
We require adjustment factors φ which ensure that
φ1 (⋅)c1jt + φ2 (⋅) R jt +1c 2jt = λtπ t ⎡⎣( q1*jt +1 − q1jto+1 ) (1 − R jt +1 ) + ( q 2*jt +1 − q1jto+1 ) R jt +1 ⎤⎦ (72)
where the adjustment factors can depend only on variables we know (or assume). In
general the adjustment factors will depend on the health effects and, if we are
unwilling to make any assumptions about them, there is no set of adjustment factors
depending on cost and readmission rates which will satisfy (72). But if we are willing
1o
to make assumptions about q1*jt − q1ojt , q 2*
jt − q jt then we can make a little progress.
Suppose that:
(q
2*
jt
− q1jto ) = kˆ j ( q1*jt − q1jto )
(q
2*
jt +1
− q1jto+1 ) = kˆ j ( q1*jt +1 − q1jto+1 )
(73)
then by substituting (65) and (66) in the left hand side of (72) we can show that the
required adjustment factors are
φ1 =
R
(1 − R jt +1 ) + kˆ j R jt +1
, φ2 = φ1 jt
R jt +1
(1 − R ) + kˆ R
jt
j
(74)
jt
Thus for example if kˆ j = 1 so that readmission treatment has the same health effect as
the successful first treatment
∑
⎡ 1 1
R jt 2 ⎤
2
x
c
x
c jt ⎥
+
⎢
jt +1 jt
jt +1
j
R jt +1 ⎦
⎣
∑ J ⎡⎣ x1jt c1jt + x 2jt c 2jt ⎤⎦
∑
=
j
1o
⎡ x1jt +1 (1 − R jt +1 ) ( q1*jt +1 − q1jto+1 ) + x 2jt +1 ( q 2*
⎤
jt +1 − q jt +1 ) ⎦
⎣
∑ J ⎡⎣ x1jt (1 − R jt ) ( q1*jt − q1jto ) + x 2jt ( q2*jt − q1jto )⎤⎦
(75)
and the CWOI with readmission costs scaled by Rjt/Rjt+1 is identical to the value
weighted health effect index. The intuition is that if the health gain from readmission
is the same as the health gain from a successful first admission then the only effect of
a change in the readmission rate is via the increase in the total cost of readmissions. If
Rjt+1 < Rjt then output, other things equal, must have increased: the cost saving on
readmissions can be used to produce more health from a given budget.
78
If we were to take account of readmission rates in our comprehensive index, we
would require readmission rates Rjt for HRG j in period t. This data is not routinely
generated and at present is not readily usable in a productivity index in the way
required for the adjustment in (75)
First, since HRGs are assigned at episode level and having linked episodes into spells
(CIPS), a spell may have more than one HRG within it. A readmission may or may
not be related to any of the HRGs which occur in the index admission. There is as yet
no agreed method for assigning an HRG to a CIPS where there are possibly multiple
episodes and multiple diagnoses. In the construction of CIPS in this report, we have
assigned the HRG to the first episode of care in the admission, but as mentioned, a
readmission may not be related to the first HRG, or any others in the CIP spell. A
spell-based HRG assignment methodology has been developed for Payment by
Results which looks for the 'dominant' procedure / diagnosis / HRG across episodes,
but this is based on a provider spell rather than CIPS and again, the readmission may
not be linked to the 'dominant' HRG.
Second, it is impossible, with available data, to distinguish between readmissions due
to poor treatment (the element we wish to capture in the index) and readmissions due
to patients being generally sick and prone to getting ill. One way of trying to make a
judgement on this may be to examine whether a readmission is linked to a previous
discharge, but this will be extremely complex and would need to be considered for
each HRG. Indeed, the readmission could be related to any of the conditions in the
spell, not necessarily the HRG coded in the CIP spell. It is possible for a readmission
to be due to neglect or poor care and yet apparently entirely unrelated to the set of
HRGs in the index admission.
We conclude that the data do not yet support the kind of adjustment to reflect the
health effects of treatment considered in this section. A more fundamental objection
may be that the method rests on the standard assumption that unit costs are
proportional to the value of treatments. In the case of readmissions this assumption is
more than usually questionable. In the next section we sketch a more ad hoc, less
theoretically grounded method which might command more support, though it also
subject to the same data problems.
79
4.9.2 Readmissions (and clinical errors and MRSA) as a deadweight loss
An alternative view of readmissions, which can also be applied to clinical errors and
MRSA, can in principle be constructed within the confines of the cost weighted index,
if the extra costs associated with these can be identified. The general principle of the
cost weighted index is that costs are proportional to benefits. If follows from this that
if there are some components of cost which are unrelated to benefits, then these
should be omitted from the index. In other words if the costs of treating MRSA,
directly associated with readmission as a consequence of plainly premature discharge
or clinical errors could be identified these should be omitted from the qualityaugmented cost weighted index.
The HRG costs include those elements of cost which are unrelated to benefit but
which instead arise from poor provision of medical services. This means that in order
to calculate the change in the value of output at constant prices we should deduct from
both the numerator and denominator the costs which represent money wasted. If one
wanted to represent not only the money wasted but also the disutility arising from
such activity, then these deductions should be augmented.
In this exercise there is a risk of double-counting. If either MRSA or premature
discharge affects mortality rates, then this is already taken into account. In the quality
adjusted index. Thus the index set out here can be defended only if one is sure that
this is not the case.
If we take an output, we then deduct from the value of output in constant prices the
expenditures on bads arising from poor provision. We assume that bad j has a cost c bjt
associated with it, and that there are x bjt examples of bad j. The index then becomes,
in the example of the survival and life expectancy adjusted index
∑
x c
j jt +1 jt
(a − k ) ⎛ 1 − e
( a − k ) ⎜⎝ 1 − e
∑ x c −∑x
− rL jt +1
jt +1
− rL jt
jt
j
jt
jt
b b
jt jt
c
⎞
b
b
⎟ − ∑ x jt +1c jt
⎠ j
(76)
j
There is an obvious similarity with the cost of death index of section 4.8.4, but with
80
the important difference that the cost of death has to be inferred as best one can from
information about the value put on life, while the costs associated with poor service
provision can in principle be identified through cost accounting.
We do not recommend this adjustment given the poor state of the data on
readmissions, and the costs of associated with readmissions and MRSA. We do
illustrate the effect of this type of adjustment in section 5 using data on readmissions
and MRSA. We could find no usable data on clinical errors but in principle they
could be incorporated in the same way.
4.10 Waiting times
Waits for diagnostic tests and treatment may affect individuals in two ways. First,
they may dislike waiting per se irrespective of the effect of treatment on discounted
sum of their quality adjusted life year (qjt). Thus waiting time is regarded as a separate
characteristic of health care, distinct from its effect on health. Second, longer waits
can reduce the health gain from treatment and the waiting adjustment is akin to a
scaling factor multiplying the health effect.
Although currently the data do not exist to provide satisfactory estimates of the first
type of effect of waiting, section 4.10.1 sets out how such data can be used when it
becomes available. We then turn in subsequent sections to consider how estimates of
the second type of effect can be made with current data.
The total wait is the sum of the wait for a first outpatient appointment after referral
from a GP, plus possible further waits for subsequent outpatient appointments for
results of tests, plus the wait from the date the patient is placed on the waiting list for
inpatient treatment. We ignore these distinct components of the total waiting time
because currently data do not permit tracking of individual patients in order to
calculate their total time waited. This limitation is likely to be rectified in future with
data being collected in order to monitor achievement of the 18 week target for total
waiting times. But given current limitations, in our empirical analysis we assess
waiting time after a patient has been placed on the list for an inpatient admission. In
81
section 4.10.5 how to include an outpatient waiting time adjustment.
4.10.1 Waiting time as a characteristic
The first way to quality adjust the output index to reflect changing waiting times is to
use direct monetary valuation of reductions in waiting times. Let πwjt be the value of a
reduction of one day in waiting time for treatment j in year t. The value of the output
of j in year t is the sum of the values of its characteristics: health gain (qhjt) and
waiting time (qwjt = wjt) :
y jt = x jt p jt = x jt [π ht qhjt + πˆ wjt qwjt ] = x jt [π ht qhjt − π wjt w jt ]
(77)
Notice that we assume that the value of a day’s wait depends on the treatment waited
for so that (77) differs from (10). The value weighted index (12) becomes
∑ x ⎡⎣π q
∑ x ⎡⎣π q
j
jt +1
j
jt
ht hjt +1
ht hjt
− π wjt w jt +1 ⎤⎦
− π wjt w jt ⎤⎦
(78)
We report in section 6 calculations of (78) based on estimates of the health effect for a
small subset of HRGs in a specimen index.
If we make the assumption of efficient allocation we can derive the quality adjusted
cost weighted index where we take account of both health and waiting time as
characteristics is a special case of (21)
I ctxq = ∑ j
x jt +1 ⎡⎣π ht q jt +1 − π wjt w jt +1 ⎤⎦ c jt x jt
x jt ⎡⎣π ht q jt − π wjt w jt ⎤⎦ ∑ j c jt x jt
⎧⎪ x q
⎫⎪ c jt x jt
π ht q jt
x w
π wjt w jt
= ∑ j ⎨ jt +1 jt +1
− jt +1 jt +1
⎬
⎪⎩ x jt q jt ⎡⎣π ht q jt − π wjt w jt ⎤⎦ x jt w jt ⎡⎣π ht q jt − π wjt w jt ⎤⎦ ⎪⎭ ∑ j c jt x jt
⎡
π q
π w ⎤
= ∑ j (1 + g xjt ) ⎢ (1 + g qhjt ) ht hjt − (1 + g qwjt ) wjt jt ⎥ ω ctjt
p jt
p jt ⎥⎦
⎢⎣
(79)
This index requires data on the amount of health gain qjt for each treatment in order to
calculate the value share π ht qhjt / p jt due to health. We do not have such data for all
HRGs and the survival adjustments considered in section 4.8 are inadequate since
they used to estimate growth rates not levels of health. Thus with available data we
82
cannot calculate (79). The kind of data on health before and after treatment that we
recommend in section 10 would enable such an index to be calculated.
We have calculated a specimen index for (79) for the small subset of treatments where
we have snapshot estimates of the health effects (see section 6). We require the
monetary value of health ( π ht ) and of waiting times (πwjt) to calculate this index. For
πht we use the monetary value of the QALY implied by the decisions of public bodies,
such as the National Institute of Clinical Excellence and the Department for Transport
(see Devlin and Parkin, 2004; Carthy et al., 1999). We estimate the value of waiting
time from the studies reported in Ryan et al. (2004). Note that, even with πwt of the
order of £10 per day, since waiting times are short compared with the horizon over
which health gains are enjoyed, the effect of the waiting time adjustment may not be
large.
It is possible to calculate (79) without data on the magnitude of the health effects if
we are willing to make further assumptions about the relative importance of health
and waiting time characteristics. Thus if we believe that the health effect of treatment
is say 10 times as important as the waiting time then we can set π ht qhjt / p jt = 10/9
which implies π wt w jt / p jt = 1/9 since the sum of the value shares must be 1. We
have not done so in this report because we can think of no sensible method of
estimating the relative importance of health and waiting times characteristics in the
absence of data on health effects and the relative marginal social values of health and
waiting times.
4.10.2 Waiting time as a scaling factor
Delay may also lead to a reduced health effect qjt from treatment. This can arise
because (a) the condition of the patient deteriorates whilst they wait; (b) the post
treatment level of health status may be reduced; and (c) if treatment is delayed the
time over which the benefits from treatment accrue may be reduced.
Figure 4.5 illustrates the effect of reduction in waiting times on the health effects of
83
treatment. Initially we drop the subscripts for characteristics (since only health gain
matters) and for type of treatment. ho(s) is the without treatment health status at time
s (time is measured from the date at which the patient begins to wait). The waiting
time in year t is wt and the time path of health with the treatment is h*(s;wt). In the
following year the waiting time falls to wt+1 and the health time path is h*(s;wt+1). We
assume that the change in the time path is due solely to the reduced wait for treatment,
not to changes in technology. The dashed lines show the expected time streams given
the treatment mortality probability and allow for a possible effect of reduced wait on
the mortality probability.
In Figure 4.5 the fact that ho(s) is declining means that
earlier treatment prevents a deterioration in health. The fact that h*(s;wt+1) > h*(s;wt)
implies that earlier treatment leads to better post treatment health. The fact that the
individual lives for longer post treatment means that they have longer to enjoy their
improved health.
A recent survey of the literature (Hurst and Siciliani, 2003) found some evidence on
deterioration and premature death associated with waiting for cardiology treatment
but little for other procedures. Clinical reassessment of patients on a waiting list was
thought to contribute to reduction in adverse outcomes of waiting but there is little
data on the frequency or efficiency of re-classification of patients on waiting lists. If
the NHS begins the routine collection of data on health related quality of life, the
QALY improvement due to reduced waiting time should be captured by trend changes
in QALYs.
84
Figure 4.5 Effect of reduced waiting times on time streams of health
h
h*(s;wt)
h*(s;wt+1)
(1-mt+1)h*(s;wt+1)
(1-mt)h*(s;wt)
ho(s)
wt+1
wt
Time since placed on waiting list
There are two possible ways to measure the effect of treatment and hence of the effect
of changes in waiting times. The first is to value treatment at the time the patient is
placed on the waiting list: health effects are therefore discounted to this date. The
second is to value treatment at the date it is actually received: health effects are
discounted to the date of treatment. If we do not wish to quality adjust with respect to
waiting times or if we are dealing with non-electives we do not need to take account
of the difference between the two approaches.
4.10.2.1
Discounting to start of wait
If the patient is placed on the waiting list at time 0 the expected health gain discounted
to that date is
L* ( wt ) + wt
qt = ∫0 t hto ( s )e − rs ds + (1 − mt ) ∫wtt
w
Lo + wt
ht* ( s; wt )e − rs ds − ∫0 t
hto ( s )e − rs ds
(80)
where we subscript time paths to allow for the possibility of technological change
across periods. Lot , L*t are life expectancies without and with treatment measured from
the date of treatment. Thus Lot + wt , L*t + wt are life expectancies measured from the
date placed on the list. r is the discount rate applied to health. Collecting terms
85
L* ( w ) + wt
qt = (1 − mt ) ∫wtt
Lo + wt
ht* ( s; wt )e − rs ds − ∫wtt
hto ( s )e − rs ds = (1 − mt ) qt* ( wt ) − qto ( wt ) (81)
which is similar in form to the expression (26) for the health effect in the discussion
of quality adjustment by mortality.
In general we will be unsuccessful in seeking a simple method of quality adjusting for
changes in survival and in waiting times which is equivalent to the quality adjustment
for survival derived in section 3.1, even with the assumption about unchanging
technologies employed to construct the survival quality adjustment. The reason is that
the survival probability enters the expression for health gain linearly but this is not
true for the waiting time, as inspection of (80) reveals.
Only if further assumptions are made about the shape of the time paths of health status
with and without treatment is it possible to derive a simple quality adjustment
reflecting both survival and waiting times. The non-linearity of the waiting time
adjustment also raises questions about whether the adjustment should be made on the
basis of average waiting times applied to all cases of a particular type or whether we
should use make the adjustment for each case separately and sum over the cases
within each treatment type. We discuss this point further in section 4.10.4.
To get a reasonably simple expression for the waiting time adjustment, suppose the
time paths hto , ht* are constant with respect to elapsed time after the patient is placed
on the waiting list, are unaffected by waiting time, do not change from one year to the
next, and that total life expectancy is unaffected by treatment and waiting time:
Lot + wt = L*t ( wt ) + w . (Remember that Lot , L*t are life expectancies without and with
treatment measured from the date of treatment.)
To simplify notation in this case
where life expectancy from date of treatment is unaffected by treatment we use Lot
= L*t = Lt . Figure 4.6 illustrates.
86
Figure 4.6 Health gain and waiting time – simple case
h
h*(s;wt)
*
*
h (s;wt+1) (1-m)h (s;wt+1)
(1-m)h*(s;wt)
ho(s)
wt+1
wt
Time since placed on list
The health gain from treatment in Figure 4.6
q1jt ( wt )
(
=
(
a jt h*jt
− hojt
= a jt h*jt − hojt
)
e
)∫
w jt + L jt − rs
e ds
w jt
− rw jt
=
(
a jt h*jt
− hojt
() e
− rw jt
−e
− r ( L jt + w jt )
)
r
(1 − e )
− rL jt
(82)
r
Waiting has a cost, in units of health, which increases with the length of the wait but
at a decreasing rate: an extra day after a long wait costs less than an extra delay after a
short wait. The cost of a positive wait measured in terms of QALYs can be defined as
this formula as
⎛ at ht* − hto ⎞
− rw
− rL
⎟ 1− e t 1− e t
r
⎝
⎠
κ t1 ( wt ) = ⎡⎣ q1t (0) − q1t ( wt ) ⎤⎦ = ⎜
(
)(
)
(83)
The discounted health effect (82) is convex in the waiting time: increases in the wait
reduce the quality of care but do so at a decreasing rate. Thus the gain from a
reduction in the waiting time from 40 weeks to 39 weeks is less than the gain from a
reduction from 10 to 9 weeks. It has been suggested (Atkinson, 2005; 119) that the
value of waiting time is more plausibly concave in the waiting time: the value of
reductions is greater the longer the wait.
87
The convexity of the health effect in the waiting time implies that an increase in the
dispersion of waiting times, holding the mean wait constant, increases the average
value of treatment. This seems counter-intuitive.6
The quality adjustment factor is
o
*
qt1+1 ⎡⎣(1 − mt +1 ) − h / h ⎤⎦
=
o
*
qt1
⎣⎡(1 − mt ) − h / h ⎦⎤
⎡ − rwt +1 (1 − e − rLt +1 )⎤
⎣e
⎦
⎡ e − rwt (1 − e − rLt )⎤
⎣
⎦
(84)
and the cost weighted index with a waiting time adjustment is
I
xaw
ct
⎛ a jt +1 − h oj / h*j ⎞ ⎛ e − rwt +1 (1 − e − rLt +1 ) ⎞ jt
⎜ − rw
⎟ ωct
= ∑ j (1 + g xjt ) ⎜
o
* ⎟
− rLt
t
⎜
⎟
⎝ a jt − h j / h j ⎠ ⎝⎜ e (1 − e ) ⎠⎟
(85)
The use of the adjustment factor (84) requires estimates of life expectancy for
patients. If we do not wish to use such estimates because we feel they are unreliable
we can instead use
o
*
qt +1 ⎡⎣(1 − mt +1 ) − h / h ⎤⎦ − r ( wt +1 − wt )
=
e
qt
⎡⎣ (1 − mt ) − ho / h* ⎤⎦
(86)
e − rwt +1 / e − rwt will have the same sign as ⎡⎣ e − rwt +1 (1 − e − rLt +1 )⎤⎦ / ⎡⎣ e − rwt (1 − e − rLt )⎤⎦ . If life
expectancy increases from t to t+1 then the adjustment using only e − rwt +1 / e − rwt will
understate growth and will overstate growth if life expectancy falls. Using only
e − rwt +1 / e − rwt and ignoring life expectancy is equivalent to assuming that life
expectancy does not change.
Given the general lack of information on health with and without treatment we can
use an estimate of the mean value of h oj / h*j (k) from a small sample of treatments (see
6
Although the conclusion that a mean preserving increase in dispersion is welfare increasing is
counter-intuitive, similar conclusions hold in other contexts. For example, a mean preserving spread in
the distribution of prices of a good faced by a consumer who is risk neutral towards income will, if the
consumer observes the price before buying the good, reduce expected utility. It is essential to specify
the decision context carefully and in particular whether uncertainty is resolved before or after decisions
are taken. In the case of consumer price uncertainty, it can be shown that if the consumer decides on
consumption before the price is revealed and if she is averse to income risk then she is made worse off
by a mean preserving increase in the dispersion of prices.
88
section 4.8.2). If we feel that this estimate is too unreliable we can set h oj / h*j = 0 in
the index (85) and apply the adjustment for waiting with a pure survival adjustment.
Notice that we have applied the same discount rate to all types of treatment. The
formulation of the health gain as having the same value whatever the type of
treatment which underpins I ctxaw implies that the same discount rate should be applied
to waits for heart bypass operations [HRG E04] as for varicose vein stripping [HRG
Q11]. The differences in the health gains from the two treatments is already reflected
in the index because it is in the cost weights which we have assumed are identical to
the value weights. If we think that the assumption of proportionality of unit costs and
marginal social values (30) is incorrect we could argue for using a higher discount
rate for heart bypass waits than for varicose vein waits. However, this would be an ad
hoc adjustment to a problem ( c jt ≠ λt p jt ) which is better tackled more directly by
adjusting the weights rather than the discount rates.
Because we treat waiting times as delaying health improvement we have also applied
the same discount rate to waiting times and life expectancy. It is possible that people
may feel differently about future health whilst waiting for treatment and when they
have been treated and apply a discount rate whilst waiting for treatment of rw but a
rate of rL < rw after treatment . The health effect would be
q1'jt = ∫
w jt
0
hojt e − rw s ds + ∫
w jt + L jt
w jt
a jt h*jt e − rL s ds − ∫
w jt + L jt
0
ho e − rL s ds
− rw w jt
−r w
e
⎡
) (1 − e L jt ) ⎤
o (1 − e
o
*
= h jt ⎢
−
⎥ + a jt h jt − h jt
rw
rL
⎢⎣
⎥⎦
(
4.10.2.2
)
− rL w jt
(1 − e )
− rL L jt
rL
(87)
Discounting to date of treatment with charge for waiting
We measure activity when it takes place which suggests that we should measure the
benefit from treatment at the time it takes place. This is what we did in section 4.8.3
when discussing life expectancy and health effects. But with such assumption the
discounted sum of QALYs, is, continuing with the simple assumptions about the time
streams of health used in the previous section and in section 4.8.3, just
89
L jt
L jt
⎛ 1 − e − rL jt ⎞
q jt = a jt ∫ h*j e − rs ds − ∫ h oj e − rs ds = ( a jt h*j − h oj ) ⎜
⎟
0
0
r
⎝
⎠
(53)
To capture the welfare lost as a result of having to wait for treatment we use a charge
for waiting which is offset against (53) which captures only the benefit for the life
span from date of treatment. As with any cumulating debt, interest is charged on the
cost of waiting:
κ 2jt = ( a jt h*jt − h ojt ) ∫ e rs ds =
w jt
0
(a
jt
h*jt − h ojt )
r
(e
rw jt
)
−1
(88)
The effect of treatment after adjustment for waiting time is now
(
)
0
− rw ⎤
⎡ L jt
q 2jt = a jt h*jt − hojt ⎢ ∫ e − rs ds − ∫ e jt ⎥
− w jt
⎣ 0
⎦
( a jt h*jt − h ojt )
=
r
⎡ 2 − e − rL jt − e rw jt ⎤
⎣
⎦
(89)
q 2jt is decreasing and convex in the waiting time so that an increase in an already long
wait has a greater effect than the same increase in a short wait. Moreover an increase
in the dispersion of waits reducing the expected value of treatment and there is a nontrivial cost of waiting even for the very young.
We can also allow for the possibility that the interest charge on waiting time differs
from the interest rate on future QALYs:
(
)
(
)
0
−r w ⎤
⎡ L jt
q 2jt′ = a jt h*jt − hojt ⎢ ∫ e − rL s ds − ∫ e w jt ⎥
−
0
w
jt
⎣
⎦
r w
⎡ 1 − e − rL L jt
e w jt − 1 ⎤
*
o ⎢
⎥
= a jt h jt − h jt
−
⎢
⎥
rL
rw
⎢⎣
⎥⎦
(
) (
)
(90)
A difficulty with (89) is that it is theoretically possible for q 2jt to be very small or even
negative (if e
rw jt
>2−e
− rL jt
) even if a jt h*jt − a ojt is positive. The same possibility arises
with (90). However, this will only be a problem when the waiting time is similar to
life expectancy. Table 4.5 has illustrative calculations for an example in which the
health effects and life expectancy do not change from period t to period t+1illustrates
for examples. Only in the three italicised cells where life expectancy and the waiting
90
times are similar do we observe nonsensical results where a reduction in waiting time
leads to a very large or a negative adjustment. Otherwise the results are in line with
intuition: the effect of a reduction in waiting time is greater at higher waits and for
shorter life expectancy.
For all except the very shortest life expectancies a given percentage reduction in the
waiting time implies a smaller percentage increase in quality adjusted output. The
reason is that the waiting time is in general small relative to the length of time over
which any health improvement will be enjoyed. The reductions are however in
general not trivial. For example, reductions in waiting time from 120 to 90 days
increases the growth rate by between 1.8% for life expectancy of 5 years and 0.35%
for life expectancy of 30 years.
Table 4.5 (a) Waiting time adjustment qjt+1/qjt: discounting to date treated with a
charge for waiting (rw = rL = 1.5%, no change in health effects and life expectancy
between period t and t+1)
Reduction in
wait (days)
10 to 0
30 to 10
60 to 30
90 to 60
120 to 90
150 to 120
180 to 150
365 to 150
0.5
1
1.0582
1.1319
1.2469
1.3283
1.4897
1.9621
27.2659
-0.1686
1.0284
1.0602
1.0995
1.1106
1.1245
1.1424
1.1663
-38.6866
Life expectancy (years)
2
5
10
1.0141
1.0290
1.0456
1.0478
1.0503
1.0530
1.0561
1.6183
1.0057
1.0116
1.0177
1.0180
1.0184
1.0188
1.0191
1.1563
1.0030
1.0060
1.0090
1.0091
1.0092
1.0093
1.0094
1.0719
20
30
1.0016
1.0032
1.0048
1.0048
1.0049
1.0049
1.0049
1.0366
1.0011
1.0023
1.0034
1.0034
1.0035
1.0035
1.0035
1.0257
Table 4.5 (b) Waiting time adjustment qjt+1/qjt: discounting to date treated with a
charge for waiting (rw = 10%, rL = 1.5%, no change in health effects and life
expectancy between period t and t+1)
Reduction in
wait (days)
10 to 0
30 to 10
60 to 30
90 to 60
120 to 90
150 to 120
180 to 150
365 to 150
0.5
1
1.0583
1.1326
1.2503
1.3376
1.5161
2.0850
-10.6470
-0.1420
1.0284
1.0605
1.1006
1.1129
1.1285
1.1488
1.1766
-9.6840
Life expectancy (years)
2
5
10
1.0141
1.0292
1.0461
1.0488
1.0517
1.0550
1.0587
1.6882
1.0057
1.0116
1.0179
1.0184
1.0189
1.0194
1.0199
1.1679
1.0030
1.0060
1.0091
1.0093
1.0094
1.0096
1.0098
1.0768
20
30
1.0016
1.0032
1.0049
1.0049
1.0050
1.0051
1.0051
1.0390
1.0011
1.0023
1.0035
1.0035
1.0036
1.0036
1.0036
1.0274
91
4.10.3 Optimal waiting times
All the waiting time adjustments considered above assume that a reduction in waiting
time is always of value – shorter waiting times are better than longer. It has been
suggested that to us that the assumption may not be valid for very short waiting times
because some patients find treatment at very short notice to be inconvenient. It would
be possible to conduct patient surveys or stated preference experiments to determine
optimal waits and the costs associated with departures from them. But in the absence
of such data we have to fall back on some crude assumptions which nevertheless
enable us to determine how much impact such consideration might have. We have
therefore investigated the implications of replacing the measure of waiting time wjt in
the waiting time adjustments considered in previous sections with wˆ jt = min(wjt –
w*,0). This has the effect of increasing the proportionate effect of a given reduction in
wjt if wjt exceeds w* and reducing it to zero if wjt < w*. Another possibility would have
been to define wˆ jt = (wjt – w*)2 but we felt that the implication that a reduction in wjt
below w* reduced quality adjusted output was likely to prove difficult to justify.
4.10.4 Distribution of waiting times
We need to consider what waiting time measure should be used in the waiting time
quality adjustments. Patients within an HRG do not have the same waiting times.
This raises the question of how we take account of the variability of waiting times.
Variations in waiting times across patients for a given treatment are due to a
combination of prioritization of patients so that different types have different expected
waits and random factors so that patients of given types have uncertain waiting times.
We will not be able to observe all the factors which are used in prioritisation so that
the distributions of realised waiting times conditional on variables, such as age and
gender, which are observable for individual patients in routine data, will overstate the
amount of uncertainty for individual patients.
Suppose that treatment has the effect on health shown in the simple case in Figure 4.6.
Then, assuming that all patients have the same life expectancy, the average health
92
gain for a patient, is discounting to the date of treatment as in section 4.10.2.2,
Eqt2 ( w) = ⎡⎣ at ht* − hto ⎤⎦ ⎡⎣ 2 − e − rLt − Ee rwt ⎤⎦ r −1
(91)
where Ee rw is the average of erw over realised w. Since Eerw < e rEw the waiting time
adjustment factor ⎡⎣ 2 − e − rLt − Ee rwt ⎤⎦ r −1 , using the mean wait to quality adjust the
output for a particular HRG will lead to an overestimate of the mean quality adjusted
output in a particular year. But what matters for the calculation of the output index is
the ratio of the quality adjustment factors for consecutive years and in general
2 − e − rLt +1 − e rEwt +1
2 − e − rLt − e rEwt
(92)
may be less than or greater than
2 − e − rLt +1 − Ee rwt +1
2 − e − rLt − Ee rwt
(93)
There are two possible ways of allowing for the distribution of waiting times in the
waiting time adjustment. We can write Eerw as
∞
Ee rw = ∫0 e rw f ( w)dw = 1 + rEw +
r 2 E ( w ) 2 r 3 E ( w) 3
+
+L
2!
3!
(94)
or as the moment generating function for the distribution of waiting times. If the
distribution is exponential (which it would be a random selection of patients was
drawn from the waiting list each day) then Ewi e rw = µi/(1-r) where 1/µi is the mean
i
waiting time and the variance of waits is 1/ µi2 . Similarly if the distribution is uniform
on [0,bi] then Ewi e rw = ( ebi r − 1 )/bir. Thus if the distribution of waiting times has a
i
tractable distribution and we know its parameters we can allow for the uncertainty in
waiting times without requiring individual level data.
After inspection of the
empirical distributions so we decided not to pursue this approach as there is
considerable disparity in the empirical distributions and few seem to be approximated
by tractable theoretical distributions.
If individuals have uncertain waiting times and have a utility function which is
decreasing and convex in waiting time then taking the mean of the actual waiting
times will understate the costs of waiting. One way to allow for a cost of the risk of
having very long waits is to not to use the mean or median of the distribution of waits
93
but the “certainty equivalent” wait wc : the mean wait plus a waiting time “risk
premium”. The certainty equivalent waiting time is
EU ( w) = u( wc ) = u( E ( w) + RP )
(95)
where RP is risk premium and u(.) is a utility function. 7 We take the specification of
the benefits from treatment when there is a positive waiting time in section 4.10.2.2 as
the utility function. The standard approximation to the risk premium (Pratt 1964) is
RP = − Aσ w2 / 2
(96)
where A = −u′′ / u′ is the coefficient of absolute risk aversion and σ w2 the variance of
waiting times. With the specification in section 4.10.2.2, the coefficient of absolute
risk aversion is − r so that
RP = rσ w2 / 2
(97)
Table 4.6 shows, for different rates of discount, the proportion of elective cases
admitted in 2002/3 in HRGs where the distribution of waiting times implies a
certainty equivalent wait greater than various percentiles of the distribution.
7
Since u is decreasing in the waiting time we have defined the risk premium as an addition to the mean
by contrast to the usual case where u is increasing in a risky variable such as income. Hence the minus
sign in the definition used here.
94
Table 4.6 Number of elective HRGs and proportion of total
in HRGs where the certainty equivalent waiting time wc
percentiles of the distribution of waiting times in 2002/3
Certainty equivalent wait
70th
75th
80th
greater than
r = 0.01
Number HRGs
214
152
152
Prop cases
0.109
0.057
0.057
r = 0.015
Number HRGs
222
156
156
Prop cases
0.138
0.069
0.069
r = 0.03
Number HRGs
241
170
170
Prop cases
0.235
0.153
0.153
r= 0.05
Number HRGs
258
185
185
Prop cases
0.292
0.205
0.205
r = 0.1
Number HRGs
292
211
211
Prop cases
0.314
0.224
0.224
r = 0.15
Number HRGs
320
233
233
Prop cases
0.375
0.241
0.241
r = 0.2
Number HRGs
340
250
250
Prop cases
0.392
0.258
0.258
elective admissions
exceeds particular
85th
90th
95
0.032
32
0.005
97
0.038
32
0.005
107
0.116
34
0.005
112
0.123
38
0.006
127
0.130
42
0.008
144
0.141
52
0.013
154
0.142
56
0.014
The table suggests that using the extreme top end of the distributions as certainty
equivalent waiting times is likely to overstate the cost of risk arising from the
distribution of waiting times. In our calculation of the indices with waiting time
adjustments in section 5 and 6 we compare the effects of using the 80th percentiles as
the certainty equivalent wait compared with the mean wait at rates of interest between
1.5% and 10%
The alternative approach is to use individual level data from HES on the waits
experienced by each elective patient. Given the number of electives this is fairly
cumbersome since each calculation of an index takes around four hours with
individual data. We do however report in section 5 the results of some calculations of
indices to show the sensitivity of indices to the use of individual rather than a
certainty equivalent wait.
95
4.10.5 Outpatient waits
Table 4.7 highlights the mean waiting times in weeks for a select number of key
specialties. In most specialties waiting times have fallen.
Table 4.7 Outpatient waiting times in weeks, selected specialties, 1999/002003/04
Code Specialty
1999/00 2000/01 2001/02 2002/03 2003/04
100
120
140
150
160
170
300
310
340
350
360
370
400
420
430
7.448
11.798
9.789
10.546
12.370
3.865
8.799
13.815
6.580
5.978
2.342
3.535
13.072
6.680
4.975
General Surgery
ENT
Oral Surgery
Neurosurgery
Plastic Surgery
Cardiothoracic Surgery
General Medicine
Audiological Medicine
Respiratory Medicine
Infectious Diseases
Genito-Urinary Medicine
Medical Oncology
Neurology
Paediatrics
Geriatric Medicine
7.398
11.261
9.451
10.781
12.352
4.162
8.877
13.175
6.986
6.491
2.316
3.597
12.814
6.758
5.170
7.495
10.764
9.707
10.812
13.023
4.150
9.066
13.813
7.415
6.583
2.483
3.208
13.378
6.756
5.344
6.787
9.871
9.029
10.019
10.300
4.194
8.099
10.986
6.953
5.232
2.214
3.559
11.324
6.639
5.340
6.377
9.258
8.653
9.251
9.233
4.503
7.392
9.351
6.369
4.949
2.048
3.851
10.038
6.582
5.232
Ideally, because of the non-linearity of the effect of waiting time on the value of
activity, we require information on the total time that each individual waits for
treatment. However, individual HES records of inpatient waiting times are not linked
to individual outpatient data and indeed outpatient data is available only aggregated to
specialty level which cannot be linked satisfactorily to inpatient data aggregated to
HRG level.
If we wanted to discount health effects to date of treatment (as in section 4.10.2) we
require an estimate of e rwlt where wlt is individual l ’s total wait for treatment which
is the sum of her outpatient and inpatient wait: wlOt + wlIt . We do not have such
linked individual level outpatient and inpatient data. If we could group outpatient data
to the same HRGs or specialties as inpatient data we could use the average wait for
O
outpatient treatment for each group to calculate e r ( wt
+ wtI )
for a specialty or HRG. But
this is not feasible. In the future it will be possible to use HES to get total waiting
96
times for individuals across outpatient and inpatient care. But for the moment we
require a means of incorporating a quality adjustment for outpatient waits.
There is data on first outpatient visits, follow up visits, HRG outpatients, and
maternity outpatients but only data on the wait for the first visit.
The data on the
HRG outpatient and maternity outpatients do not have speciality codes that we can
match to the speciality codes for the first and follow up visits. We propose to use the
waiting time adjustments from section 4.10.2.2 but to apply the waiting time for first
visits to both first and follow up visits. It can be argued that a reduced first outpatient
visit wait reduces the delay until the value of the whole course of visits is realised.
Second, less plausibly, it could be argued that, regarding first and follow up visits as
having different values, reductions in waits for first visits also indicate that waits for
follow up visits have fallen and that this increases the value of follow up visits as
well.
4.10.6 Waiting time adjustment: conclusion
In this sub section we have
•
shown how data on waiting times can be used to quality adjust the cost
weighted output index in addition to the survival based adjustments considered
in sections 4.7 and 4.8.
•
regarding waiting time as a characteristic separate from the health
effect of treatment, yielding an adjustment which is additive the health effects
•
regarding waiting time as delaying the health effect of treatment,
yielding adjustments which act as scaling factors on the health effects
•
concluded that calculation of an index based on the value of the health and
waiting time characteristics, as in section 2.4 and section 4.10.1, is not
currently possible because of the lack of health effects data. We have
constructed waiting time quality adjustments which can be applied to all
elective HRGs with existing data and which rest on different assumptions
about whether one should discount to the date the patient is placed on the
waiting list (section 4.10.2.1) or to the date of treatment (4.10.2.2,).
•
suggested that an adjustment based on discounting to the date of treatment are
preferable since they imply that increased dispersion of waiting times would
97
reduce quality adjusted output, whereas discounting to date placed on the list
implies that patients would prefer increased dispersion.
•
suggested that the dispersion of waiting times should be reflected in the
waiting times adjustment by calculating the adjustment using, not the mean or
median wait, but the 80th percentile waiting time which can be interpreted as
an estimate of the “certainty equivalent” wait.
As with the health effects adjustments a final view on the preferred adjustment
depends on the plausibility and lack of volatility of the adjustments when confronted
with the data in sections 5 and 6.
98
Table 4.8 Summary of waiting time quality adjustments
Quality adjustment
Wght
Survival. Uniform
health effect. Life
expectancy.
Discounting to date on
list.
Cost
Survival. Uniform
health effect. Life
expectancy.
Discounting to date
treated, with charge
for wait.
Health and waiting
time characteristics
Form
Rationale
Results
Assumptions
Comments
) ⎞⎟
4.10.2.1
Efficient allocation.
Constant health status. No
effect of treatment on life
expectancy. kj equal for all
j in Sec 5.5. kj varying
across j in secs 6, 7
Cost
⎛ a jt +1 − k j ⎞ ⎛ 2 − e rw jt +1 − e − rL jt+1 ⎞
c
x
∑ j jt jt +1 ⎜⎜ a − k ⎟⎟ ⎜ 2 − erw jt − e− rL jt ⎟
j ⎠⎝
⎠
⎝ jt
c
x
∑ j jt jt
4.10.2.2
Sec 5.5,
Sec 6.3.
With
“optimal”
wait
adjustment
Sec 5.5
Sec 5.5,
Sec 6.3
Adjustments
have
more effect with
more
costly
treatments.
Also
calculated
with
different
discount
rates for w and L
Adjustments
have
more effect with
more
costly
treatments.
Also
calculated
with
different
discount
rates for w and L
Value
∑
(
(
jt +1
1 − e jt +1
⎞⎛ e
⎜
⎟⎟
− rw jt
− rL
1 − e jt
⎠ ⎜⎝ e
∑ j c jt x jt
⎛a −k
∑ j c jt x jt +1 ⎜⎜ ajt +1− k j
j
⎝ jt
− rw
(
(
− rL
)
)
⎟
⎠
− rL
x jt +1 ⎡π ht ( a jt +1h*j − h oj ) 1 − e jt +1 r −1 − π wt w jt +1 ⎤
⎣
⎦
− rL jt
o
−1
*
⎡
⎤
∑ j x jt ⎣π ht ( a jt h j − h j ) 1 − e r − π wt w jt ⎦
Constant health status. No
effect of treatment on life
expectancy. Value
additive in health and
waiting time.
cjt unit cost, volume xjt volume; ajt. mjt proportion patients alive, dead on discharge (or after 30 days); ; k estimate of proportionate effect of treatment on quality adjusted life
years (QALYs); Ljt life expectancy at mean age of patients treated; r discount rate on QALYs/waits; πht value of QALY (£s), πwt value of day’s wait, h*, ho estimates of health
with and without treatment.
j
)
2.7, 4.10.1
Efficient allocation.
Constant health status. No
effect of treatment on life
expectancy. kj equal for all
j in Sec 5.5. kj varying
across j in secs 6, 7
Sec 6.5
99
4.11 Patient satisfaction
The indices we have set out above are all variants on cost weighted aggregates built
up from FCE data. There are however some attributes which the Department may
wish to see reflected in an output index which are not covered by the data discussed
so far. Prominent among these are patient views on cleanliness, food quality and
satisfaction with non-curative aspects of treatment (such as the behaviour of nurses
and doctors).
Our analysis in this section is based on the assumption that these variables cause
patient disutility because they are seen as indicative of poor treatment quality. For
this part of our work then, we introduce these to our index is by identifying indicators
(such as aggregates constructed from responses to surveys of patient opinion). We
then calculate the growth rate of our comprehensive index, ItComp as the weighted sum
of the growth rate of one of the quality adjusted output indices we have described
earlier (call it It0) and the growth rates of the other indicators. We denote the growth
rate of indicator k by Itk. If there are n such indicators, and the relevant weights are
denoted by ωk then the overall index is given by
n
I tComp = ∑ ωk I tk
(98)
k =0
In some circumstances we may be able to deduce appropriate values for the ωk. For
example we know the total value of expenditure on cleaning and this therefore gives
us the basis for a weight on a cleanliness measure. In other cases the choice has to be
purely arbitrary. We do not know what costs, if any, are involved in persuading
doctors and nurses to be polite to patients, so there is little we can do except to invent
a weight.
The NHS Patient Survey Programme covers Acute Trusts, Primary Care Trusts,
Mental Health, Ambulance Trusts and others. In addition, there are plans for surveys
focusing on the National Service Frameworks for coronary heart disease, mental
health, older people, diabetes, etc. There are a number of issues that make the
100
incorporation of patient satisfaction into our measure of NHS output difficult, some
theoretical and some practical.
From a theoretical standpoint, there are at least three reasons why measures of patient
satisfaction should be treated with caution. The first is that patient satisfaction is not
independent of the level of output, xj, or other characteristics qk. For example, if we
include a measure of waiting times in our set of characteristics, it is likely that patient
satisfaction, not only with the time they had to wait, but also other aspects of the
service they have received, will be correlated with this. Because of this, there is the
likelihood of ‘double-counting’ output.
Second, patients’ reports of satisfaction will depend on the levels of service they
expect. Certain sections of the population may have lower expectations than others, as
noted in the First Interim Report (Dawson et al., 2004a). These expectations may vary
systematically across the population, introducing bias to our measure of output.
Third, satisfaction is a multidimensional concept. There may be a number of
orthogonal aspects to satisfaction, beyond total satisfaction, that should be considered
as separate characteristics in themselves and aggregated using their social valuations.
For example, patients’ satisfaction with the hotel services aspect of an inpatient stay
may be largely independent of their satisfaction with the medical aspects. Eliciting
these different dimensions of satisfaction with single instruments is extremely
difficult, if not practically impossible, which leads us to the empirical obstacles to the
inclusion of a measure of patient satisfaction into the index.
The practical question arising from the use of patient satisfaction data is the
construction of appropriate growth indicators, Itk In our analysis we make use of
qualitative surveys carried out for the Health-care Commission. The results of these
surveys show what proportion of patients gave answers in particular categories to
each question.
We constructed aggregates using the same approach as the Health-care Commission.
For each relevant question (i.e. those questions which were asked in more than one
survey round without changes likely to influence results), we gave a weight of 100 to
the most favourable answer, a rate of 0 to the least favourable answer and weights to
101
intermediate categories based on the number of possible choices. Thus if there was
one intermediate category it was given a weight of 50, if there were two the more
favourable was given a weight of 67 and the less favourable a weight of 33 and so on.
These weights were applied to the proportion of respondents in each category in order
to give us an overall score. For a group of overall questions (cleanliness, food quality
or other aspects of satisfaction) we took the arithmetic average of the individual
scores in order to produce an overall indicator.
This quantification appears arbitrary and one might be concerned that it does not
really identify the latent variables which underlie the reported categorical responses.
However the indices produced in this simple manner were very strongly correlated
with indices derived from assumptions about the density function of the underlying
latent variables. Thus we are reasonably happy with their use.
There are a number of questions we have not used in our analysis. Some of these
collected information in both years but did not invite observations on service quality
(e.g. What age group do you belong to?). Others we have rejected because there were
small but possibly important changes to the questions between the years (e.g. in 2002
“Were you in a mixed-sex room or ward?” and in 2004 “Were you in a mixed-sex bay
in a room or ward?”, or because they do not actually convey what is needed to assess
patient quality. Thus the question about the time people have waited to see GPs does
not identify patients who have been unable to see their GPs because the GPs only
make appointments for the day that they will see the patients.
The outpatient questionnaire includes one question about the time the patient has had
to wait for an appointment once they knew they needed it. We have counted waiting
time elsewhere and therefore leave that question out of our assessment. A second
question asks how long the patient has waited relative to the time of their
appointment, inviting categorical answers. By making the assumption that the midpoint of each category is relevant to all patients in that category and a plausible
assumption for the final category (two and a half hours for a wait of more than two
hours) we can estimate a mean waiting time and measure the percentage change in
this. This can be combined with other score variables for outpatient treatment.
102
These indicators are measured on a per-patient basis. To give an overall change in the
“volume of quality”, it is necessary to add to each indicator the growth in the volume
of treatment. Our own view is that this is not measured exactly by the cost weighted
index, since non-medical quality matters whether a patient is given a cheap or an
expensive treatment. The measure we use is number of inpatient spells. For
outpatients the appropriate volume indicator is the number of outpatient consultations
and the same is true for primary care. We discuss this further in section 5. The
percentage change in the volume component has to be added to the percentage change
in quality to give the change in the overall volume of quality, i.e. the relevant Itk.
The quantified data, for the relevant questions are shown in tables A1 to A4 in
Appendix A.
4.12 Discount rate on health
There is little agreement on the appropriate discount rate to use when health gains
accrue over time.
Cairns (1994, 1997) examines evidence on individuals’ time
preferences for social health gains, such as saving lives of other people at different
points in the future. Discount rates for saving lives fall from 41% where the delay is
only two years to 16% where the delay is nineteen years. Gravelle and Smith (2001),
using a social welfare framework, point out that if the value of health is increasing
over time, estimates of the volume of health benefits must take this into account. The
volume of health effects can be adjusted directly by the rate of growth in the value of
health. An indirect method is to reduce the discount rate on health effects relative to
the discount rate applied to costs. The size of the adjustment depends on estimates of
the rate of growth of the value of health which itself is a function of the rate of growth
of the direct utility effect of health and the rate of growth of income. If the indirect
method of allowing for growth in the value of health by means of a lower discount
rate on health effects is used, the discount rate on health effects will be 1-3% less than
the discount rate on costs depending on the assumptions made about key variables.
The Department of Health currently discounts health benefits at 1.5% p.a. and costs at
3.5% p.a. (Department of Health, 1996;, HM Treasury, 2003). We take 1.5% as our
base case in discounting life expectancy but also consider higher rates to investigate
103
the sensitivity of the index to the discount rate.
The formulations for the waiting time adjustment allow for the possibility that
different discount rates could be applied to the period whilst waiting and the post
treatment period. It could be argued that a higher rate of discount should be applied to
the waiting period since there is a cost to waiting over and above the delay in getting
treatment. Unlike the discount rate applied to health effects post treatment there is no
obvious higher rate to use for waits. We suggest that the direct disutility from
waiting, over and above the delay in getting treatment, is best dealt with by treating it
as a separate characteristic. Thus when the data become available to support a value
weighted index waiting time could be included both as a determinant of the health
effect and as a separate characteristic if patient preference studies suggest that this is
appropriate.
We have included in Sections 5 and 6 illustrative calculations of the scaling effect of
waiting time using different discount rates for health and waiting time. Since the rates
applied to waiting time are arbitrary and in the absence of any contrary indication we
prefer on balance to use the same rate for waits and health effects.
4.13 Quality adjustment for general practice
Until recently routine data on the quality of general practice has been virtually nonexistent because information on general practice has been collected with the aim of
paying GPs and until April 2004 the General Medical Services contract paid little
explicit attention to quality. GPs received bonus payments linked to the proportion of
eligible women receiving cervical screens, and of children vaccinated.
Cervical
screening activities were included in the old Cost Weighted Activity Index as volume
measures of activity so that to this extent the CWAI had a small quality component.
From 1998/9 onwards GPs have been able to opt for Primary Medical Service
contracts negotiated with their Primary Care Trusts under which they are meant to
deliver an enhanced package of services. By April 2004 around 35% of practices had
switched to PMS. Since PMS contracts are locally negotiated, the central reporting of
the activities targeted under the GMS contract is patchy.
104
There are a number of validated measures of general practice quality - such as the
proportion of patients with coronary heart disease whose blood pressure is controlled.
The Department of Health is investigating the use of the QRESEARCH database
extracted from a large (around 500) sample of GP electronic record systems to
measure the quality of care in general practice using these types of indicators
(Simkins, 2005).
The new General Medical Services contract of April 2004 contained financial
incentives linked to achievement against a large basket of quality indicators in the
Quality and Outcome Framework (QOF) (Department of Health, 2003a; Roland,
2004). PMS practices were also required to join the QOF and returns for all practices
are centrally collected and available via the Quality, Prevalence and Indicator
Database QPID run by the DH’s Health and Social Care Information Centre.
It is possible to calculate a subset of the quality indicators in the QOF for the practices
in the QRESEARCH database over a number of years prior to the introduction of the
QOF for all practices in April 2004. This series can then used to quality adjust the
past volume series of general practice consultations.
In principle the quality indicator series can be calculated in future years for the
practices in QRESEARCH and, for all practices. There are potential difficulties in
constructing a quality adjusted series which covers the periods before and after
2004/5. GPs are partially altruistic: their actions are guided by a professional concern
for their patient’s well being and by a concern for their income and effort. The
introduction of the QOF in April 2004 changed the relative importance of the
professional and personal incentives. For aspects of quality covered by the QOF GPs
now have both financial and professional incentives. For aspects of quality which are
outside the QOF they now only have professional incentives. The QOF may therefore
lead to increased activity in the areas covered by the QOF and reduced activity in
those outside it. Hence the changes in QOF activity may provide a misleading guide
to changes in overall quality of care since the unremunerated and therefore not
centrally notified activities may have declined, or not increased to the same extent.
105
The data produced by the QOF is likely to be a fruitful source of quality adjustment of
general practice activity in future years. Some of the indicators can be translated into
health impacts, for example from studies of CHD event risk factors including blood
pressure. But the interpretation of trends in quality indicators will require empirical
modelling of the responses of practices to changing financial incentives and their
impact on unremunerated quality indicators.
Such modelling should be possible
using QRESEARCH or a similar database.
The QOF also has an incentive for GPs to undertake surveys of their patients, though
the reward is for carrying it out and acting on the results, and is not linked to the
actual patient responses. If the QOF is adjusted to link payment to responses, for
example on satisfaction with particular aspects of the practice then this will be a
potentially fruitful source of data on the patient experience in general practice. We
discuss in section 4.11 how such data could be used as a quality adjustment to an
output index.
It has been suggested that admission rates for ambulatory care sensitive conditions
(ACSCs) can be used as a measure of the quality of general practice. ACSCs are
conditions which can be controlled in a good quality general practice and which
should not result in a hospital admission.
The usual examples include asthma,
diabetes and epilepsy and ACSCs for these conditions have been used by the DH and
the Commission for Health Improvement as primary care performance indicators.
They have also been used extensively in the US, New Zealand and Australia
(Giuffrida et al., 1999; Jackson and Tobias, 2001; Victorian Government, 2004).
Even if ACSC admission rates affected by the quality of general they are also strongly
influenced by factors outside the control of GPs, including the availability and
admission criteria of hospitals (Giuffrida et al., 1999). ACSCs admissions are an
imperfect proxy for the health of the relevant population. We do not recommend their
use as a means of quality adjusting measures of national general practice output.
106
4.14 Atkinson principles and quality adjustment
The Atkinson Review published its final report in January 2005 setting out
recommendations for improving the measurement of government output and
productivity (Atkinson, 2005).
Atkinson paid particular attention to the Eurostat (2001) Handbook on price and
volume measures in National Accounts.
The Handbook made important
recommendations on the methods to be used to measure output and the implications
for the measurement of non-market output.
Eurostat distinguished between activities, outputs, and outcomes. For purposes of
national accounting it is preferable to measure outputs (treatment received by a
patient) rather than activities (number of operations or prescriptions). Outcomes
should be used to quality adjust outputs.
Eurostat “graded” the methods that
governments use for measuring non-market outputs into A, B or C.
A. Preferred method:
i)
use output indicators (rather than activities)
ii)
all services should be covered, as detailed as possible
iii)
outputs should be quality adjusted
iv)
outputs should be cost weighted
B. Less satisfactory but acceptable method:
i)
use output indicators but the detail needs to be improved
ii)
no account is taken of quality change
C. Unacceptable method:
i)
use of inputs instead of activities or outputs
ii)
coverage of output not representative
In discussing Group A methods, counting inpatient activity by DRG is accepted as a
way of measuring output but this appears inconsistent as it is only a more
disaggregated way of counting activities. Atkinson acknowledges the difficulty of
107
quality adjusting by DRG activity.
Atkinson says that ideally output should be
measured as the whole course of treatment for an illness rather than just counting
activities (by DRG or HRG). This would make it possible to take better account of
the quality of care. The report notes that it “would be very helpful to be able to base
quality adjustments for NHS output on a data set which measures the health outcome
achieved as a result of treatment, collected annually by all or part of the NHS for most
aspects of health care.” In the short-term attention could be focused on a few disease
groups for which data on outcomes is available.
In the general discussion of methodology, Atkinson points to the distortion in output
indices that are weighted by average costs. Different outputs should be weighted by
the marginal value of the outputs to individuals and there is no reason to believe this
corresponds to marginal or average cost.
In major respects Atkinson (2005) recommended a methodology for measuring NHS
output growth advocated in our earlier reports:
•
Units of output should relate to patient journeys (courses of treatment) rather
than activities (tests, procedures, consultations, drugs prescribed). Until a
patient identifier permits linking the various services delivered to a single
patient, output will still have to be measured as the sum of activities.
•
The quality of outputs should be incorporated into output indices.
•
Quality measures should reflect the attributes of output valued by individuals.
Atkinson initially identifies relevant attributes as those recognised by current
objectives of government health policy. We suggest, since individuals may
value other attributes than those targeted by the policies of any particular
government, and may value them differently, that further research is required
on the attributes valued by individuals.
•
In order to generate a single index of output, weights must be attached to the
multitude of NHS activities. Weights should reflect the marginal social value
of the activities. At present relevant data does not exist and ONS will continue
to use cost weights. Atkinson acknowledges the distortions introduced by the
use of cost weights—a relative increase in expensive treatments appears to
increase NHS output while a relative increase in cost reducing treatments with
108
the same or better health outcomes will appear to reduce NHS output.
If we had routinely collected data on health outcomes and these data were used to
weight the activities of the NHS, it would not be necessary to make a separate quality
adjustment to an index of NHS output. In the short term Atkinson suggests an
alternative approach. Changes in quality which are currently measurable should be
used to augment cost weighted output indices. We suggest that for health care,
currently available data permit quality adjustment in respect of two attributes: waiting
times and survival rates. Following the Atkinson (2005) approach requires that we
find a basis for determining the relative value of changes in these two attributes of
health care and use the resulting estimate of the rate of change in quality to quality
adjust the cost weighted output index.
Atkinson (2005) stresses the need to measure the value added by public services. This
is particularly important in health care where individuals may be healthier or living
longer for reasons unrelated to the availability and quality of health care. What is
required are measures of the marginal effect of the NHS in producing outputs of value
to patients. Given the difficulties in estimating production functions, rough and ready
judgements will be required when attributing improvement in health outcomes to
health care.
Atkinson (2005) discusses the “complementarity” between public and private output.
As economic growth leads to higher standards of living, the relative valuation of
different goods and services will change. This will affect the weights to be attached
to the various activities in an output index. If people are living longer the value of a
hip replacement may increase as the present value of the benefits are estimated over a
longer period of time. We argue that such changes in value arising from factors
outside the control of the NHS should not be counted in measuring the quality
adjusted growth rates of different types of NHS output. Instead they should be
included in the weights to be applied in calculating the weighted average rate of
growth of NHS output (the output index). Changes in the value of a hip replacement
arising from factors outside the control of the NHS should influence decisions on how
to allocate NHS resources across different activities but not in assessing the rate of
growth of the output of hip replacement activities.
109
5 Experimental indices of NHS output
5.1
General trends, index form and data sources
This section demonstrates the impact on a cost weighted output index of adjustments
for a number of quality dimensions. Prior to presenting results, however, it is useful to
highlight some salient features of the data.
Our starting point is the dataset used by the DH in its cost weighted output index
(CWOI). For inpatient hospital care, the Hospital Episode Statistics (HES) are used
(data on non-elective, elective and day case activity). The main quality adjustments
discussed in this report are for survival and waiting times. Mortality rates are only
available for HES activity (non-elective, elective and day case hospital activity).
Waiting times are available for electives and day cases and some outpatient activities.
In 2002/03 the data consisted of 1913 groups of activities. Table 5.1 gives a summary
of the main divisions of this dataset. In this chapter we focus on the quality
adjustments to the HES data, which apply to 47% of the activity currently included in
the CWOI. In chapter 6 we examine the effects of different quality adjustments to
output indices for our small specimen set of HRGs for which we have health effects
data. In the light of the results in this chapter for the HES set of hospital activities
and the specimen set in chapter 6, we set out our preferred quality adjustment variant
in chapter 7, for the set of HES hospital activities, for a broader set of hospital
activities including outpatient treatments, accident and emergency, and for all NHS
activity. In chapter 9, the output indices from chapter 7 are combined with the input
indices from chapter 8 to yields estimates of various types of productivity growth.
110
Table 5.1 Activities and cost shares 2002/3
Electives+ day cases
Non-electives
Outpatients
Other activities2
Total
Number of activities
(millions)
Cost shares1
5.58
5.96
53.43
13.38
22.1
11.15
53.37
100
Notes: 1. Derived by multiplying activities by unit costs; 2.These activities are measured in noncomparable units so total numbers of activities are meaningless. A division of cost shares with this
category is shown in Table 3.1.
In order to highlight the impact of quality adjustment for the activities where the data
permit adjustment, in sections 5.4 and 5.5 we compare our quality adjusted indices to
an unadjusted index restricted to the same set of activities. In the tables this truncated
version of the CWOI is labelled “unadjusted”. In Section 7 we examine how quality
adjusting for this subset of activities affects the value of the complete CWOI.
5.1 General trends and data
The HES data are grouped according to procedures comprising 574 Healthcare
Resource Groups (HRGs), with an additional separation into electives and day cases
and non-electives (Appendix B). Figure 5.1 graphs the number of episodes for each
year from 1998/99 to 2003/04. It shows little change in electives up to 2001/02 with
some growth thereafter. Non-electives show more significant growth, with very high
growth in the final year.
111
Figure 5.1 Number of FCEs, electives+day cases (elip) and non-electives (nelip),
1998/99 – 2003/04
8500000
8000000
7500000
7000000
6500000
elip
nelip
6000000
5500000
5000000
4500000
4000000
1998/99
1999/00
2000/01
2001/02
2002/03
2003/04
Within these broad categories there is considerable variation in number of procedures
and in growth by HRG. For example, comparing 2002/03 with 2001/02, the arithmetic
mean growth in episodes for elective HRGs was 3.8% but with a standard deviation of
15.9% – growth across HRGs was even more variable for non-electives. Partly this
reflects substitution across treatments but nevertheless the variation is large.
Unit costs from the Reference Costs database are employed to aggregate these diverse
activities. The unit costs also show considerable variation across procedures, from
over £20,000 for transplant procedures to under £500 for ophthalmic and ear
procedures. In 2002/03 the mean unit cost across HRGs for electives was about
£1,700 with a standard deviation of £2,220, with figures of £2,200 and £2,600,
respectively, for non-electives.
The cost weighted output index combines activity growth by weighting by unit costs,
equivalent to multiplying the ratio of activities by cost shares. Cost shares are
concentrated in a few HRGs. Treating electives and non-electives as separate sets of
112
activities, in 2002/03, 25% of expenditure was accounted for by only 20 HRGs with
50% accounted for by 74 HRGs, as illustrated by the cumulative expenditure share
chart below. The top expenditure categories include HRGs where activity rates are
very high such as maternity care for normal deliveries or hip replacements and which
are not typically life threatening. But it also includes HRGs where mortality rates are
very high such as heart procedures and complex procedures involving the elderly. A
high cost share on this latter group turns out to be important in the adjustments for
survival discussed below.
Figure 5.2 Cumulative expenditure shares, FCEs (537 elective, 537 non-elective
HRGs)
1.2
1
0.8
0.6
0.4
0.2
0
0
5.2
200
400
600
800
1000
1200
Index form
Table 5.2 compares the Laspeyres (base period weighted) index with the Paasche
(current period weights) index and the Fisher index which is the geometric mean of
the Laspeyres and Paasche indices. The index number formula used does have a small
113
but significant impact on the indices, as shown for episodes for selected years. In this
report we follow ONS in reporting Laspeyres indices.
Table 5.2 Impact of index number formula on CWOI index, FCEs based
1999/00-2000/01
2000/01-2001/02
2001/02-2002/03
Laspeyres
0.90%
0.93%
4.41%
Paasche
0.71%
0.82%
4.47%
Fisher
0.81%
0.87%
4.44%
Note: the reference costs for 1998/99 were considered unreliable so the index for the comparison
between 1998/99 and 1999/00 use 1999/00 unit costs in the numbers reported below. 2003/04 unit
reference costs on a comparable basis were not available.
5.3
Spells versus episodes
We discussed the choice between measuring hospital output in finished consultant
episodes (FCEs) and continuous inpatient spells (CIPS) which consist of sets of
consecutive FCEs in section 4.2 (see also Appendix B). We argued that CIPS were a
better approximation to the patient journey and therefore a more appropriate measure
of output. We use CIPS for our calculation of the effects of quality adjustments.
Although there are around 8% fewer CIPS than FCEs this should have essentially no
effect on the calculation of a cost weighted output index since we constructed our unit
costs for CIPS from the underlying FCE unit costs. Table 5.3 compares FCE based
and CIPS based CWOIs as a check on our calculations of unit costs of spells. The
only reason for a divergence between the two indices is that some of the FCEs
assigned to a particular year in the FCE index may be assigned to a different year in a
CIPS index since a CIPS is assigned to a year only if its last FCE finished in that year.
We would however expect to see differences in FCE and CIPS based indices once the
outputs are adjusted for survival and mortality since these adjustments are applied to
the different distributions of HRG types generated by the FCE and CIPS volume
measures.
114
Table 5.3 Comparison of cost weighted output indices for hospitals based on
finished consultant episodes and continuous inpatient spells
CWOI index
5.4
Episodes
CIPS
pp diff
1998/99-1999/00
1.84
1.87
-0.03
1999/00-2000/01
0.90
0.91
-0.01
2000/01-2001/02
0.93
0.95
-0.02
2001/02-2002/03
4.41
4.44
-0.03
2002/03-2003/04
5.75
5.81
-0.06
Average
2.75
2.78
-0.03
Survival adjustments: hospital output
5.4.1 Simple survival adjustment
This section considers the results of applying the survival adjustment formula in
section 4.8.1 to HES data. There are two choices of death rates that can be used in the
calculations, those that occur during the hospital stay or in-hospital deaths together
with those occurring within some period following discharge from hospital. Inhospital deaths are those most directly attributable to the NHS but are likely to
underestimate survival changes due to medical treatment since many patients die
within a short time after discharge. However using mortality rates after discharge runs
the risk of attributing deaths from extraneous influences to the NHS. On average in
the period under consideration 30 day mortality rates were about 25% higher than inhospital deaths (Table 5.4). Both indicators show a downward trend with similar rates
of decline.
115
Table 5.4 Mortality rates, (deaths/CIPS), 1998/99-2003/04
1998/99
1999/00
2000/01
2001/02
2002/03
2003/04
In-hospital
0.0239
0.0238
0.0229
0.0236
0.0228
0.0222
30 day
0.0308
0.0306
0.0293
0.0299
0.0286
0.0276
Death rates vary enormously across procedures. Death rates are considerably higher
for non-elective procedures than for electives. Although the rate of decline is greater
in the latter - on average elective mortality rates declined by 5.9% from 1998/992003/04 against decreases of 2.1% for non-electives - aggregate trends are dominated
by those for non-electives given their greater weight (Figure 5.3).
Figure 5.3 30 day Mortality rates, electives and non-electives
0.06
0.05
0.04
electives
non-electives
0.03
0.02
0.01
0
1998/99
1999/00
2000/01
2001/02
2002/03
2003/04
116
Figure 5.4 Plot of mortality rates 2002/03
1.2
1
0.8
0.6
0.4
0.2
0
0
200
{
400
electives
600
}{
800
1000
non-electives
1200
}
Table 5.5 reports calculations of the pure short term survival adjusted cost weighted
output index (section 4.8.1)
∑
⎛ a jt +1 ⎞
x
⎜⎜
⎟⎟ c jt
jt
+
1
j
⎝ a jt ⎠
∑ j x jt c jt
(99)
where a is the survival rate.
The first column of results is the unadjusted CWOI. The second and third columns
are the survival adjusted indices calculated with 30 day and in-hospital death rates.
The adjustments are non-trivial, though generally quite small in percentage point
terms. These adjustments are generally larger in the final two years than in the
beginning of the period. The use of 30 day mortality rates yields a higher adjustment
than in-hospital deaths in all but the first year.
117
Table 5.5 Laspeyres CWOI index, CIPS, adjusted for survival
Laspeyres CWOI
Unadjusted
Adjusted for survival (1-m)
30 day
In-hospital
1998/99-1999/00
1.87
1.27
1.37
1999/00-2000/01
0.91
1.16
1.08
2000/01-2001/02
0.95
0.89
0.86
2001/02-2002/03
4.44
5.37
5.14
2002/03-2003/04
5.81
6.37
6.22
Average all years
2.78
2.99
2.91
The impact of the survival adjustment depends on both the rate of change of survival
across HRGs and their cost shares. The latter turn out to have a large impact since, as
stated earlier, the majority of procedures show little change in survival but these tend
to be concentrated in low cost procedures. To illustrate this point Figure 5.5 shows the
change in average (unweighted) mortality rates (from Table 5.4) and the change in the
CWOI adjusted for survival minus the unadjusted CWOI (the second column in Table
5.5 minus the first column in Table 5.5), both indexed at 1998 =1. The mortality rate
shows a relatively smooth pattern, generally declining but with a small upward shift
comparing 2000/01 and 2001/02. In contrast the impact on the CWOI is much more
variable, and not always in the inverse direction to the change in the mortality rate.
Figure 5.6 plots changes in survival rates against unit cost for one of the growth
periods, 2001/02-2002/03. Most changes in survival are small, ranging around the
value 1 on the y-axis and the majority of these are in the lowest unit cost range. Year
on year changes in the CWOI are driven largely by variations in survival rates in the
relatively few procedures with very high unit costs, plus a few cases where changes in
survival rates are very high in the low unit cost range.
118
Figure 5.5 Mortality rates and the impact of the survival adjustment*, Index
1998/99=1
1.1
1.05
1
0.95
0.9
mortality rate
CWOI impact of survival
0.85
0.8
0.75
0.7
0.65
0.6
1998/99
1999/00
2000/01
2001/02
2002/03
2003/04
* calculated as the difference between the 30 day survival adjusted CWOI and the unadjusted CWOI
Figure 5.6 Growth in survival (ratio) and unit costs, 2001/02-2002/03,
(electives and non-electives)
2.5
Ratio survival (2002/03)/(2001/02)
2
1.5
1
0.5
0
0
5000
10000
15000
20000
25000
30000
35000
unit costs
119
To understand the sensitivity of the results to cost shares we estimated the change in
the index when survival was assumed unchanged for the top 25 high cost share HRGs,
which represented just over 30% of total expenditure. The impact of this was to
reduce the 30 day survival adjustments by about 60%. Thus the calculations depend
heavily on the survival rates of a small number of HRGs. Within this high cost share
group, comparing 1999/00 with 1998/99, 17 of the 25 HRGs showed reductions in
survival rates and these are responsible to a large extent for the big negative impact of
the survival adjustment on the CWOI in that growth period. In contrast in the final
two growth periods the majority of high cost share HRGs witnessed increases in
survival rates – 19 HRGs in 2001/02-2002/03 and 20 HRGs in 2002/03-2003/04.
Over time, both the number of HRGs with positive growth in survival rates and the
share of expenditure accounted by these procedures have increased as shown in Table
5.6. If the percent of HRGs with increases in survival rates is lower than the
cumulative expenditure share (in percent) of these procedures, then increased survival
is concentrated in relatively high cost procedures. Table 5.6 shows that this is the case
in each growth period except the first and that the discrepancy has increased through
time.
Table 5.6 Changes in 30 day survival rates and expenditures shares
1998/99-1999/00
1999/00-2000/01
2000/01-2001/02
2001/02-2002/03
2002/03-2003/04
Percent of
procedures* with
change in survival
rates >1
Expenditure shares
of procedures* with
change in survival
rates >1
42.8
55.5
49.5
62.9
63.0
37.8
62.0
50.7
75.4
77.7
* Total number of procedures = 1148, with electives and non-elective HRGs treated as separate
procedures.
120
We suggested in section 4.8.1 that a simple survival estimate would be likely to be a
conservative estimate of the effect of the growth in the health effect of treatment and
we now consider how making strong assumptions about the health effect alters the
results.
5.4.2 Survival and estimated health effects adjustment
We consider the survival and health effects adjusted index of section 4.8.2
∑
j
⎛ a − kj
x jt +1 ⎜ jt +1
⎜ a jt − k j
⎝
∑ j x jt c jt
⎞
⎟⎟ c jt
⎠
(100)
where kj = q ojt / q*jt is an estimate of the proportionate effect of treatment conditional
on survival to no treatment which, in the absence data on actual health effects, we
assume is constant over time. ( q*jt is the sum of discounted quality adjusted life years
accruing to patients who survive treatment. qojt is the sum of quality adjusted life
years for untreated patients). With k = 0 which implies that the patient would have
zero quality adjusted life years if not treated we have the pure survival adjusted index.
We examine the impact of assuming that k is positive. As noted in section 4.8.2, the
rather sketchy available evidence suggests a value of around k = 0.8 for non life
threatening procedures. When the treatment has a high mortality we set k = 0. If we
used m = 0.2 for the cut-off mortality rate for setting k = 0 this would ensure that term
a – k is never negative which would correspond to treatment having a negative effect
on health. But as we noted in the simulations in section 4.8.2 this would make the
index very sensitive to change in mortality when the rate is close to 0.2. We therefore
set the cut off value for mortality which leads to k = 0 so that a – k is never smaller
than 0.05. Thus for HRGs with high mortality we adjust only by the survival rates, not
by the survival rates and the assumed health effect.
Table 5.7 shows that including the crude health effects adjustment via k = qo/q*
generally increases the growth rate compared with no adjustment (first column) and
with a simple survival adjustment (Table 5.5). The greatest impact is the third
column, in particular for the final two periods. The results in general suggest that
121
adjusting for survival adds about two percentage points to the growth rate in 2002/03
and 2003/04. Averaged across the five yearly growth rates, the impact ranges from
adding about 1.0 to 0.4 percentage points to the growth rate. Contrast this with an
average impact of 0.22 for the simple survival adjustment using 30 day survival rates.
Thus a survival adjustment which incorporates crude but not implausible adjustments
for health effects is capable of significantly adding to the growth rate of hospital
output. Note, as with the simple survival adjustment, much of the impact is due to the
behaviour of survival rates in the high cost share HRGs. For example in the case
where k=0.8 with cut off = 0.10, nearly 70% of the adjustment can be attributed to the
25 HRGs with the highest cost shares.
Table 5.7 CWOI index, CIPS, adjusted for survival, 30 day mortality rates
Unadjusted
q0/q*=0.8
if m<0.10,
q0/q =0
otherwise
q0/q*=0.8 if
m<0.15,
q0/q =0
otherwise
q0/q*=0.7
if m<0.15,
q0/q =0
otherwise
q0/q*=0.7
if m<0.10,
q0/q =0
otherwise
q0/q*=0.9
if m<0.05,
q0/q =0
otherwise
1998/99-1999/00
1.87
0.78
0.09
0.73
1.02
1.26
1999/00-2000/01
0.91
1.58
1.97
1.51
1.36
1.54
2000/01-2001/02
0.95
0.91
1.01
0.93
0.90
1.01
2001/02-2002/03
4.44
6.59
7.72
6.34
5.97
6.27
2002/03-2003/04
5.81
7.15
8.04
7.10
6.76
7.09
Average all years
2.78
3.36
3.77
3.28
3.20
3.43
5.4.3 Survival adjustments with health effects and life expectancy
In section 4.8.3 we suggested that including a term reflecting life expectancy of
patients treated would be a way of improving the crude adjustment for the health
effect and proposed the index
122
∑
x c
j jt +1 jt
(a − k ) ⎛ 1− e
( a − k ) ⎜⎝ 1 − e
∑xc
jt +1
− rL jt +1
j
jt
− rL jt
j
j
jt
⎞
⎟
⎠
(101)
jt
where Ljt is the life expectancy at the average age of patients getting treatment j and r
is the discount rate on quality adjusted life years (the units in which health effects are
measured). Table 5.8 reports the results of calculation of this index with a discount
rate on remaining life equal to 1.5% for the simple survival adjustment and with our
central case value of k = 0.8 with a mortality cut off of either 0.15 or 0.10.
Table 5.8 CWOI index, CIPS, adjusted for survival, life expectancy, 30 day
mortality rates, r=1.5
Unadjusted
q0/q*=0.8 if m<0.10,
q0/q =0 otherwise
q0/q*=0.8 if m<0.15,
q0/q =0 otherwise
1998/99-1999/00
1.87
1.12
0.74
1999/00-2000/01
0.91
1.37
1.76
2000/01-2001/02
0.95
0.76
0.89
2001/02-2002/03
4.44
6.31
7.44
2002/03-2003/04
5.81
7.13
8.03
Average all years
2.78
3.30
3.72
The growth is higher with either of the adjustments than without them, more markedly
for the variant with the more generous cut off which leaves more HRGs being
adjusted by the ratio of health effects than by the simple survival ratio. The effect of
the life expectancy adjustment is to reduce the growth compared with the
corresponding case in Table 5.7 in all years except the first and reflects the increasing
age of patients treated by the NHS.
5.5
Waiting time and survival adjustments: hospital output
This section considers additional impacts on the indices from taking account of
changes in waiting times. It shows results for a number of variants based on the
123
formulae in section 4.10. We calculated the mean wait after truncating very long waits
to four years and the “certainty equivalent” wait which was the mean plus a “risk
premium” to reflect the disutility from the risk of long wait relative to the mean. We
used the rule of thumb that the certainty equivalent wait for a treatment was at the 80th
percentile wait for that treatment (see section 4.10.4).
Table 5.9 shows mean waits across all patients and the mean 80th percentile wait
across HRGs for electives in the period under study. This shows a decline in average
waiting times using both measures since 1998/99. However this is mainly due to a
large drop between 1998/99 and 1999/2000. Starting in the latter year mean waiting
times increased up to 2002/03 but declined marginally in the final year.
Table 5.9 Trends in waiting time, days, averages across HRGs
Mean
Per cent HRGs with decline in
waiting times
Truncated mean 80th percentile
Truncated mean
80th percentile
1998/99
88.7
132.2
1999/00
80.8
117.7
62.3
55.1
2000/01
82.3
119.0
32.4
34.2
2001/02
85.2
124.4
31.9
33.9
2002/03
88.5
128.9
32.3
29.9
2003/04
85.9
126.8
63.4
51.5
The third and fourth column of Table 5.9 summarise the variation across HRGs in
terms of waiting time experience by showing the percent of HRGs that show declines
in average waiting times. Only in the first and last growth period do the majority of
HRGs show decreasing waits with increases in the majority in the intervening years.
The overall mean measures of waiting times are affected by the extent to which
activity moves between procedures. Although substantial proportions of HRGs record
reductions in waits in any one year this does not imply a substantial reduction in
waiting times. Indeed if yearly waiting times were symmetrically randomly
distributed around an unchanging mean for each HRG then 50% would have
124
reductions in any given year but there would be no overall downward trend in waiting
times. The first two columns of Table 5.9 suggest that adjustments for reductions in
waiting times are likely to have a small effect because the measures of waiting time
that we use did not change very much.
5.5.1 Effect of waiting time adjustments
The last point above is confirmed when we consider variations in quality adjustments
to take account of changes in waiting times within the framework of a cost weighted
output index. Tables below show the results of using various scaling factor waiting
time formulae from sections 4.10.2, with the two measures of waiting time and
different discount rates. The first column of figures in each panel shows as the base
case the survival adjustment variant with k = q0/q*=0.8 and the mortality cut off set to
m = 0.10 as a point of comparison. We found that other survival adjustments made
little difference to the effects of the waiting time adjustments.
Table 5.10 reports results from the adjustment with discounting to date of treatment
with charge for wait (section 4.10.2.2)
(
) (
)
r w
⎡ 1 − e − rL L jt +1
e w jt +1 − 1 ⎤
⎢
⎥
−
r
r
⎢
⎥
L
w
⎛ a jt +1 − k j ⎞ ⎣
x
c
∑ j jt +1 jt ⎜⎜ a − k ⎟⎟ ⎡ 1 − e− rLL jt erww jt − 1 ⎤ ⎦
j ⎠
⎝ jt
⎢
⎥
−
rL
rw
⎢
⎥
⎣
⎦
∑ j x jt c jt
(
) (
)
(102)
where wjt is the waiting time measure for HRG j, rw is the discount rate on waiting
times and rL is the discount rate on QALYs. Note this formula differs from that in
Table 4.8 due to different discount rates on waits and QALYs – if the two discount
rates are equal the formula reduces to that in Table 4.8. The panels differ in the
measure of waiting time adopted (mean wait or 80th percentile).
125
Table 5.10 Laspeyres CWOI index, CIPS, adjustments for changes in waiting
times
Discount to date of treatment with charge for wait (based on mean wait variable)
Survival
adjustment
only
rw = rL =
1.5%
rw = rL =
5%
rw =10%,
rL =1.5%
rw =50%,
rL =1.5%
1998/99-1999/00
1.12
1.16
1.08
1.16
1.16
1999/00-2000/01
1.37
1.35
1.46
1.35
1.34
2000/01-2001/02
0.76
0.75
0.79
0.75
0.75
2001/02-2002/03
6.31
6.31
6.40
6.32
6.32
2002/03-2003/04
7.13
7.20
7.26
7.20
7.21
Average
3.30
3.32
3.36
3.32
3.32
Discount to date of treatment with charge for wait (based on 80th percentile wait
variable)
Survival
adjustment
rw = rL =
rw =50%,
rw = rL =
rw =10%,
only
1.5%
5%
rL =1.5%
rL =1.5%
1998/99-1999/00
1.12
1.18
1.13
1.19
1.21
1999/00-2000/01
1.37
1.34
1.44
1.34
1.33
2000/01-2001/02
0.76
0.75
0.78
0.75
0.75
2001/02-2002/03
6.31
6.34
6.44
6.35
6.38
2002/03-2003/04
7.13
7.24
7.31
7.25
7.30
Average
3.30
3.33
3.38
3.34
3.36
Note: All columns have the same survival adjustment: k = 0.8 if m < 0.10, 0 otherwise
We also considered a number of variations in quality adjustments to take account of
changes in waiting times based on use of additional formula or different ways of
measuring waiting times. The first reports the results for waiting time adjustment with
discounting to the date placed on the list (section 4.10.2.1):
(
(
w jt +1
1 − e L jt+1
⎞⎛ e
⎜
⎟⎟
− rw w jt
−r L
1 − e L jt
⎠ ⎝⎜ e
∑ j c jt x jt
⎛a −k
∑ j c jt x jt +1 ⎜⎜ ajt +1− k
⎝ jt
−r w
−r L
)
) ⎟⎞
⎟
⎠
(103)
where the waiting time adopted is the 80th percentile.
126
The second is based on the use of individual data to measure mean waiting times.
Since the waiting times and life expectancy factors are non-linear and there is a
variation in waiting times and in ages within an HRG in a given year it is possible that
our use of a single waiting time and life expectancy estimate for each HRG may lead
to misleading results. We therefore computed the equivalent of the waiting time
adjustment with discounting to date of treatment with a charge for waiting with
individual level data.
Thirdly we were asked to consider how an adjustment for waiting times could allow
for optimal waiting times – it was suggested that some patients might find too short a
wait inconvenient. In the absence of any information on what an optimal wait might
be we investigated the implications of assuming that the effect of an optimal waiting
time w* was to replace the actual wait in our waiting time adjustments with the ŵ =
w – w* if w > w* and 0 otherwise. Thus reductions in waiting time below w* would
have no effect whereas the proportionate effect of reductions above w* would be
increased. We experimented first with w* = 30 days but found that this resulted in a
large number of HRGs where ŵ = 0, therefore we opted to use a value of w = 15
days.
Table 5.11 shows the impact on the CWOI of these three variants, where the first
column shows the calculations in Table 5.10, discount to date of treatment with
charge for wait (based on 80th percentile wait variable) for comparable discount rates.
Discounting to date on list lowers the average growth rates, mainly through reductions
in the first and last years. The use of individual data has a greater effect in raising the
growth rate, although this is concentrated in the first few years. The use of optimal
waits has little impact on the average growth rates, with only a discernible impact in
the final year.
127
Table 5.11 Laspeyres CWOI index, CIPS, adjustments for changes in waiting
times, rL =rW=1.5%
Discount to date
of treatment with
charge for wait,
80th percentile
wait variable
Discount to
date on list,
80th
percentile
wait
Using
Optimal
individual
waits*
data,
discounting to
discounting to
date of
date of
treatment with
treatment with
charge for
charge for
wait, mean
wait
wait
1998/99-1999/00
1.18
0.69
1.30
1.18
1999/00-2000/01
1.34
1.36
1.59
1.34
2000/01-2001/02
0.75
0.74
1.05
0.75
2001/02-2002/03
6.34
6.53
6.41
6.34
2002/03-2003/04
7.24
7.07
7.24
7.25
Average
3.33
3.24
3.48
3.33
*Based on 15 day optimal waiting time.
Note: All columns have the same survival adjustment: k = 0.8 if m < 0.10, 0 otherwise
The results which show small effects of waiting time adjustments are largely driven
by the lack of change in waiting times rather than the methods used. To see this
suppose waiting times for the 80th percentile were reduced by 10% for all HRGs
comparing 2003/04 with 2002/03. Then the discount to date of treatment with low
discount rates equal to 1.5% would add 0.16 percentage points. With the same
discount rates, reducing waits at the 80th percentile by 50% would add 1.12
percentage points.
The results from the specimen index calculated with a much
smaller set of HRGs are sensitive to the method of waiting time adjustment and can
make a difference to estimated growth rates.
In addition the impact of changes in waiting times is dependent on the cost share
weights. In this case however, large increases in waiting times tend to be concentrated
in low unit cost procedures. This is illustrated the final two growth periods in Figure
5.7 and Figure 5.8 but a similar pattern is also apparent for earlier years.
128
Figure 5.7 Percentage changes in waiting times (days) and unit costs, 2001/022002/03
150
100
ratio waiting times
50
0
0
5000
10000
15000
20000
25000
30000
35000
mean wait
80th percentile wait
-50
-100
-150
unit costs
Figure 5.8 Percentage changes in waiting times (days) and unit costs, 2002/032003/04
100
ratio waiting times
50
0
0
5000
10000
15000
20000
25000
30000
35000
mean wait
80th percentile wait
-50
-100
-150
unit costs
129
Again it is useful to summarise the relationship between cost and changes in waiting
times by the number of HRGs that show reductions and their expenditure shares.
Table 5.12 shows that for three of the five growth periods the majority of HRGs show
increases in waiting times with higher proportions in the first and final period. In
general the percent of HRGs with reductions in waiting times are about equal to their
expenditure shares so that reductions tend to be concentrated at the low unit cost end.
Table 5.12 Changes in waiting times and expenditures shares
Mean wait
Expenditure
Per cent of
shares of
electives*
electives*
with reduction
with
in waiting
reduction in
time
waiting time
1998/99-1999/00
1999/00-2000/01
2000/01-2001/02
2001/02-2002/03
2002/03-2003/04
62.3
32.4
31.9
32.3
63.4
68.7
37.3
30.4
33.3
63.9
80th percentile wait
Expenditure
Per cent of
shares of
electives*
electives*
with
with
reduction in
reduction in
waiting time waiting time
55.1
34.2
33.9
29.9
51.5
65.8
37.7
35.2
29.8
51.7
* Total number of electives = 563
We did not estimate the alternative characteristic adjustment set out in section 4.10.1
for all HRGs because of lack of data on health effects. However, we report in section
6 results from using this approach to waiting times with a small specimen set of
HRGs for which we have better health data.
5.5.2 Outpatient waits
Finally we consider outpatient waits. Data on waiting times for first outpatient
attendances are only available for four of the years considered in this report. Average
days wait for outpatients were 64 days in 1999/00 and 2000/01 but then declined by
about 10% in 2002/03 to 58 days and a further 7% to 54 days in 2003/04. We used
the discount to date of treatment formula as for electives above, assuming all
outpatients had remaining life expectancy of 26 years, the average across electives.
The cost weights for changes in waiting times for outpatients was assumed to be the
130
sum of the cost share of first attenders and follow up appointments to be consistent
with the spells approach employed in previous calculations. The effect of this
adjustment was to increase the cost weighted output index for outpatient first
attenders from 4.47% to 4.59% in 2001/02 and from 6.48% to 6.56% in 2002/03.
These adjustments become very small when all outpatients including follow-ups are
included in the index.
5.6
Additional quality adjustments
We also considered the use of data in addition to survival and waiting time in order to
quality adjust the output index.
These additional adjustments are necessarily
speculative because of the absence of crucial data so we present them mainly to
illustrate the application of the methods described in section 4 and to give a very
rough indication of what are the crucial parameters on which information is required
Given current data availability we do not recommend they be used to quality adjust
the NHS output index.
The adjustments are of two types (a) we treat measures of
readmissions and MRSA as indicators of unnecessary additional expenditure (section
4.9.1); and (b) we use measures of patient experience as summary indicators of
characteristics that patients value (section 4.11) .
5.6.1 Adjusting for the costs of poor treatment: readmissions and MRSA
We suggested in section 4.9.1 that one way of accounting for readmissions and
MRSA was to argue that these led to lost output whose value, in accordance with the
assumptions underlying the cost weighted index, was their additional cost to the NHS.
Thus, ignoring other quality adjustments for illustrative purposes, we calculate
∑x
∑x
c − ∑ x bjt +1c bjt
jt +1 jt
j
j
j
jt
c jt − ∑ x bjt c bjt
(104)
j
where xb denotes the number of readmissions or cases of MRSA and cb their costs. As
we noted in our discussion in section 4.9.1 (see also Appendix A) the current data on
the xb are not sufficiently detailed, to enable us to distinguish say between
readmissions which are the result of poor initial treatment and those which result from
pre-existing poor health of the patient. In our calculation we therefore use the total
131
number of readmissions and the total number of MRSA cases. Since we have no data
on MRSA cases or readmissions prior to 2001/02 we show only the effect from
2001/02 to 2003/04.
There are also problems in estimating the costs cb if we do not know, for example,
which readmissions are indicators of poor treatment. We therefore use notional costs
of a readmission of £500 and of £1000 for an MRSA case at 2002/03 prices, with
prices in other years estimated using money GDP per capita.
With these data and working from the cost weighted index with no adjustment for
mortality or waiting time, we can see what effect these have on the estimated growth
rate. We show three cases in addition to the basic cost weighted index.
Table 5.13 Effects of quality adjustment for readmissions and MRSA on cost
weighted hospital output
No. of
MRSA Hospital
Readmissions Cases CWOI*
Adjusted Indices
MRSA Charge (£ per case)
0 £1,000
Readmission Charge
(£ per case)
2001/2
2002/3
2003/4
0
Average growth
476,556
492,247
536,005
17,933
18,519
19,311
£0
£0 £1,000
£500
£500
4.44% 4.44% 4.46% 4.46%
5.81% 5.81% 5.75% 5.75%
5.12% 5.12% 5.10% 5.11%
*This is the unadjusted CWOI, for the inpatient hospital sector (see Table 5.3)
The impact of MRSA cases is negligible. The number of cases is very small,
compared to the number of patients treated in hospitals; a very much higher cost
would be required for it to have an impact. The effect of readmissions is slightly
larger (see section 4.9). However since the growth rate, at 6% p.a. over the two years
is not very different from the growth rate of the unadjusted impact, the costs
associated with readmissions (which are regarded as money wasted rather than
contributing to output) would have to be very large for there to be a substantial effect
on the overall index.
132
Table 5.14 shows the effect of adding the readmission and MRSA adjustments to the
survival and waiting time adjusted hospital cost weighted output index. It also further
clarifies the calculations, in the case where both adjustments are included, by showing
the shares of total expenditure attributed to each. With such small shares, especially
for MRSA, it is unlikely that the impact on the overall index would ever be of great
significance.
Table 5.14 Effects of quality adjustment for readmissions and MRSA on cost
weighted inpatient hospital output in addition to survival and waiting time
adjustment
2001/02
Index
Assumed
Assumed
Additionally
Costs of
Index
Costs of
adjusted for Adjusted for
Growth
MRSA as Readmissions
Proportion of as Proportion
Waiting and Readmissions Growth
Rate
Mortality and MRSA Rate MRSA Readmissions Total Costs of Total Costs
0.14%
1.89%
2002/03
6.35%
6.41%
3.27%
3.29%
2003/04
7.25%
7.22%
4.28%
8.89%
Av 03/4
over 01/02
6.80%
6.82%
3.77%
6.05%
0.14%
1.81%
Combined Growth Rate
MRSA/Readmission
5.90%
Note: The index adjusted for waiting and mortality is the 80th percentile waiting variable with rw=10%
p.a. and rl=1.5% p.a. in table 5.10. The adjustments for readmissions and MRSA are incorporated in
the index by adding to the growth in the index adjusted only for waiting and mortality the growth rates
of MRSA and readmission weighted negatively by the cost shares for the previous year.
These speculative guesstimates suggest that adjusting for readmissions and MRSA in
this way can have non-trivial effect on the survival and waiting time adjusted hospital
cost weighted output index. While we stress the illustrative nature of these figures
one important policy point does follow from them. Cases of MRSA are rare that costs
associated with its treatment would have to be a substantial multiple of the £1000 we
assumed before it could have an important impact on the index. Readmission, on the
other hand, is of material importance. If the DH wishes to use our approach to adjust
the CWOI we recommend that it should focus initially on quantifying the costs of
MRSA cases and the proportion of readmissions which are avoidable or harmful.
133
5.6.2 Patient satisfaction
As we note in our discussion in section 4.11, there are theoretical arguments against
using such data, not least that it may simply be double counting aspects of quality,
such as health effects and waiting times, which can be captured by other more direct
means. On the other hand if one believes that satisfaction survey response measure
characteristics of NHS care which are of value to patients and are not already
reflected in other quality measures then we have suggested in section 4.11 a method
of incorporating such data. The data are described in Appendix A.
Since patient satisfaction data only permit comparison of 2004 or 2004/05 with 2003
we have illustrated our method by examining its impact on the average annual growth
rate in hospital output between 2001/02 to 2003/04.8
Our method has three main steps. First we quantify the ordered qualitative responses
to various satisfaction questions by assigning them equally spaced numerical values
between 0 and 100 for the least to the most satisfied categories. We construct such
numerical scores for three aspects of the patient experience: food, cleanliness and
non-clinical experience (for example whether patients felt they were treated with
respect and dignity). We have scores from A&E, outpatient, and inpatients surveys.
There is a residual category of “other” hospital activity, taking up about 25% of costs.
We have assumed that the surveys do not describe patient satisfaction with the
services provided by this.
Second, since these scores are in effect estimates on a per patient basis we need to
scale them by a suitable measure of the volume of such experiences. We use the
number of patients for outpatients and A&E and the number of patient spells for
inpatients.
The reason for choosing patient spells for inpatients rather than the
alternative of bed-days is that the analogy with hotel services can be taken only so far.
One can argue that most patients would prefer short stays rather than long stays in
hospital- that staying in hospital is a necessary evil rather than a hotel service
consumed with the readiness of a stay in a hotel. For practical purposes, because we
8
We also calculated patient satisfaction adjustments to the volume of patient consultations for the same
period which had little effect because the patient experience scores changed little over the period
134
feel that, because a stay in hospital is a route to better health rather than a
consumption good per se, we make the calculation with the volume of treatment
measured by numbers of consultant inpatient spells.
Third, the scores are incorporated into the output index in two ways.
We use
expenditure on food and cleaning to weight the food and cleanliness scores, taking
account of the shares of A&E, outpatients, and inpatients in total hospital costs in
2001/02.9
We cannot, however find from the expenditure data weights appropriate
to non-clinical experience. We therefore present results making the assumption that
either 5% (case A) or 10% (case B) of total expenditure by hospital trusts and that
these same proportions apply to the three activities covered. It can be doubted whether
any form of accounting would identify the proportions since the score includes
measures of politeness and courtesy which cost nothing.
Table 5.15 reports an illustrative calculation of growth rates in the satisfaction scores,
and the growth in the total volume of quality taking account of the numbers
experiencing these different aspects of care in the different sectors.
9
These weights overstate the importance of the cost of providing the “hotel service” components of
cleanliness and food quality since they also have medical consequences. But since we do not have data
on the consequences of medical treatment for quality of life, we are not in fact double-counting. In any
case, since the cost weights are small, double-counting is unlikely to be a major source of error.
135
Table 5.15 Illustrative indicators of patient satisfaction with hospital services,
average annual growth rates, 2001/02 to 2003/04
A&E
Overall
Outpatients Inpatients Change
Food
Cleanliness
1.02%
Superficial
Attention
0.70%
Quality Change (A)
Quality Change (B)
Volume Change 4.59%
Total Change (A)
Total Change (B)
Weight
4.47%
0.64%
0.64%
-2.27%
-0.63%
-1.05%
-0.10%
0.56%
5.05%
5.12%
0.40%
0.14%
0.25%
5.08%
5.23%
5.49%
24.21%
Weight in
overall Trust
Budget
A
B
0.85% 0.85%
1.44%
1.44%
5.00% 10.00%
7.29%
12.29%
71.32%
Notes:
1.
2.
3.
4.
The changes in the indicators of food, cleanliness and non-clinical care are the changes in the
logarithms of the relevant variables. This is also true of the changes shown in tables A1 to A4.
The changes in each indicator for each category of treatment are weighted together using the
weights in the last row so as to give the overall change in each quality attribute. The food
indicator is used as it stands because all expenditure on food is associated with inpatients.
The overall quality indicators for each attributed are then weighted using weights (A) or (B)
shown in the last two columns. To give the overall quality changes (A) and (B). These growth
rates are then combined with the overall volume change (calculated as the weighted sum of the
changes in the individual volumes) to give the total change using weights (A) and (B).
The quality changes for food, cleanliness and non-clinical care are calculated for whatever
period the data happen to be available. The changes in volume relate to 2003/04 over 2001/02.
The overall impact on the overall hospital output index of these adjustments is shown
in Table 5.16. The effect of the quality terms is further damped because total hospital
output includes the “other” activities in addition to inpatients, outpatients and accident
and emergency. We have no quality data on these. But in any case, with the
MRSA/Readmissions and quality indices both growing at rates not very different from
the overall quality-adjusted CWOI, it is not surprising that these effects have little
influence on the overall total.
136
Table 5.16 Illustrative calculations of hospital CWOI with adjustments for
survival, waiting times, patient satisfaction as measured in patient surveys,
readmissions and MRSA. Average annual growth rates 2001/02 to 2003/04
Average Growth Rates 2001/2 to 2003/4
Unadjusted CWOI
Quality Variant 1
With Adjustment for Patient Satisfaction (5% weight on non-clinical
care satisfaction)
With adjustment for MRSA, Readmissions and Patient Satisfaction
(5% weight on non-clinical care satisfaction)
With Adjustment for Patient Satisfaction (10% weight on non-clinical
care satisfaction)
With adjustment for MRSA, Readmissions and Patient Quality and
Satisfaction (10% weight on non-clinical care satisfaction)
% p.a.
4.34%
5.74%
5.71%
5.71%
5.69%
5.69%
Note: these figures refer to the broader definition of the hospital sector; see Table 9.2 and preceding
discussion.
5.7
Conclusions
This section has implemented a number of the quality adjustments to hospital sector
output based on the methods developed in section 4 to make use of currently available
data. We found that
•
the pure survival adjustment raises the average annual growth rate of the
hospital sector between 1999/00 and 2003/04 from 2.78% to 2.99% when
survival was measured at 30 days.
•
the survival effect has a smaller effect when calculated using in-hospital
survival (average growth 2.91%).
•
combining the survival adjustment with an assumed uniform proportional
health effect further increase the average growth rate by around 0.4% to 1.0%
depending on the assumed value of health effect and the mortality rate cut off
criteria used to reduce the volatility of the index.
•
adding a life expectancy adjustment to the survival and health effect
adjustment had little additional effect on the growth rate.
•
combining survival, assumed health effect, waiting time and life expectancy
adjustments produced estimated growth rates which were very similar to
estimates with only survival and assumed health effect adjustments.
•
the effect of the waiting time adjustment was insensitive to very large
137
variations in discount rates on waiting times, to the use of individual rather
than HRG level data, to the form of the adjustment, and to the measure of
waiting time (mean wait or 80th percentile wait).
•
the small waiting time effects are due to the small changes in waiting times
over the period rather than to the form of the waiting time adjustment and the
particular parameter values used.
•
a crude illustrative adjustment readmissions and MRSA (all that is possible
with current data) in addition to the survival, assumed health effect, life
expectancy and waiting times adjustments, had no perceptible effect on the
average annual growth rate (2001/2 to 2003/4).
•
a similarly crude illustrative adjustment for patient satisfaction with food,
cleanliness and non-clinical care reduced the growth rate very slightly, (2001/2
to 2003/4) by less than 0.1 percentage point.
6 Specimen output index
6.1
Introduction
In Section 2.4 we set out our preferred index of NHS output, a value weighted output
index. It is not possible to estimate a comprehensive value weighted index because of
a lack of data on the most important characteristic: improvement in health for patients
who survive treatment. In Sections 2.7 and 4.6 we examined the stringent assumptions
necessary to justify using cost weights in the output index. In Section 4.8.2 we looked
at the sensitivity of the output index to an estimate of health gain based on the
assumption that health gain was constant across all activities and did not vary over
time.
In this section we examine the implications of having better health data which permit
the calculation of our preferred value weighted output index. We have identified a
few HRGs where data exist on health outcomes (Appendix C). The data are similar to
what would be produced by sampling patients before and after treatment and are used
138
to drop the restrictive assumption that health gain is constant across activities. While
the conditions for which we have outcomes data are not representative of all NHS
activities, we are able to compare a number of “specimen” indices with the equivalent
cost weighted output index for the same sub-set of conditions.
We use the health data to examine:
•
A value weighted output index assigning monetary weights to improvements
in health and reductions in waiting times
•
The impact on an index of substituting cost weights with value weights
•
The effect on health effect adjusted cost weighted indices of allowing health
gain to vary by treatment
In the next subsection we describe the data used in the construction of the specimen
index. We then estimate the following indices:
•
A Cost Weighted Output Index (CWOI)
•
CWOI with a short-term survival adjustment
•
CWOI incorporating health adjustment
•
CWOI incorporating health and waiting times adjustment
•
A health outcome weighted output index (HOWOI)
•
A HOWOI incorporating waiting times adjustment
•
A Value Weighted Output Index (VWOI) where health and waiting times are
treated as characteristics
We also explore the sensitivity of results to
•
In-hospital versus 30-day survival rates
•
The measurement of waiting times
•
Discount rate
•
Monetary value of a QALY
•
Monetary value applied to a day spent waiting
139
6.2
Data
The specimen index comprises the HRGs listed in table 6.1 below. For all of these
HRGs, data on the health outcomes before and after treatment were available, either
from clinical trials that employed the EQ5D or our analysis of SF36 from BUPA and
York District Trust. The data derived from these two instruments are converted to a
common scale. These data are described in greater detail in Appendix C.
Table 6.1 Before and after health outcomes
Health
outcome
h 0j
h*j
A07
B02
B03
C14
C22
C24
E04
E12
E15
E35
F73
F74
G01
G13
G14
H02
H04
H23
H24
H26
0.41
0.73
0.70
0.87
0.83
0.77
0.50
0.68
0.54
0.63
0.64
0.74
0.53
0.63
0.68
0.37
0.35
0.77
0.72
0.41
0.57
0.76
0.72
0.95
0.91
0.93
0.73
0.72
0.79
0.69
0.69
0.81
0.59
0.66
0.81
0.62
0.54
0.84
0.74
0.53
J01
L32
M07
M09
P18
Q11
R02
R03
R09
0.93
0.81
0.70
0.72
0.36
0.77
0.37
0.36
0.32
0.96
0.85
0.80
0.83
0.41
1
0.67
0.62
0.60
HRG description
Source
HRG
Intermediate Pain Procedures
Phakoemulsification Cataract Extraction with Lens Implant
Other Cataract Extraction with Lens Implant
Mouth or Throat Procedures - Category 2
Nose Procedures - Category 3
Mouth or Throat Procedures - Category 3
Coronary Bypass
Acute Myocardial Infarction w/o cc
Percutaneous Transluminal Coronary Angioplasty (PTCA)
Chest Pain >69 or w cc
Inguinal Umbilical or Femoral Hernia Repairs >69 or w cc
Inguinal Umbilical or Femoral Hernia Repairs <70 w/o cc
Liver Transplant
Biliary Tract - Major Procedures >69 or w cc
Biliary Tract - Major Procedures <70 w/o cc
Primary Hip Replacement
Primary Knee Replacement
Soft Tissue Disorders >69 or w cc
Soft Tissue Disorders <70 w/o cc
Inflammatory Spine, Joint or Connective Tissue Disorders <70 w/o
cc
Complex Breast Reconstruction using Flaps
Non-Malignant Prostate Disorders
Upper Genital Tract Major Procedures
Threatened or Spontaneous Abortion
Psychiatric Disorders
Varicose Vein Procedures
Surgery for Degenerative Spinal Disorders
Spinal Fusion or Decompression Excluding Trauma
Revisional Spinal Procedures
BUPA
BUPA
BUPA
BUPA
BUPA
BUPA
BUPA
EQ5D
BUPA
EQ5D
BUPA
BUPA
EQ5D
BUPA
BUPA
BUPA
York
BUPA
BUPA
EQ5D
BUPA
EQ5D
BUPA
BUPA
EQ5D
EQ5D
BUPA
BUPA
BUPA
140
Annual data from 1998/99 to 2003/04 on the following are used in the construction of
the indices:
•
Activity, measured as continuous inpatient provider spells (CIPS), derived
from HES.
•
In-hospital and 30-day survival rates, derived from HES.
•
Waiting times, measured as the mean waiting time and the wait at the 80th
percentile, derived from HES.
•
Life expectancy, derived from life tables and estimated according to the
average age of those in each HRG.
Raw data for each of the variables used in the indices are provided for each year from
1998/99 to 2003/04. To give an intuitive sense of the change in these data over time,
and hence what the various indices will be capturing, some of the data are presented
in the following tables and figures.
Table 6.2 provides elective and non-elective activity, measured as CIPS, for each
year. Figure 6.1 shows the amount of activity in each HRG in 1998/99, when the
series begins, and 2003/04, when the series ends. Figures (a) and (b) show the amount
of elective CIPS for, respectively, low and high volume HRGs. Figures (c) and (d)
provide similar information for non-elective CIPS. For the majority of HRGs, the
number of CIPS in 2003/04 (the darker bars) is greater than the number in 1998/99.
All else equal, this would be expected to translate into a positive change in the index
over the full period.
141
Table 6.2 Activity by year
Elective Activity
HRG
A07
B02
B03
C14
C22
C24
E04
E12
E15
E35
F73
F74
G01
G13
G14
H02
H04
H23
H24
H26
J01
L32
M07
M09
P18
Q11
R02
R03
R09
Non-elective Activity
1998/99
85,645
155,929
32,040
157,225
39,451
128,004
15,215
330
10,555
446
22,300
57,034
75
6,935
24,491
34,122
27,741
896
2,547
14,060
1,833
2,820
67,938
4,467
3,919
51,872
8,921
6,249
951
1999/00
86,770
177,551
20,644
152,462
37,603
114,666
14,618
212
11,364
412
21,432
54,838
99
6,825
24,707
34,355
28,730
1,153
3,121
14,281
2,020
2,603
62,433
4,864
3,378
45,659
8,368
6,029
924
2000/01
88,495
212,176
13,542
143,192
37,511
101,809
14,860
228
13,178
506
21,886
55,927
57
7,439
26,169
36,100
31,685
1,364
3,523
13,495
2,481
2,497
58,150
4,913
2,862
43,145
8,394
6,032
940
2001/02
93,227
224,247
10,648
140,591
31,858
100,901
15,046
187
15,439
482
21,348
54,517
88
7,600
27,636
37,530
34,392
1,189
3,014
15,591
2,655
2,184
55,138
5,039
3,308
40,306
8,150
6,329
931
2002/03
99,493
250,377
7,755
144,338
33,653
106,716
16,280
328
17,625
534
22,918
58,334
112
8,318
31,026
41,630
41,037
1,487
3,330
21,146
3,095
2,382
53,764
5,629
3,816
43,846
8,808
7,015
1,142
2003/04
100,640
282,486
5,602
145,078
33,026
101,381
15,132
332
21,490
530
24,405
59,413
141
8,910
33,125
46,126
48,916
1,637
3,807
23,992
3,337
2,219
52,315
5,998
3,439
41,156
9,161
7,987
1,135
1998/99
1,444
386
172
6,648
7,639
6,652
2,267
67,422
4,672
32,875
3,322
2,995
324
1,406
1,879
1,262
166
8,537
11,156
7,871
21
2,881
11,069
53,023
219
158
1,867
1,022
163
1999/00
1,048
392
110
6,665
7,680
7,006
1,141
58,969
4,918
34,483
3,189
2,884
386
1,417
1,962
1,168
159
8,876
11,877
7,735
13
2,975
10,239
55,123
206
150
1,589
870
168
2000/01
780
458
101
6,523
6,899
7,003
1,133
57,130
5,066
40,355
3,105
2,847
302
1,497
2,196
1,154
182
10,066
12,887
7,435
30
2,652
10,225
56,158
231
161
1,648
874
158
2001/02
679
574
66
6,134
6,773
6,936
950
55,455
4,995
42,724
2,957
2,782
327
1,534
2,264
1,149
218
10,562
13,136
7,214
22
2,692
9,736
59,124
192
150
1,457
792
133
2002/03
679
608
34
6,395
6,919
7,121
1,935
63,691
8,327
47,025
3,097
2,864
343
1,711
2,822
1,297
205
11,793
13,740
7,980
31
3,032
9,742
63,393
136
104
1,608
901
182
2003/04
836
583
42
6,557
6,542
7,622
2,074
63,900
10,488
51,843
3,217
3,070
298
1,824
3,157
1,157
255
12,911
14,527
8,076
14
3,355
9,889
64,311
192
128
1,663
1,044
195
33,242
32,487
32,847
33,089
35,722
37,342
8,259
8,048
8,250
8,335
9,232
9,647
Average
Figure 6.1 Number of elective and non-elective CIPS by HRG, 1998/99 and
2003/04
(a) Low volume elective HRGs
(b) High volume elective HRGs
Activity - high volume elective HRGs
Activity - low volume elective HRGs
10000
300000
9000
1998/99
8000
250000
1998/99
2003/04
7000
2003/04
200000
6000
5000
150000
4000
100000
3000
2000
50000
1000
0
0
G01
E12
E35
H23
R09
J01
H24
L32
P18
M09
R03
G13
R02
(c) Low volume non-elective HRGs
E15 H26 E04 F73 G14 H04 B03 H02 C22 Q11 F74 M07 A07 C24 B02 C14
(d) High volume non-elective HRGs
Activity - high volume non-elective HRGs
Activity - low volume non-elective HRGs
80000
2000
1800
70000
1998/99
1600
2003/04
60000
1998/99
2003/04
1400
50000
1200
1000
40000
800
30000
600
20000
400
10000
200
0
0
J01
Q11
R09
H04
B03
P18
G01
B02
R03
H02
G13
A07
R02
G14 E04 L32 F74
F73 E15 C14 C24 C22 H26 H23 M07 H24 E35 M09 E12
142
There are two available measures of the rates of survival for each HRG – in-hospital
survival and 30-day survival. Data on survival are provided in Table 6.3. The average
survival rate among electives was upwards of 99% and around 96% for non-electives.
Table 6.3 30-day and in-hospital survival rates, by year
Elective survival rate
30 days
inhospital
HRG
1998/99
1999/00
2000/01
2001/02
2002/03
2003/04
1998/99
1999/00
2000/01
2001/02
2002/03
2003/04
A07
B02
B03
C14
C22
C24
E04
E12
E15
E35
F73
F74
G01
G13
G14
H02
H04
H23
H24
H26
J01
L32
M07
M09
P18
Q11
R02
R03
R09
99.83%
99.51%
99.52%
99.94%
99.93%
99.88%
98.07%
72.49%
99.52%
97.32%
99.47%
99.92%
94.52%
98.77%
99.94%
99.22%
99.30%
97.99%
99.96%
99.87%
100.00%
99.65%
99.82%
100.00%
99.77%
99.95%
99.88%
99.40%
99.68%
99.87%
99.57%
99.56%
99.95%
99.92%
99.87%
98.18%
74.59%
99.69%
98.04%
99.53%
99.95%
93.00%
99.13%
99.93%
99.21%
99.34%
98.78%
99.94%
99.88%
99.85%
99.50%
99.83%
99.98%
99.82%
99.95%
99.81%
99.50%
100.00%
99.85%
99.62%
99.59%
99.94%
99.94%
99.88%
98.27%
75.90%
99.59%
97.46%
99.55%
99.95%
84.75%
98.73%
99.92%
99.27%
99.26%
99.34%
99.91%
99.90%
99.96%
99.48%
99.81%
99.96%
99.79%
99.94%
99.81%
99.58%
99.89%
99.84%
99.67%
99.63%
99.93%
99.95%
99.90%
98.21%
72.68%
99.63%
98.96%
99.56%
99.93%
90.59%
99.13%
99.94%
99.31%
99.41%
99.58%
99.83%
99.94%
99.85%
99.50%
99.79%
100.00%
99.79%
99.95%
99.81%
99.60%
99.46%
99.87%
99.66%
99.68%
99.94%
99.92%
99.90%
98.19%
79.67%
99.67%
97.60%
99.61%
99.94%
92.79%
99.15%
99.94%
99.43%
99.36%
99.20%
99.88%
99.95%
99.94%
99.58%
99.81%
99.98%
99.84%
99.95%
99.79%
99.46%
99.65%
99.89%
99.69%
99.77%
99.95%
99.94%
99.91%
98.66%
87.42%
99.65%
98.91%
99.63%
99.95%
94.24%
99.20%
99.95%
99.33%
99.50%
99.33%
99.82%
99.93%
99.94%
99.73%
99.82%
99.98%
99.83%
99.94%
99.82%
99.64%
99.64%
100.00%
100.00%
99.99%
99.99%
99.99%
99.97%
98.28%
73.96%
99.77%
99.33%
99.87%
100.00%
94.52%
99.09%
99.98%
99.55%
99.61%
99.33%
99.96%
99.94%
100.00%
99.89%
99.90%
100.00%
99.82%
100.00%
99.93%
99.60%
99.68%
100.00%
99.99%
100.00%
99.99%
99.99%
99.96%
98.41%
77.70%
99.83%
99.26%
99.87%
99.99%
93.00%
99.37%
99.98%
99.54%
99.64%
99.39%
100.00%
99.96%
99.90%
99.81%
99.89%
100.00%
99.85%
100.00%
99.88%
99.62%
100.00%
100.00%
100.00%
99.97%
100.00%
99.99%
99.96%
98.52%
77.19%
99.79%
98.44%
99.88%
100.00%
84.75%
99.01%
99.98%
99.53%
99.56%
99.49%
100.00%
99.94%
100.00%
99.84%
99.87%
100.00%
99.82%
100.00%
99.89%
99.75%
100.00%
100.00%
100.00%
100.00%
99.99%
99.99%
99.97%
98.53%
75.26%
99.82%
99.38%
99.86%
99.99%
90.59%
99.26%
99.98%
99.54%
99.62%
99.92%
99.93%
99.99%
99.92%
99.91%
99.86%
100.00%
99.82%
100.00%
99.89%
99.73%
99.67%
100.00%
100.00%
100.00%
100.00%
99.99%
99.97%
98.44%
82.17%
99.82%
99.08%
99.88%
100.00%
92.79%
99.41%
99.98%
99.66%
99.65%
99.66%
99.97%
99.98%
99.94%
99.83%
99.89%
99.98%
99.87%
99.99%
99.90%
99.80%
99.74%
100.00%
100.00%
100.00%
100.00%
100.00%
99.97%
98.86%
88.23%
99.84%
99.27%
99.90%
100.00%
94.24%
99.37%
99.99%
99.59%
99.67%
99.70%
99.95%
99.98%
99.97%
99.91%
99.88%
100.00%
99.94%
100.00%
99.90%
99.75%
99.91%
99.71%
99.74%
99.73%
99.75%
99.75%
99.77%
99.90%
99.91%
99.90%
99.91%
99.91%
99.92%
Activity weighted
average
Non-elective survival rate
30 days
inhospital
HRG
1998/99
1999/00
2000/01
2001/02
2002/03
2003/04
1998/99
1999/00
2000/01
2001/02
2002/03
2003/04
A07
B02
B03
C14
C22
C24
E04
E12
E15
E35
F73
F74
G01
G13
G14
H02
H04
H23
H24
H26
J01
L32
M07
M09
P18
Q11
R02
R03
R09
98.66%
98.04%
98.88%
99.46%
98.86%
97.59%
93.52%
84.34%
96.52%
97.44%
95.65%
99.77%
84.85%
92.21%
99.31%
80.81%
96.25%
97.31%
99.73%
99.33%
100.00%
97.76%
99.48%
99.99%
99.54%
98.09%
99.11%
93.14%
99.42%
98.53%
98.79%
98.21%
99.26%
98.82%
97.41%
92.79%
84.18%
96.36%
97.29%
95.00%
99.76%
90.82%
90.20%
99.69%
78.26%
94.94%
97.06%
99.72%
99.39%
100.00%
98.22%
99.43%
99.99%
99.52%
100.00%
98.56%
93.52%
98.84%
98.65%
99.16%
98.15%
99.57%
98.83%
97.81%
94.41%
85.16%
97.21%
97.56%
95.82%
99.93%
91.91%
92.07%
99.59%
76.15%
96.65%
97.49%
99.63%
99.33%
96.55%
97.63%
99.44%
99.99%
100.00%
99.38%
99.24%
91.88%
99.39%
99.02%
98.68%
97.10%
99.47%
98.68%
97.41%
93.74%
85.43%
97.48%
97.43%
95.44%
99.86%
90.35%
92.23%
99.47%
78.98%
94.05%
97.44%
99.62%
99.38%
100.00%
98.04%
99.65%
99.99%
100.00%
100.00%
98.66%
91.89%
100.00%
99.01%
98.10%
100.00%
99.50%
98.89%
97.92%
95.51%
87.74%
97.86%
97.97%
95.55%
99.76%
91.46%
93.52%
99.68%
78.28%
96.14%
97.71%
99.77%
99.57%
100.00%
98.35%
99.63%
100.00%
100.00%
99.04%
98.80%
95.02%
100.00%
99.08%
98.49%
100.00%
99.42%
99.19%
97.97%
96.50%
88.76%
98.09%
98.14%
95.69%
99.77%
92.52%
92.61%
99.78%
82.52%
95.54%
97.68%
99.82%
99.54%
100.00%
98.43%
99.59%
99.99%
97.93%
100.00%
99.32%
94.80%
98.52%
99.40%
98.53%
100.00%
99.71%
99.63%
98.35%
94.11%
85.69%
97.09%
98.62%
96.62%
99.93%
85.15%
93.42%
99.47%
83.70%
98.13%
98.51%
99.87%
99.50%
100.00%
98.51%
99.62%
100.00%
99.54%
99.36%
99.37%
95.86%
99.42%
99.35%
99.51%
100.00%
99.54%
99.51%
98.06%
93.64%
85.71%
96.96%
98.36%
96.32%
99.86%
91.07%
91.26%
99.80%
80.55%
96.84%
98.26%
99.83%
99.52%
100.00%
99.36%
99.56%
100.00%
99.52%
100.00%
99.16%
96.06%
99.42%
99.63%
99.37%
98.15%
99.71%
99.41%
98.42%
94.96%
86.55%
97.69%
98.61%
96.76%
99.93%
92.56%
92.75%
99.73%
80.38%
96.65%
98.40%
99.81%
99.47%
96.55%
98.49%
99.63%
100.00%
100.00%
100.00%
99.48%
95.58%
100.00%
99.30%
99.83%
97.10%
99.66%
99.35%
97.98%
94.22%
86.76%
97.95%
98.42%
96.05%
99.93%
90.35%
93.07%
99.51%
82.16%
94.93%
98.51%
99.82%
99.53%
100.00%
98.81%
99.75%
99.99%
100.00%
100.00%
98.91%
95.33%
100.00%
99.29%
98.89%
100.00%
99.75%
99.51%
98.42%
95.90%
88.77%
98.29%
98.77%
96.46%
99.90%
91.46%
93.99%
99.79%
80.47%
96.62%
98.61%
99.88%
99.67%
100.00%
98.75%
99.71%
100.00%
100.00%
99.04%
99.10%
95.83%
100.00%
99.66%
99.50%
100.00%
99.53%
99.57%
98.50%
96.62%
89.64%
98.47%
98.87%
96.47%
99.94%
92.52%
92.72%
99.81%
85.56%
96.75%
98.63%
99.90%
99.69%
100.00%
99.02%
99.66%
100.00%
97.93%
100.00%
99.54%
96.05%
99.01%
94.52%
94.85%
95.37%
95.54%
96.12%
96.52%
95.26%
95.60%
96.06%
96.19%
96.65%
96.99%
Activity weighted
average
143
Figure 6.2 shows the change in survival rates for elective CIPS between 1998/99 and
2003/04 for these alternative measures for low and high volume HRGs. Similar
information is provided for non-elective CIPS in Figure 6.3.
In general, survival rates have improved for both elective and non-elective patients.
The most dramatic improvement in survival has been for non-elective acute
myocardial infarction (E12), where the probability of 30-day survival increased from
85.69% in 1998/99 to 89.64% in 2003/04. As AMI is also a high volume HRG, this
improvement would be expected to exert a high degree of leverage on the value of an
index that included survival.
In-hospital and 30-day survival rates map each other fairly closely, as comparison of
figures (a) and (c) and of figures (b) and (d) show. Consequently, it would not be
expected that the index would be particularly sensitive to which measure is adopted.
Figure 6.2 Change in elective survival rates, 1998/99 – 2003/04
(a) In-hospital, low vol elec HRGs
(b) In-hospital, high vol elec HRGs
Proportionate change in in-hospital survival - low volume elective HRGs
Proportionate change in in-hospital survival - high volume elective HRGs
1.2500
1.0060
1.0050
1.2000
1.0040
1.1500
1.0030
1.0020
1.1000
1.0010
1.0500
1.0000
0.9990
1.0000
G01
E12
E35
H23
R09
J01
H24
L32
P18
M09
R03
G13
R02
E15 H26 E04 F73 G14 H04 B03 H02 C22 Q11 F74 M07 A07 C24 B02 C14
0.9980
0.9500
0.9970
0.9960
0.9000
(c) 30 day, low vol elec HRGs
(d) 30 day, high vol elec HRGs
Proportionate change in 30-day survival - low volume elective HRGs
Proportionate change in 30-day survival - high volume elective HRGs
1.007
1.3
1.25
1.005
1.2
1.003
1.15
1.001
1.1
1.05
0.999
E15 H26 E04 F73 G14 H04 B03 H02 C22 Q11 F74 M07 A07 C24 B02 C14
1
G01
E12
E35
H23
R09
J01
H24
L32
P18
M09
R03
G13
R02
0.997
0.95
0.9
0.995
144
Figure 6.3 Change in non-elective survival rates, 1998/99 – 2003/04
(a) In-hospital, low vol non-elec HRGs
(b) In-hospital, high vol non-elec HRGs
Proportionate change in in-hospital survival - low volume non-elective HRGs
Proportionate change in in-hospital survival - high volume non-elective
HRGs
1.1
1.06
1.08
1.05
1.06
1.04
1.04
1.03
1.02
1.02
1.01
1
J01
Q11
R09 H04
B03
P18
G01
B02
R03
H02
G13
A07
R02
0.98
1
0.96
0.99
0.94
0.98
0.92
0.97
G14 E04 L32 F74 F73 E15 C14 C24 C22 H26 H23 M07 H24 E35 M09 E12
(c) 30 day, low vol non-elec HRGs
(d) 30 day, high vol non-elec HRGs
Proportionate change in 30 days survival - low volume non-elective HRGs
Proportionate change in 30 days survival - high volume non-elective HRGs
1.12
1.08
1.1
1.06
1.08
1.06
1.04
1.04
1.02
1.02
1
0.98
0.96
J01
Q11
R09
H04
B03
P18
G01
B02
R03
H02
G13
A07
R02
1
G14 E04 L32 F74 F73 E15 C14 C24 C22 H26 H23 M07 H24 E35 M09 E12
0.98
0.94
0.92
0.96
As discussed in Appendix B, there are a variety of ways of summarising how long
people have to wait for admission to hospital. In the specimen index we compare two
summary measures: the mean waiting time and waiting time at the 80th percentile of
the distribution. Raw data for the waiting time at the mean and 80% percentile are
provided in Table 6.4. On average, the mean wait fell from 163 days to 134 days over
the period, while the wait at the 80% percentile fell from 262 days to 213 days.
145
Table 6.4 Mean and 80% percentile waiting time in days, by year
Waiting time
HRG
A07
B02
B03
C14
C22
C24
E04
E12
E15
E35
F73
F74
G01
G13
G14
H02
H04
H23
H24
H26
J01
L32
M07
M09
P18
Q11
R02
R03
R09
Activity weighted average
1998/99
73
228
236
118
205
144
195
68
75
37
178
174
14
147
162
236
285
58
76
28
110
72
109
13
31
251
110
165
136
1999/00
66
204
209
97
184
126
199
40
72
37
159
151
13
147
157
238
285
59
76
26
109
65
106
12
19
225
107
162
130
163
148
mean wait
2000/01 2001/02
68
68
193
184
197
185
82
75
177
167
109
151
215
189
66
60
83
84
40
51
161
163
146
147
11
13
151
161
163
169
250
253
294
294
56
48
64
66
27
28
101
119
68
69
99
97
11
13
21
40
214
211
120
122
180
184
140
143
144
145
2002/03
73
180
165
81
183
128
154
95
89
77
161
148
19
161
169
247
282
67
68
30
125
79
100
11
22
216
127
179
140
2003/04
77
153
161
90
173
115
106
88
92
84
144
139
23
145
154
225
252
93
77
35
122
70
96
11
18
196
119
171
130
1998/99
100
357
371
185
356
245
350
68
112
37
316
303
14
252
277
374
437
65
96
28
168
86
173
13
31
408
173
304
227
1999/00
92
323
332
143
316
204
350
40
104
37
273
250
24
241
262
383
444
59
91
26
170
86
164
12
19
371
162
286
216
144
134
262
235
80% percentile wait
2000/01 2001/02
96
101
310
301
329
305
117
106
309
288
167
262
373
342
66
60
128
126
40
60
271
280
235
237
17
13
265
286
279
299
402
404
458
445
56
48
72
83
27
28
152
209
91
90
150
147
11
13
26
40
354
352
173
196
318
331
209
234
229
2002/03
106
295
293
118
319
211
256
102
140
83
280
246
19
290
302
378
406
71
70
33
268
122
155
11
23
348
228
330
258
2003/04
112
248
261
142
294
194
175
91
152
94
245
232
23
257
263
340
354
102
79
43
254
95
153
11
27
304
219
301
247
231
213
235
Figure 6.4 shows the change between 1998/99 and 2003/04 in the mean waiting time
(Figure 6.4 (a) and (b)) and waiting time at the 80th percentile of the distribution
(Figure 6.4 (c) and (d)). Although waiting times increased between 1998/99 and
2003/04 for some HRGs, these tend to be low volume activities. For the majority of
high volume HRGs, waiting times fell. It would be expected that the net effect,
therefore, of including waiting times in the specimen index would be an increase in
the index over the period. Figures (a) and (c) and figures (b) and (d) are very similar,
suggesting that the choice between mean and 80th percentile as a summary of waiting
time is unlikely to have a dramatic effect on the index.
146
Figure 6.4 Change in waiting time, 1998/99 – 2003/04
(a) Mean wait, low volume HRGs
(b) Mean wait, high volume HRGs
Proportionate change in mean waiting time - high volume HRGs
Proportionate change in mean waiting time - low volume HRGs
3
1.6
2.5
1.4
2
1.2
1.5
1
1
0.8
E15
G01
E12
E35
H23
R09
J01
H24
L32
P18
M09
R03
G13
H26
E04 F73 G14 H04 B03
H02 C22 Q11
F74 M07 A07 C24 B02 C14
R02
0.5
0.6
0
0.4
(c) 80th perc wait, low volume HRGs
(d) 80th perc wait, high volume HRGs
Proportionate change in 80th percentile wait - high volume HRGs
Proportionate change 80th percentile wait - low volume HRGs
3
1.6
2.5
1.4
2
1.2
1.5
1
1
0.8
E15 H26 E04 F73 G14 H04 B03 H02 C22 Q11 F74 M07 A07 C24 B02 C14
G01
E12
E35
H23
R09
J01
H24
L32
P18
M09
R03
G13
R02
0.5
0.6
0
0.4
As noted in section 2.7, the use of cost weights presumes an efficient allocation of
NHS resources. If efficient allocation cannot be assumed, an alternative basis for
establishing the relative value of activity would be according to the health outcomes
each produces. The before, h 0j , and after, h*j , measures of the health effect for each of
the HRGs included in the specimen index are provided in Table 6.1 while unit costs,
calculated on the basis of CIPS, are presented in Table 6.5. Where relative costs are
not proportionate to relative health outcomes, the assumption of efficient allocation is
questionable.
147
Table 6.5 Unit costs based on CIPS, by year
Elective unit cost
HRG
A07
B02
B03
C14
C22
C24
E04
E12
E15
E35
F73
F74
G01
G13
G14
H02
H04
H23
H24
H26
J01
L32
M07
M09
P18
Q11
R02
R03
R09
Activity weighted
average
Non-elective unit cost
1998/99
£384
£628
£661
£465
£717
£639
£5,024
£995
£2,385
£812
£898
£667
£11,595
£1,756
£1,304
£3,965
£4,454
£713
£663
£997
£3,027
£481
£1,839
£282
£641
£676
£2,311
£3,407
£2,497
1999/00
£384
£628
£661
£465
£718
£641
£5,121
£1,224
£2,388
£838
£901
£667
£11,839
£1,780
£1,306
£3,993
£4,471
£709
£660
£1,003
£3,032
£491
£1,845
£282
£644
£677
£2,332
£3,403
£2,498
2000/01
£395
£624
£637
£491
£741
£679
£5,654
£1,588
£2,421
£803
£991
£731
£14,149
£1,857
£1,358
£4,284
£4,661
£734
£625
£1,051
£3,346
£526
£1,912
£305
£732
£727
£2,569
£3,727
£3,061
2001/02
£422
£673
£737
£550
£849
£769
£6,480
£1,580
£2,457
£1,086
£1,093
£810
£18,505
£2,055
£1,500
£4,442
£4,859
£861
£614
£978
£3,735
£628
£2,109
£349
£784
£837
£2,706
£3,893
£3,178
2002/03
£466
£682
£734
£574
£914
£807
£6,507
£1,745
£2,815
£992
£1,148
£873
£18,961
£2,117
£1,564
£4,763
£5,294
£851
£639
£907
£3,965
£653
£2,298
£405
£783
£894
£2,899
£4,184
£3,641
2003/04
£466
£682
£736
£574
£913
£806
£6,388
£1,687
£2,820
£942
£1,149
£872
£19,020
£2,100
£1,560
£4,758
£5,278
£853
£635
£902
£3,971
£646
£2,294
£405
£783
£894
£2,896
£4,189
£3,649
1998/99
£1,705
£1,094
£1,299
£778
£979
£948
£5,264
£1,182
£2,587
£855
£1,496
£1,042
£14,569
£3,073
£1,947
£4,039
£4,655
£910
£654
£1,299
£2,947
£1,136
£1,858
£325
£778
£1,373
£2,854
£4,813
£3,007
1999/00
£1,746
£1,096
£1,285
£785
£988
£984
£5,426
£1,352
£2,622
£912
£1,521
£1,047
£14,612
£3,202
£1,961
£4,172
£4,772
£966
£667
£1,352
£3,057
£1,217
£1,865
£325
£774
£1,397
£2,909
£4,995
£3,075
2000/01
£747
£1,116
£1,294
£850
£1,171
£1,146
£5,794
£1,484
£2,824
£915
£1,795
£1,126
£18,696
£3,310
£2,081
£4,741
£4,408
£981
£643
£1,448
£3,480
£1,303
£1,818
£351
£589
£1,319
£3,269
£5,200
£3,122
2001/02
£993
£1,087
£1,580
£898
£1,206
£1,256
£6,401
£1,688
£3,040
£970
£1,925
£1,327
£20,229
£3,894
£2,385
£5,191
£4,084
£961
£618
£1,521
£3,218
£1,465
£2,064
£400
£2,664
£1,487
£3,649
£5,764
£3,816
2002/03
£1,050
£994
£1,112
£943
£1,303
£1,251
£6,973
£1,546
£3,241
£893
£1,919
£1,368
£23,179
£3,802
£2,465
£5,658
£5,477
£947
£614
£1,488
£3,340
£1,303
£2,234
£399
£1,562
£1,234
£3,774
£5,961
£4,476
2003/04
£966
£1,000
£1,138
£930
£1,307
£1,233
£6,914
£1,625
£3,227
£868
£1,922
£1,360
£23,034
£3,750
£2,439
£5,669
£5,394
£936
£609
£1,473
£3,340
£1,285
£2,228
£399
£1,451
£1,237
£3,767
£5,954
£4,406
£1,069
£1,078
£1,150
£1,261
£1,355
£1,379
£1,077
£1,103
£1,159
£1,251
£1,274
£1,290
Figure 6.5 and Figure 6.6 show the relative weight given to each HRG if relative
values are based on costs or health outcomes, for elective and non-elective HRGs
respectively. The figures are sub-divided to show low and high volume HRGs
separately and with cost weights calculated on 1998/99 Reference Costs and 2003/04
Reference Costs. The health outcome weights are time invariant.
Many of the high volume HRGs appear relatively more “valuable” if value is based
on health outcome rather than cost (e.g. C24, C22, H26, M09), with the stark
exception of G01 (liver transplantation). All else equal, if a greater proportion of these
activities were undertaken in 2003/04 compared to 1998/99, an index in which
activity is valued according to health outcome would suggest greater output growth
than an index where relative values are based on costs.
148
Figure 6.5 Cost and health outcome weights, elective HRGs
(a) 1998/99 costs, low volume HRGs
(b) 2003/04 costs, low volume HRGs
2003/04 cost and health outcome share - low volume elective HRGs
1998/99 cost and health outcome share - low volume elective HRGs
0.3
0.25
03/04 cost share
0.25
health outcome share
0.2
98/99 cost share
0.2
health outcome share
0.15
0.15
0.1
0.1
0.05
0.05
0
0
G01
E12
E35
H23
R09
J01
H24
L32
P18
M09
R03
G13
R02
(c) 1998/99 costs, high volume HRGs
G01
E12
E35
H23
R09
J01
H24
L32
P18
M09
R03
G13
R02
(d) 2003/04 costs, high volume HRGs
1998/99 cost and health outcome share - high volume elective HRGs
2003/04 cost and health outcome share - high volume elective HRGs
0.1
0.1
98/99 cost share
0.09
03/04 cost share
0.09
health outcome share
health outcome share
0.08
0.08
0.07
0.07
0.06
0.06
0.05
0.05
0.04
0.04
0.03
0.03
0.02
0.02
0.01
0.01
0
0
E15 H26 E04 F73 G14 H04 B03 H02 C22 Q11 F74 M07 A07 C24 B02 C14
E15 H26
E04 F73
G14 H04 B03 H02
C22 Q11 F74 M07 A07 C24
B02 C14
Figure 6.6 Cost and health outcome weights, non-elective HRGs
(a) 1998/99 costs, low volume HRGs
(b) 2003/04 costs, low volume HRGs
2003/04 cost and health outcome share - low volume non-elective HRGs
1998/99 cost and health outcome share - low volume non-elective HRGs
0.3000
0.2500
03/04 cost share
98/99 cost share
health outcome share
health outcome share
0.2500
0.2000
0.2000
0.1500
0.1500
0.1000
0.1000
0.0500
0.0500
0.0000
0.0000
J01
Q11
R09
H04
B03
P18
G01
B02
R03
H02
G13
A07
J01
R02
(c) 1998/99 costs, high volume HRGs
R09
H04
B03
P18
G01
B02
R03
H02
G13
A07
R02
(d) 2003/04 costs, high volume HRGs
2003/04 cost and health outcome share - high volume non-elective HRGs
1998/99 cost and health outcome share - high volume non-elective HRGs
0.0900
0.0800
98/99 cost share
0.0700
Q11
0.0800
health outcome share
03/04 cost share
health outcome share
0.0700
0.0600
0.0600
0.0500
0.0500
0.0400
0.0400
0.0300
0.0300
0.0200
0.0200
0.0100
0.0100
0.0000
0.0000
G14 E04 L32 F74 F73 E15 C14 C24 C22 H26 H23 M07 H24 E35 M09 E12
G14 E04 L32 F74 F73 E15 C14 C24 C22 H26 H23 M07 H24 E35 M09 E12
149
In the following sections we compare estimates of output growth under various
specifications of the specimen index with CIPS measuring volume. All estimates are
based on a Laspeyres index.
6.3
Cost weighted output indices
Column (i) of Table 6.6 contains estimates of output change for a cost weighted
output index (CWOI) corresponding to equation (19) of the form:
I ctx =
∑ j x jt +1c jt
∑ j x jt c jt
Table 6.6 Cost weighted output index, with adjustments for survival and health
effects
CWOI
(i)
1998/99 - 1999/00
1999/00 - 2000/01
2000/01 - 2001/02
2001/02 - 2002/03
2002/03 - 2003/04
Average
CWOI survival adjustmCWOI health adjustment
In-hospital
30-day
30-day
30-day
30-day
no threshold
threshold<0.90
k=0.8
(ii)
(iii)
(iv)
(v)
(vi)
-1.19%
2.79%
2.18%
9.14%
6.30%
-1.18%
2.88%
2.20%
9.35%
6.42%
-1.17%
2.90%
2.23%
9.39%
6.46%
-1.33%
2.66%
1.78%
4.76%
6.37%
-1.20%
2.98%
2.29%
9.55%
6.67%
-1.19%
3.16%
2.57%
12.37%
7.49%
3.84%
3.93%
3.96%
2.85%
4.06%
4.88%
This unadjusted CWOI suggests an average annual growth in output of 3.84%. There
is annual variation in the estimated amount of growth. In particular, there is a large
increase in the index of 9.14% between 2001/02 and 2002/03. This is driven by an
increase in activity rather than a change in the costs. The average (unweighted for
volume or cost) increase in activity between 2001/02 and 2002/03 was 12%. Figure
6.7 below shows the number of elective and non-elective CIPS in each year for high
volume HRGs.
150
Figure 6.7 Activity change 2001/02-2002/03, high volume HRGs
Activity change 2001/02 - 2002/03, high vol HRGs
300000
2001/02
250000
2002/03
200000
150000
100000
50000
0
C22 H04 H02 Q11 E35 F74 M07 E12 M09 A07 C24 C14 B02
Columns (ii) and (iii) in Table 6.6 present results for a survival adjusted cost weighted
output index, of the form presented in equation (40):
I ctxa =
∑ j x jt +1 (a jt +1 / a jt )c jt
∑ j x jt c jt
Compared to the unadjusted cost weighted output index, there is slight increase in the
estimated output growth when survival is included in index, reflecting the general
improvement in survival over the period. The increase is slight, however, because
survival rates are high for these HRGs (around 97%).
Inclusion of the survival effect increases estimated annual output growth by 0.09% if
in-hospital survival is considered and 0.12% based on 30-day survival rates. Given the
high correlation between these measures, their equivalent influence on the index is
unsurprising. Subsequent estimations employ 30-day survival as a measure of a j .
The figures in columns (iv) and (v) of Table 6.6 show the estimates including
adjustment for before and after health status, corresponding to equation (48):
151
I
xq
ct
*
0
*
0
∑ j x jt +1 (a jt +1h j − h j ) /(a jt h j − h j )c jt
=
∑ j x jt c jt
As discussed in section 4.8.2, in estimating this equation it is necessary to introduce
an arbitrary threshold for HRGs with poor survival rates. Failure to make this
adjustment makes the index disproportionately sensitive to changes in aj for activities
with small or negative (a jt h*j − h 0j ) or (a jt +1h*j − h 0j ) . If survival rates are below a
particular threshold, only the change in survival is taken into account. To illustrate we
estimate the equation without a threshold and with a threshold at the 90% survival
rate.
As can be seen, omitting the threshold has a dramatic effect on the estimates,
particularly in 2001/02-2002/03, where output growth was estimated as greater than
9% but now appears much lower (4.76%). The divergence stems predominantly (but
not exclusively) from non-elective E12, which has a poor survival rate (of 85.69% in
1998/99). The influence of this HRG is felt particularly in the change between
2001/02-2002/03, when activity increased from 55,455 to 63,691. A formulation of
the form (a jt +1h*j − h 0j ) , takes a negative value for E12, with the adjustment being the
ratio of two negative numbers and showing the increase in (ah* − h 0 ) as a reduction.
This pulls the index down dramatically particularly in years where there was a growth
in this activity (2001/02-2002/03) and up in years where this activity declined (e.g.
1999/00-2000/01).
Hence, for HRGs with a survival rate below 90%, the before-and-after health
adjustment is not taken into account. This threshold applies in most years to E12
(AMI) and H02 (non-elective primary hip replacement) and in occasional years to
G01 (liver transplantation).
The set of estimates in column (v) in Table 6.6 show estimates when the threshold is
included. Inclusion of the health effects leads to an average annual increase in the
estimates of output growth of 0.1% compared to the CWOI survival adjusted index.
The final set of figures (column (vi)) in table 6.6 assume that health effects are
constant across treatments, ie where k=0.8. As can be seen this makes a dramatic
152
difference to the estimates of output growth, changing from an average of 4.06%
when k varies by HRG to 4.88% when k is held constant. This sensitivity reflects both
the difference in average values of k for the HRGs included in the specimen index
( k j =0.825) and, more particularly, the substantial variation in kj (standard
deviation=0.14). Of course, it is not possible to speculate about the direction of
estimated output change from relaxing the assumption of a constant value of k when
applied across the full range of NHS activities. However, this analysis does suggest
the impact might be of substantial magnitude.
Table 6.7 presents estimates for a cost weighted output index where waiting times and
life expectancy are taken into account with waiting time is discounted to date placed
on the list, as described in section 4.10.2.
(
−r w
−r L
⎧ ⎡ (1 − e − rw w jt +1 ) (1 − e − rL w jt +1 ) ⎤
e L jt +1 1 − e L jt +1
*
o
o
⎪hj ⎢
−
⎥ + ( a jt +1h j − h j )
rw
rL
rL
⎪ ⎣
⎦
∑ j x jt +1c jt ⎨
− rL w jt
−r L
− rw w jt
− rL w jt
e
1 − e L jt
⎪
) (1 − e
)⎤
*
o ⎡ (1 − e
o
−
⎥ + ( a jt h j − h j )
⎪ hj ⎢
rw
rL
rL
⎣
⎦
⎩
(
I ctxaw =
∑ j x jt c jt
)
) ⎫⎪
⎪
⎬
⎪
⎪
⎭ (105)
Table 6.7 reports the sensitivity of results to:
•
•
•
Mean and 80th percentile waiting times
Discounting life expectancy at 1.5% or 5%
Discounting waiting time at 1.5%, 5% or 10%
The choice between mean wait (top half of Table 6.7) and the wait experienced at the
80% percentile (bottom half of Table 6.7) has little difference on the estimates,
unsurprisingly given their close correlation. Use of the 80th percentile generates
slightly higher estimates of output growth, reflecting the policy concentration on
reducing the waiting times for long waits during this period. In subsequent estimations
the 80% percentile wait is chosen to measure waiting time.
As can be seen, the choice of a discount rate of 10% applied to waiting time has a
significant effect on the estimates.
153
Table 6.7 Cost weighted output index, with adjustment for waiting times
discounted to date placed on list
Mean waiting time
discount rate life expectancy
discount rate waiting time
1998/99 - 1999/00
1999/00 - 2000/01
2000/01 - 2001/02
2001/02 - 2002/03
2002/03 - 2003/04
Average
80% percentile waiting time
discount rate life expectancy
discount rate waiting time
1998/99 - 1999/00
1999/00 - 2000/01
2000/01 - 2001/02
2001/02 - 2002/03
2002/03 - 2003/04
Average
CWOI health, waiting time and life expectancy adjustment
Waiting discounted to date placed on list
1.50%
5%
1.50%
5%
10%
1.50%
5%
10%
-1.36%
2.40%
1.96%
9.88%
6.58%
-1.35%
2.40%
1.97%
9.89%
6.59%
-1.35%
2.39%
1.97%
9.90%
6.61%
-1.20%
2.65%
2.03%
9.99%
6.75%
-1.19%
2.65%
2.03%
9.99%
6.77%
-1.18%
2.64%
2.04%
10.00%
6.79%
3.89%
3.90%
3.90%
4.04%
4.05%
4.06%
5%
10%
CWOI health, waiting time and life expectancy adjustment
Waiting discounted to date placed on list
1.50%
5%
1.50%
5%
10%
1.50%
-1.34%
2.40%
1.96%
9.92%
6.62%
-1.32%
2.39%
1.96%
9.94%
6.65%
-1.30%
2.39%
1.97%
9.97%
6.70%
-1.15%
2.65%
2.01%
10.08%
6.87%
-1.12%
2.65%
2.01%
10.11%
6.92%
-1.08%
2.65%
2.02%
10.15%
6.98%
3.91%
3.93%
3.95%
4.09%
4.11%
4.14%
An alternative approach to considering waiting times is to discount to date of
treatment and include a charge for waiting, as discussed in section 4.10.2.2, so that the
index becomes:
(
) (
(
I ctxaw
)
r w
⎡ 1 − e − rL L jt +1
e w jt +1 − 1 ⎤
⎢
⎥
−
o
* ⎢
r
r
⎥
L
w
⎛ a jt +1 − h j / h j ⎞ ⎣
x
c
⎜
⎟
∑ j jt +1 jt ⎜ a − ho / h* ⎟ ⎡ 1 − e− rLL jt erww jt − 1 ⎤ ⎦
j
j ⎠
⎝ jt
⎢
⎥
−
rL
rw
⎢
⎥
⎣
⎦
=
∑ j x jt c jt
) (
)
(106)
We estimate this formulation at different discount rates, with results presented in
Table 6.8. This generates higher estimates of output growth than the formulation in
which waits were discounted to date placed on list.
154
Table 6.8 Cost weighted output index, with adjustment for waiting times
discounted to date of treatment
80% percentile waiting time CWOI health, waiting time and life expectancy adjustment
Waiting discounted to date of treatment, with charge for waiting
discount rate life expectancy
1.50%
1.50%
5%
discount rate waiting time
1.50%
1.50%
5%
10%
1.50%
5%
k=0.8
(i)
(ii)
(iii)
(iv)
(v)
(vi)
1998/99 - 1999/00
-1.17%
-1.18%
-1.17%
-1.16%
-0.99%
-0.98%
1999/00 - 2000/01
2.50%
2.32%
2.32%
2.31%
2.58%
2.57%
2000/01 - 2001/02
2.33%
2.05%
2.05%
2.06%
2.11%
2.11%
2001/02 - 2002/03
12.46%
9.65%
9.67%
9.69%
9.81%
9.83%
2002/03 - 2003/04
8.10%
7.29%
7.32%
7.35%
7.52%
7.55%
Average
4.85%
4.03%
4.04%
4.05%
4.20%
4.22%
10%
1.50%
0%
(vii)
-0.97%
2.57%
2.12%
9.86%
7.59%
(viii)
-1.39%
2.40%
2.00%
9.27%
6.54%
4.23%
3.76%
Column (viii) has the results of setting rw = 0 so that there is no adjustment for
waiting, only for life expectancy as in section 4.8.3. Compared to the CWOI with
only a health effects adjustment Table 6.6, column (v) the average growth rate is
reduced by about 0.3%.
6.4
Health outcome weighted output indices
This section presents the results from calculation of health outcomes weighted output
indices (HOWOI). For comparative purposes with the CWOI, we first estimate a
version of HOWOI in which the impact of treatment on life expectancy is ignored. In
effect, this amounts to comparing the use of cost and survival-adjusted before and
after health outcomes (not QALYs) as weights:
*
0
∑ j x jt +1 (a jt +1h j − h j )
*
0
∑ j x jt (a jt h j − h j )
(107)
This equation is estimated both with k varying by HRG (column (ii) Table 6.9) and
for a value of k=0 (column (iii) Table 6.9). As can be seen, output growth appears
lower in this formulation of a HOWOI than the corresponding cost weighted output
index (figures from Table 6.6 reproduced in column (i) of Table 6.9).
As can be seen, substitution of cost for health outcome weights leads to a reduction in
estimated output growth for this group of HRGs. The extent to which an index is
sensitive to the choice of cost and health outcome weights depends on three factors:
•
Whether cost weights are disproportionate to health outcome weights;
155
•
The volume of activity in those HRGs where the relative weights are most
disproportionate;
•
The change in activity over time in those HRGs where the relative weights are
most disproportionate.
For a handful of HRGs, cost weights are greater than health outcome weights. This is
particular evident for G01 liver transplants, which are costly (the non-elective cost
was £23,000 in 2003/04) but their estimated contribution to health outcome is about
average for the sample of HRGs considered here. However, because this is a low
volume HRG and there is little change in the amount of activity over time, the impact
of changing the valuation basis for G01 exerts little influence on the overall index.
In contrast, elective activity categorised to A07 (intermediate pain procedures)
contributes 7.4% of total 2003/04 activity, and elective activity in this HRG grew by
17.5% over the period captured by the index. Its cost share in 2003/04 is only 0.6%
whereas its health outcome share is 4.43%. All else equal, the growth in activity in
this high volume HRG would lead to an index based on health outcome shares having
a higher value than one based on cost shares.
There is little difference between the cost and health outcome weights for B02
(cataract extractions), but they are accorded slightly less weight (-0.06%) when
relative values are based on health outcomes. However, despite this minimal
difference, B02 exerts considerable influence on the overall index, contributing 20.7%
of total volume in 2003/04. There has also been a volume increase of 81% in this
activity over the period captured by the index. Thus, this HRG exerts downward
influences on the index.
156
Table 6.9 Health outcome weighted output index
CWOI
HOWOI
No LE
No LE
k=0.8
discount rate
life expectancy
1998/99 - 1999/00
1999/00 - 2000/01
2000/01 - 2001/02
2001/02 - 2002/03
2002/03 - 2003/04
HOWOI
With LE
(i)
-1.19%
2.79%
2.18%
9.14%
6.30%
(ii)
-2.96%
0.16%
1.41%
10.17%
4.62%
(iii)
-1.88%
1.87%
1.04%
9.11%
5.07%
1.50%
(iv)
-4.75%
-3.78%
0.27%
8.43%
1.95%
5%
(v)
-4.06%
-2.20%
0.48%
8.85%
2.73%
3.84%
2.68%
3.04%
0.42%
1.16%
Average
The previous adjustment makes the assumption that the health status snapshots h*j , hoj
measure the discounted sum of QALYs q*j , qoj .
More properly health outcome
weights should incorporate the effect of treatment on life expectancy, so that they
more nearly measure the discounted sum of QALYs:
*
0
∑ j x jt +1 (a jt +1h j − h j )(1 − e
*
0
∑ j x jt (a jt h j − h j )(1 − e
− rL L jt +1
− rL L jt
)
(108)
)
Estimates are presented from this index in columns (iv) and (v) of table 6.9, with life
expectancy discounted at 1.5% and 5%. The impact of including life expectancy is a
substantial reduction in estimated output growth. The reason for this is that life
expectancy declined gradually over the period, the main reason for this probably
being that increasingly older people were receiving treatment, as demonstrated in
column. Table 6.10 provides evidence.
Table 6.10 Average age and life expectancy
Age
1998/99
1999/00
2000/01
2001/02
2002/03
2003/04
45.71
45.76
46.17
46.3
46.93
47.9
Life expectancy
25.83
25.34
24.12
23.91
23.59
22.98
The health outcomes weighted output index is also estimated after incorporating
157
waiting times to date placed on list
(
⎧ ⎡
−r w
−r w
⎡ e − rL w jt +1 1 − e − rL L jt +1
⎪ o (1 − e w jt +1 ) (1 − e L jt +1 ) ⎤
o
*
−
∑ j x jt +1 ⎨ h j ⎢
⎥ + ( a jt +1h j − h ) ⎢
r
r
rL
⎢
w
L
⎦
⎪⎩ ⎣
⎣
⎧ ⎡
−r w
−r w
⎡ e − rL w jt 1 − e − rL L jt ⎤ ⎫
⎪ o (1 − e w jt ) (1 − e L jt ) ⎤
*
o
⎥ ⎬⎪
−
∑ j x jt ⎨ h j ⎢
⎥ + ( a jt h j − h ) ⎢
rw
rL
rL
⎢
⎥
⎦
⎪⎩ ⎣
⎣
⎦ ⎪⎭
(
)
) ⎤⎥ ⎫⎪
⎬
⎥⎪
⎦⎭
(109)
and to date of treatment with a charge for waiting:
(
) (
) ⎤⎥
r w
⎡ 1 − e − rL L jt +1
e w jt +1 − 1
−
∑ j x jt +1 ( a h − h ) ⎢⎢ r
rw
L
⎣
r w
⎡ 1 − e − rL L jt
e w jt − 1 ⎤
o
*
⎥
−
∑ j x jt ( a jt h j − h j ) ⎢⎢ r
r
⎥
L
w
⎣
⎦
*
jt +1 j
o
j
(
) (
)
⎥
⎦
(110)
The sensitivity of these two variants of the HOWOI are assessed with respect to:
•
Discounting life expectancy at 1.5% or 5%
•
Discounting waiting time at 1.5%, 5% or 10%
Results are provided in Table 6.11. When waiting time is discounted to date placed
on list, estimates of output growth are slightly higher than those when no waiting
time adjustment is made, with the difference increasing at higher discount rates.
Compared to discounting to date placed on list, discounting to date of treatment
results in lower estimates of output growth, decreasing at higher discount rates.
158
Table 6.11 HOWOI, adjusted for waiting time
80% percentile waiting time
HOWOI health, waiting time and life expectancy adjustment
Waiting discounted to date placed on list
discount rate life expectancy
discount rate waiting time
1998/99 - 1999/00
1999/00 - 2000/01
2000/01 - 2001/02
2001/02 - 2002/03
2002/03 - 2003/04
Average
80% percentile waiting time
1.50%
1.50%
5%
10%
5%
1.50%
5%
10%
-4.64%
-3.74%
0.20%
8.47%
2.00%
-4.60%
-3.74%
0.19%
8.48%
2.03%
-4.54%
-3.73%
0.18%
8.50%
2.08%
-3.79%
-2.08%
0.29%
8.98%
2.86%
-3.71%
-2.07%
0.28%
9.00%
2.91%
-3.61%
-2.06%
0.26%
9.03%
2.99%
0.46%
0.47%
0.50%
1.25%
1.28%
1.32%
HOWOI health, waiting time and life expectancy adjustment
Waiting discounted to date of treatment, charge for waiting
discount rate life expectancy
discount rate waiting time
1998/99 - 1999/00
1999/00 - 2000/01
2000/01 - 2001/02
2001/02 - 2002/03
2002/03 - 2003/04
Average
6.5
1.50%
1.50%
5%
10%
5%
1.50%
5%
10%
-4.52%
-3.81%
0.09%
8.55%
2.11%
-4.52%
-3.81%
0.09%
8.55%
2.11%
-4.51%
-3.82%
0.08%
8.56%
2.12%
-3.66%
-2.18%
0.21%
9.06%
3.01%
-3.65%
-2.18%
0.20%
9.06%
3.02%
-3.63%
-2.18%
0.19%
9.07%
3.04%
0.48%
0.48%
0.49%
1.29%
1.29%
1.30%
Value weighted output index
Our “ideal” index takes the form specified in equation (12), in which activities are
valued according to their associated health outcomes and waiting times are considered
a characteristic of health care. The index takes the form:
I
xq
yt
−r L
*
0
∑ j x jt +1[(a jt +1h j − h j )[(1 − e L t +1 )π h ] / rL − w jt +1π W ]
=
−r L
*
0
∑ j x jt [(a jt h j − h j )(1 − e L t )π h ] / rL − w jπ W ]
(111)
We estimate this index assuming that the monetary value of a QALY ( π h ) in 2002/3
is £30,000 and applying growth rates in money GDP to calculate values for earlier
years. We explore the sensitivity of results to:
•
the cost of a day spent waiting ( π W ) - either £3.13 or £50 in 2002/3 (adjusted
by money GDP growth in earlier years)
•
discounting life expectancy at 1.5% and 5%
Estimates of output growth are presented in Table 6.12. These imply lower rates of
output growth than a CWOI for these HRGs, the main reason being because of the
159
influence of treating an increasing older population (leading to decreasing life
expectancy). The effect of applying a higher value to the cost of a day spent waiting is
to increase estimated output growth, but not substantially.
Table 6.12 Value weighted output index
cost of day spent waiting
discount rate life expectancy
1998/99 - 1999/00
1999/00 - 2000/01
2000/01 - 2001/02
2001/02 - 2002/03
2002/03 - 2003/04
Average
6.6
Value weighted output index
£3.13
1.50%
5%
£50
1.50%
5%
(i)
-4.71%
-3.86%
0.23%
8.41%
2.00%
(ii)
-4.00%
-2.32%
0.41%
8.82%
2.82%
(iii)
-3.39%
-4.22%
-0.29%
8.27%
2.90%
(iv)
-1.54%
-2.53%
-0.42%
8.70%
4.59%
0.41%
1.15%
0.65%
1.76%
Conclusion
In this section we have applied various formulations of an output index to a limited set
of HRGs for which data were available on health status before and after treatment.
The main conclusions are that:
•
Estimates of output growth are sensitive to whether k is assumed constant
across treatments. In view of this, it would be advisable to ascertain before and
after health status for a larger sample of NHS treatments.
•
There is a high correlation between indices using the two mortality measures in-hospital and 30-day survival.
•
Although relative cost and health outcome weights differ to some extent for
our specimen set of HRGs, the difference does not lead to dramatic changes in
the estimates produced by the specimen index. It cannot be assumed, however,
that there will not be greater divergence between indices using costs and
health outcome weights for other NHS activities.
•
Unable to estimate QALYs directly, we have had to rely upon life tables from
the general population to generate estimates of life expectancy. With an
increasingly older population being treated over time, this leads to decreasing
life expectancy, which in turn implies declining output growth in indices
160
where life expectancy is included. This is because our index formulations
make the value judgement that an additional quality adjusted life year should
have the same value whatever the age of the person it accrues to.
•
Cost weighted indices with waiting time adjustments are sensitive to whether
waiting time is discounted to the date placed on the list or to the date of
treatment, and to the choice of discount rate.
•
The health and waiting time outcomes index, for the HRGs considered here, is
not particularly sensitive to which point in the distribution is chosen to
measure waiting time (mean or 80% percentile) or to the cost applied to a day
spent waiting (£3.13 or £50).
7 Effects of quality adjustments on hospital and NHS output
indices: summary
In Section 4 we argued that it was important to include estimates of health effects in a
quality adjusted output index. This should be done by regular collection of health
outcomes data for a representative range of NHS activity. In the absence of this data,
in Section 5 we used available information on outcomes for 29 HRGs and made the
assumption that the average health gain observed could be applied uniformly to all
hospital activity.
For the specimen quality adjusted output index discussed in Section 6, it was possible
to test the sensitivity of results to the assumption of a uniform effect. For the
specimen index we were able to estimate quality adjusted output using data for actual
health effects and compare the result with estimates using a uniform health effect. As
expected, the move from uniform to actual values does affect the result.
We recommend that wherever possible actual health effects data be used to estimate
quality adjusted output indices. Over the next few years the number of HRGs for
which actual data will be available should increase. This will gradually reduce the
proportion of activity where it is necessary to make assumptions about health effects.
161
A consequence of this recommendation is that for the next few years a quality
adjusted output index would have to be based on a mix of actual and assumed values.
In this section we examine the impact of departing from the assumption of uniform
fixed health effect (k = 0.8) and instead use actual values where they exist and
assumed values where data is absent.
•
For the 29 elective procedures for which we have data, k varies by HRG as in
the specimen index.
•
For all other elective procedures we assume k = 0.8 as suggested by the mean
of the k for the elective HRGs where there are estimates
•
For non-elective HRGs we assume k = 0.4 on the grounds that non-elective
patients may have worse health (qo) if not treated so that the ratio of health if
not treated to health if treated (k = qo/q*) is smaller.
Given that non-elective activity is growing more rapidly than elective, the lack of
knowledge of health state and health gain for non-elective patients is a serious
problem.
We compare our recommended variant (Q2) based on a health effects adjustment
which varies by HRG with a variant (Q1) with the same health effect adjustment for
all HRGs, elective and non-elective.
Quality variant 1 assumes k = q0/q* = 0.8 if a – k > 0.05 and k = 0 otherwise for all
elective and non-elective HRGs, discounts to date of treatment with charge for wait,
with discount rates on waits and health equal to 1.5% and the waiting time variable is
the 80th percentile wait in each HRG.
Quality variant 2 is our recommended quality variant. This sets k = q0/q* = 0.8 for
electives, k = 0.4 for non-electives, k = actual k for those HRGs included in the
specimen index where this is known, provided a – k > 0.10 and k = 0 otherwise. This
quality variant discounts to date of treatment with charge for wait, with discount rates
on waits and health equal to 1.5% and uses 80th percentile waits.
162
We show the effects of these two quality adjustments variants on the HES hospital
output in Table 7.1, to all hospital output including outpatients and accident and
emergency in Table 7.2 and to all NHS output in Table 7.3.
Table 7.1 shows that the Q1 quality adjustment variant, with survival, health effects,
life expectancy and waiting time adjustments adds just under one percentage point to
the HES hospital unadjusted index average across the five growth periods. The
recommended variant Q2 results in a smaller upward adjustment of just over 0.5%
Table 7.1 HES hospital cost weighted output index with hospital sector quality
adjustments
Unadjusted
Quality variant 1
Survival,
Survival
health
and health
effect, life
effect only
expectancy
and waiting
Quality variant 2
Survival,
Survival
health
and health
effect, life
effect only
expectancy
and waiting
1998/99-1999/00
1.87
0.09
0.49
0.63
1.04
1999/00-2000/01
0.91
1.97
1.73
1.50
1.25
2000/01-2001/02
0.95
1.01
0.87
0.82
0.65
2001/02-2002/03
4.44
7.72
7.48
6.77
6.52
2002/03-2003/04
5.81
8.04
8.15
7.21
7.31
Average
2.80
3.77
3.74
3.38
3.35
Table 7.2 shows the effect of the two variants on a broader definition of hospital
activity. All variants have faster growth than for the narrower HES output indices in
Table 7.1. The main reason for this is the faster growth in the activities not captured
by HES, such as A&E and outpatients, which are excluded from Table 7.1. However,
because fewer of the quality adjustments apply to these non-HES activities the
proportionate effect of Q1 and Q2 is smaller than in Table 7.1. Thus Q1 increases
average annual growth by 0.44% instead of nearly 1% and Q2 increases growth by
0.25% instead of 0.55% over the period.
163
Table 7.2 Hospital sector cost weighted output index with hospital sector quality
adjustments
Unadjusted
Quality variant 1
Quality variant 2
Survival
and health
effect only
Survival,
health
effect, life
expectancy
and waiting
Survival
and health
effect only
Survival,
health
effect, life
expectancy
and waiting
1998/99-1999/00
2.03
0.43
0.79
0.91
1.28
1999/00-2000/01
1.54
2.35
2.16
1.99
1.80
2000/01-2001/02
4.48
4.52
4.43
4.40
4.31
2001/02-2002/03
3.94
5.71
5.57
5.19
5.06
2002/03-2003/04
4.78
5.94
6.00
5.51
5.56
Average
3.35
3.79
3.79
3.60
3.60
Finally Table 7.3 shows the impact of the quality adjustments to the hospital sector
output on the cost weighted output index for the NHS as a whole. We first show the
CWOI without quality adjustments and then add variants of the adjustments for
survival and waiting times for the hospital sector. Overall NHS output growth is
higher than either of the hospital sector output growth rates because of the more rapid
growth in some non-hospital activities such as prescribing and consultation rates, and
because of increasing coverage of NHS activity. Quality adjustment variant Q1
increases average annual growth by 0.29% and Q2 increases it by 0.17%. Notice that
for 2001/02 to 2002/03 both variants have a much larger effect (1.04% for Q1 and
0.71% for Q2) but rather small effects in the middle years and actually reduce growth
from 1998/99 to 1999/00. This negative adjustment is due, as we noted in section
5.4.1, to the fall in survival for a small number of high activity high cost HRG.
Notice also that from 1998/99 to 1999/00 when both Q1 and Q2 lead to downward
adjustments our recommended variant Q2 has a smaller negative effect. Thus in
general variant Q2 has a smaller positive or negative effect than Q1 because it uses
smaller assumed health effects for emergency activities.
164
Table 7.3 Aggregate NHS cost weighted output index with hospital sector quality
adjustments
Unadjusted
Quality variant 1
Survival,
Survival
health
and health
effect, life
effect only
expectancy
and waiting
Quality variant 2
Survival,
Survival
health
and health
effect, life
effect only
expectancy
and waiting
1998/99-1999/00
2.61
1.77
1.96
2.03
2.22
1999/00-2000/01
2.11
2.57
2.46
2.36
2.26
2000/01-2001/02
3.85
3.88
3.82
3.80
3.74
2001/02-2002/03
5.07
6.20
6.11
5.87
5.78
2002/03-2003/04
4.43
5.17
5.20
4.89
4.93
Average
3.62
3.92
3.91
3.79
3.79
We next examine input changes over the period and then in section 9 combine our
output indices and input indices to calculate productivity growth rates.
8 Labour input
8.1
Introduction
Current practice by DH and ONS calculates labour input by deflating payments to
labour by a wage index. It is more usual to estimate labour inputs based on number of
workers or hours worked so it was considered useful to devote effort in the project to
this alternative method of measuring labour input. In addition the Atkinson Report
recommended that labour input should be adjusted to take account of variations in
types of workers employed, in particular the changing use of skilled workers; this
section also addresses this recommendation. Labour is by far the most important input
used in producing health services, accounting for about 75% of total hospital
expenditures. These measures can then be combined with the aggregate and hospital
output measures given in section 7 to calculate labour productivity growth rates and
combined with measures of payment to labour can be used to calculate total factor
productivity growth rates. Productivity estimates are presented in the next section.
165
8.2
Labour input in the NHS
This section summarises results on constructing measures of labour input.
8.2.1 Volume of labour input
The volume of labour input can be calculated using direct or indirect measures. Direct
measures include number of persons engaged (including self-employed), number of
full-time equivalents or total hours worked. The indirect measure is expenditure on
labour deflated by a wage index, employed by ONS for NHS labour input.
Productivity analysts tend to prefer direct measures since reasonable data are
generally available on numbers employed whereas wage indexes are seen as less
reliable. This section follows this tradition of using direct measures. However it
should be noted that in sectors where the self-employed account for a large share of
employment, as is the case for the NHS, there may be more grounds for using an
indirect measure; this is discussed further below.
The simplest direct measure is a headcount of number of persons employed. Since
many persons in the NHS work part-time, a more reliable indicator is full-time
equivalent workers. Such a measure is calculated by the DH where part-time work is
weighted by normal weekly hours of these people. While full-time equivalents is
undoubtedly a better measure than headcounts, it is only a half-way house to the
measure recommended by the OECD productivity manual (OECD, 2001) of annual
actual hours worked. Normal or usual hours worked do not take account of changes in
time lost due to holidays, sickness etc. Over time trends in time paid but not worked
tend to dominate changes in usual weekly hours worked. Adjustments to an annual
actual hours worked basis are discussed below.
8.2.2 Quality of labour input
“Because a worker’s contribution to the production process consists of his/her
“raw” labour (or physical presence) and services from his/her human capital, one
hour worked by one person does not constitute the same amount of labour input as
one hour worked by another person”, OECD productivity manual (p. 41).
166
Volume measures of labour input hide considerable diversity across types of workers.
Obviously the productivity of highly skilled workers is greater than that of less skilled
workers as set out in the quote above. Division by skill type is not the only quality
dimension; other candidates are age or experience, gender or occupation. Nevertheless
most research on measuring labour quality suggests skill is the most important
dimension (Jorgenson, Ho and Stiroh, 2005).
The standard growth accounting formula for adjusting for skills divides labour hours
by skill type and then weights the growth in hours of each type by their wage bill
shares. This captures the fact that more highly skilled workers get paid more than the
unskilled, and under competitive market conditions, the wage paid reflects the
marginal productivity of workers of different types. Merely calculating growth in total
hours worked is equivalent to weighting worker types by their share in employment.
Hence if there is general upskilling of the workforce so that growth in hours is greater
for skilled relative to unskilled workers, weighting by wage bill shares leads to higher
aggregate labour input growth.
Formally, quality adjusted labour input, with s types of skilled labour can be
calculated using a Törnqvist index (see section 3) by:
ln Lt − ln Lt −1 = ∑ϖ s (ln Lst − ln Lst −1 )
(112)
l
where Ls is number of hours worked for worker of skill type s, and
ϖ s = 0.5(
ws ,t −1 Ls ,t −1
wt −1 Lt −1
+
wst Lst
)
wt Lt
is the wage bill share of type s workers in the total wage bill for all workers, averaged
across periods t and t-1. The difference between the equation above and the growth in
total hours worked gives the impact of skills on aggregate labour input growth.
8.2.3 Data sources and volume trends
This report reviewed available data sources relating to health sector labour input to
assess their usefulness in constructing direct measures of labour input for the NHS.
Two sources seemed particularly useful and formed the basis of the calculations
167
presented in this section. These were:
•
NHS Workforce Census – An annual census conducted by the Department of
Health. This source provides data on numbers employed in the NHS, both in
headcount and full-time equivalent terms, by occupation and organisation.
•
Labour Force Survey – This is a quarterly sample survey conducted by ONS.
It contains data on numbers employed and annual hours worked by industry
(SIC92) and occupation (SOC), distinguishing private and public sectors and
whether the person is employed by an NHS Trust. It also contains data on
wages, qualifications, region and nationality of employees and questions
relating to on the job training.10
These sources were supplemented by information from the NHS staff earnings survey.
The NHS Workforce Census has an advantage over the Labour Force Survey in that
being a census it captures all people whose employer is the NHS. The labour force
survey headcounts are derived indirectly through both the industry in which the
individual states they are working (SIC 85.1 – human health activities) and if they
state that they work for an NHS Trust. Against this the LFS includes agency workers
whereas the NHS Census excludes these. The two sources were compared for
consistency and were found to follow similar but not identical trends through time
(shown in Figure 8.1).
10
Note since the LFS is an individual survey, where the person interviewed is frequently reporting for
their spouses or partner, reporting errors can lead to situations where individuals state that they work
for industries such as personal services but that they are employed by the NHS, e.g. contract cleaners
168
Figure 8.1 Numbers employed in the NHS 1995-2003
1400000
1300000
1200000
Census
LFS
1100000
1000000
900000
800000
1995
1996
1997
1998
1999
2000
2001
2002
2003
The decision was made therefore to employ the NHS Census data for the headcount
but to include an adjustment using the ratio of agency to other staff from the LFS to
adjust the Census data. The following table shows growth rates in persons engaged by
broad occupational group. These show significant growth in total numbers employed,
in particular since 2000. The growth rates are fairly uniform across broad categories.
Within categories there is more variation. For example the number of managers,
included in the infrastructure support group, increased by about 6.5% per annum
from 1995 to 2003, with a very large increase of over 11% per annum since 2000.
However it should be noted that managers represent a very small proportion of the
NHS workforce, reaching only 2.7% by 2003 despite the high growth. In contrast
nursing assistants and auxiliaries show growth of less than 2% p.a. since 1995 and
2.7% since 2000.
169
Table 8.1 Trends in the NHS workforce
1995-03
2000-03
Total
2.49
4.58
Professionally qualified clinical staff
All doctors
All qualified nurses (including practice nurses)
2.77
3.30
2.48
4.53
4.03
4.66
Total qualified scientific, therapeutic & technical staff
3.60
4.73
NHS infrastructure support
Support to Clinical Staff
Other
1.26
3.20
0.55
4.66
5.35
1.98
Rather than use full-time equivalents we employ LFS data on weekly hours to convert
these headcounts to total annual hours. Weekly actual hours show a slight decrease
over time from 28.9 in 1995 to 28.6 in 2003 with rises in between.
8.2.4 Quality adjustments based on qualifications
The LFS is used in this project to incorporate quality adjustments. Thus the NHS
Census data are used as control totals with proportions of workers in each skill group
and their wage rates from the LFS used in the skills adjustment. We then refine these
estimates in a number of ways to take account of on the job training, regional
variations in wage rates and country of birth of workers. In addition we include an
adjustment for doctors using data from the NHS Census and the Earnings survey since
the qualification division in the LFS is not fine enough to ensure we are picking up all
skill variations.
Table 8.2 shows the proportion of workers by qualification group for selected years.
Employment growth has been relatively strong among workers with higher degrees
and primary degrees as well as those with A-levels and equivalents. The share of
those with nursing qualifications or other NVQ4 has declined reflecting the growth in
degrees among nurses and health care professionals. There has also been a marked
170
decline in the share of the workforce with no skills. While the figures confirm the well
known high growth in university degrees among doctors, nurses and other health
professionals, the changes in the lower end of the skill distribution suggest upskilling
also among other NHS employees, illustrated by the chart for health care assistants
below. Interestingly, managers have also experienced pronounced changes in their
skill distribution with the percent of managers having a higher degree (mostly masters
degrees) rising from about 3% in 1995 to about 16% in 2003 and the percent with
degrees rising from 22% to 28% over the same time period (Figure 8.2).
Table 8.2 Skill proportions of the Workforce: NHS selected years
1995
2000
2003
Higher degree (Masters, PhDs)
Degree
5.4
13.2
6.9
17.4
8.3
18.2
Nursing qualification/other NVQ4
(higher education below degree)
36.4
31.3
29.2
NVQ3 (A-levels or equivalent)
NVQ1 and NVQ2
(GCSEs or equivalent, vocational
qualifications)
Other
None
6.1
22.8
8.6
23.8
9.8
23.2
6.8
10.0
5.9
6.1
5.9
5.5
171
Figure 8.2 Healthcare assistants: proportions employed by highest qualification
held
60.00
50.00
40.00
1995
2000
30.00
2003
20.00
10.00
0.00
Degree/nursing/other NVQ4
NVQ3
NVQ1 and NVQ2
Other or none
Figure 8.3 Managers: proportions employed by highest qualification held
45.00
40.00
35.00
30.00
25.00
1995
2000
2003
20.00
15.00
10.00
5.00
0.00
PhD and Masters
First degree
Nursing qualification, other
NVQ4
NVQ3
All other
.
172
It is also worth considering skill use in the NHS with that for the economy as a whole
or other service sectors. Table 8.3 shows skill proportions in the total economy and
private market services for the three years shown for the NHS above.
Compared to the total economy, the NHS employs proportionally more workers at the
high end of the skill distribution. These differences are more pronounced when the
comparisons is between the NHS and private market services. Over this time period
growth in proportions of the workforce with the highest qualifications (degree and
above) has been higher in the NHS than in the total or market services and the
reduction in the use of unskilled workers has been greater.
One interesting trend that can be considered from the comparison between the NHS
and other sectors is the extent to which those with nursing qualifications are leaving
the NHS to work in other sectors. In 2003 the LFS shows about 550,000 individuals
whose highest stated qualification is a nursing qualification. Of these 42% work in
NHS Trusts whereas in 1995 nearly 47% worked in NHS Trusts. Some of this
attrition can be traced to other areas of the public health sector or private health
sectors but much is to other non health industries. Thus 42% of persons whose highest
qualification was nursing worked outside the health sector in 1995 and this had risen
to 45% by 2003. While this suggests some increase in attrition rates it may well be an
underestimate since many people leaving may have other qualifications higher than
nursing.
173
Table 8.3 Skill proportions of the Workforce: NHS selected years
Total Economy
Higher degree (Masters, PhDs)
Degree
Nursing qualification/other NVQ4
(higher education below degree)
NVQ3 (A-levels or equivalent)
NVQ1 and NVQ2 (GCSEs or equivalent,
vocational qualifications)
Other
None
Market Services1
Higher degree (Masters, PhDs)
Degree
Nursing qualification/other NVQ4
(higher education below degree)
NVQ3 (A-levels or equivalent)
NVQ1 and NVQ2 (GCSEs or equivalent,
vocational qualifications)
Other
1995
2000
2003
2.7
4.6
5.6
11.6
13.1
13.8
9.5
9.7
9.9
12.1
40.0
16.3
36.5
17.8
34.6
7.9
16.2
8.0
11.8
7.7
10.6
1.9
10.9
5.9
3.1
13.1
6.6
3.7
13.5
6.7
13.8
42.7
17.6
38.9
19.5
36.7
8.6
16.2
8.6
12.1
8.5
11.4
1. Comprising distribution, transport, communications, hotels & catering, business services, financial
services and personal services.
The changes in the skill use pattern noted above changes aggregate labour input only
if there are significant differences in wage rates across skill groups. Table 8.4 shows
wages relative to the lowest group for the qualifications given in the Table above. In
fact there are very large differences in the wages paid to workers in the NHS with on
average those with higher degrees earning about 4 times the average unskilled wages.
Note however that these differentials tend to be smaller in the public sector than in the
private sector.
Also wage differentials have tended to stay reasonably constant
through time. Therefore, based on these data, quality change is driven by greater
employment growth among skilled workers rather than by any movement to
employing more expensive and under competitive assumptions more productive
workers, within these skill groups.
174
Table 8.4 Wages relative to the unskilled, selected years
Higher degree
Degree
Nursing qualification/other NVQ4
NVQ3
NVQ1 and NVQ2
Other
None
1995
2000
2003
3.88
2.72
1.88
1.39
1.14
1.48
1
4.07
2.61
1.87
1.33
1.14
1.41
1
3.81
2.68
1.87
1.21
1.09
1.70
1
Combining the information on skill use and wages gives growth in quality adjusted
labour. Table 8.5 below shows growth in headcount, quality adjusted labour and the
percentage point contribution of quality change for all NHS workers, and selected
occupation groups.
175
Table 8.5 Growth in volume and quality adjusted labour input, Total NHS
Numbers
Quality-adjusted
Difference
1995-03
2.68
3.47
0.79
2000-03
4.50
5.19
0.69
1995-03
3.30
3.33
0.03
2000-03
4.03
4.20
0.17
1995-03
2.75
3.00
0.25
2000-03
4.24
4.32
0.08
1995-03
2.22
2.34
0.13
2000-03
2.71
3.72
1.01
3.30
4.48
3.39
4.64
0.09
0.16
2.89
5.00
3.60
5.64
0.71
0.64
6.59
11.18
8.10
12.90
1.51
1.72
2.41
5.35
2.93
6.17
0.52
0.82
Total
Doctors
Nurses
Nursing aux.
Health associate profs
1995-03
2000-03
Healthcare assistants
1995-03
2000-03
Managers
1995-03
2000-03
Administrative
1995-03
2000-03
Again it is useful to compare these trends with other sectors of the economy. In
addition we also show growth in labour input for the hospital sector alone, since
calculations for other inputs discussed below are carried out mainly for the hospital
sector. In this calculation we also include the adjustment for agency workers and
convert to an annual hours basis. The sample sizes in the LFS were such that it was
not possible to include occupation specific hours or agency adjustments in the
previous table. Adjusting for agency workers raises the annual average growth rate
over the entire period from 2.68% to 2.78% and since 2000 from 4.5% to 4.63%,
176
small upward adjustments. Adjusting for trends in hours raises the growth rate further
to 4.75% from 2000-2003 but lowers the growth rate marginally over the entire
period. Ideally, in the calculations we would like to incorporate trends in hours by
qualification group. However when we attempted such a division in the LFS the
resulting weekly hours turned out to show implausible variations from year to year
and so this calculation was not attempted.
The table shows growth rates over the six years considered in the output calculations.
It shows much higher quality adjusted labour input growth in the total NHS and in
hospitals than in the economy as a whole. Quality adjustments were similar in
percentage points to those in the total economy and larger in the more comparable
market services sector proportionally more important in the total economy or market
services than in the health sector averaged across the entire time period and the period
since 2000. However the aggregate and private sector estimates have been more
sensitive to the stage of the business cycle and hence to the starting year chosen. In
the past few years the NHS adjustments have been greater.
Table 8.6 Growth in volume and quality adjusted labour input, comparison
between the NHS and other sectors, annual average 1999/00-2003/04
Annual hours
Quality-adjusted
Difference
Total NHS
3.37
4.24
0.87
Hospitals
3.58
4.35
0.77
Total Economy
0.92
1.74
0.82
Market Services
0.71
1.36
0.65
8.2.5 Quality adjustments: refinements
So far the calculations assume that the only variation in type of worker is skill. We
also calculated a number of variations which accounted for a further disaggregation of
177
doctors, regional variations, on the job training and country of birth of workers. The
rationale for the first is that since all doctors must have degrees or equivalents in order
to practice, the LFS data are not capturing skills acquired through on the job learning.
To take account of this we need to disaggregate doctors by type, e.g. consultants,
registrars, junior doctors. Data from the Census for the secondary care sector, coupled
with snapshots of relative wages from the NHS earnings survey were employed to
achieve a crude adjustment based on a division of doctors by consultants and others.
Over the period from 2000/01 to 2003/04 this resulted in growth in quality adjusted
doctors about 30% above numbers employed. For the same period quality adjusting
based on certified qualification led to only an increase in growth of 4%. Therefore at
least for doctors, using qualification data alone may not be sufficient to capture all
quality change. However since doctors represent less than 1.5% of the NHS Trusts
wage bill this amount to only a very small adjustment for the hospital sector. As we
do not have comparable data to consider GPs, the adjustment for overall NHS activity
is even smaller.
It is likely that workers have received some additional training beyond that associated
with their certified qualifications. Using LFS data we can divide workers according to
whether they received any job related training or education during the 13 weeks
previous to the survey date. Since the data are averaged across quarters this variable
picks up any job related training carried out over entire years. For 2003/04 just over
50% of NHS workers answered yes to the question of whether they received some job
related training over the past year and the percentage is high across all occupation
groups (table 8.7). Nearly half the training occurs in the employees workforce with
about 30% off site in education institutions (Table 8.8). About half the workers
engaged in training with duration lasting one month or less while about 20% was on
going at the time of the survey. Nearly 30% of training was for more than 6 months.
178
Table 8.7 Percent of NHS workers receiving job related training, 2003/04
Occupational group
%
Total
50.2
Managers
56.5
Medical practitioners
68.2
Other health professionals
62.7
Science & engineering professionals, technicians, IT, research
44.0
Teaching & all other professionals etc, plus skilled trades
34.1
Nurses, midwives
64.0
Other health associate professionals, therapists
57.2
Nursing auxiliaries and assistants
48.8
Other healthcare and related personal services
46.2
All other occupations
25.5
Table 8.8 Location of job related training
Location of job related training
%
Employer/another employer's premises
49.1
Training centres etc
11.5
Educational institution
30.9
None of these
8.5
Job related training can vary from one day courses on health and safety to extensive
formal learning from senior colleagues. If most training were concentrated in the
former then this would have very little impact on overall productivity of workers
whereas the latter type of training is likely to have a significant impact. Consistent
with the method employed in the remainder of this section, we look at relative wages
as an indicator of relative productivity. In 2003/04 workers who gave a positive
response to the on the job training question earned on average 28% more than those
who gave a negative response. The differential was greatest among highest skilled
179
groups but was also large across all skill divisions.
In order to account for on the job training we compared the quality adjustments when
labour is divided by skill level with those where we included the further division
according to whether they received job related training or not. The results of this
calculation were to raise the quality adjusted labour input growth rate on average from
1999/00 by about 5%, a small but not insignificant impact. Therefore our estimates of
quality adjusted labour were scaled up to reflect the impact of on the job training.
Finally we tried two additional calculations using a division by region and by country
of birth but neither significantly altered the results.
8.3
Conclusion
This section considered trends in the use of labour input in the NHS. It showed that
labour input growth has been rising very rapidly in recent years, mainly due to growth
in the numbers of workers employed but also there is a significant contribution from
upskilling of the workforce. The latter is important in understanding why expenditure
on the NHS has been increasing rapidly in recent years. Thus a crude calculation
suggests some 20% of payments to labour is due to paying for higher skilled workers.
9 Experimental productivity estimates
This section considers measurement of inputs and combining these with output
estimates from previous sections to calculate productivity. Lee (2004) reported
productivity estimates for the total NHS with details of calculations given in
Hemingway (2004). The focus has been on refining the estimates of labour input to
take account of the use of various types of skilled labour. Labour is by far the most
important input used in producing health services, accounting for about 75% of total
hospital expenditures. These measures can then be combined with the aggregate and
hospital output measures given in Section 7 to calculate labour productivity growth
180
rates. Total factor productivity (TFP) estimates are then calculated for the hospital
sector and the NHS as a whole. In the hospital sector it is straightforward to combine
the direct measures of labour input discussed in the next section with data on the use
of intermediate inputs and capital from Trust Financial Returns to calculate TFP
growth. ONS data are combined with the estimates for labour input from Section 8 to
derive the aggregate TFP estimates.
9.1
Labour input and labour productivity growth
Using the calculations reported in section 8, annual estimates of the volume of labour
input, and quality adjusted labour input, are shown in the following table for the
aggregate NHS and the Hospital NHS.
Note the hospital sector captures activities that are recorded in HES, which were the
focus of Chapter 5, and non-HES activities such as outpatient, A&E, rehabilitation
and mental health activities. This broader definition of outputs corresponds to the
basis on which hospital inputs are measured. The overall NHS includes activities such
as primary care, prescriptions, community care, dental and ophthalmic.
Table 9.1 Volume of labour input and quality adjusted labour input, annual
estimates of growth rates
Labour input: volume
Labour input: quality adjusted
Total NHS
Hospital
Total NHS
Hospital
1998/99-1999/00
1.58
1.49
2.53
3.47
1999/00-2000/01
1.05
1.77
1.45
2.75
2000/01-2001/02
5.42
5.05
5.31
3.77
2001/02-2002/03
4.69
4.51
5.57
5.91
2002/03-2003/04
4.48
5.47
4.94
5.83
Average
3.43
3.64
3.95
4.34
181
Labour productivity growth rates are derived by taking the growth in output minus the
growth in labour input. Here we show calculations for unadjusted cost weighted
output and the quality variants chosen for illustrative purposes in Section 7 adjusting
for survival and waiting times. Denote these two variants by Q1 and Q2. For
comparison purposes these output growth rates are shown in Table 9.2.
The first panel in Table 9.3 presents labour productivity estimates based on volume of
labour and shows positive average labour productivity growth across the period for all
output variants for the total NHS and for the Q1 variant for hospital output but a small
negative number for unadjusted hospital output. When quality adjusted labour is used
instead, average labour productivity growth becomes negative in the total NHS when
output is not quality adjusted, but is approximately zero using the Q1 variant of
quality adjusted output. Hospital labour productivity growth is negative for all output
variants when quality adjusted labour is used. Note negative labour productivity
growth in health services is not unusual in international comparisons. Data from the
US national accounts suggests labour productivity growth, unadjusted for labour
quality changes, was -0.31% on average from 1999 to 2002. Adjusting for labour
quality would reduce this further.
Table 9.2 Output growth
Total NHS
Hospital
Unadjusted
Q1
Q2
Unadjusted
Q1
Q2
1998/99-1999/00
2.61
1.96
2.22
2.03
0.79
1.28
1999/00-2000/01
2.11
2.46
2.26
1.54
2.16
1.80
2000/01-2001/02
3.85
3.82
3.74
4.48
4.43
4.31
2001/02-2002/03
5.07
6.11
5.78
3.94
5.57
5.06
2002/03-2003/04
4.43
5.20
4.93
4.78
6.00
5.56
Average
3.62
3.91
3.79
3.35
3.79
3.60
0
*
Notes: Q1 is the ‘high’ quality adjustment variant with k = q /q = 0.8 if a – k > 0.05, and k = 0
otherwise, discounts to date of treatment with charge for wait, discount rates on waits and life
expectancy equal to 1.5% and where the waiting time variable is the 80th percentile wait in each HRG.
Q2 is our recommended quality variant and sets k = q0/q* = 0.8 for electives, k = 0.4 for non-electives,
k = actual k for those HRGs where known, if a – k > 0.10, and k = 0 otherwise; discounts to date of
treatment with charge for waits, discount rates on waits and life expectancy equal to 1.5% and uses 80th
percentile waits.
182
Table 9.3 Labour productivity growth
Total NHS
Hospital
Unadjusted
Q1
Q2
Unadjusted
Q1
Q2
1998/99-1999/00
1.02
0.38
0.63
0.53
-0.69
-0.20
1999/00-2000/01
1.05
1.40
1.20
-0.23
0.38
0.03
2000/01-2001/02
-1.49
-1.51
-1.59
-0.54
-0.58
-0.70
2001/02-2002/03
0.35
1.35
1.04
-0.55
1.02
0.53
2002/03-2003/04
-0.05
0.69
0.42
-0.65
0.51
0.09
Average
0.17
0.46
0.34
-0.29
0.13
-0.05
1998/99-1999/00
0.08
-0.56
-0.30
-1.40
-2.60
-2.12
1999/00-2000/01
0.65
1.00
0.79
-1.18
-0.57
-0.92
2000/01-2001/02
-1.38
-1.41
-1.48
0.68
0.64
0.52
2001/02-2002/03
-0.48
0.51
0.20
-1.87
-0.32
-0.81
2002/03-2003/04
-0.48
0.26
-0.01
-0.99
0.16
-0.25
-0.95
-0.54
-0.72
Labour volume
Quality adjusted labour
-0.32
-0.04
-0.16
Average
Note: all productivity estimates use geometric means
9.2
Intermediate and capital inputs
Intermediate input for the hospital sector comes from the Trust Financial Returns
(TFR) and is deflated by a modified version of the DH Health Services Cost Index
(HSCI) to derive a volume measure. Intermediate input was defined as all current non
pay expenditure items in the TFR, and hence excluded all purchases of capital
equipment and capital maintenance expenditures as these items cannot be allocated to
a particular year’s output.
The list of items included and their shares in total intermediate expenditure in selected
183
years is shown in Table 9.4. This shows the share of drugs increasing rapidly and a
declining trend in the miscellaneous category with no other intermediate category
showing much change. Within the final category, external purchase of health care
from non-NHS bodies has shown an increased share through time but remains small at
about 6% of total intermediate in 2003/04.
Table 9.4 Share of intermediate expenditure by type
1998/99 1999/00 2000/01 2001/02 2002/03 2003/04
Drugs
0.238
0.242
0.286
0.303
0.316
0.338
Other Clinical Supplies
0.041
0.042
0.053
0.055
0.052
0.056
General
0.139
0.134
0.150
0.144
0.128
0.122
0.166
0.159
0.187
0.176
0.151
0.143
Non-capital premises3
0.112
0.101
0.115
0.114
0.101
0.104
Other4
0.304
0.322
0.209
0.209
0.252
0.237
Supplies
and
1
Services
Establishment
expenditure2
1. Hotel, catering and cleaning services; 2. Stationery, communications, advertising and transport costs;
3. Energy, rent and external services; 4. Miscellaneous services including external purchase of health
care from non-NHS bodies and other external contract expenditures.
These numbers for intermediate input were deflated by an aggregate price index,
derived as a chain linked index of corresponding HSCI items. This resulted in a very
small upward adjustment in the intermediate input deflator than one using all items in
the HSCI, as the prices of capital items have been growing more slowly than current
items and in the case of computers have been falling.
To be consistent with the methodology employed by ONS, capital input could be
measured by depreciation from the TFR. However this calculation ignores any capital
services from capital purchases in the current year. An alternative is to assume a
proportion of these expenditures are depreciated in the current year. Since much of the
expenditure is on computers, software and medical equipment with low asset lives, we
assume one third of these assets are depreciated. These capital services are deflated by
a chain linked deflator for capital items in the HSCI while depreciation is deflated by
184
the ONS capital consumption deflator for the total NHS. Adding a proportion of
capital expenditures implies average annual real capital growth of 4.76% per annum
from 1998/99 to 2003/04 compared to 4.69% using depreciation alone, a small effect.
However it does raise capital’s share from 0.041 to 0.080. Finally consistent with the
methodology employed in private services the value of business rates are added to
capital’s share.
Calculating input shares is more difficult for the total NHS since we attempt to
combine data from different sources. For the aggregate NHS we use TFR data on
payments to labour in the hospital sector and use ONS data for payments to labour in
other parts of the NHS. Similarly, intermediate inputs are derived combining
expenditures from TFR and PFR with ONS data on other parts of the NHS. Capital
inputs are those employed by ONS in their measures of Health Sector Productivity.
Family Health Drugs are deflated by the cost of all items rather than the ONS quality
adjusted Paasche variant. The estimates for the NHS should be treated with
considerable caution since data are being taken from a number of sources which may
need further reconciling.
Table 9.5 shows average period input shares and average growth in the three inputs
where labour input is the quality adjusted variant.
Table 9.5 Average period input shares and average growth in inputs
NHS
Hospital
Shares
Input growth
Shares
input growth
Labour
0.61
3.95
0.72
4.34
Intermediate
0.33
8.58
0.20
4.93
Capital
0.06
3.12
0.08
4.76
Labour represents a lower share in the total NHS than in the hospital sector, mainly
due to the inclusion of family health prescribing in the former. Growth in intermediate
input is very large in the total NHS, while all three inputs show similar average
growth rates in the hospital sector.
185
9.3
Total factor productivity growth
Combining the input shares with growth in real inputs allow the calculation of total
input growth and subtracting this from output growth yields total factor productivity
growth rates, shown in Table 9.6. Average TFP growth rates are strongly negative for
the total NHS for all quality variants. The numbers based on quality adjusted output
are similar to those calculated by ONS reported in Lee (2004). The ONS estimates
using the most comparable methods to measure inputs suggest average TFP declining
by 1.34% per annum over the same period. However it should be noted that ONS
output measures use reference cost activities which are not directly comparable with
the HES based data employed in this report’s calculations. Average TFP growth is
also negative for the hospital sector but less so than for the total NHS. The results are
sensitive to the allocation of expenditures between the three broad categories of
inputs. Lee (2004) highlighted the sensitivity of the results to the deflators used, in
particular for drugs – see also the discussion in section 10.5. However since drugs are
both an input and an output this sensitivity is surprising. Hence some further
investigation is required and will be carried out following discussions with ONS.
Table 9.6 Total factor productivity growth
Total NHS
Hospital
Unadjusted
Q1
Q2
Unadjusted
Q1
Q2
1998/99-1999/00
-2.33
-2.95
-2.71
-2.82
-4.00
-3.53
1999/00-2000/01
0.55
0.89
0.69
0.30
0.91
0.56
2000/01-2001/02
-2.12
-2.15
-2.22
0.17
0.13
0.01
2001/02-2002/03
-1.86
-0.88
-1.19
-2.01
-0.46
-0.95
2002/03-2003/04
-2.97
-2.25
-2.51
-1.13
0.02
-0.39
Average
-1.75
-1.48
-1.59
-1.11
-0.70
-0.87
The finding that TFP growth is negative is not unusual in the private sector. For
example Basu et al. (2003) report negative gross output based annual average TFP
186
growth rates for a number of sectors in the 1990s including insurance and business
services. Similar results have frequently been reported by the US Bureau of Labour
Statistics. Negative TFP growth is mostly likely to occur in service sectors where
output is poorly measured and quality adjustment is minimal. TFP growth rates for the
private sector using comparable measurement methods are not yet available for the
period under consideration in this report.
When inputs are measured correctly, with adjustments for quality change then the
TFP residual is close to a measure of pure technical change so long as output is also
measured correctly. But as emphasised in many parts of this report, we are only
capturing part of the improvement in quality of care via our proposed adjustments for
survival, health effects and waiting times. Because of this incomplete adjustment for
quality change we expect to underestimate TFP growth. There are also reasons why
in the short term at least we might expect negative growth rates. The literature on the
impact of information technology on productivity in the private sector points to an
important role of organisational changes in facilitating benefits from new technology.
Basu et al. (2003) suggest that these changes can lead to declining TFP in the short
run due to disruption of production processes. There is no doubt that the NHS is
undergoing significant change.
Of more consequence for the health sector is the notion that there are diminishing
returns as increased activity allows treatment of more complex and hence most costly
cases. Activity rates have been increasing more rapidly in recent years. Some
evidence in support of this is provided by the increased average age of patients treated
in hospitals, from 48.6 years in 1999/00 to 50 years in 2003/04. In addition there has
been some increase in the expenditure shares of HRG categories with the title
‘complex elderly’ from 3.4% of expenditures to 4.2% over the same period. Changes
in the case mix are likely to be larger within than across HRGs but we lack the
necessary data to examine this. Data that identified the characteristics of patients
would also be useful in identifying the extent to which changes in NHS productivity
are affected by diminishing returns.
187
10 Improving the data
We appreciate that the Department of Health is trying to reduce the amount of data
collected from the NHS (Review of Central Returns Unit).
However, the data
requirements outlined below are essential to development of robust measures of
output, productivity and quality change in the NHS. These data will be required not
only by the DH and Trusts to improve performance of the NHS but also by outside
bodies such as the Treasury and ONS.
10.1 Outcomes data
10.1.1 Health outcomes
The main aim of the health system is the improvement of the health of the population.
This being so, it would seem reasonable that any measure of health system
performance, output and productivity should include measures of the effect of the
system on health. The challenges associated with measuring the effect of interventions
are discussed in Appendix C.
The construction of a productivity index requires information about changes in health
status attributable to interventions. Such information currently is not collected by the
NHS. We suggest the systematic use of a standardised measure of health status to
improve the effective management of the NHS and to provide the fundamental data
needed to properly reflect changes in NHS productivity.
We have suggested that the NHS should collect data on the health of patients before
and after treatment. An outcome measure based on the difference between snapshot
measures of health status before treatment hb and after treatment ha is an imperfect
measure of the change in the discounted sum of QALYs due to treatment. It does not
measure health with and without care but health before and after care. It also replaces
each time profile with a single snapshot. For some treatments and conditions the
effect of treatment is merely to slow down the rate of decline in health status, so that
h a − h b < 0 even though the treatment increases the sum of QALYs compared with no
188
treatment. For these cases incorporating estimates of hb and ha in estimates of the level
of output in any given year would reduce measured output.
However, since the aim is to measure the rate of growth of output productivity we are
interested in whether the rate of growth of ∆h = ha – hb is a reasonable approximation
to the rate of growth of the effect of treatment on the discounted sum of QALYs. The
important issue is how well the rate of change in measures based on the snapshots hb,
ha approximates the rate of change in the areas under the two time profiles of health
streams with treatment h*(s) and without treatment ho(s)
Both the level of health before treatment hb and the health of treated patients if not
treated depend on the patient population selected for treatment and on the general
health of the population. It is not unreasonable to suggest that the rates of change of
hb and the discounted value of the no treatment health profile ho(s) over time will be
similar. Both the snapshot level of health after treatment ha and the discounted value
of the time profile h* will be measured on the same population and hence are affected
by the same factors including any technological change.
Hence, despite the imperfections of the difference between snapshots of post and pre
treatment heath status for calculating the level of productivity, we suggest that rates of
change of measures based on hb, ha will improve estimates of NHS output growth
compared to estimates where such information is not used.
The following points need to be addressed in such a data collection exercise.

NHS patient sample. Since the scale of NHS activity is so broad and the
potential volume of patients is so large sampling seems a more sensible
strategy than attempting to measure health effects for all NHS patients in a
sector. Sample sizes are likely to vary across different types of patients. While
a random sample of the NHS patient population would be preferable, in the
first instance it may be satisfactory to undertake a pilot exercise at a handful of
Trusts. This might be adequate for national measures of output but would be
inadequate if information on health outcomes are to be used to improve
performance of the NHS.
189

Choice of instrument. A single generic, rather than condition-specific
instrument is required in order to facilitate aggregation across different types
of NHS activity. Profiles such as SF-36 provide multiple measures of outcome
but are unsuitable for most non-clinical purposes since they typically lack the
capacity to form a single aggregate index. Derivatives of SF-36 such as the
SF-6D do not suffer from this deficiency. The EQ-5D is designed to produce a
single index and its five dimensions have been calibrated in terms of social
preference weights of a UK population and is probably the primary candidate
measure.

Timing.
The timing of before and after health status measurement may
depend on the type of activity (emergency or elective) and on diagnostic
category or intervention type since different treatments may have an effect
over shorter or longer periods.
Our analysis of data for two elective
procedures (hip and knee replacement) shows most treatment gain after six
months but clinical advice could be used to determine the appropriate period
for follow-up post treatment for other conditions.

Grouping of NHS activities. Given the enormous range of NHS activities it
is necessary to group them for data analysis. The main grouping of secondary
care activities is at present by HRG which attempts to group activities by their
costs. But a given HRG may contain a large number of procedures which have
very different effects on health. The availability of patient-level health
outcomes data by ICD will permit matching to other datasets (such as HES)
and will make it possible to explore the extent to which health outcomes are
related to other routinely collected patient characteristics, such as age, gender,
diagnoses, procedure, survival rates and mortality. This is essential if we are
to develop disease specific studies of improvement in health care.

Frequency of data collection.
If the pace of technological change in
medicine was slow enough it could be argued that collecting data on health
outcomes was an exercise that needed to be undertaken only at intervals of
several years. But technological change in medicine and pharmaceuticals is
190
rapid, the NHS is subject to frequent organisational change which may affect
the mix of patients receiving particular treatments and the speed at which new
technology spreads. We believe that only a continuous sampling of the NHS
patient population will be adequate to capture trends in the impact of NHS
services on patients.
10.1.2 Feasibility
Outside clinical trials, experience of routine collection of health status data in the UK
is patchy.
Individual clinicians and clinical teams make use of a variety of
standardised measures, but this is largely uncoordinated, its coverage remains
undocumented and aggregation of such data is problematic given the use of different
instruments.
Although there are a limited number of examples of prospective health data collection
these examples demonstrate that such data collection would be feasible in the NHS.

The survey of acute inpatients conducted by Picker International showed
that it was possible to collect EQ-5D data from a sample of patients recently
discharged from all NHS Trust hospitals. The value of these data would have
been enhanced if they been linked with basic HES variables, such as diagnosis
or procedure. Given current DH policy to regularly survey patient experience,
inclusion of questions on health outcomes would involve very low marginal
cost.

The Health Outcomes Data Repository (HODaR) operates a continuous
survey of all inpatients and outpatients at a single large Welsh Trust. These are
now linked to individual level primary and community care data. Data for
more than 30,000 patients have been collected, almost 10% of these having
completed EQ-5D on more than one occasion. However, the data are
predominantly based on post-discharge observations and this limits their value
in measuring health outcomes. Since the advent of this project the HODaR
survey has started to collect data on pre-admission health status.
191

As described in Appendix C, for a number of years BUPA has been routinely
administering health status questionnaires to patients before and three months
after treatment, with some 100,000 having now been surveyed. These data are
restricted primarily to elective procedures. BUPA plan to extend them to four
types of cancer.
10.1.3 Cost
The incremental costs of introducing systematic observation of health status via
existing information systems are difficult to estimate. Currently BUPA estimates that
it costs around £4 per patient to administer their manual system of health status
measurement based on SF-36.
The introduction of systematic health status measurement might be achieved under the
aegis of the National Programme for Information Technology announced in
December 2002 with a budget of £2.3 billion and which the Audit Commission
suggested would provide the Department of Health with the opportunity to improve
NHS data quality.
It would seem sensible to consider an extension to the current HES-based data to
provide maximum scope for exploitation through record linkage. Modification of this
sort ought not to incur a significant cost. However, the data captured from patients
will require additional organisational and administrative costs. Patient-centred
reporting systems using traditional paper and pencil techniques require costly
processing in order to link them to other NHS data. Computer-assisted interview
methods have scope for more efficient data acquisition and transmission but would
need more costly administration. The use of handheld PDA recording systems is now
becoming a feature of many clinical trials that record patient-reported health status
and it can be expected that hardware costs will continue to fall.
10.2 Other outcome measures: patient satisfaction
As we emphasised in our First Interim Report the effect of the NHS on health status is
192
obviously an extremely important dimension of NHS outcomes but other dimensions,
especially process related outcomes, should also be taken into account. However,
existing patient surveys rarely ask similar questions over time. The DH should decide
on core aspects of the patient experience and ensure these questions appear each year
in patient satisfaction surveys. A problem we discuss in this report is that of
identifying a relevant weight to place on patient experience in a quality adjusted
output index. Ryan (2004) recommended that the DH undertake a series of discrete
choice experiments to obtain evidence on the relative value patients attach to different
aspects of process quality. We endorse this recommendation.
10.3 General practice data
10.3.1 GP activity
We have discussed the measures of GP consultations in section 4.4, noting the
unsatisfactory nature of the current source (General Household Survey) and that the
DH is now planning to obtain consultations directly from GP record systems via the
QRESEARCH database. We have agreed to investigate this new data for the DH to
determine what if any adjustments need to be made to improve its reliability.
Ideally NHS productivity measures should be based on numbers of patient journeys of
different types where journeys are likely to involve both primary and secondary care.
In the absence of routine record linkage such measures are not currently feasible but it
would still be worthwhile getting a finer breakdown of GP consultations to allow for
the changing mix of providers and for the changing mix of types of consultations.
We have also discussed measures of general practice quality in section 4.13. Whilst
QRESEARCH and similar databases of GP record systems and the central collection
of the greatly enlarged set of quality indicators linked to GP pay will provide
potentially useful quality adjustment these will need to be based on careful empirical
modelling of GPs responses to the new financial incentives, especially possible
diversion of effort from unremunerated quality efforts.
193
10.3.2 GP cost weights
The
PSSRU
estimates
the
unit
costs
of
GP
and
nurse
consultations
(http://www.pssru.ac.uk/pdf/uc2003/uc2003.pdf) using a variety of official and
unofficial sources. Several of the estimates rest on self reported GP activity from the
1992/3 GP Workload Survey undertaken for the DDRB. There does not appear to be a
more recent survey of GP activity and we recommend that DH should consider
undertaking such a survey at regular intervals.
10.3.3 General practice staff
Prior to April 2004, practice staff such as practice nurses, although practice
employees, were partly paid for by the NHS and so a record was kept, though it was
not a reliable source because not all practices claimed these subsidies. Under the new
GP contract the subsidies have been abolished and there is no record of non-GP staff
in practices. Now practices instead receive a sum of money based on their practice.
We recommend that the DH make it a condition of the practice contract that a full
return of employed staff is made.
10.3.4 Prescribing
The prescription activity measure in the recently revised NHS outputs index is derived
from PPA data. The PPA data are collected in order to remunerate pharmacists (and
dispensing GPs). It is therefore a comprehensive measure of prescriptions dispensed
and can be disaggregated to product type if required.
The data are reliable,
comprehensive and readily available at national levels of aggregation. They have been
used to construct a number of indicators of practice prescribing quality as well as
quantity.
The usefulness of the data could be greatly improved and this would be relatively
simple. The most obvious example is by improving the patient information on the
prescription form. At the moment the only patient data on the form indicates if the
patient is entitled to free prescriptions and on what grounds. The information has
been used by the Prescribing Support Unit to produce the Low Income Scheme Index
(LISI) which measures the proportion of prescriptions which are dispensed without
194
charge on grounds of low income. The LISI is the only direct variable measuring
practice population socioeconomic status which relates directly to practice patients
rather than being attributed from Census or Social Security data on the basis of patient
postcode. Adding a field for diagnosis to the prescription form would greatly enhance
the usefulness of routine prescribing data as a measure of prescribing quality. Adding
gender and age fields would also improve the socioeconomic data and improve
prescribing quality indicators. We recommend that the DH should add these fields to
the prescription form.
10.4 Other primary care data
NHS Direct, NHS Direct Online and Walk-In Centres are recent innovations in the
provision of first contact advice and information. They are likely to reduce the costs to
patients of such first contacts, leading both to an increase in primary care activities
and to a change in the mix of activities in general practice. The organisations are
expected to play an increasing role in the NHS over the coming years and it is
important that their presence is recognised in measures of NHS output and
productivity.
Aggregate data on use of NHS Direct and NHS Direct Online are available. In order
to measure the outputs of the services more accurately it would be helpful to have data
on

the breakdown of enquires between the provision of health advice and
information about the health service

the type of conditions people seek health advice about

actions that are recommended as a result of the request
It is possible that such data have been collected, for example via the website service
for those who seek advice from a nurse which involves self-completion of a detailed
questionnaire on the nature of the symptoms and condition, as well as personal
information. Presumably – although we have not been able to ascertain whether this is
the case – enquiries that result in a self-care recommendation are logged also. The
telephone service seems to be set up in a similar fashion, the difference being that the
195
information is recorded by NHS Direct staff.
It appears, therefore, that these two organisations routinely collect (or, at least, have
the capacity to collect) detailed electronic information from every person making an
enquiry about their (or their family member’s) health condition.
We recommend that the DH should utilise more of the data collected by NHS Direct
to improve measures of output.
10.5 Inputs
It is important to have a comprehensive coverage of inputs used in producing health
services in order to explain changes in outputs and to measure productivity. Thus we
require values of expenditures on inputs, volume measures and price deflators to
convert values to volume measures when the latter are not available. We also require
data on the extent to which the quality of inputs are changing through time. When
volume measures are available, under the assumption that payments to labour equal
marginal products, weighting diverse inputs by their shares in wage bills can be
employed to adjust for quality change. Alternatively hedonic regressions may be
employed to quality adjust price deflators. Both methods depend on competitive
market assumptions which are unlikely to characterise many of the markets in which
the NHS operates.
The analysis of labour input showed that it is possible to derive reasonable volume
measures and adjustments for quality change by linking readily available data sources.
The quality adjustment employed was certified qualifications with an additional
adjustment for job related training. While certified qualifications are useful they may
not be sufficiently detailed to capture differences in the productive capacity of some
employees. Many professionals within the NHS have similar qualifications since there
are minimum requirements set by professional bodies. While the NHS census does
include very detailed data on numbers of professionals by grade, there is no
comparable data on earnings. The persons responsible for the Earnings survey within
DH were unwilling to attempt to match earnings to the Census numbers by type on the
196
valid grounds that the sample size was too small. A larger sample survey of wage
rates and earnings within the NHS would be very useful, not only for the productivity
calculations carried out in this report but also to calculate the extent to which
increases in payments to labour are due to employing more highly skilled personnel as
against mere wage inflation.
The productivity calculations in section 9 above highlight the importance of
intermediate inputs in overall NHS activity and to a lesser extent in hospital activity.
Reliable volume measures require reasonable estimates of intermediate input
deflators. The main problem with intermediate inputs is the price deflator employed
for drugs. As mentioned in section 3.3 above ONS is currently attempting to refine its
deflator for prescription drugs based on detailed data on prices. Data sources for
prices of hospital drugs are not readily available. Even if such data could be collected,
it is doubtful if they would be useful as a tool for quality adjustments. Hedonic type
adjustments are only valid in competitive markets. The market for hospital drugs in
Britain is best characterised as a bilateral monopoly with a monopsony purchaser
buying from powerful oligopoly drug producers. The standard textbook model of
bilateral monopoly shows that the price will depend on the relative bargaining power
of the purchasers and suppliers. It is well known that the NHS buys drugs at a
discount and it may well be the case that discounting on new drugs may swamp any
quality change, at least at the point of entry. The use of prescription drugs are in the
control of independent GPs so the monopsony element is less important but there
remain market imperfections on the producer side.
These remarks suggest that using drugs prices to measure quality requires an
understanding of how markets operate and should be based on a time profile of prices
rather than comparisons between old and new drugs at the time new drugs enter the
market. A more fruitful but also costly approach might be to examine patient
outcomes and drug use from disease registers or to obtain opinions from panels of
experts. This is simply another example of the need for outcomes data if we are to
measure technical change and productivity in health care.
197
In order to quality adjust capital input it would be useful to have separate investment
data on types of equipment that have seen rapid technological change and change in
unit cost (e.g. MRI scanners). Alternatively it would be useful to have the value of the
stock of these assets, numbers of items and age profiles of the stock. One problem that
must be addressed is the gap in data created by PFI confidentiality. We understand
that a significant proportion of new investment in equipment such as scanners is being
undertaken under PFI contracts. Unless it is possible to access information on stocks
and value, it may not be possible to adequately deal with questions of productivity
growth and technical change associated with investment in new equipment. It would
also be useful to have information on investment by GPs.
11 Conclusions and recommendations
11.1 Methods
11.1.1 The preferred approach
Economic theory suggests that the preferred way of measuring NHS output is with a
value weighted output index.
•
The unit of output is the patient treated, the characteristics of output valued by
individuals indicate quality and the weight attached to each characteristic
reflects the marginal social value of the characteristic.
•
The index overcomes the serious problem of a cost weighted index where
movement to more cost-effective ways of treating patients appears as a
reduction in output.
•
Data necessary to estimate this index are not currently available but are
feasible to collect.
•
A condition specific value weighted index can be constructed as data on major
diseases becomes available.
Not only is the value weighted index theoretically correct, it would allow
measurement of improvements in delivery of services intended to raise both
productivity and patient satisfaction.
198
11.1.2 Methods using existing data
It is not possible to calculate a value weighted index with current data. It is possible
to quality adjust the hospital component of a cost weighted NHS output index using
existing data combined with some assumptions. We have
•
spelt out the methods for quality adjustment with existing data in some detail,
taking care to emphasise the necessary assumptions and their implications,
rather than merely presenting plausible ad hoc adjustments which may in fact
contain dubious assumptions or value judgements.
•
shown how it is possible to use routine data on short term survival and waiting
times, coupled possibly with an explicit assumption about the proportionate
effect of treatment on health, to calculate quality adjusted cost weighted output
indices.
•
presented experimental calculations of these indices to compare their effects
on a simple cost weighted output index and to investigate the empirical
implications of the assumptions about important parameters which are
required.
•
used data on the health effects of a limited set of treatments to illustrate the
construction of a value weighted index for the set and to shed further light on
the implications of making possibly inaccurate assumptions about the health
effects when constructing an index for all hospital treatments.
•
described how it is possible in principle to use data on other aspects of care
(readmissions, MRSA, patient satisfaction with food, cleanliness and nonclinical care) to provide an additional quality adjustment and have produced
some illustrative examples of such adjustments based on the current
unsatisfactory data on these characteristics of care.
•
described a method of quality adjustment using the information on longer term
survival which will become available in the near future.
199
11.2 Results
11.2.1 Results for the hospital sector
Results are reported for a cost weighted quality adjusted output index over the period
1999/00 – 2003/04. It was only possible to quality adjust for hospital activity, 47% of
expenditure covered in the DH cost weighted output index. In Section 5 we examine
sensitivity to discount rates and key assumptions. Table 11.1 summarises the central
results.
Table 11.1 Hospital sector cost weighted output index with recommended
quality adjustments (growth rates%)
No
adjustment
Survival and
health effects
adjustment1
With survival. health
effects, and waiting
time adjustments2
1998/99-1999/00
2.03
0.91
1.28
1999/00-2000/01
1.54
1.99
1.80
2000/01-2001/02
4.48
4.40
4.31
2001/02-2002/03
3.94
5.19
5.06
2002/03-2003/04
4.78
5.51
5.56
Average p.a.
3.35
3.60
3.60
1
This sets k = q0/q* = 0.8 for electives, k = 0.4 for non-electives, k = actual k for those HRGs included
in the specimen index where this is known, provided a – k > 0.10 and k = 0 otherwise.
2
Recommended quality variant 2. As note 1 plus discounts to date of treatment with charge for wait,
with discount rates on waits and life expectancy equal to 1.5% and uses 80th percentile waits.
Table 11.2 shows the impact of incorporating quality adjustments for the hospital
sector into the overall NHS cost weighted output index.
200
Table 11.2 Aggregate NHS cost weighted output index with hospital sector
quality adjustments (growth rates %)
Unadjusted CWOI
CWOI with hospital survival
and waiting time adjustments1
1998/99-1999/00
2.61
2.22
1999/00-2000/01
2.11
2.26
2000/01-2001/02
3.85
3.74
2001/02-2002/03
5.07
5.78
2002/03-2003/04
4.43
4.93
Average p.a.
3.62
3.79
1
Recommended quality variant 2. This sets k = q0/q* = 0.8 for electives, k = 0.4 for non-electives, k =
actual k for those HRGs included in the specimen index where this is known, provided a – k > 0.10 and
k = 0 otherwise; discounts to date of treatment with charge for wait, with discount rates on waits and
life expectancy equal to 1.5% and uses 80th percentile waits.
Overall, our results show that quality adjustment with existing data can make an
impact on measures of NHS output and as more routinely collected data becomes
available, the quality adjustment can be improved.
11.2.2 Other quality indicators
We were asked to explore the use of indicators such as patient satisfaction,
readmission rates, clinical errors and incidence of MRSA. The data are not at present
suitable for inclusion in an output index but some “illustrative” adjustments are
reported in Table 11.3.
Table 11.3 Illustrative calculations of hospital CWOI with adjustments for
survival, waiting times, patient satisfaction as measured in patient surveys,
readmissions and MRSA. Average annual growth rates 2001/2 to 2003/4
Average Growth Rates 2001/2 to 2003/4
% p.a.
Unadjusted cost weighted output index
With adjustment for survival and waiting times1
With adjustment for adjustment for satisfaction with food, cleanliness,
and non clinical care2
With adjustment for MRSA, readmissions satisfaction with food,
cleanliness, and non clinical care2
4.34%
5.74%
5.71%
5.71%
1
Variant 1 (k = 0.8, mortality cut off 0.15, discounting to date of treatment, discount rate on waits
1.5%, on life expectancy 1.5%, 80th percentile waiting time measure).
2
5% weight on non-clinical care
201
11.3 Total factor productivity growth
Employing a new quality adjusted index of labour input in the hospital sector,
provisional estimates of productivity growth are reported. Key results appear in Table
11.4. They highlight the importance of quality adjustments to output in evaluating
NHS performance.
Table 11.4 Total factor productivity growth (%)
Total NHS
Unadjusted Recommended
Quality
Variant
Hospital
Unadjusted Recommended
Quality
Variant
1998/99-1999/00
-2.33
-2.71
-2.82
-3.53
1999/00-2000/01
0.55
0.69
0.30
0.56
2000/01-2001/02
-2.12
-2.22
0.17
0.01
2001/02-2002/03
-1.86
-1.19
-2.01
-0.95
2002/03-2003/04
-2.97
-2.51
-1.13
-0.39
Average p.a.
-1.75
-1.59
-1.11
-0.87
11.4 Recommendations
Recommendations for improving quality adjustment were made throughout the report.
We summarise the main ones here.
For the medium term improvement of the output index improvements to the data are
required. We recommend:
•
Routine collection of outcomes data for a range of NHS treatments. The
programme should start with a few high volume elective and medical
conditions that would permit sampling rather than complete coverage. The
data would also be immensely useful for other purposes including monitoring
of Trust performance and improved cost-effectiveness analysis of particular
treatments.
202
•
Collection of longer term survival data by linkage of HES and ONS records to
produce estimates of patient life expectancy.
•
A patient identifier that will permit grouping NHS activities across institutions
and by disease. The DH has plans to implement this change.
•
Stated preference studies of patients to establish their relative valuations of the
characteristics of NHS output from waiting times to being treated with
courtesy and dignity by staff.
The studies should also include a cost
characteristic so that monetary valuations can be inferred and all
characteristics can be valued in a common unit. The studies will enable the
data from patient satisfaction studies to be utilised for quality adjustment as
well as informing decision making in the NHS.
For the short run improvement of the output index with available data:
•
We recommend the use of short term survival coupled with life expectancy to
quality adjust hospital output.
•
The short term survival adjustment will underestimate output growth. We
recommend that it be coupled with an estimated health effect derived from an
estimate of the proportionate effect of treatment:
o
As the data become available from surveys of patient health before and
after treatment and elsewhere, treatment specific estimates of k j = qojt / q*jt
should be used.
o
Where there are no treatment specific estimates, kj should be estimated as
the volume-weighted mean of existing treatment specific estimates for the
relevant class (electives and non-electives).
o
In the absence of any estimates of treatment specific k for non-electives
the estimate for non-electives should be equal to half the volume-weighted
mean k of the electives.
o
The health effects adjustment should be used only for treatments with a
mortality rate of 0.10 or less.
•
We recommend the use of a waiting time adjustment based on discounting to
date of treatment, with a charge for waiting. Theoretical considerations
203
suggest that the discount rate on waits should be the same as the discount rate
on QALYs. We suggest 1.5%.
•
We do not recommend quality adjustments based on patient satisfaction with
food, cleanliness, and non-clinical care until there are data on the relative
marginal values of these outcomes. If it is felt that estimates of the costs of
cleaning and food derived from Trust accounts reasonably reflect marginal
social values then it would be possible to include an adjustment just for these
satisfaction indicators.
•
We do not recommend quality adjustments based on readmission rates and
MRSA because of data problems and because they may reflect aspects of care
which are better captured in the other quality adjustments.
•
We recommend the use of 30 day mortality, rather than in hospital mortality,
as the measure of short term survival, since we believe its greater theoretical
merits outweigh the difficulties in calculating it. As data linkage methods are
improved the advantage of the 30 day mortality will increase.
•
The waiting time measure should be a certainty equivalent wait, to avoid the
need to calculate adjustments on individual data. The 80th percentile wait
seems a reasonable value.
•
Quality adjustments of hospital output should use CIPS rather than FCEs as
the unit of output.
•
HES rather than the Reference Cost data base should be the source of data on
hospital outputs.
204
11.5 Acknowledgements
The outputs and productivity project is funded by the Department of Health. The
views expressed here are not necessarily those of the Department of Health. This
report has benefited greatly from discussion with numerous experts, including the
members of the Steering Group Committee, Jack Triplett (consultant to the project
team), Sir Tony Atkinson, Barbara Fraumeni, Andrew Jackson, Azim Lakhani, Phillip
Lee, Alan Maynard, Alistair McGuire and Alan Williams. A number of individuals
and organisations contributed to assemble the data used in this report. We are grateful
to Kate Byram, Mike Fleming, Geoff Hardman, James Hemingway, Sue Hennessy,
Sue Macran, Paula Monteith, Casey Quinn, Sarah Scobie, Bryn Shorney, Craig
Spence, Karen Wagner, Chris Watson, BUPA, the Cardiff Research Consortium,
Health Outcomes Group and York Hospitals NHS Trust. Any errors and omissions
remain the sole responsibility of the authors.
205
References
Atkinson, T. (2005) ‘Measurement of government output and productivity for the
national accounts’, Atkinson Review: Final Report, HMSO, 31 January 2005.
Baily, M. and Garber, A. (1997). ‘Health care productivity’, Brookings Papers on
Economic Activity. Microeconomics, 143-215.
Baker, D., Einstadter, D., Thomas, C., Husak, S., Gordon, N., Cebul, R. (2002).
‘Mortality trends during a program that publicly reported hospital
performance’, Medical Car,e 40 (10): 879-890
Basu, S., Fernald, J., Oulton, N. and Srinivasan, S. (2003) ‘The case of the missing
productivity, or, does information technology explain why productivity
accelerated in the United States but not in the United Kingdom’, NBER
working paper no. 10010.
Berndt, E.R., Busch, S.H. and Frank, R.G. (2001) ‘Treatment price indexes for acute
phase major depression’, in D.M. Cutler and E.R. Berndt, eds.: Medical Care
Output and Productivity, University of Chicago Press, Chicago.
Berndt, E.R., Bir, A., Busch, S.H., Frank, R.G. and Normand, S.T. (2002) ‘The
medical treatment of depression, 1991-1996: productive inefficiency, expected
outcome variations, and price indexes’, Journal of Health Economics, 21: 373396.
Bruce, J., Russell, E.M., Mollison, J. and Krukowski, Z.H. (2001) ‘The measurement
and monitoring of surgical adverse events’, Health Technology Assessment, 5
(22). Available at: http://www.hta.nhsweb.nhs.uk/fullmono/mon522.pdf
Cairns, J.A. (1994) ‘Valuing future benefits’, Health Economics, 3: 221-229.
Cairns, J.A. and van der Pol, M.M. (1997) ‘Constant and decreasing timing aversion
for saving lives’, Social Science and Medicine, 45 (11): 1653-1659.
Carthy, T., Chilton, S., Covey, J., Hopkins, L., Jones-Lee, M., Loomes, G., Pidgeon,
N. and Spencer, A. (1999) ‘The contingent valuation of safety and the safety
of contingent valuation, part 2: The CV/SG ‘chained’ approach’, Journal of
Risk and Uncertainty, 17: 187-213.
CDRweekly (2002) The first year of the Department of Health’s mandatory MRSA
bacteraemia surveillance scheme in acute NHS Trusts I England: April 2001 –
March 2002, Health Protection Agency, Vol. 12, nr. 25, 20 June 2002
CDRweekly (2003) The second year of the Department of Health’s mandatory MRSA
bacteraemia surveillance scheme in acute NHS Trusts I England: April 2002 –
March 2003, Health Protection Agency, Vol. 13, nr. 25, 19 June 2003.
CDRweekly (2004) The third year of regional and national analysis of the Department
of Health’s mandatory MRSA bacteraemia surveillance scheme in acute NHS
206
Trusts I England: April 2001 – March 2004, Health Protection Agency, Vol.
14, nr. 29, 15 July 2004.
Commission for Health Improvement (2003) NHS performance ratings acute trusts,
specialist trusts, ambulance trusts 2002/2003, Commission for Health
Improvement: London. http://www.chi.nhs.uk/ratings/
Crowcroft, N. S. and M. Catchpole (2002) ‘Mortality from methicillin resistant
Staphylococcus aureus in England and Wales: analysis of death certificates’,
British Medical Journal, 325: 1390-1391.
Cutler, D.M. and Huckman, R.S. (2003) 'Technological development and medical
productivity: the diffusion of angioplasty in New York state', Journal of
Health Economics, 22: 187-217.
Cutler, D.M., McClellan, M., Newhouse, J.P. and Remler, D. (2001) 'Pricing Heart
Attack Treatments', in D.M. Cutler and E.R. Berndt, eds.: Medical Care
Output and Productivity, University of Chicago Press, Chicago.
Dawson, D., Gravelle, H., Kind, P., O’Mahony, M., Street, A. and Weale, M. (2004a)
‘Developing new approaches to measuring NHS outputs and productivity’,
First Interim Report to Department of Health, CHE Technical Paper 31, July
2004. Available at: http://www.york.ac.uk/inst/che/tech.htm
Dawson, D., Gravelle, H., Kind, P., O’Mahony, M., Street, A. and Weale, M. (2004b)
‘Measurement of NHS outputs and productivity growth: memorandum to
Department of Health on data requirements’, 30 September 2004.
Dawson, D., Gravelle, H., Kind, P., O’Mahony, M., Street, A. and Weale, M. (2004c)
‘Developing new approaches to measuring NHS outputs and productivity:
Data for productivity estimates’, Second Interim Report to Department of
Health, November 2004.
Deaton, A. and Muellbauer, J. (1980) Economics and Consumer Behaviour,
Cambridge University Press, Cambridge.
Department of Health (1996) ‘Policy Appraisal and Health: a guide from the
Department of Health’, London GO70/38 3901.
Department of Health (2001) NHS Performance Ratings: Acute Trusts 2000/01,
Department of Health: London.
http://www.doh.gov.uk/performanceratings/2001/index.html
Department of Health (2002a) ‘Reforming NHS Financial Flows: introducing
payment by results’, Department of Health: London.
Department of Health (2002b) ‘Getting ahead of the Curve: a strategy for combating
infectious diseases (including other aspects of health protection)’, Department
of Health: London. Available at:
http://www.dh.gov.uk/assetRoot/04/06/08/75/04060875.pdf
Department of Health (2002c) NHS Performance Ratings and Indicators: Acute
207
Trusts, Specialist Trusts, Ambulance Trusts, Mental Health Trusts 2001/02,
Department of Health: London.
http://www.doh.gov.uk/performanceratings/2002/index.html
Department of Health (2003a) Investing in General Practice, the new General
Medical Services Contract, Department of Health: London.
http://www.dh.gov.uk/assetRoot/04/07/86/58/04078658.pdf
Department of Health (2003b) ‘Surveillance of Healthcare Associated Infections’,
Publication and Statistics, Letters and Circulars, Department of Health:
London. Available at:
http://www.dh.gov.uk/assetRoot/04/01/34/10/04013410.pdf
Department of Health (2004a), ‘The ‘Experimental’ NHS Cost Efficiency Growth
measure’, Department of Health: London.
Department of Health (2004b), Departmental Report, Department of Health: London.
Department of Health (2004c) ‘Chief Executive's Report to the NHS’, Department of
Health: London.
Department of Health (2004d), Bloodborne MRSA infection rates to be halved by
2008 – Reid, Publication and Statistics, Press Release, Department of Health:
London. Available at:
http://www.dh.gov.uk/PublicationsAndStatistics/PressReleases/PressReleases
Notices/fs/en?CONTENT_ID=4093533&chk=MY%2BkD/
Department of Health (2004e), Infection control training for over one million NHS
staff, Publication and Statistics, Press Release, Department of Health: London.
Available at:
http://www.dh.gov.uk/PublicationsAndStatistics/PressReleases/PressReleases
Notices/fs/en?CONTENT_ID=4093432&chk=7bcARL
Department of Health (2004f), MRSA surveillance system – results, Health Protection
Agency Communicable Disease Surveillance Centre for the Department of
Health,
Department
of
Health:
London.
Available
at:
http://www.dh.gov.uk/PublicationsAndStatistics/Publications/PublicationsStat
istics/PublicationsStatisticsArticle/fs/en?CONTENT_ID=4085951&chk=HBt2
QD
Department of Health (2005) CMO Update, ‘New reporting system to improve patient
safety’, March.
Devlin, N. and Parkin, D. (2004) ‘Does NICE have a cost-effectiveness threshold and
what other factors influence its decisions? A binary choice analysis’, Health
Economics, 13:437-452.
EuroQol Group (1990) ‘EuroQol – a new facility for the measurement of healthrelated quality of life’, Health Policy, 16: 199-208.
Eurostat (2001) Handbook on price and volume measures in national accounts.
Luxembourg, Office for Official Publications of the European Communities.
208
Giuffrida A, Gravelle H, Roland M. (1999) ‘Measuring Quality of Care with Routine
Data: Avoiding Confusion between Performance Indicators and Health
Outcomes’, British Medical Journal, 319:94-98.
Gravelle, H. and Smith, D. (2001) ‘Discounting for Health Effects in Cost-Benefit and
Cost-Effectiveness Analysis’, Health Economics, 10:587-599.
Healthcare Commission (2004) 2004 performance ratings, Healthcare Commission:
London. http://ratings2004.healthcarecommission.org.uk/
Healthcare Commission (2005) 2005 performance ratings, Healthcare Commission:
London. http://ratings2005.healthcarecommission.org.uk/
Health Protection Agency (2003), NINSS Partnership, Surveillance of Surgical Site
Infection in English hospitals 1997 – 2002. Available at:
http://www.hpa.org.uk/infections/topics_az/hai/SSIreport.pdf
Health Protection Agency and Office for National Statistics (2004), Trends in MRSA
in England and Wales: analysis of morbidity and mortality data for 1993 –
2002, Health Statistics Quarterly 21 – Spring 2004. Available at:
http://www.statistics.gov.uk/STATBASE/Product.asp?vlnk=6725
Health Protection Agency (2004), Protocol for Surveillance of Surgical Site Infection
– Surgical Site infection Surveillance Service England.
http://www.hpa.org.uk/infections/topics_az/hai/SSI%20Protocol.pdf
Hemingway, J. (2004) ‘Sources and methods for public service productivity: health’,
Economic Trends, 613: 82-90.
Hicks, J.R. (1940) 'The valuation of social income', Economica, 7: 105-124.
HM Treasury (2003) ‘The Green Book: Appraisal and Evaluation in Central
Government’, London: TSO.
Hofer, T.P. and Hayward, R.A. (1995) ‘Can early re-admission rates accurately detect
poor-quality hospitals?’, Medical Care, 33(3): 234-45.
Hurst, J. and Siciliani, L. (2003) ‘Tackling Excessive Waiting Times for Elective
Surgery: A Comparison of Policies in Twelve OECD Countries’, OECD
Health Working Paper 6, OECD, Paris. Available at: www.oecd.org/health
Jackson G, Tobias M. (2001) ‘Potentially avoidable hospitalizations in New Zealand,
1989-98’, Australian and New Zealand Journal of Public Health, 25(3):21221.
Jorgenson D, Ho. M and Stiroh, K. (2005) ‘Labour Input and the Returns to
Education’, Chapter 6 of Information Technology and the American Growth
Resurgence, MIT Press.
Lakhani, A., Coles, J., Eayres, D., Spence, C. & Rachet, B. (2005) ‘Creative use of
existing clinical and health outcomes data to assess NHS performance in
England: Part 1-performance indicators closely linked to clinical care’, British
209
Medical Journal, 330:1426-1431.
Lancaster, K. (1971) Consumer Demand: A New Approach, Columbia University
Press, New York.
Lee, P. (2004) ‘Public service productivity: health’, Economic Trends, 613: 38-59.
Ludke, R.L., Booth, B.M. and Lewis-Beck, J.A. (1993) ‘Relationship between early
readmission and hospital quality of care indicators’, Inquiry, 30: 95-103.
Mai, N. (2004) ‘Measuring health care output in the UK: a diagnosis based approach.’
National Patient Safety Agency (2005) Building a memory: preventing harm,
reducing risks and improving patient safety, July, on line resource available at
http://www.npsa.nhs.uk/site/media/documents/1246_PSO_Report_FINAL.pdf
National Patient Safety Agency (2005) National Patient Safety Agency: helping to
make the NHS safer, July 2005, on line resource available at:
http://www.npsa.nhs.uk/site/media/documents/1260_NPSA_Information.pdf
OECD (2000) A System of Health Accounts. Version 1.0. OECD, Paris.
OECD (2001) Measuring Productivity: Measurement of Aggregate and Industry Level
Productivity Growth, OECD, Paris. Available at:
http://www.oecd.org/dataoecd/59/29/2352458.pdf
Pratt, J. (1964) ‘Risk aversion in the small and the large’, Econometrica, 32: 122-136.
Roland, M. (2004) ‘Linking physician pay to quality of care – a major experiment in
the UK’, New England Journal of Medicine. 351, 1448-1454.
Rosen, S. (2002) ‘Markets and diversity’, American Economic Review, 92 (1): 1-15.
Ryan, M., Odejar, M. and Napper, M. (2004) ‘The Value of Reducing Waiting Time
in the Provision of Health Care: A Review of the Evidence.’ Report to the
Department of Health, Health Economics Research Unit, Aberdeen.
Sefton, J. and Weale, M. (to appear) ‘The concept of income in a general
equilibrium’, Review of Economic Studies.
Shapiro, I., Shapiro, M.D. and Wilcox, D.W. (2001) 'Measuring the Value of Cataract
Surgery', in D.M. Cutler and E.R. Berndt, eds.: Medical Care Output and
Productivity, University of Chicago Press, Chicago.
Simkins, A. (2005) ‘Using the GP contract Quality and Outcomes Framework for
quality adjustment’, mimeo, August 2005.
Street, A. and AbdulHassain, S. (2004), ‘Would Roman soldiers fight for the financial
flows regime? The re-issue of Diocletian’s edict in the NHS’, Public Money
and Management, 24 (5): 301-38.
Thomas, J.W. (1996) ‘Does risk-adjusted readmission rate provide valid information
210
on hospital quality?’ Inquiry 1996; 28: 258-70.
Victorian Government (2004), The Victorian Ambulatory Care Sensitive Conditions
Study, 2001–02, Public Health, Rural and Regional Health and Aged Care
Services, Victoria Government Department of Human Services Melbourne
Victoria, July 2004.
Williams, A. (1985) 'The economics of coronary artery bypass grafting', British
Medical Journal, 291: 326-9.
Yule, G.U. (1934) 'On some points relating to vital statistics, more especially statistics
of occupational mortality', Journal of the Royal Statistical Society, 97: 1-84.
211
Annex: How should NHS output be measured?
Value weighted output index
The value weighted output index is our preferred way to measure NHS output:
I ytxq =
∑x ∑π
∑x ∑π
j
j
q
jt +1
k
kt kjt +1
jt
k
kt kjt
q
where xjt is the volume of output j in period t, qkjt is the amount of outcome or characteristic k produced
by a unit of j, and πkt is marginal value of outcome k.11
The index requires data on both the characteristics produced and on their marginal social value. Since
improving the health of patients is a primary objective of the NHS, improved health outcomes are one
of the most important characteristics of treatment. But other characteristics of treatment also affect
utility, e.g. the length of time waited for treatment, the degree of uncertainty attached to the waiting
time, relationship with doctors, hospital food and safety. These can be incorporated in the value
weighted output index when the necessary data are available.
Cost weighted output index
Continuous
inpatient
spells
(CIPS)
If the data needed to calculate the value weighted output index are not
available, we can instead use unit costs to weight outputs, and make use of
available data to quality adjust these cost weighted outputs. The cost weighted
output index is:
I ctx =
∑x
∑x
c
jt +1 jt
j
j
jt
c jt
Information on survival can be used to adjust the cost weighted output index
30 day
mortality
rates
Data on short term survival can
be used to adjust the index as
follows:
∑
⎛ a jt +1 ⎞
c
x
⎜⎜
⎟⎟
jt
jt
+
1
j
⎝ a jt ⎠
∑ j c jt x jt
Information on long term survival (not
currently available for most treatments) could
be used to adjust the index as follows:
∑
x c
j jt +1 jt
⎞
a jt +1 S ⎛ σ *jt +1 ( s ) ⎞ ⎛
σ *jtδ s
⎜
⎟
⎜
⎟
∑
5
*
s =1 ⎜
*
s
⎟
a jt
⎝ σ jt ( s ) ⎠ ⎜⎝ ∑ s =1σ jt ( s )δ ⎟⎠
∑ j x jt c jt
11
Please refer to the table of notation at the end of this report for further details of the notation used
here.
212
The simple survival adjustment above implies that the patient would have zero quality adjusted life
years if not treated. It is possible to introduce an additional term into the formula to include a
uniform estimate of the difference between health before and after treatment, giving the health
effect survival index:
⎛ a jt +1 − k j
x
∑ j jt +1 ⎜⎜ a − k
j
⎝ jt
∑ j x jt c jt
⎞
⎟⎟ c jt
⎠
Note that for HRGs where the mortality rate is high, we make no health effect adjustment and use
only the change in survival.
Information on changes in the life expectancy of patients treated can also be included as follows:
∑ j x jt +1c jt
(a − k ) ⎛ 1− e
( a − k ) ⎜⎝ 1 − e
∑xc
jt +1
j
jt
j
j
Certainty
equivalent wait –
measured as the
waiting time for
patients at the 80th
percentile of the
waiting time
jt
− rL jt +1
− rL jt
⎞
⎟
⎠
jt
Information on waiting times can be used to
quality adjust the cost weighted output index.
This approach regards reductions in the wait
for treatment as valuable because of their effect
on the discounted value of the health gain from
treatment
There are two main forms of waiting time adjustment
Discount to date of treatment with charge for waiting
(
Discount to date placed on list
) (
)
r w
⎡ 1 − e − rL L jt +1
e w jt +1 − 1 ⎤
⎢
⎥
−
rL
rw
⎥
⎛ a jt +1 − k j ⎞ ⎢⎣
∑ j x jt +1c jt ⎜⎜ a − k ⎟⎟ ⎡ 1 − e− rLL jt erww jt − 1 ⎤ ⎦
j ⎠
⎝ jt
⎢
⎥
−
rL
rw
⎢
⎥
⎣
⎦
∑ j x jt c jt
(
) (
)
This is the form of the CWOI that we recommend should be
used in the interim, where;
•
rL = rW = 0.015,
•
kj = actual kj if known, = mean k for known electives
if kj not known and elective, = ½ mean k for electives
if non-elective,
•
if (ajt+1 – kj) and (ajt – kj) < 0.10 then kj = 0.
•
wjt, wjt+1 are 80th percentile waits
(
(
jt +1
1 − e jt +1
⎞⎛ e
⎟⎟ ⎜ − rw jt
− rL
1 − e jt
⎠ ⎜⎝ e
∑ j c jt x jt
⎛a −k
∑ j c jt x jt +1 ⎜⎜ ajt +1− k j
j
⎝ jt
− rw
− rL
)
(assumes rW = rL = r)
213
) ⎞⎟
⎟
⎠
Additional quality adjustments can also be made to the CWOI. A lack of appropriate data means that
only illustrative adjustments for these additional aspects of quality could be calculated at present.
A quality adjustment for readmissions and MRSA can be
incorporated based on the assumption that their cost is a
deadweight loss which reduces the value of treatment.
Hence the CWOI (ignoring other quality adjustments for
illustrative purposes) can be adjusted as follows,
∑x
∑x
c − ∑ x bjt +1c bjt
jt +1 jt
j
j
j
jt
c jt − ∑ x bjt c bjt
j
where xb denotes the number of readmissions or cases of
MRSA and cb their costs.
Patient satisfaction can also be incorporated in the
CWOI, using measures of patient experience, derived
from patient satisfaction surveys, as summary indicators
of characteristics that patients value.
This can be incorporated as follows;
n
I tComp = ∑ ωk I tk
k =0
where ItComp is calculated as the weighted sum of the
growth rate of one of the quality adjusted output indices
and the growth rates of the other indicators. We denote
the growth rate of indicator k by Itk. If there are n such
indicators, and the relevant weights are denoted by ωk
then the overall index is given by the formula above.
214
Table of Notation
Notation
x jt
π kt
q jkt
p jt = ∑k π kt q jkt
y jt = p jt x jt
Interpretation
quantity of output j at time t (units of
j)
marginal
social
value
of
characteristic k at time t
quantity of characteristic k produced
by one unit of output j at time t
marginal social value of unit of
output j at time t (£s per unit of j)
value of output j at time t (£s)
yt = ∑i y jt
total value of NHS output
I ytxq
value weighted output index
g xjt
growth rate of output xj
g qkjt
growth rate of characteristic k
produced by output j
proportion of marginal value of
output j accounted for by
characteristic k
proportion of the total value of
period t output accounted for by
output j
cost weighted output index CWOI
ω ptkt
ω ytjt
I ctx
c jt
unit (average) cost of output j at
time t (£s per unit of j)
m jt
mortality rate from NHS output j in
period t
survival rate (1-mjt)
vector of mental and physical health
characteristics
health level from having health state
ajt
θ
h(θ )
σ
*
jt
(s )
p *jt (θ , s )
h*jt ( s ) = ∑θ p*jt (θ , s ) h (θ )
δ
q*jt = ∑ s δ s −tσ *jt ( s )h*jt ( s )
h oj ( s )
θ
probability of surviving s periods
given that the patient survived
treatment j at date t
probability of being in health state θ
conditional on surviving s periods
after treatment j at date t
expected level of health conditional
on surviving s periods
discount factor
discounted sum of quality adjusted
life years produced by the treatment
if the patient survives treatment
expected health s periods hence if
215
g ajt
the patient does not receive
treatment j conditional on surviving
s periods
probability of surviving without
treatment
probability of being in health state θ
after s periods conditional on
surviving
without
receiving
treatment
discounted sum of quality adjusted
life years if patient not treated
expected increase in discounted
QALYs from treatment j at time t
growth rate of survival
g q*
growth rate in q *jt
g q0
growth rate in q 0jt
I ctxa
survival adjusted cost weighted
output index
value of a reduction of one day in
waiting time for treatment j in year t
value of health gain
σ 0j (s )
p 0j (θ , s )
q0jt = ∑ s δ sσ oj ( s )h oj ( s )
q jt = (1 − m jt )q *jt − q 0jt
jt
jt
π wjt
π ht
rw , rL
w jt
discount rates on the wait for
treatment, QALYs
waiting time for treatment j in year t
L0t
time path of health with treatment
after wait w
life expectancy without treatment
L*t
life expectancy with treatment
I ctxaw
survival and waiting time adjusted
cost weighted index
h * (s; wt )
216
Developing new approaches to measuring NHS
outputs and productivity
FINAL REPORT - APPENDICES
7 September 2005
(revised 7 November)
Diane Dawson∗, Hugh Gravelle**,
Mary O’Mahony***, Andrew Street*, Martin Weale***
Adriana Castelli*, Rowena Jacobs*, Paul Kind*, Pete Loveridge***,
Stephen Martin****, Philip Stevens***, Lucy Stokes***
∗
Centre for Health Economics, University of York. **National Primary Care Research and
Development Centre, Centre for Health Economics, University of York. ***National Institute for
Economic and Social Research. ****Department of Economics and Related Studies, University of York.
Table of contents
Appendices ................................................................................... 4
A. Data sources other than HES ................................................. 4
1. Data on the patient experience................................................................................4
1.1 Data ......................................................................................................................4
1.2 Indicators.............................................................................................................5
2. Data on clinical errors ...........................................................................................10
3. Data on cleanliness and infections ........................................................................13
3.1 Cleanliness .........................................................................................................13
3. 2 Infection control................................................................................................14
3.2.1 Wound infections ....................................................................................15
3.2.2 Infection control......................................................................................16
3.2.3 MRSA .....................................................................................................16
4. Outpatient waiting times .......................................................................................17
5. Readmission rates ..................................................................................................18
6. Life expectancy.......................................................................................................20
B. Hospital Episode Statistics (HES) ....................................... 26
1. HES regular attender episodes .............................................................................26
2. HES: duplicate records..........................................................................................27
3. HES: episodes and spells .......................................................................................28
4. Identifying continuous inpatient (CIP) spells of NHS care................................29
5. HRG assignment ....................................................................................................31
6. Unit costs of spells ..................................................................................................31
7. Quarterly output measures ...................................................................................35
8. Identifying spells that end in death ......................................................................35
9. Mortality summary statistics ................................................................................37
10. Missing Observations: Maintaining Sample Size..............................................45
C. Measuring the health outcomes secured by treatment...... 46
1. Introduction............................................................................................................46
2. Defining health effects ...........................................................................................46
3. Measurement instruments.....................................................................................50
4. Clinical trial data ...................................................................................................53
5. Observational data.................................................................................................55
5.1 Health Outcomes Data Repository.....................................................................56
5.1.1 Analysis...................................................................................................60
5.1.2 Conclusions.............................................................................................65
5.2 York District Hospital........................................................................................65
5.2.1 Results.....................................................................................................66
5.2.2 Conclusions.............................................................................................70
5.3 BUPA.................................................................................................................72
5.3.1 Temporal trends in health gain ...............................................................72
5.3.2 SF36 data used in the specimen index ....................................................87
D. Comparing marginal and average cost................................ 93
1. Introduction............................................................................................................93
2. Methods...................................................................................................................93
3. Data .........................................................................................................................95
ii
4. Discussion................................................................................................................95
5. Results .....................................................................................................................95
iii
Appendices
A. Data sources other than HES
1. Data on the patient experience
1.1 Data
The following surveys were carried out in 2004:
Acute Trusts Inpatient Survey
http://www.healthcarecommission.org.uk/NationalFindings/Surveys/PatientSurveys/fs
/en?CONTENT_ID=4004319&chk=qlIfTU
Acute Trusts: Young Patients Survey
http://www.healthcarecommission.org.uk/NationalFindings/Surveys/PatientSurveys/fs
/en?CONTENT_ID=4004320&chk=BOfOB4
Primary Health Care Trust Survey
http://www.healthcarecommission.org.uk/NationalFindings/Surveys/PatientSurveys/fs
/en?CONTENT_ID=4004323&chk=Z%2BCM6e
Mental Health Survey
http://www.healthcarecommission.org.uk/NationalFindings/Surveys/PatientSurveys/fs
/en?CONTENT_ID=4004322&chk=yWjJbN
Ambulance Patients Survey
http://www.healthcarecommission.org.uk/NationalFindings/Surveys/PatientSurveys/fs
/en?CONTENT_ID=4004321&chk=uZoE2N
The Acute Trusts Inpatient Survey compared results with an earlier survey in 2003,
while the Primary Health Care Trust Survey compared results with a survey in 2003.
The other surveys did not compare results with earlier data.
4
In 2004/5 two relevant surveys have been completed:
Emergency Department Survey
http://www.healthcarecommission.org.uk/NationalFindings/Surveys/PatientSurveys/fs
/en?CONTENT_ID=4011238&chk=0bcNSV
Outpatient Department Survey
http://www.healthcarecommission.org.uk/NationalFindings/Surveys/PatientSurveys/fs
/en?CONTENT_ID=4011237&chk=bn4nNE
These both compare results with figures for 2003.
1.2 Indicators
In tables A1 to A4 the change in the logarithm of the quantified indicator is shown.
5
Table A1 Adult Inpatients 2004 compared to 2002
% change
2004 over
2002
annual rate
In your opinion, how clean was the hospital room or ward that you were in?
-0.20%
How clean were the toilets and bathrooms that you used in hospital?
Average cleanliness
How would you rate the hospital food?
Average food
How organised was the care you received at A&E?
Following arrival at the hospital how long did you wait before admission to a
room or ward and bed
When you were told you would be going into hospital, were you given enough
notice of your date of admission?
Was your admission date changed by the hospital?
From the time that you arrived at the hospital, did you feel that you had to wait a
long time to get to a bed on a ward?
When you had important questions to ask a doctor, did you get answers that you
could understand?
Did doctors talk in front of you as if you weren't there?
When you had important questions to ask a nurse, did you always get answers
that you could understand?
Did nurses talk in front of you as if you weren't there?
Sometimes in a hospital, a member of staff will say one thing and another will
say something quite different. Did this happen to you?
If your family or someone else close to you wanted to talk to a doctor, did they
have enough opportunity to do so?
-1.07%
-0.63%
0.64%
0.64%
1.95%
Were you given enough privacy when discussing your condition or treatment?
Were you given enough privacy when being examined or treated?
Were your scheduled tests, x-rays, or scans performed on time?
Do you think the hospital staff did everything they could do to help control your
pain?
Did a member of staff explain the purpose of the medicines you were to take at
home in a way you could understand?
Did a member of staff tell you about medication side effects to watch for when
you went home?
Did a member of staff tell you about any danger signals you should watch for
after you went home?
Did the doctors or nurses give your family or someone close to you all the
information they need to help you recover?
Average Non-medical Attention
Overall, did you feel that you were treated with respect and dignity while you
were in the hospital?
Overall, how would you rate the care you received?
Average overall care
0.31%
0.00%
-0.93%
4.97%
0%
0.55%
0.96%
0.00%
0.30%
2.18%
-0.28%
-0.92%
2.79%
0.00%
-0.88%
1.05%
-1.49%
0.00%
0.56%
0.00%
0.99%
0.49%
6
Table A2 Accident and Emergency 2004/5 compared to 2003
% change
2004/5 over
2003 annual
rate
In your opinion, how clean was the emergency department?
How clean were the toilets in the emergency department?
Average cleanliness
While you were in the emergency department, how much information about
your condition or treatment was given to you?
Were you given enough privacy when discussing your condition and
treatment
-0.84%
-1.20%
-1.02%
Were you given enough privacy when being examined or treated?
Sometimes in a hospital, a member of staff will say one thing and another
will say something quite different. Did this happen to you in the emergency
department?
Were you involved as much as you wanted to be in decisions about your
care and treatment?
Did a member of staff explain the results of the test in a way you could
understand?
Do you think the hospital staff did everything they could to help control
your pain
While you were in the emergency department, did you feel bothered or
threatened by other patients?
Did a member of staff explain the purpose of the medications you were to
take at home in a way you could understand?
0.76%
Did a member of staff tell you about medication side effects to watch for?
Did a member of staff tell you about any danger signals regarding your
illness or treatment to watch for after you went home?
Average Non-medical Attention
Overall, did you feel you were treated with respect and dignity while you
were in the emergency department?
Overall, how would you rate the care you received in the emergency
department?
Average overall care
1.19%
1.20%
0.00%
0.87%
-0.89%
1.44%
1.07%
-0.38%
3.10%
-0.65%
0.70%
1.15%
2.46%
1.80%
7
Table A3 Outpatients 2004/5 compared to 2003
% change
2004/5
over 2003
annual rate
In your opinion, how clean was the outpatient department?
How clean were the toilets at the outpatient department?
Average cleanliness
How long after the stated appointment time did the appointment start? (Change in
Mean Waiting Time sign reversed)
Did you have enough time to discuss your health or medical problem with the
doctor?
Did the doctor explain the reasons for any treatment or action in a way you could
understand?
Did the doctor listen to what you had to say?
If you had important questions to ask the doctor, did you get answers that you
could understand?
Did you have confidence and trust in the doctor examining and treating you?
Did the doctor seem aware of your medical history?
If you had important questions to ask him/her (non-doctor staff), did you get
answers that you could understand?
Did you have confidence and trust in him/her (non-doctor staff)?
If you had important questions to ask him/her (nurse), did you get answers that
you could understand?
Did you have confidence and trust in him/her (nurse)?
If you had important questions to ask him/her (physiotherapist), did you get
answers that you could understand?
Did you have confidence and trust in him/her (physiotherapist)?
If you had important questions to ask him/her(radiographer), did you get answers
that you could understand?
Did you have confidence and trust in him/her(radiographer)?
Did doctors and/or other staff talk in front of you as if you weren't there
In the outpatient department, how much information about your condition or
treatment was given to you?
Sometimes, a member of staff will say one thing and another will say something
quite different. Did this happen to you?
Were you involved as much as you wanted to be in decisions made about your
care and treatment?
Did a member of staff explain why you needed the test(s) in a way you could
understand?
Did a member of staff tell you how you would find out the results of your test(s)?
Did a member of staff explain the results of the test in a way you could
understand?
-2.62%
-1.92%
-2.27%
3.77%
0.78%
-0.20%
0.00%
0.00%
-0.78%
-0.76%
0.40%
0.00%
0.79%
-0.37%
0.38%
0.37%
-0.42%
-0.73%
0.00%
0.00%
-0.72%
0.00%
-0.82%
-1.73%
-1.40%
8
Before the treatment, did a member of staff explain what would happen?
Before the treatment, did a member of staff explain any risks and/or benefits in a
way you could understand?
Did a member of staff explain to you how to take the new medications?
Did a member of staff explain the purpose of the medications you were to take at
home in a way you could understand?
Did a member of staff tell you about medication side effects to watch for?
Did a member of staff tell you about what danger signals regarding your illness or
treatment to watch for after you went home?
Average Non-medical Attention
Overall, did you feel you were treated with respect and dignity while you were at
the outpatient department?
Overall, how would you rate the care you received at the outpatient department?
Average overall care
% change
2004/5
over 2003
annual rate
0.39%
0.85%
-0.74%
0.00%
1.28%
-3.16%
-0.10%
0.00%
0.00%
0.00%
9
Table A4 Primary Care 2004 compared to 2003
% change
2004 over
2003
The last time you saw a doctor from your GP surgery did you have to wait
for an appointment?
Not used
Did someone tell you how long you would have to wait?
Not used
Were you given enough time to discuss your health or medical problem
with the doctor?
0.00%
Were you involved as much as you wanted to be in decisions about your
care and treatment?
-1.80%
Were you involved as much as you wanted to be in decisions about the
best medicine for you?
4.88%
Did someone explain the results of the tests in a way you could
understand?
-3.70%
Did that person explain the reasons for any action or treatment in a way
that you could understand? (Health professional other than GP)
In the last 12 months, have you ever been put off going to your GP
surgery/health centre because the opening times are inconvenient for you?
Were you involved as much as you wanted to be in decisions about your
dental care and treatment?
Did you have confidence and trust in the dentist?
Average Non-medical Attention
0.00%
-1.16%
-4.20%
-1.16%
-0.89%
2. Data on clinical errors
The National Patient Safety Agency (NPSA) was set up in 2001 with the principal
aims to identify, understand and reduce the risk of harm to patients caused by clinical
errors. It has established a variety of means with which to pursue their goal, including
among others guidance and training to help identify possible sources of errors and
help reduce the risk of recurrence of incidents.
More importantly it started to collect directly the reporting of patient safety accidents,
through the establishment of the National Reporting and Learning System (NRLS).
Pilots were undertaking in June 2002 in 28 Trusts. And as from December 2004 all
607 NHS organisations in England and Wales have been able to report through their
system. NHS organisations covered by the system (NRLS) include all healthcare
10
settings, from primary care to acute care, from learning and disabilities to mental
health and ambulance care. The system has been developed in order that any reporting
is confidential, with the implicit incentive of increasing the number of incidents.
Health care professionals may also report any incident through their own Trust
reporting systems, which will then be passed on to the NPSA via their own local risk
management systems linked to the agency.
It is worth noting that the reporting of incidents affecting patients’ safety is not
mandatory. The Agency expects, however, a high degree of compliance due to the
confidentiality with which incidents are reported.
The NRLS collects data from nine health care areas:

Acute hospitals

Ambulance Service

Community and general dentistry

Community nursing, medical and therapy service (including community
hospitals)

Community optometrists/opticians

Community pharmacy

General Practice

Learning disabilities

Mental Health
The data set has been developed in order to allow the following information to be
collected soon after a single patient safety incident has occurred.
1. Type of incident
2. Time and place of incident
3. Characteristics of the patient(s) involved (age, sex, ethnicity)
4. Outcome for the patient
5. Staff involved in the incident and/ore making the report
6. Contributory factors
7. Factors that might have prevented harm
11
Space for free text is provided
Data is collected at specialty level, as follows:
•
Accident and Emergency
•
Anaesthetics
•
Diagnostic services
•
Medical Specialties
•
Obstetrics and gynaecology
•
Surgical Specialties
•
Other/unknown
Incident types that have been identified so far are:
•
Patient accident
•
Medication
•
Treatment, procedure
•
Infrastructure (including staffing, facilities, environment)
•
Access, admission, transfer, discharge (including missing patient)
•
Documentation (including records, identification)
•
Clinical assessment (including diagnosis, scans, tests, assessments)
•
Medical device/equipment
•
Implementation of care and ongoing monitoring/review
•
Other
Examples of contributory factors are: communication, education and training, patient
factors, work and environment.1
Only 230 NHS organisations in England and Wales have reported incidents so far.
1
All contributory factors with their respective frequencies by incident types can be found at
http://www.npsa.nhs.uk/site/media/documents/1262_PSO_AdditionalInfo.pdf
12
And as shown in the report ‘Building a memory: preventing harm, reducing risks and
improving patient safety (2005)’, the number of reports has increased rapidly over
time, especially during the last few months of 2005. Currently, the amount of ready
available data is non existent.
Further, due to the confidentiality with which health care professionals report these
incidents, data are very sensitive and access to them is not granted, as confirmed to us
by the head of the Observatory.
3. Data on cleanliness and infections
One characteristic of hospital output, which is gaining increasing importance, is that
of hospital cleanliness, and the control of infections. A number of different data
sources exist measuring these attributes. Cleanliness is given a great deal of attention
in performance assessment of Trusts and is one of the key targets in the star ratings of
NHS Trusts. The implicit assumption is that better cleanliness will lead to a reduction
in hospital acquired and other infections, though infections are also now specifically
being measured by various surveillance systems.
3.1 Cleanliness
Patient Environment Action Teams (PEAT) inspect hospitals and report on the patient
environment (cleanliness and tidiness) and food services. PEAT visitors are
volunteers comprising members drawn from Facilities and Estates staff, Hotel
Services Managers, Domestic Managers, Caterers, Infection Control Nurses, patient
and patient representative organisations along with members of the general public. 3
PEAT teams assess cleanliness across 18 criteria. The PEAT scores are used in the
star ratings system and are available from 2002/03 (Commission for Health
Improvement, 2003; Healthcare Commission, 2004). For 2003/04 PEAT teams only
visited individual hospitals which achieved a rating of 'acceptable' or less in the
2002/03 PEAT assessment. Hospitals which achieved a rating of 'good' in 2002/03
undertook a self-assessment.
13
The PEAT score at each Trust is calculated as the total number of PEAT points
achieved at a Trust divided by the total number of PEAT points available, weighted to
account for the relative size of each hospital within a Trust. A weighting process is
then applied which emphasises the areas that patients deem most important. This
includes:
•
Cleanliness in wards
•
Cleanliness in emergency departments
•
Cleanliness in other departments and waiting areas
•
Cleanliness of toilets in wards
•
Cleanliness of toilets in emergency departments
•
Cleanliness of toilets in other departments, and waiting areas
•
Cleanliness of bathrooms in wards
•
Environment in toilets in wards
•
Environment in toilets in emergency departments
•
Environment in toilets in other departments and waiting areas
•
Environment in bathrooms in wards
For the purposes of the star ratings the percentage data is then transformed into a
categorical variable with five bands. The PEAT score (1-5) achieved by each rated
site in a Trust is multiplied by the number of beds in each rated site and summed
across all rated sites and then divided by the total number of beds across the sites that
have at least 10 overnight beds to produce an indicator value between 1 and 5. Both
the overall percentage PEAT score and the transformed categorical variable are
available for each Trust.
3. 2 Infection control
A number of different surveillance systems have been in place (both voluntary and
mandatory) to monitor infections in English hospitals. These fall into three categories:
wound infection, overall measures of infection control, and MRSA (hospital acquired
infection).
In 1996, the Department of Health together with the Public Health Laboratory Service
14
(PHLS), now incorporated in the Health Protection Agency, established and launched
the Nosocomial Infection National Surveillance Scheme (NINSS). The scheme was
driven as a response to the need to identify and monitor infection in English hospitals.
The NINSS scheme ran until November 2002, on a non compulsory basis. Further, it
was a selective scheme in that each participating hospital could select one or more
categories for surgical procedures among the list of 12 designated procedures. It was
as well an intermittent scheme, in that hospitals were required to collect data for a
minimum of three consecutive months each year. Some hospitals decided to collect
the data continuously.
In October 2000 the Minister of Health promised to introduce compulsory monitoring
of hospital acquired infections (referred to as healthcare associated infections) in all
NHS Trusts in England. This monitoring would extend to certain blood stream
infections (including MRSA), as well as infections of wounds following orthopaedic
surgery. The orthopaedic SSI surveillance is compulsory as from 1st April 2004, while
the MRSA surveillance scheme has been in place since 2001. This information is used
as an indicator in the balanced scorecard as part of the Star Ratings for acute hospital
Trusts. (Department of Health, 2003b).
We describe available data under each of these three areas in more detail:
3.2.1 Wound infections
Wound infections can occur as a consequence of many surgical procedures. Bruce et
al (2001) state that “surgical wound infection, although it seldom causes mortality,
leads to morbidity and prolonged hospital stay” and, hence, to increased medical
costs. It appears as well that 50 to 70 per cent of all wound infections occur after
patients are discharged from hospital (Department of Health, 2002b). This leads to the
conclusion that these types of infections are of concern to secondary, primary and
community care providers within the NHS.
A new surveillance of surgical site infection (SSI) scheme has been introduced in
April 2004. It builds upon the previous NINSS experience. It is again a voluntary
scheme; with the difference that data collection on wound infections occurring after
15
orthopaedic procedures is mandatory for all NHS Trusts in England. Data collection
will be kept as far as possible comparable to that collected by NINSS (Health
Protection Agency, 2004).
Data on the volume of wound infections, rate and change is to our knowledge
collected as part of the above-mentioned scheme. A summary of the data collected is
reported in ‘Surveillance of Surgical Site Infections in English Hospitals 1997 – 2002’
(Health Protection Agency, 2003).
3.2.2 Infection control
Data on infection control has been collected by the Department of Health since
2002/03 and an aggregate variable is used in the star ratings of NHS Trusts. It gives
an average of the percentage scores for each of the 15 components in the controls
assurance infection control standard. The NHS Plan Implementation Programme
makes it clear that as a core requirement underpinning the targets in the Plan all
relevant NHS organisations must ensure that they have effective systems in place to
prevent and control hospital acquired infection. This average self-assessment score is
then transformed into a categorical variable for use by the Healthcare Commission in
the star ratings (Commission for Health Improvement, 2003; Healthcare Commission,
2004).
3.2.3 MRSA
Methicillin resistant staphylococcus aureus, better known as MRSA, is a blood stream
infection, which has been increasingly affecting NHS patients in England (Crowcroft
and Catchpole, 2002; Health Protection Agency and Office for National Statistics,
2004). An announcement has been recently made by the Health Secretary, John Reid,
introducing the new objective to halve MRSA bloodstream infections in hospitals by
March 2008 (Department of Health, 2004d). Further, all NHS staff (nurses, healthcare
assistants, porters and cleaners) will receive infection control training (under the
Agenda for Change framework) with the aim of reducing MRSA and any other
healthcare associated infections (Department of Health, 2004e).
16
A mandatory Department of Health Bacteraemia Surveillance Scheme for MRSA has
been in place since 2001. Data is available by NHS Trust for the first three years of its
existence (Department of Health, 2004f). These data include the number of MRSA
bacteraemia reports by Trust and the MRSA rate per 1000 bed-days by Trust from
2001/02 to 2003/04. Data are disaggregated by Trust type: General Acute, Single
Specialty and Specialist Trusts.
The total number of Staphylococcus aureus bacteraemias in England rose from 17,933
in the first year to 18,519 in the second and 19,311 in the third. The numbers of
MRSA bacteraemia were 7,250, 7,384 and 7,647 respectively (CDRweekly, 2002;
2003; 2004). This means an overall increase of MRSA reported cases of 5.5 per cent
over the three years, and an increase of 3.6 per cent in the last year of surveillance.
Some caveats are noted regarding the interpretation of the data:
•
The figures reflect the burden of serious infections associated with MRSA and
not all MRSA infection.
•
The MRSA infections reported by an acute Trust were not necessarily acquired
in that Trust (due to patient transfers between hospitals).
•
The bed occupancy figures used to derive the MRSA bacteraemia rate are from
a period before the MRSA data. This disparity will have an effect on a Trust’s
rates if there has been a significant change in activity in the Trust. This may
occur when there has been a merger of Trusts.
•
The bed occupancy figures apply only to overnight admissions. Consequently
MRSA bacteraemias in patients who are not admitted overnight may make a
Trust’s figures look falsely high, as these patients will feature in the numerator
but not in the denominator. This may apply to certain types of patients, such as
those on renal units.
4. Outpatient waiting times
Although there is extensive information at the individual patient level for those
admitted to hospital, relatively little information exists about those seen as
outpatients. The only information that exists is some aggregate (all outpatient)
17
information about referrals received, activity levels, and waiting times which is
available from the quarterly (QM08) return submitted by NHS Trusts (see
http://www.performance.doh.gov.uk/waitingtimes/index.htm). This quarterly return
gathers data by over 60 specialties on:
1.
2.
the number of written referrals received from GPs during the quarter;
the number of other referral requests received (including those from A&E
departments, a consultant in a department other than A&E, and a prosthetist);
3.
the number of GP written referrals seen during the quarter who had waited:
a) less than 4 weeks
b) between 4 and less than 13 weeks
c) between 13 and less than 17 weeks
d) between 17 and less than 21 weeks
e) between 21 and less than 26 weeks
f) 26 weeks and over
4.
the number of patients with a written referral from a GP who had not yet attended
for a first appointment as at the census date and who had been waiting:
a) between 13 and less than 17 weeks
b) between 17 and less than 21 weeks
c) between 21 and less than 26 weeks
d) 26 weeks and over.
The mean outpatient waiting time for those seen during the quarter was calculated as a
weighted average of the mid-points of each time band with the number of patients
seen employed as the weight for each band (for the over 26 weeks wait we employed
32.5 weeks as the ‘mid-point’ for this band). To convert this quarterly data into annual
waits, a weighted average of the quarterly waits was calculated with weights
reflecting the number of outpatients seen in each quarter.
5. Readmission rates
Readmission rates at NHS Trust level have been published by the Department of
Health for a number of years. Since 2001, readmission rates have been used by the
Department of Health, CHI and the Healthcare Commission, in the construction of the
star ratings for NHS Trusts. These have been calculated as the percentage of
emergency readmissions within 28 days of discharge published from 1998/99
18
onwards. Figure A1 shows the England national average readmission rate. The data
source is the Department of Health and the definition of readmissions is that used by
the Healthcare Commission in the 2005 star ratings.
The data are standardised by age, generating 3 age bands. The data are also
standardised to a base year (2001/02). This definition classifies any emergency
admission at all within 28 days of discharge as a readmission, whether or not the
diagnosis on readmission had any connection with the previous diagnosis or not.
Consequently there is no way of distinguishing in these data whether the readmission
is associated with a failure in the prior hospital spell.
Figure A1 Emergency readmission to hospital within 28 days of discharge, by
age band
14
12
10
8
Age 75+
Age 16-74
Age 0-15
Readmission rate percent
6
4
2
0
1998/99
1999/00
2000/01
2001/02
2002/03
2003/04
Source: Department of Health
Data are based on Hospital Episodes Statistics (HES). Data use CIP spells, and
therefore include transfers to other hospitals. Readmissions are attributed to the Trust
from which the patient was discharged. The matching occurs via the HES identifier
which uniquely identifies a patient and exists in the dataset from 1997/98. It is
generated by matching records for the same patient using a combination of their NHS
19
number, local patient identifier, plus the patients’ postcode, sex and date of birth.
The readmissions indicators exclude day cases, maternities and spells with a cancer
diagnosis or chemotherapy anywhere in the spell. Mental health specialties are also
excluded.
The indicator is indirectly standardised by calculating the ratio of a Trust’s observed
number of readmissions and the number of readmissions that would be expected if it
had experienced the same readmission rates as those of patients in England, given the
mix of age, sex, methods of admission, diagnoses and main procedures among its
patients. This standardised ratio is then converted into a rate by multiplying it by the
overall readmission rate of patients in England. Several algorithms have been set up to
try to deal with the difficult task of dealing with problems of double counting and
readmissions which span calendar years.
Because the data are unable to distinguish readmissions due to poor treatment (the
element we wish to capture in the index) from other admissions within 28 days that
are unrelated to the earlier condition, we do not regard the data as useable for
adjusting output indices and have limited ourselves to illustrative calculations.
6. Life expectancy
Estimates of quality-adjusted life expectancy are based on self-reported health states
while overall life expectancy estimates are compiled from mortality data. The most
recent data we have on quality-adjusted life expectancy were compiled in 1996 (table
A5). However the use of these data as a basis for our work is less than completely
satisfactory because it is well known that life expectancies have advanced
considerably since then. We proceed instead by taking up to date estimates of life
expectancy (Interim Life Tables 2000-2002 for the United Kingdom, Annual Abstract
2002, p. 71) and deducting from these the difference between total life expectancy
and quality-adjusted life expectancy shown in the 1996 data (see Dawson et al.,
2004c, section 3.1.1). Thus our assumption is one of constant morbidity though there
is no clear view on whether healthy life expectancy is rising faster of slower than life
20
expectancy as a whole- in other words on whether the expected period of poor health
is getting longer or shorter.
Both life expectancy and quality-adjusted life expectancy tables suffer from the
problem that they provide data only for specific ages in the case of the first, or age
bands in the case of the second. It is necessary to interpolate and extrapolate in order
to obtain the year-specific figures which we require.
The method of interpolation which we use is the cubic spline- a 1-line command
(SPLINE) available in MATLAB. This provides a smooth curve which passes through
all the points for which data are available. We use the spline to interpolate/extrapolate
up to the age of 95. The life table provides clearly the reference years around which
the spline interpolates, while for the data on quality-adjusted life- and thus on years of
poor health, we assume that they relate to the mid-point of the range shown. For
people over 90 the figure is assumed to relate to 94. The method is subject to
inaccuracy particularly outside the range of data and thus for the very elderly.
In table A6 we show (i) life expectancy interpolated/extrapolated from the Annual
Abstract table, (ii) the interpolated/extrapolated difference between total life
expectancy and quality-adjusted life expectancy from the 1996 table and iii) the
resulting estimate of quality-adjusted life expectancy calculated by subtraction. In the
columns marked (i) for males and females the life expectancies at five-year intervals
(0, 5 etc) match those of the table from the Annual Abstract. In the column marked ii)
those for the mid-point years match those in the relevant columns of table A5.
Until further results on quality-adjusted life expectancy become available, only the
figures in the columns (i) for each sex can be updated. Since these calculations were
done we became aware that the Government Actuary’s web site provides life
expectancy figures for each year, avoiding the need for interpolation of these data.
21
Table A5 Life Expectancy and Quality-adjusted Life Expectancy: 1996
Age
<1
1 to 4
5 to 9
10 to 14
16 to 19
20 to 24
25 to 29
30 to 34
35 to 39
40 to 44
45 to 49
50 to 54
55 to 59
60 to 64
65 to 69
70 to 74
75 to 79
80 to 84
85 to 89
90 plus
Period life
expectancy
A
74.76
73.29
68.37
63.41
58.50
53.72
48.95
44.18
39.41
34.70
30.08
25.59
21.33
17.35
13.75
10.64
8.05
5.91
4.28
2.56
MALES
qualityQuality
Life
expectancy
marginal adjusted Adjusted
less QA Life
period life marginal
Life
expectancy
Lex
Expectancy expectancy
B
A-B
4.78
4.77
4.77
4.77
4.71
4.62
4.49
4.26
3.98
3.60
3.11
2.59
2.14
1.63
1.72
2.56
4.37
4.37
4.38
4.36
4.24
4.11
3.84
3.61
3.24
2.83
2.49
2.06
1.62
1.22
1.32
2.00
50.04
45.68
41.31
36.93
32.57
28.33
24.22
20.38
16.77
13.53
10.70
8.21
6.15
4.53
3.32
2.00
8.46
8.04
7.64
7.25
6.84
6.37
5.86
5.21
4.56
3.82
3.05
2.43
1.90
1.38
0.96
0.56
Period life
expectancy
C
79.76
78.21
73.27
68.30
63.36
58.46
53.55
48.65
43.79
38.99
34.27
29.66
25.20
20.94
16.93
13.32
10.14
7.40
5.24
2.95
FEMALES
Life
qualityQuality
expectancy
marginal adjusted Adjusted
period life marginal
Life
less QA Life
expectancy
Lex
Expectancy expectancy
D
C-D
4.90
4.91
4.90
4.86
4.80
4.72
4.61
4.46
4.26
4.01
3.61
3.18
2.74
2.16
2.29
2.95
4.44
4.40
4.43
4.34
4.21
4.08
3.92
3.72
3.33
3.13
2.83
2.34
1.97
1.52
1.61
1.81
52.09
47.65
43.25
38.82
34.48
30.27
26.19
22.27
18.55
15.22
12.09
9.26
6.92
4.95
3.43
1.81
11.27
10.81
10.30
9.83
9.31
8.72
8.08
7.39
6.65
5.72
4.84
4.06
3.22
2.45
1.81
1.14
22
Table A6 Calculation of Quality-adjusted Life Expectancy base on the 2000-2002
Life Tables
MALES
Life
FEMALES
QA Life
Life
QA Life
Expectancy
Quality Adjustment
Expectancy
Expectancy
Quality Adjustment
Expectancy
i
ii
iii
i
ii
iii
0
75.70
11.66
64.04
80.40
11.95
68.45
1
74.84
11.38
63.47
79.57
11.99
67.57
2
73.96
11.11
62.85
78.70
12.03
66.67
3
73.06
10.86
62.21
77.79
12.05
65.75
4
72.14
10.61
61.53
76.86
12.05
64.81
5
71.20
10.39
60.81
75.90
12.05
63.85
6
70.24
10.17
60.07
74.92
12.03
62.89
7
69.27
9.97
59.30
73.93
12.01
61.92
8
68.29
9.78
58.51
72.92
11.97
60.95
9
67.30
9.61
57.69
71.91
11.93
59.98
10
66.30
9.44
56.86
70.90
11.87
59.03
11
65.30
9.28
56.01
69.89
11.81
58.08
12
64.29
9.13
55.16
68.89
11.74
57.14
13
63.29
8.99
54.29
67.89
11.67
56.22
14
62.29
8.86
53.43
66.89
11.59
55.30
15
61.30
8.74
52.56
65.90
11.50
54.40
16
60.33
8.62
51.70
64.91
11.41
53.50
17
59.36
8.51
50.85
63.93
11.32
52.62
18
58.41
8.41
50.00
62.95
11.22
51.73
19
57.45
8.31
49.14
61.98
11.12
50.86
20
56.50
8.22
48.28
61.00
11.02
49.98
21
55.55
8.13
47.42
60.02
10.91
49.11
22
54.59
8.04
46.55
59.04
10.81
48.23
23
53.63
7.96
45.67
58.06
10.71
47.36
24
52.66
7.88
44.79
57.08
10.60
46.48
25
51.70
7.80
43.90
56.10
10.50
45.60
26
50.73
7.72
43.02
55.12
10.40
44.72
27
49.77
7.64
42.13
54.14
10.30
43.84
28
48.81
7.56
41.24
53.16
10.20
42.96
29
47.85
7.48
40.37
52.18
10.11
42.07
30
46.90
7.41
39.49
51.20
10.02
41.18
31
45.96
7.33
38.63
50.22
9.93
40.29
32
45.02
7.25
37.77
49.23
9.83
39.40
33
44.09
7.17
36.91
48.25
9.73
38.52
Age
23
34
43.15
7.09
36.06
47.27
9.63
37.64
35
42.20
7.01
35.19
46.30
9.53
36.77
36
41.24
6.93
34.32
45.34
9.42
35.92
37
40.28
6.84
33.44
44.38
9.31
35.07
38
39.31
6.75
32.56
43.42
9.20
34.22
39
38.35
6.66
31.70
42.46
9.08
33.38
40
37.40
6.56
30.84
41.50
8.96
32.54
41
36.46
6.47
30.00
40.53
8.84
31.69
42
35.53
6.37
29.16
39.57
8.72
30.85
43
34.62
6.27
28.34
38.60
8.60
30.01
44
33.71
6.18
27.53
37.65
8.47
29.18
45
32.80
6.08
26.72
36.70
8.34
28.36
46
31.90
5.97
25.92
35.77
8.21
27.56
47
31.00
5.86
25.14
34.85
8.08
26.77
48
30.10
5.74
24.36
33.93
7.94
25.99
49
29.20
5.61
23.59
33.02
7.81
25.21
50
28.30
5.48
22.82
32.10
7.67
24.43
51
27.40
5.34
22.06
31.17
7.53
23.65
52
26.51
5.21
21.30
30.25
7.39
22.86
53
25.63
5.08
20.55
29.32
7.25
22.07
54
24.76
4.95
19.81
28.40
7.11
21.29
55
23.90
4.82
19.08
27.50
6.97
20.53
56
23.06
4.69
18.36
26.62
6.81
19.80
57
22.23
4.56
17.67
25.75
6.65
19.10
58
21.41
4.42
16.99
24.89
6.48
18.42
59
20.61
4.28
16.33
24.05
6.29
17.75
60
19.80
4.13
15.67
23.20
6.10
17.10
61
19.00
3.98
15.02
22.35
5.91
16.44
62
18.20
3.82
14.38
21.51
5.72
15.79
63
17.42
3.66
13.75
20.66
5.53
15.13
64
16.65
3.50
13.14
19.83
5.35
14.48
65
15.90
3.35
12.55
19.00
5.18
13.82
66
15.18
3.20
11.98
18.19
5.00
13.18
67
14.48
3.05
11.43
17.39
4.84
12.55
68
13.81
2.91
10.89
16.60
4.68
11.92
69
13.15
2.78
10.36
15.84
4.53
11.31
70
12.50
2.66
9.84
15.10
4.37
10.73
71
11.86
2.54
9.32
14.38
4.22
10.16
72
11.24
2.43
8.81
13.68
4.06
9.62
73
10.64
2.32
8.32
13.01
3.90
9.11
24
74
10.06
2.21
7.84
12.35
3.73
8.62
75
9.50
2.11
7.39
11.70
3.56
8.14
76
8.97
2.01
6.97
11.07
3.39
7.68
77
8.48
1.90
6.58
10.45
3.22
7.23
78
8.00
1.79
6.21
9.85
3.06
6.79
79
7.54
1.69
5.86
9.26
2.90
6.37
80
7.10
1.58
5.52
8.70
2.74
5.96
81
6.67
1.48
5.19
8.16
2.59
5.56
82
6.25
1.38
4.87
7.63
2.45
5.18
83
5.85
1.29
4.56
7.13
2.31
4.82
84
5.46
1.20
4.27
6.65
2.18
4.48
85
5.10
1.11
3.99
6.20
2.05
4.15
86
4.76
1.03
3.73
5.77
1.93
3.84
87
4.45
0.96
3.49
5.36
1.81
3.55
88
4.17
0.89
3.28
4.98
1.70
3.28
89
3.92
0.82
3.09
4.63
1.59
3.04
90
3.70
0.76
2.94
4.30
1.49
2.81
91
3.52
0.71
2.82
4.00
1.39
2.60
92
3.39
0.65
2.73
3.73
1.30
2.42
93
3.29
0.61
2.69
3.48
1.22
2.26
94
3.24
0.56
2.68
3.27
1.14
2.13
95
3.24
0.52
2.73
3.08
1.07
2.01
25
B. Hospital Episode Statistics (HES)
Hospital Episode Statistics (HES) provide information on admitted patient care
delivered by NHS hospitals in England from 1989 to the present time. In addition, the
HES system also captures admitted patient care funded by the NHS but provided by
the private sector. HES covers all medical and surgical specialities and includes
private patients treated in NHS hospitals.
HES comprises over 12 million records each year. Each time a person is admitted to
hospital certain details are collected – such as the patient's age, gender and date of
admission – with further information – such as details of any diagnosis made and
surgical procedures undertaken – added to the record in due course. This information
is stored as a large collection of separate records, one for each period of care under the
same consultant, in a secure database. Records are stored according to the financial
year (actually 1st April to 31st March) in which the period of care finished. For a
small proportion of stays in hospital, there is more than one record. This happens
when a patient is transferred from the care of one consultant to another. Each period
of care under the same consultant is known as a ‘consultant episode’.
HES does not provide details of drugs used in hospitals, or information on outpatients.
There are no records on patients treated in A&E departments who are discharged
immediately. Outpatient activity also includes some day surgery which is not included
in HES. However, from 2003/04, outpatient attendances and A&E activity are
collected as part of HES, though it is not currently available yet. HES will therefore in
future cover most secondary care.
1. HES regular attender episodes
Since 1999/00 HES has gathered information on patients who regularly attend
hospital as part of an on-going programme of treatment. Regular day and night
attenders are patients who have regular appointments in hospital for treatment that
may take longer than a day to complete. Previously, this activity was not included in
HES. In 2003/04, there were about 690,000 of these regular attender episodes and
26
these were spread across 39,000 patients. The vast majority of these episodes (over
490,000) were to address conditions associated with renal failure (renal dialysis).
Most of the other episodes were for cancer treatment (e.g., for chemotherapy). Only
10% of these episodes involved an overnight stay in hospital.
It is mandatory for trusts to submit data on regular attenders to HES from April 2003.
However, there may still be low coverage. In previous years the default in HES was
not to supply the regular attenders, but in 2003/04 the default changed to include
regular attenders. Since regular attenders do not appear in HES in previous years, they
have been excluded from 2003/04 to make the annual HES data comparable. Much of
the activity, for instance renal dialysis and chemotherapy, is picked up elsewhere in
the Reference Cost activity which is included in the indices.
Table B1 reports the annual number of regular attender episodes since 1999/00. The
figures for 1999/00 – 2001/02 inclusive should be treated with great caution due to
data reliability issues. The figure for 2002/03 is considered more reliable than those
for previous years but the figure for 2003/04 is considered to be the most accurate.
Data reliability issues for years prior to 2003/04 may hinder the calculation of
accurate growth rates.
Table B1 Regular attender episodes in HES
HES year
Number regular
attender episodes
1999/00
432,024
2000/01
513,434
2001/02
519,177
2002/03
613,389
2003/04
689,713
2. HES: duplicate records
HES is constructed from records submitted by Trusts on a quarterly basis to a central
national clearing-house. Given the number of records involved it is perhaps inevitable
that errors will occur. One problem is that occasionally Trusts will submit duplicate
27
records (i.e. that the same record (episode) will be submitted more than once). The
following method was employed to eliminate these duplicate records from HES
(Lakhani et al., 2005). All episodes were sorted by the HESID, EPISTART,
EPIORDER and EPIEND fields. If two or more consecutive episodes contained
identical values in the fields listed in Table B2 below, then the record with a valid
value in the ELECDUR field was retained. If no record or more than one record had a
valid ELECDUR, then the record with the most non-empty fields was selected. If this
failed to select only a single copy of the duplicated episode then the first occurrence
of the duplicated episode was retained.
Table B2 Fields used to identify duplicate HES records
ADMIDATE
ADMIMETH
ADMISORC
CLASSPAT
DIAG1-14
DISDATE
DISDEST
DISMETH
EPIEND
EPIORDER
EPISTART
EPISTAT
EPITYPE
HESID
MAINSPEF
OP_DTE_1-12
OPER_1-12
RESHA
RESLADST
STARTAGE
TRETSPEF
The number of duplicate records dropped from each year’s HES can be found in
Table B3.
Table B3 Duplicate records (episodes) dropped from HES, 1997/98 - 2003/04
HES data year
Number of duplicate records dropped
1997/98
139,807
1998/99
32,103
1999/00
30,960
2000/01
18,447
2001/02
45,845
2002/03
58,258
2003/04
24,128
3. HES: episodes and spells
A HES record is generated for each episode of admitted patient care under a particular
consultant within a single hospital provider; admitted patient care includes day
28
surgery. The unit of analysis employed by HES is the finished consultant episode
(FCE). Over 90% of all episodes involve a patient remaining under the care of the
same consultant for the duration of their stay in hospital. In the other cases, however,
the patient is discharged from the care of one consultant, but remains in hospital, and
moves to the care of another consultant. This move, from the care of one consultant to
another, terminates one HES episode and triggers the start of another. By grouping
together all those episodes associated with a stay at a given provider (hospital) one
identifies continuous inpatient provider spells of care.
A patient who is admitted to one hospital and then transferred to another - perhaps
because the first hospital is not equipped to provide highly specialised care which the
patient's condition demands - would also generate two HES episodes. Rather than use
episodes or provider spells as the unit of analysis, we employ a third unit of analysis:
continuous inpatient spells of NHS care. These are defined as continuous periods of
patient care anywhere within the NHS. An NHS spell might comprise multiple
episodes, including the transfer from one hospital to another as well as the move from
one consultant to another within a given hospital. There will therefore be more than
one record in the HES database for such patients. The records within the HES
database are not currently linked, but admission details and patient identifier can be
used to link continuous periods of treatment.
4. Identifying continuous inpatient (CIP) spells of NHS care
To identify the episodes associated with each spell of continuous inpatient care, all
episodes were sorted by the HESID, EPISTART, EPIORDER and EPIEND fields in
that order (Lakhani et al., 2005). This grouped together all episodes involving the
same patient (assuming that episodes with the same HESID relate to the same
patient). EPISTART was used to place the episodes in chronological order with
EPIORDER employed to place two episodes that start on the same day in the correct
sequence. Finally, EPIEND puts those spells that involve a transfer from one provider
to another into the correct chronological sequence.
The ordered episodes are linked to form continuous inpatient spells providing that
29
HESID (the patient identifier) remains the same and there is neither a discharge
episode (patient leaves hospital, dead or alive (DISMETH<7)) nor a transfer episode.
If HESID changes then the record with the new HESID is the beginning of a new
spell. If HESID remains unchanged but there is a discharge episode that is not a
transfer to another provider then the next episode also relates to a new spell. Any one
patient might record more than one spell in any given year.
Following Lakhani et al. (2005), an episode is defined as ending with a transfer if:
a) the discharge destination (DISDEST) is 51, 52 or 53; OR the admission source
(ADMISORC) of the following episode is 51, 52 or 53; OR the admission method
(ADMIMETH) of the following episode is 81
AND
b) the HESID field of the following episode is the same as that of the current episode
AND
c) the episode order (EPIORDER) of the following episode is 1
AND
d) the admission date of the following episode is valid (i.e. a positive number of days
since 31 March 1990)
AND
e) the discharge date of the current episode is valid (i.e. a positive number of days
since 31 March 1990)
AND
f) the admission date of the following episode minus the discharge date of the
current episode’s discharge date is between zero and two (inclusive).
Table B4 reports the number of episodes, provider spells and NHS spells available for
analysis.
30
Table B4 Number of episodes, provider spells and NHS spells by HES data year
HES
data
FCE Episodes (millions)
Provider spells (millions)
CIP spells (millions)
1997/98
11,404
10,893
10,799
1998/99
12,001
11,350
11,211
1999/00
12,203
11,285
11,106
2000/01
12,273
11,257
11,075
2001/02
12,316
11,183
10,994
2002/03
12,719
11,739
11,542
2003/04
13,332
year
12,148
5. HRG assignment
HRGs are assigned at episode level and having linked episodes into spells (CIPS), a
spell may have more than one HRG within it. We assign HRGs to each CIP spell on
the basis of the first HRG within the CIPS. An alternative HRG assignment
methodology has been developed for Payment by Results which looks for the
'dominant' HRG across episodes, but this is based on a provider spell (PS). HRG v3.1
was in use until 2002/03 and was then replaced by HRG v3.5 in 2003/04. Several
categories of HRG are affected by this change. Reference Cost data is collected using
HRG v3.5 in 2003/04. However, raw HES data is still generated in v3.1 in 2003/04. In
order for activity data to be comparable to 2002/03, HES data was not recoded into
v3.5 for 2003/04. Since we calculate base weighted Laspeyres indices lagged costs are
used and we do not require unit costs from the Reference Cost data in 2003/04 under
v3.5. Hence all our analysis could be undertaken using v3.1. If the output indices are
however to be updated for 2004/05, the HES data will need to be recoded to v3.5, if it
is not already generated in that format.
6. Unit costs of spells
If CIPS or provider spells are to be used as the output measure, the unit cost weights
from the Reference Costs are inappropriate as they relate to FCEs.
31
To produce a cost weighted output index which is quality adjusted for survival and
waiting times it is necessary to assign costs, survival, and waiting times to each type
of output, whether measured as numbers of CIPS of each type or numbers of FCEs of
each type. We can assign waiting times to CIPS using the waiting time for the first
FCE of the spell and can assign mortality by using the mortality indicator of the last
FCE in the spell. But it is more complicated to compute a cost for CIPS.
The source for hospital unit costs is the Reference Cost database which has the costs
for FCEs. FCEs which are clinically similar and have similar resource use are grouped
together as a Health Resource Groups (HRG). The Reference Cost database has unit
costs for FCEs in the different HRGs. Thus it is straightforward to assign costs to
HES outputs if these are measured as numbers of FCEs in each type of HRG.
Matters are more complicated if one wishes to have a unit cost for CIPS, 8% of which
contain more than one FCE. There are a number of ways of calculating a cost for each
spell and for labelling spell types.
(a) Define the spell type by the set of FCEs it contains and to simplify ignore the order
of FCEs in the spell. The unit cost a spell type is the sum of the unit costs of the
HRGs of its constituent FCEs. The advantage of this approach is that output types are
homogenous: each contains spells with the same set of FCEs. The disadvantage is that
the number of different types of spell is potentially very large. Even if spells consist
of at most two FCEs there are ½ n(n+1) types of spell if there are n types of FCE.
There are over 500 elective HRGs and the same number of non-electives. Restricting
attention to 2002/3 spells whose first episode was elective we found over 17,000
different types of spell. Thus calculation of the indices is cumbersome because of the
great increase in the number of outputs. The procedure satisfies adding up: total costs
calculated as numbers of spells of each type times their unit costs will equal total cost
calculated as numbers of FCEs of each type times their Reference Cost unit costs.
Thus a pure cost weighted activity index with activity measured by spells defined in
this way would be very nearly equal to the pure CWAI with activity based on FCEs.2
2
There would be a slight difference because some spells which have a first FCE finishing in year t and
a second FCE finishing in year t+1 would be assigned to year t+1, whereas the constituent FCEs would
be assigned to different years with a FCE based index.
32
Similarly average waits and mortality also satisfy adding up.
(b) Assign each spell to a type by using the HRG of its first episode and use the cost
of the HRG of the first FCE in the spell as the unit cost of the spell. Thus the spells in
an HRG category may contain disparate types of spell (defined by the set of HRGs of
their constituent FCEs) but all will have the same first FCE type). This will
underestimate the cost of multi-FCE spells. Multiplying the number of spells by the
unit costs of their first FCEs will yield a total cost which is less than actual total cost
(which is the number of FCEs of different types times their unit costs). Thus the
average cost for each output type is not the “true” unit cost i.e. the total cost of all
spells assigned to the type divided by the number of spells.
(c) Define the spell type by the HRG type of the first episode. Calculate the unit cost
of the HRG as the total cost of all spells assigned to it divided by the number of spells
assigned. The cost of a spell is calculated as the sum of the unit costs of the FCEs it
contains. Like (a) this satisfies adding up for costs, waits and deaths. It requires more
HES processing than (b) but less than (a) since the number of output types would be
smaller. Since it produces the same number of output types as (b) the subsequent
calculation of the indices is simpler than (a).
Assume that all CIPS have at most two FCEs. Let zj0t be the number of single FCE
spells of HRG type j in year t and zjkt the number of two FCE spells where the first
FCE has HRG type j and the second has HRG type k.
With method (c) the total number of CIPS assigned to HRG type j by the HRG of the
first FCE in the CIPS is:
x Sc
jt = z j 0 t + ∑ k ≠ j z jkt
(1)
The current period t unit cost of this activity type in period t using method (c) is:
c
Sc
jt
=
z j 0t c Fjt + ∑ k ≠ j z jkt ( c Fjt + cktF )
z j 0 t + ∑ k ≠ j z jkt
=
z j 0t c Fjt + ∑ k ≠ j z jkt ( c Fjt + cktF )
x Sc
jt
(2)
where c Fjt is the unit cost of an FCE of HRG type j in year t
The total cost of CIPS by method (c) is:
33
CtSc = ∑ j x Sjt c Sjt = ∑ j z j 0 t c Fjt + ∑ j ∑ k ≠ j z jkt c Fjt + ∑ j ∑ k ≠ j z jkt cktF
(3)
which equals the total cost actually incurred in year t.
In order to calculate a unit cost for period t+1 CIPS activity based on period t actual
FCE costs to be applied to period t+1 CIPS activity we calculate:
c
St +1c
jt
=
z j 0t +1c Fjt + ∑ k ≠ j z jkt +1 (c Fjt + cktF )
z j 0t +1 + ∑ k ≠ j z jkt +1
=
z j 0t +1c Fjt + ∑ k ≠ j z jkt +1 (c Fjt + cktF )
x Sc
jt +1
(4)
F
Note that cˆ Sjtt+1c ≠ c Sc
jt since although the raw unit costs of FCEs c jt are the same in (4)
and in (2), (4) use the numbers of FCEs of different types and the total number of
spells for period t +1 and (2) uses those from period t.
The CWAI based on CIPS (method (c)) is then:
I
x Sc
c Sc t
∑x
=
∑x
St +1c
Sc
jt +1 jt
j
j
cˆ
Sc Sc
jt jt
c
=
∑
j
⎡ z j 0t +1c Fjt + ∑ z jkt +1 (c Fjt + cktF ) ⎤
k≠ j
⎣
⎦
Sc Sc
x
c
∑ j jt jt
(5)
which is identical to a CWAI based on FCEs. But notice that when we apply quality
adjustment factors the equality will disappear since in general (a) the factors are
applied to different distributions of HRGs (based on spells versus FCEs) and (b) for
some of the types of adjustment the factors will differ for HRGs based on FCEs and
those based on spells (e.g. the simple survival adjustment is the ratio of survival rates
and will differ for a given HRG label j because survival rates for the spell are based
on the survival rate for the last FCEs assigned to the spell and those for the FCE based
HRG on the survival rate of that FCE).
Using CIPS with version (c) would give us a unit of output which was not
homogenous in that different cases in each output group (by HRG of first FCE) may
have different combinations of second, third etc. FCEs. But any output categorisation
will have heterogeneous cases since different individuals wait different lengths of
time for the same type of output. Using (c) involves averaging over waiting times,
costs and mortalities but it does give an accurate answer to a meaningful question:
what is the average cost, wait or mortality of a person whose first FCE was of this
type. We therefore base our calculation of indices with CIPS activity measures on
method (c).
34
7. Quarterly output measures
We were asked to examine the construction of quarterly output indices to be used for
within year estimates of productivity growth. Most of the required activity measures
are available on a quarterly basis, or will shortly become so3. HES produces quarterly
activity counts but there are difficulties in using these data for in year growth
estimates. The quarterly data is not cleaned anywhere near as thoroughly as the annual
data and the data from the later months in a quarter are likely to be incomplete. The
quarterly releases are cumulative so that for example the third quarter release has data
based on all records in the year to date and so overwrites second and first quarter
releases. This would increase the accuracy of estimates made with a one quarter lag. If
spell based activity indices are required the lagged unit costs to be applied to current
period activities would have to be recalculated each quarter, though we suspect that
using the unit costs from the previous period would make little difference. The
quarterly HES data do not contain the date of death field provided by ONS so that it
would not be possible to calculate 30 day death adjustments with quarterly data.
8. Identifying spells that end in death
To calculate death (or survival) rates we need to be able to identify which continuous
inpatient spells of NHS care end in death. Fortunately, there is a field in HES
(DISMETH) which defines the circumstances under which the patient left hospital.
DISMETH takes a value of 4 (patient died) or a value of 5 (baby was stillborn) if the
patient is not alive when discharged. However, the cause of death is not recorded and
may be unrelated to the patient's diagnosis or treatment. Table B5 (column (a)) reports
the number of ‘deaths on discharge’ for spells of NHS care by HES data year. These
vary between 250,000 and 267,000 per annum.
One shortcoming of the HES data is that it cannot identify how long a patient survives
after leaving hospital. Using information gathered from the registration of deaths, the
3
General practitioner consultations have previously been estimated from the General Household
Survey which is conducted annually but these estimates are now likely to be replaced by those derived
from downloads of a large database of practice record systems which could be used to produce
quarterly data.
35
Office for National Statistics (ONS) has attempted to link its date of death data to
HES records by matching various fields including those recording a patient’s sex, date
of birth and postcode. The outcome of this matching process is a file, for each HES
data year, that contains the HESID field and the date of death for all deaths that occur
between the 1st of April of the data year and the 30th of April of the following year.
In principle, this data set can be merged with HES to identify those spells that end
with death within 30 days of discharge from hospital.
Unfortunately, the linkage of the ONS deaths data to HES is not without its
difficulties and work is ongoing to improve this matching process. These difficulties
can generate some surprising outcomes including the result that:
a) it is not uncommon for the ONS date of death to be before the HES date of
discharge from hospital; and
b) the ONS deaths data records no death while HES indicates that the patient was
dead on discharge.
One consequence of these linkage difficulties is that the ONS deaths data tends to
under-estimate the number of deaths. Typically, the number of deaths on discharge
(according to HES) exceeds the number of deaths on the day of discharge (from ONS)
by about 20,000 (compare columns (a) and (b) of Table B5).
The number of deaths within 30 days of discharge from hospital is presented in
column (c) of Table B5. This is based solely on the ONS deaths data. An alternative
‘deaths within 30 days of discharge’ measure can be constructed by combining the
deaths data from HES (for those dead on discharge) and adding those deaths that
occur within 30 days of discharge from hospital (from ONS data). The number of
deaths captured by this combined HES/ONS deaths within 30 days of discharge
measure can be found in column (d) of Table B5. Our empirical work on death rates
employs both this combined HES/ONS 30 day death rate as well as the dead on
discharge (HES) death rate.
36
Table B5 Number of deaths from HES and ONS (thousands)
HES data year
HES deaths
ONS deaths
HES/ONS
on discharge
on discharge day*
within 30 days*
30 days discharge*
(a)
(b)
(c)
(d)
1997/98
250
n/a
n/a
n/a
1998/99
268
238
306
335
1999/00
264
239
305
329
2000/01
254
231
294
315
2001/02
259
238
300
319
2002/03
263
244
304
322
2003/4
270
245
305
327
* This category also includes those deaths where the ONS date of death is one day before the date of
discharge.
9. Mortality summary statistics
This appendix presents some summary data on mortality and survival rates. Figure B1
is a histogram of in-hospital CIPS based survival rates for 548 elective and daycase
HRGs in 2002/03. The mean CIPS based survival rate is 97.8% (98.4% FCE based).
A few HRGs have zero or very low survival rates. They are listed in Table B6. Most
have extremely low levels of activity.
37
0
10
Density
20
30
40
Figure B1 In-hospital CIPS based survival rates, 2002/03, electives and day cases
0
.2
.4
.6
In-hospital survival rates 2002/03
.8
1
Table B6 HRGs with the lowest in-hospital survival rates, 2002/03, electives
HRG
HRG Label
Number of
Number of
spells
deaths
D09
Pulmonary Embolus - Died
20
20
D15
Bronchopneumonia
185
118
E28
Cardiac Arrest
49
22
A32
Head Injury without Significant Brain Injury with complications
3
1
D29
Inhalation Lung Injury or Foreign Body Aspiration
71
20
Summary data for 555 non-elective HRGs are shown in Figure B2. The mean CIPS
based survival rate is 93% (94.9% FCE based) which is lower than for electives.
HRGs with zero or very low survival rates are listed in Table B7.
38
0
5
Density
10
15
20
Figure B2 In-hospital CIPS based survival rates, 2002/03, non-electives
0
.2
.4
.6
In-hospital survival rates 2002/03
.8
1
Table B7 HRGs with the lowest in-hospital survival rates, 2002/03, non-electives
HRG
HRG Label
D09
N01
E28
J13
D15
Q01
Pulmonary Embolus - Died
Neonates - Died <2 days old
Cardiac Arrest
Major Burn >29% TBSA without Significant Graft Procedure >49
Bronchopneumonia
Emergency Aortic Surgery
Number
of spells
733
3127
2313
43
9789
116
Number
of deaths
733
3127
1610
28
5065
22
Figures B3 and B4 present the same information but for 30-day survival. The patterns
are very similar to those for in-patient survival. The unweighted mean CIPS based
survival rate for electives and day cases is 97.3%. (97.9% FCE based). The
unweighted mean CIPS based survival rate for non-electives is 91.7% (94.0% FCE
based).
39
0
10
Density
20
30
40
Figure B3 Thirty day CIPS based survival rates, 2002/03, electives
0
.2
.4
.6
30 day surv ival rates 2002/03
.8
1
0
5
Density
10
15
20
Figure B4 Thirty day CIPS based survival rates, 2002/03, non-electives
0
.2
.4
.6
30 day surv ival rates 2002/03
.8
1
Figures B5 and B6 are scatter plots of the Winsorised growth rates in CIPS based 30
40
day and in-hospital survival rates between 2001/2 and 2002/3 for electives and nonelectives. The correlations between the two survival growth rates is high: 0.984 for
electives and day cases and 0.944 for non-electives.
-.02
Winsorised growth in in-hospital survival rates
-.01
0
.01
.02
Figure B5 Winsorised growth in in-hospital and 30 day CIPS based survival
rates, 2001/02-2002/03, electives
-.02
-.01
0
.01
Winsorised growth in 30 day survival rates
.02
41
-.02
Winsorised growth in in-hospital survival rates
-.01
0
.01
.02
Figure B6 Winsorised growth in in-hospital and 30 day CIPS based survival
rates, 2001/02-2002/03, non-electives
-.02
-.01
0
.01
Winsorised growth in 30 day survival rates
.02
The following table shows descriptive statistics for age and gender compositions of
elective and daycase HRGs. Mean age for both in-hospital and 30 day deaths appears
to have risen marginally. The proportion of males in both in-hospital and 30 day
deaths appears to be constant.
The table also highlights the proportion of electives which are day cases.
42
Table B8 Summary statistics for age, gender, daycase rates for HRGs, electives
Variable
Mean age of in-hospital deaths
Mean age of in-hospital deaths
Mean age of in-hospital deaths
Mean age of in-hospital deaths
Mean age of in-hospital deaths
Mean age of in-hospital deaths
Mean age of in-hospital deaths
Mean age of deaths within 30 days
Mean age of deaths within 30 days
Mean age of deaths within 30 days
Mean age of deaths within 30 days
Mean age of deaths within 30 days
Mean age of deaths within 30 days
Mean age of all cases
Mean age of all cases
Mean age of all cases
Mean age of all cases
Mean age of all cases
Mean age of all cases
Mean age of all cases
Percent male in-hospital deaths
Percent male in-hospital deaths
Percent male in-hospital deaths
Percent male in-hospital deaths
Percent male in-hospital deaths
Percent male in-hospital deaths
Percent male in-hospital deaths
Percent male deaths within 30 days
Percent male deaths within 30 days
Percent male deaths within 30 days
Percent male deaths within 30 days
Percent male deaths within 30 days
Percent male deaths within 30 days
Percent male of all cases
Percent male of all cases
Percent male of all cases
Percent male of all cases
Percent male of all cases
Percent male of all cases
Percent male of all cases
Percent daycases of elective admissions
Percent daycases of elective admissions
Percent daycases of elective admissions
Percent daycases of elective admissions
Percent daycases of elective admissions
Percent daycases of elective admissions
Percent daycases of elective admissions
Year n HRG
1997/98 467
1998/99 454
1999/00 460
2000/01 453
2001/02 449
2002/03 446
2003/04 440
1998/99 491
1999/00 501
2000/01 499
2001/02 492
2002/03 503
2003/04 493
1997/98 566
1998/99 566
1999/00 567
2000/01 565
2001/02 563
2002/03 564
2003/04 567
1997/98 468
1998/99 455
1999/00 461
2000/01 454
2001/02 450
2002/03 447
2003/04 441
1998/99 481
1999/00 495
2000/01 487
2001/02 483
2002/03 486
2003/04 484
1997/98 567
1998/99 567
1999/00 568
2000/01 566
2001/02 564
2002/03 565
2003/04 568
1997/98 567
1998/99 567
1999/00 568
2000/01 566
2001/02 564
2002/03 565
2003/04 568
mean
65.623
65.320
65.616
65.607
66.566
66.818
65.430
64.780
64.673
64.574
65.249
65.290
64.120
50.934
50.366
50.293
50.475
50.408
50.792
50.571
0.522
0.533
0.511
0.534
0.527
0.536
0.511
0.672
0.650
0.671
0.664
0.665
0.654
0.492
0.496
0.501
0.498
0.500
0.500
0.501
0.326
0.339
0.362
0.366
0.371
0.374
0.374
sd
17.604
17.424
16.986
17.556
16.979
17.218
18.002
17.189
17.179
17.642
16.774
16.861
18.335
18.693
18.477
18.291
18.323
18.443
18.146
18.465
0.286
0.293
0.288
0.284
0.304
0.288
0.300
0.271
0.284
0.272
0.282
0.286
0.293
0.200
0.200
0.203
0.198
0.199
0.197
0.203
0.278
0.288
0.295
0.294
0.295
0.292
0.294
min
0.003
0.003
0.003
0.003
0.003
0.003
0.056
0.003
0.003
0.003
0.003
0.026
0.050
0.003
0.092
0.010
0.003
0.175
0.170
0.089
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
max
102
98
98
96
97
92
89
92
89
96
97
93
93
97
82
82
82
82
83
82
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
0.981
1.000
0.984
1.000
0.983
0.984
43
The following table illustrates the age and gender composition for all non-elective
HRGs over time. Mean age for both in-hospital and 30 day deaths appears to have
increased marginally, whilst the pattern for gender is less clear.
Table B9 Summary statistics for age, gender, deaths for HRGs, non-electives
Variable
Mean age of in-hospital deaths
Mean age of in-hospital deaths
Mean age of in-hospital deaths
Mean age of in-hospital deaths
Mean age of in-hospital deaths
Mean age of in-hospital deaths
Mean age of in-hospital deaths
Mean age of deaths within 30 days
Mean age of deaths within 30 days
Mean age of deaths within 30 days
Mean age of deaths within 30 days
Mean age of deaths within 30 days
Mean age of deaths within 30 days
Mean age of all cases
Mean age of all cases
Mean age of all cases
Mean age of all cases
Mean age of all cases
Mean age of all cases
Mean age of all cases
Percent male in-hospital deaths
Percent male in-hospital deaths
Percent male in-hospital deaths
Percent male in-hospital deaths
Percent male in-hospital deaths
Percent male in-hospital deaths
Percent male in-hospital deaths
Percent male deaths within 30 days
Percent male deaths within 30 days
Percent male deaths within 30 days
Percent male deaths within 30 days
Percent male deaths within 30 days
Percent male deaths within 30 days
Percent male of all cases
Percent male of all cases
Percent male of all cases
Percent male of all cases
Percent male of all cases
Percent male of all cases
Percent male of all cases
Year
1997/98
1998/99
1999/00
2000/01
2001/02
2002/03
2003/04
1998/99
1999/00
2000/01
2001/02
2002/03
2003/04
1997/98
1998/99
1999/00
2000/01
2001/02
2002/03
2003/04
1997/98
1998/99
1999/00
2000/01
2001/02
2002/03
2003/04
1998/99
1999/00
2000/01
2001/02
2002/03
2003/04
1997/98
1998/99
1999/00
2000/01
2001/02
2002/03
2003/04
n
528
530
535
530
524
533
528
546
540
543
532
540
534
571
571
571
571
571
571
571
529
531
536
531
525
534
529
539
537
542
531
539
534
572
572
572
572
572
572
572
mean
64.647
64.280
64.189
65.020
65.116
65.086
65.151
63.891
64.206
64.444
64.438
64.666
64.566
51.139
50.567
50.388
50.288
50.373
50.555
50.537
0.495
0.509
0.507
0.502
0.500
0.510
0.500
0.584
0.576
0.578
0.571
0.573
0.562
0.506
0.510
0.507
0.508
0.510
0.513
0.514
sd
19.611
19.909
20.137
19.411
19.678
19.601
19.257
19.888
19.974
19.587
20.013
19.882
19.554
20.119
19.916
19.814
19.684
19.715
19.605
19.564
0.216
0.225
0.219
0.229
0.217
0.227
0.214
0.229
0.219
0.232
0.217
0.227
0.219
0.191
0.194
0.197
0.198
0.196
0.195
0.198
min
0.003
0.003
0.004
0.004
0.014
0.003
0.017
0.005
0.004
0.004
0.018
0.003
0.003
0.253
0.005
0.004
0.004
0.010
0.004
0.011
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
max
94
88
87
90
96
95
94
86
87
90
94
95
91
84
84
84
84
84
84
84
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
44
10. Missing Observations: Maintaining Sample Size
In order to maintain as full a sample as possible, all missing observations have been
replaced with a representative figure, with the exception of those observations lacking
the individual identifier, EPIKEY, or method of admission, ADMIMETH. All missing
values for deaths and waits are replaced with mean for their HRG for electives and
non-electives separately, provided a value for that specific case exists, or else the
population mean is applied. Missing values for cost by HRG are replaced by the
population mean for electives and non-electives. For the age variable, missing
STARTAGE values are replaced with the gender mean for the HRG for electives and
non-electives.
45
C. Measuring the health outcomes secured by treatment
1. Introduction
In section 4 we set out a methodology for an “ideal” index of NHS output. The output
produced by NHS activity should be measured as the health gain and other
characteristics valued by patients. Construction of such an index requires data on
health gain. We explored two main sources from which such information might be
obtained:
1.
Clinical trial data, from which it may be possible to extract before and after
treatment estimates of health status for patients enrolled in clinical trials;
2.
Observational data, from which before-and-after treatment estimates might be
derived for patients in routine clinical practice.
2. Defining health effects
In essence, the measure of health outcome included in a productivity index should
indicate the ‘value added’ to health as a result of contact with the health system. This
has proved difficult to make operational in the health sector, mainly because of
measurement difficulties. The fundamental difficulty is that the counterfactual – what
health status would have been in the absence of intervention – is rarely observed.
Although health status measurement is becoming increasingly routine in many health
care settings, this tends to involve comparisons of health states before and (sometime)
after intervention. The with/without and before/after measures of the value added of
treatment are unlikely to be equivalent.
Figure C1 illustrates the problem. The figure plots an individual’s health status,
measured on the y axis using a scale where 1 indicates full health and 0 death, with
time measured along the x axis. In this illustration the individual suffers a decline in
health status until time t0 when the option of treatment is made available. Without
treatment, the health profile would follow the dashed line, ho(t), with the patient dying
46
at time t2. If the patient accepts treatment, the time path h*(t) is followed. Treatment
initially reduces health - the patient may be temporarily incapacitated even by
successful surgery. After t1, though, the patient begins a recovery and the treatment
improves health status and lengthens life from t2 to t3. The net change in health status
resulting from treatment can be calculated by subtracting the area where h*(t) < ho(t))
from the area where h*(t) > ho(t). This difference is commonly referred to as the total
number of quality-adjusted life years (QALYs) resulting from treatment. A QALY
therefore captures the effect of treatment on both how long patients live and their
quality of life over time. It is thus a more sophisticated measure of the treatment effect
than simply focussing on the change in survival, which would capture only the
difference between t2 to t3. However, QALYs are considerably more difficult to
estimate accurately. There are three major difficulties.
47
Figure C1 With and without treatment health profiles
Health
status
h=1
*
With treatment: h (t)
o
Without treatment: h (t)
t0
t1
t2
t3
First, the “without treatment” time path, ho(t), is typically not monitored. Rarely are
patients denied any sort of treatment, so this counterfactual is simply unobservable.
Some studies have relied on the “expert opinion” of clinicians to estimate expected
outcome in the absence of treatment (Williams, 1985), but it is not clear how their
opinion can be ratified.
In the absence of knowledge of about ho(t), one possibility is to fall back on a
before/after measure of value-added. This brings with it the two remaining
difficulties. A major complication in measuring QALYs is that to identify h*(t) it is
necessary to measure health status continuously. Typically, though, only snapshot
estimates at particular points in time are available. The timing of the snapshots is
clearly crucial: the individual’s health status appears quite different if measured at t2
rather than at t1. So the treatment effect view can look very different depending on the
point of time chosen to measure health status after treatment. We investigate this issue
for hip and knee replacements in the empirical section of this annex.
The third and most critical difficulty is that the difference between with/without and
before/after measures of value-added will depend on the patient’s underlying health
condition. For conditions where the without treatment time path ho(t) is likely to be
constant, the with/without and before/after measures of value-added may be in close
agreement. But for other conditions, the measures will diverge. Divergence is likely to
be most noticeable where people if left untreated would be expected to suffer
48
deterioration in health over time and where the purpose of the intervention is not to
improve health but to stabilise the condition. In such cases the before/after measure
would suggest little or no change in health status, irrespective of the fact that the
intervention might have been highly beneficial. Sole reliance on before/after measures
will systematically favour interventions that lead to health improvements over those
that stabilise or slow the deterioration of conditions. Consequently, such measures
should not be used to make judgements about the relative value of different types of
health care intervention. It is also problematic for the construction of a productivity
index because if, over time, there is a change in the mix of services provided by the
health care system, the true value of this change may not be captured accurately.
As well as requiring a basis by which the relative value of different interventions is
established, accurate measures of health system productivity ought to able to capture
technological improvements that enhance health outcomes. To measure improvements
from one year to the next we would need to compare h*(t) in a baseline year (year 0)
with h*(t) in a later (year 1). If technological changes have generated improvements in
the quality but not length of life of people suffering from a particular condition, this
would lead to an upward shift of the h*(t) time path, as illustrated in figure C2. If both
quality and length of life were improved, the year 1 time path would extend beyond t3.
Figure C2 Improved health profiles over time
Health
Status
h=1
year 1
year 0
t0
t3
49
3. Measurement instruments
The two key requirements for a measure of health status useful for measuring NHS
productivity are a means of describing the health profile of an individual along a set of
dimensions of health and a mapping from the set of dimensions to an overall health
score.
There are two broad groups of health status measure. The first group is of conditionspecific or targeted measures designed for use in a single therapeutic area. Even
within a single diagnostic area such as depression, it is possible to find multiple
measures. These measures often have their origins in clinical practice. An example is
the VF14, which is described in box 1. Data collected by BUPA using this instrument
are reported later in the annex.
The second, and much smaller group of measures, is generic and designed for use in a
wide range of settings. Potentially such measures might be applied in all therapeutic
settings. Practical design considerations mean that generic measures trade-off
complexity and comprehensiveness against reduced administrative burden and
simplicity of coding. Since there are no absolute standards against which to judge the
design of any measure, such compromises are reached on the basis of the experience
of instrument developers and the technical performance of the measures they produce.
All health status measurement systems have two components. First, they require a
means of describing health states. This can be done by an explicit formalised
classification system (for example, the Karnofsky Performance Scale or Health
Utilities Index) or, more often, by implicit combinations of responses to complex
questionnaires (for example, Hospital Anxiety and Depression Scale or SF-36).
Instruments differ in the dimensions of health they assess and the fineness of the
measures employed along the dimensions. Second, all descriptive systems require
some form of valuation or weighting system to reduce the volume of captured
information to manageable proportions. Most weighting systems are characterised by
arbitrary weights imposed by instrument developers. For example, responses to the 5point Excellent, Very Good to Poor categorisation of self-rated health status in the
original form of SF-36 is recoded to 100, 75, …, 0. The SF-36 is described in box 2,
50
and data derived from this instrument are considered later in this annex.
Following recommendations of the Washington Panel and the Technology Appraisal
Guidelines published by NICE, it is widely held that the weights for health states
(particularly in the context of cost-effectiveness analysis) should be based on those of
the relevant general population. Weights for health states could be elicited from many
different sources – such as health care professionals or patients. It is sometimes
argued that since patients have the relevant first-hand experience it is they who should
be consulted about the issue of weights. However, such weights can be expected to
vary in the course of the patient “journey” through ill-health; consequently, there is
scope for seeing different values arising from patients with (say) newly diagnosed
illness and those whose illness is longer-standing. Similarly, patients who have
wholly or partially recovered from an illness might hold different values to those
whose response to treatment has been less positive.
The requirement for a standard metric to measure health outcomes at system level
rules out the consideration of all condition-specific measures based as they are on
different descriptive systems each with an independent valuation or scoring system.
Profile measures (such as the Nottingham Health Profile and SF-36) report generic
health status in terms of separate dimensions of health but require a set of weights for
the different dimensions to produce a single health index. The set of generic measures
which produce a single index is small and includes Health Utilities Index, EQ-5D, 15D and AQLQ. Of these measures, only EQ-5D is calibrated in terms of social
preference weights of a UK population. EQ-5D is described in box 3.
Box 1: VF14
The VF14 is a condition-specific index of functional impairment for patients that suffer cataracts
(Steinberg 1994). Visual functioning is measured across fourteen vision-dependent activities, such as
the ability to read small print, watch television, or take part in sport. Each activity is scored from 1
(great deal of difficulty) to 4 (no difficulty). A single index (the VF-14) is calculated on a 0-100 scale,
by multiplying the average score across each activity by 25. Averaging across activity implies that they
are of equal importance to the determination of the overall level of visual functioning.
51
Box 2: Short Form 36
The SF36 is a general health questionnaire comprising 35 questions (termed “items”) about various
dimensions of current health status and 1 question relating current health status to that experienced a
year previously. The 35 items can be summarised into two measures, physical health and mental
health. This involves a three-stage process, in which items are first scored, collapsed into scales, and
then scales are collapsed into the summary measures.

First, items are scored to a 0-100 scale, from the least to most favourable responses for each
item. An item with three response categories is coded 0-50-100; an item with six responses is
coded as 0-20-40-60-80-100. Hence, a convenient set of cardinal values is imposed on the
ordinal responses for each item.

Second, sets of items are allocated to one of eight scales. Thus, for example, the physical
functioning scale comprises 10 items; social functioning contains just two. A score for each
scale is calculated by averaging the scores across items in each set. This averaging across
items assumes all are of equal importance in construction of the scale.

Third, the two summary health measures are constructed by amalgamating the four scales that
relate to physical health and the four that relate to mental health. The scores for these
summary measures are calculated as the average of the scale scores, without weighting for the
number of items that comprise the scale. This implies, for instance, in the summary physical
health measure items in the bodily pain scale (comprising two items) carry five times the
weight of the items in the physical functioning scale (comprising ten items).
Items
Scales
Summary
Measure
10 }
Physical functioning
4 }
Role-physical
2 }
Bodily pain
5 }
General health
4 }
Physical
Health
Vitality
2 }
Social functioning
3 }
Role-emotional
5 }
Mental health
Mental
Health
52
Box 3: EQ5D
The EQ-5D is a generic health status instrument which measures health along five dimensions. An
overall health status index EQ-5Dvas can be derived using a visual analogue scale on a 0-100 scale
where the endpoints are labelled “worst imaginable health” and “best imaginable health”. Alternatively
the index can be derived by time trade-off methods. A set of weights for the five EQ-5D dimensions
can then be derived by analysis of the scores on the five dimensions describing a given health state and
the overall summary scores for that health state for each individual. The resulting set of average
weights can then be applied to construct a health index EQ-5Dindex for all possible health states defined
in EQ-5D. Such weights have been periodically determined in UK population studies. The set of
weights most widely used in economic evaluation of healthcare are those from the Measurement and
Valuation of Health Project. These are the default values adopted in evaluation studies commissioned
by NICE.
4. Clinical trial data
The use of published studies to estimate change in health status has a number of
precedents (Berndt et al., 2001; Mai 2004) and may be combined with or validated by
expert opinion (Berndt et al., 2002). In principle trial data can yield estimates of
health status before treatment (hb), for those treated (h*) and – in circumstances where
a placebo was offered - those not treated (ho) at the same time point post treatment.
We sought to determine whether published studies contain information in the requisite
form for use in constructing a productivity index. To do this we have surveyed users
of the EQ5D instrument and reviewed studies that have employed the instrument. We
identified thirty published studies that had used the EQ-5D, and extracted and
summarised the published information in a standard format. The EQ-5D data from
this source used for the construction of the specimen index are given in Table C1.
The main conclusions are that:
1.
EQ-5D has been applied to the evaluation of a wide range of interventions but
coverage is a long way from being comprehensive. The studies are neither
representative of the general stream of NHS practice, nor are they concentrated
on high volume/expenditure activities, nor are they fully reflective of areas
53
where most technological progress has been made.
2.
There is a high degree of variability in the summary statistics reported in
studies, with differences in what measure of the average effect is used (mean or
median) and whether or not variance is reported.
3.
There is no common follow-up time across studies.
4.
No studies report results according to the year in which the intervention took
place so that even if there was more than one study of a particular treatment it
would be difficult to use the information to derive estimates of productivity
growth.
Added to these observations, there are substantial drawbacks to using data from
published clinical trials in constructing a productivity index:
•
The study population may not be representative of the patients who receive the
intervention in routine practice, given the exclusion criteria for many studies.
•
The description of interventions in such studies may not map well to the
classification of activities in the index. Usually trial interventions are very
precisely defined, while fairly aggregated classifications are used in the
construction of productivity indices.
•
In most trials alternative interventions are compared, say of a new technology
with existing practice. As such, there will be alternative estimates of effects.
There are no clear grounds for favouring one estimate over the other in
constructing a productivity index.
•
The follow-up time in studies is variable, and hence the full treatment effect
may not be captured by the estimate.
•
Trials are not conducted for all NHS activities and, as such, only partial
coverage can be achieved.
•
Measurement of productivity change requires information about whether the
effect of interventions has changed over time. There are very few examples of
published studies that have attempted to ascertain the extent to which the
effectiveness of the interventions being compared is determined by the time
the intervention was made.
54
Table C1 Estimates of health status before and after treatment from clinical
trials used for specimen index
Study
number
Diagnostic Group
Before
After
treatment
treatment
b
hbj / h*j
*
( hj )
( hj )
1
Rheumatoid arthritis
0.43
0.63
0.68
2
Ischaemic stroke
0.31
0.62
0.50
3
Psychiatric disorder
0.36
0.41
0.88
3
Rheumatoid arthritis
0.38
0.43
0.88
4
Liver transplantation
0.53
0.59
0.90
9
Chest pain
0.63
0.69
0.91
13
Spinal pain
0.59
0.69
0.86
13
Spinal pain
0.62
0.66
0.94
15
Acute mycardial infatction
0.68
0.72
0.94
24
Low back pain
0.7
0.8
0.88
25
Vaginal standard hysterectectomy
0.75
0.91
0.82
25
Vaginal laparoscopic hysterectectomy
0.76
0.92
0.83
25
Abdominal standard hysterectectomy
0.72
0.89
0.81
25
Abdominal laparoscopic hysterectectomy
0.69
0.87
0.79
28
Benign prostate hypertrophy
0.81
0.85
0.95
29
Acute lower back pain
0.69
0.8
0.86
29
Acute lower back pain
0.76
0.76
1.00
29
Acute lower back pain
0.69
1
0.69
30
Varicose veins - PIN stripping
0.73
1
0.73
30
Varicose veins - conventional stripping
0.8
1
0.80
mean
0.83
stddev
0.12
5. Observational data
In view of the difficulties associated with relying on clinical trial data, an alternative
is to measure health status before and after treatment using surveys of the general
stream of NHS patients. In the absence of routinely collected NHS data we explored
three existing sources of more limited observational data:

Health Outcomes Data Repository
55

York District Hospital

BUPA
5.1 Health Outcomes Data Repository
Since June 2002, the Health Outcomes Data Repository (HODaR) (www.cardiffresearch-consortium.co.uk/hodar) has operated a continuous health status survey of all
inpatients and outpatients at a single large Welsh Trust. These are linked to
individual-level hospital, primary and community care data. We were provided with
the HODaR database, containing 29,541 observations. We analysed the information in
order to gain an understanding of the challenges faced in collecting and using
observational data. Data on health status are collected using two generic health survey
instruments, SF36 and EQ-5D. In-patients are sent a postal questionnaire six weeks
post discharge. Outpatients are surveyed at the time of their clinic appointment.
Because the survey is linked to HES data, it is possible to analyse the health status
survey by condition, defined by ICD-10 diagnoses codes, procedure codes or
Healthcare Resource Group (HRG). Table C2 presents the mean and standard
deviation in EQ5D score by HRG for all 29,541 patients in descending order of HRG
frequency (only HRGs containing more than 65 patient surveys are listed).
For the full sample, mean EQ5D score is 0.66, implying that the average health state
experienced by these patients has 66% of the value of being in full health. The
variation in EQ5D score is high, with a standard deviation of 0.32. This is to be
expected, given the wide range of conditions included in the sample. However, the
variance is high even within each HRG category suggesting that, although they might
be resource homogenous, HRGs are not particularly “health outcome” homogenous.
This implies caution in the use of the mean EQ5D estimates as precise measures of
the health states experienced by patients in each HRG.
56
Table C2 EQ5D score by HRG
HRG
N
Mean
EQ5D
Standard
Deviation in
HRG
N
EQ5D
E03
Mean
EQ5D
Standard
Deviation in
EQ5D
Total
29541
0.66
0.32
E14
1141
0.61
0.31
M09
120
0.82
0.30
J37
811
0.76
0.28
Q14
120
0.56
0.33
F06
741
0.69
0.30
F54
118
0.77
0.26
B02
700
0.67
0.29
H37
116
0.63
0.27
M06
631
0.74
0.30
L27
115
0.66
0.32
S22
550
0.63
0.31
L46
114
0.59
0.34
F35
547
0.73
0.29
C22
112
0.72
0.33
M07
515
0.74
0.29
F32
112
0.73
0.26
120
0.69
0.29
L21
479
0.69
0.30
B03
110
0.60
0.31
N12
422
0.85
0.23
M11
105
0.85
0.24
J02
387
0.73
0.26
S25
105
0.57
0.41
E36
338
0.63
0.34
F31
103
0.70
0.27
E15
318
0.65
0.32
S01
99
0.70
0.26
F47
315
0.69
0.31
C02
93
0.77
0.29
F16
310
0.67
0.30
Q12
93
0.58
0.31
H10
296
0.59
0.33
R02
92
0.43
0.38
H04
253
0.51
0.30
E10
90
0.61
0.32
M03
243
0.70
0.28
F82
89
0.85
0.25
C24
237
0.70
0.31
H06
89
0.43
0.33
M02
232
0.75
0.28
M01
89
0.81
0.25
F74
228
0.77
0.26
H19
87
0.64
0.35
N07
225
0.89
0.20
J03
87
0.77
0.28
E04
221
0.63
0.27
B06
82
0.71
0.27
L17
217
0.74
0.27
F37
81
0.61
0.37
H02
204
0.53
0.30
F93
81
0.80
0.28
H17
199
0.67
0.32
H40
81
0.74
0.30
E12
196
0.68
0.28
A22
80
0.54
0.33
E33
194
0.58
0.28
H26
78
0.42
0.34
A07
191
0.25
0.34
L26
77
0.73
0.33
G14
190
0.72
0.29
E09
76
0.68
0.29
E34
178
0.57
0.32
F56
76
0.64
0.36
B09
172
0.69
0.30
F42
75
0.61
0.39
B05
164
0.66
0.31
H33
75
0.49
0.31
J35
154
0.77
0.28
C01
74
0.65
0.36
F73
151
0.70
0.26
K16
74
0.65
0.34
M05
149
0.79
0.28
E31
73
0.62
0.32
S16
147
0.55
0.38
K01
72
0.71
0.28
E30
142
0.71
0.29
E18
70
0.53
0.35
H13
142
0.66
0.32
L10
70
0.71
0.32
57
Q11
142
0.80
0.24
Q10
70
0.71
0.30
D21
140
0.56
0.34
A30
69
0.55
0.37
L19
140
0.70
0.28
D13
68
0.51
0.38
D20
136
0.45
0.34
E32
68
0.71
0.30
E35
133
0.57
0.31
J04
68
0.77
0.24
F46
128
0.60
0.32
L53
68
0.76
0.28
E29
126
0.67
0.24
R03
68
0.31
0.36
E08
125
0.64
0.28
S98
68
0.63
0.31
F36
123
0.52
0.34
D02
67
0.61
0.28
H14
123
0.76
0.28
K04
67
0.67
0.37
C14
121
0.76
0.32
L43
67
0.76
0.26
For 2,587 patients who had two courses of treatment, two EQ5D surveys are available
so that it is possible to measure the change in their health status over time Although
such patients are unlikely to be typical of all patients receiving care, we have
concentrated on analysing data for this subset of patients to see what light it sheds on
changes in health status.
This subset of patients differ along three important dimensions:
1. The elapsed time between the two surveys. This can be calculated because we
have information on the date that each survey was completed.
2. Whether the two surveys relate to the same underlying condition. We have
assessed whether coding of diagnosis remains similar for the two surveys.
Different diagnostic codes may indicate that the surveys relate to different
underlying conditions.
3. The ordering of the settings to which the surveys refer. Ordering is determined by
the survey identification number. The interpretation placed on the change in health
status observed between each survey depends on this ordering. There are four
possible scenarios, depicted in figure C3 below.
58
1.0
Health status
Health status
Figure C3 Timing scenarios of relationship between surveys
t 0+6
t0
t1
1.0
t0
time
1.0
t0
t1
t 1+6
time
Scenario 2
Health status
Health status
Scenario 1
t 0+6
t1
1.0
t0
time
Scenario 3
t1
t 1+6
time
Scenario 4
Scenario 1: Inpatient survey followed by outpatient survey.
Under this scenario, the patient is admitted at t0, completes a post-discharge survey at
t0+6 and completes an outpatient survey during their clinic visit at time t1. The change
in health status is calculated as ∆h1 = EQ 5Dt1 − EQ5Dt0+ 6 . Even if the outpatient visit
relates to the same health problem as the earlier admission it is not possible to predict
the sign of ∆h1. It may be positive if the patient continues to recover during the time
between the surveys. But if the outpatient visit is triggered by deterioration in the
patient’s condition, ∆h1 will be negative. Moreover the full treatment effect will not be
captured, because any recovery occurring between t0+6 and t0 is not recorded. It may
well be that, for many interventions, the most dramatic changes in health status occur
during this period.
Scenario 2: Two inpatient surveys
Under
this
scenario
the
change
in
health
status
is
calculated
as
59
∆h2 = EQ5Dt1+6 − EQ5Dt0+ 6 . There is a high probability that the two inpatient
admissions are unrelated, this probability increasing the longer the elapsed time
between the two surveys. Irrespective of whether or not the admissions are related, it
is not possible to predict the sign of ∆h2.
Scenario 3: Two outpatient surveys
Here the change in health status is calculated as ∆h3 = EQ5Dt1 − EQ 5Dt0 . As for the
previous scenario it is not possible to predict the sign of ∆h3.
Scenario 4: Outpatient survey followed by inpatient survey
The change in health status is calculated as ∆h4 = EQ5Dt1+6 − EQ5Dt0 . This scenario
may represent a before-and-after measurement of the intervention effect. This is more
likely if the outpatient visit and admission are related, the shorter the lapsed time
between the outpatient visit and admission, and if most of the post-treatment recovery
occurs in the six-week post-discharge period. The more these conditions prevail, the
more likely that ∆h4 will have a positive sign.
5.1.1 Analysis
Table C3 shows the number of patients, change in EQ5D score, and time between the
two surveys for each of the above scenarios. The mean change in EQ5D is negative
for all four scenarios, although there is a wide range in scores. There are a number of
EQ5D health states to which negative values are attached, implying that these are
health states considered worse than death. Changes into or out of such health states
may lead to ∆h taking values in excess of ± 1 .
60
Table C3 Change in EQ5D score and time between surveys, by scenario
Scenario
N
Mean change Min change in Max change in
in EQ5D
EQ5D
EQ5D
Mean time
between
surveys (days)
1 Inpatient to Outpatient
502
-0.0075
-1
1
633
2 Inpatient to Inpatient
1612
-0.0015
-1
1.016
634
3 Outpatient to Outpatient
192
-0.0196
-0.912
0.796
489
4 Outpatient to Inpatient
281
-0.0069
-1
1
581
The lapsed time between surveys is often considerable, amounting to around a year
and a half on average. This is the case even for scenario four, where the average time
between surveys amounts to 581 days. This undermines the case for considering the
surveys for this subset of patients as constituting before-and-after measurements. That
said, the distribution over time for these patients is highly skewed. The patients can be
classified into three groups:
1. 48 patients for whom the date of the outpatient and inpatient surveys are identical
(the ordering of surveys is determined by the survey identification number). This
implies that the outpatient survey was completed retrospectively, at the same time
as the inpatient survey. There may be problems of recall for these patients.
2. Those for whom the elapsed time is very long. It is doubtful that these surveys
constitute before-and-after measurements.
3. Remaining patients, where elapsed time is short enough to suggest they may be
before-and-after surveys. Of course, this requires a judgement to be made about
what constitutes a ‘short enough’ time.
The possibility that the two surveys constitute before-and-after measurement is further
undermined when considering the diagnosis recorded at each survey. The ICD-10
codes recorded over the two periods are rarely the same, and there is little consistency
even in the HRG chapter to which the patient is classified. Table C4 shows how the
281 patients under scenario four are classified to HRG chapters at the first and second
survey. Shaded boxes along the diagonal indicate the number of patients who remain
in the same chapter for both surveys with only 43% doing so.
61
The off-diagonal cells show the number of patients who moved HRGs between the
surveys. As an example, the first row shows where the 7 patients originally classified
to Chapter A (Nervous System) were classified in their second survey. Thus, 3
remained in this chapter, one patient was subsequently classified to Chapter C
(Mouth, Head, Neck and Ears) and three to Chapter H (Musculoskeletal). The extent
of this movement across HRG chapters implies that caution should be exercised in
attaching the change in EQ5D score to a specific underlying condition. Moreover,
inconsistency in the condition to which patients are classified over time increases data
requirements substantially (by the power of the additional classes that need to be
considered).
62
Table C4 Change in HRG chapter classifications between surveys, scenario 4 patients
HRG Chapter (2nd Survey)
HRG Chapter (1st Survey)
Nervous System
A
B
3
Eyes and Periorbita
Mouth, Head, Neck and Ears
D
1
1
1
Cardiac Surgery
1
Digestive System
G
6
L
1
2
1
3
2
2
18
4
4
1
5
Urinary Tract and Male Reproductive System
Q
R
S
U
1
2
3
2
4
1
2
3
3
2
2
9
1
2
3
1
4
1
1
1
1
1
2
2
1
1
1
1
3
3
2
1
3
5
2
13
4
53
5
52
2
27
2
17
2
7
2
2
21
1
3
25
1
5
5
1
10
1
2
1
1
10
17
4
16
37
281
1
1
11
1
1
1
11
1
1
1
1
1
4
2
Spinal Surgery and Primal Surgery Conditions
1
Haematology, Infectious Diseases
1
Mental health
1
Invalid Primary Diagnosis
2
5
7
8
6
57
1
35
10
8
7
3
1
Total
7
4
Obstretics and Neonatal Care
Total
N
1
2
Vascular System
M
1
2
1
2
K
1
34
Endocrine and Metabolic System
Female Reproductive System
J
2
1
1
1
H
1
1
Musculoskeletal System
F
3
1
1
Hepato-Biliary and Pancreatic System
E
1
Respiratory System
Skin, Breast and Burns
C
22
1
1
1
1
1
24
20
14
3
3
1
10
63
Table C5 Change in EQ5D score between surveys, scenario 4 patients
HRG Chapter (2nd Survey)
HRG Chapter (1st Survey)
Nervous System
A
B
0.20
Eyes and Periorbita
Mouth, Head, Neck and Ears
D
E
F
G
0.15
0.59
-0.59
Cardiac Surgery
-0.10
Digestive System
Hepato-Biliary and Pancreatic System
-0.16
Musculoskeletal System
-0.07
-0.28
K
L
-0.06
-0.33
0.06
0.06
0.00
0.04
0.12
0.04
-0.09
-0.30
-0.28
-0.03
-0.28
0.08
-0.06
-0.21
0.08
-0.11
-0.14
0.19
0.00
0.03
0.12
0.15
-0.17
-0.26
0.00
Urinary Tract and Male Reproductive System
Q
R
S
0.11
U
Total
-0.16
-0.03
-0.15
-0.31
-0.18
-0.10
0.05
-0.07
Obstretics and Neonatal Care
-0.40
0.05
0.75
0.19
0.01
0.18
0.11
0.56
0.06
-0.19
-0.11
0.31
0.03
0.04
0.05
0.13
-0.07
-0.04
0.06
-0.02
0.05
0.00
0.08
-0.07
-0.24
0.00
-0.05
0.00
-0.14
0.00
-0.08
-0.02
1.00
0.04
-0.03
0.00
-0.03
0.00
0.00
-0.12
-0.54
0.00
1.00
-0.12
0.00
0.20
0.00
-0.15
0.35
0.10
0.15
0.00
Haematology, Infectious Diseases
-0.66
Mental health
0.10
Invalid Primary Diagnosis
1.00
-0.07
0.35
0.04
-0.15
-0.07
0.06
0.00
-0.07
0.02
-0.11
0.00
-0.06
0.00
-0.01
-0.07
0.00
0.11
Spinal Surgery and Primal Surgery Conditions
Total
N
0.00
-0.35
Vascular System
M
-0.11
-0.03
-0.10
-0.04
J
-0.84
Endocrine and Metabolic System
Female Reproductive System
H
-0.39
-0.04
-0.10
Respiratory System
Skin, Breast and Burns
C
-0.16
-0.27
-0.25
0.07
0.20
-0.05
0.01
0.28
0.10
0.00
-0.10
0.00
0.05
-0.62
0.49
0.03
0.07
0.02
0.04
0.01
-0.01
64
Finally for completeness, Table C4 shows the mean change in EQ5D score (∆h4) for
the patients in scenario 4 according to their classifications to HRG chapters across the
two surveys. Obviously, small numbers in each cell preclude drawing any conclusions
from these data.
5.1.2 Conclusions
We have analysed the HODaR set of observational data to ascertain whether the
information can be utilised in the construction of outcome weights for a productivity
index. We have concluded that the HODaR data are unsuitable for this purpose. The
surveys have not been administered with the express intention of collecting before and
after information. Although multiple surveys exist for a subset of patients, it is
unlikely that many of these constitute before-and-after measurements.
However, the HODaR data do provide an indication of the variation in EQ5D scores
for particular conditions (e.g. specific diagnoses or HRGs). This information might be
used to assess the sample size requirements for the collection of other observational
data.
5.2 York District Hospital
The Orthopaedics and Trauma unit at York District Hospital began collecting SF36
data since March 2001 for patients undergoing hip and knee replacements. Patients
completed the SF36 up to December 2002 and SF12 thereafter at the following times:
•
Initial consultation in the outpatients setting
•
Pre-operatively upon admission
•
Between 3-6 months following their operation
•
12 months post-operatively
Post-operative information was collected predominantly from postal questionnaires,
although some patients may have filled in the forms during a follow-up outpatient
appointment.
The SF36 data are analysed to answer the following questions:
•
Does it matter when hb is measured?
65
•
What are the values of hb and h*?
•
Does it matter when h* is measured?
The SF12 data proved of limited value, and hence analytical results are not reported.
5.2.1 Results
Timing of the measurement of hb
Patients in the sample completed SF36 either at initial consultation or at admission.
The rationale was to ascertain whether patients experienced any pronounced
deterioration in their health status prior to admission. If no such deterioration occurred
the pre-treatment measures can be considered equivalent.
•
For patients having hip replacements, there were no differences among
patients in terms of their age and gender as to when the before treatment SF36
was administered (Table C5)
•
Those surveyed at admission reported slightly worse bodily pain than those at
initial consultation (23.8 compared to 26.95), but there were no differences for
these patients along the other SF36 scales.
•
85 patients waiting for hip replacements were surveyed at both periods (Table
C6). These patients reported improved general health between their initial
consultation and at admission (58.37 compared to 64.27), but there were no
differences across the other SF36 scales.
•
For patients having knee operations, a higher proportion of women were
surveyed at the initial consultation than on admission (Table C5).
•
Mean SF36 scores on some scales were higher at admission than they had
been at the initial consultation. These scales were physical functioning,
general health and the role-emotional element of mental health.
•
The patients surveyed at admission reported higher levels of physical health
and of mental health than those surveyed at the initial consultation (physical
health: 35.1 compared to 30.93; mental health: 57.67 compared to 52.28).
•
56 patients were surveyed at both times, and these reported improvements in
66
their general health, vitality and mental health (Table C6). Overall, these
patients experienced an improvement in their mental health summary score
(58.67 compared to 51.03).
•
These findings are contrary to the expectation that patients would have
experienced deterioration in their health status while waiting for admission.
These results suggest that the timing at which the SF36 is administered may have an
impact on the level of health reported.
Table C5 SF36 scores at initial consultation and pre-operatively, all patients
Hips
Initial
consultation
Males
Females
n
Pre-operative
Knees
Initial
consultation
n
Preoperative
n
n
41.50%
58.50%
93
131
41.90%
58.10%
101
140 NS
38.60%
61.40%
64
102
49.10%
50.90%
162
168 *
Age
68.23
224
67.78
241 NS
70.11
166
70.05
328 NS
Physical functioning
Role-Physical
Bodily pain
General health
Physical Health
23.88
10.1
26.95
63.53
31.49
224
217
221
221
212
22.83
10.35
23.8
65.69
30.68
239
236
239
240
231
NS
NS
*
NS
NS
21.65
12.04
28.66
61.49
30.93
165
164
164
163
159
26.49
16.12
31.31
66.71
35.1
328
318
325
322
310
**
NS
NS
**
**
Vitality
Social functioning
Role-Emotional
Mental health
Mental health
43.39
48.65
45.78
69.38
52.06
224
223
217
224
217
41.62
46.99
45.53
68.45
50.73
241
241
235
241
235
NS
NS
NS
NS
NS
44.59
52.33
43.57
69.38
52.28
164
166
158
165
156
48.44
56.4
52.78
72.59
57.67
327
324
318
328
315
NS
NS
*
NS
*
67
Table C6 SF36 scores at initial consultation and pre-operatively, patients
surveyed on both occasions only
Related observations
Hips
Knees
Initial
consultation
Pre-operative
n
Physical functioning
Role-Physical
Bodily pain
General health
Physical Health
19.08
5.95
22.76
58.37
26.74
17.31
8.23
23.18
64.27
28.5
85
84
84
84
82
Vitality
Social functioning
Role-Emotional
Mental health
Mental health
40.67
47.21
39.23
64.16
48.11
41.2
45.29
43.9
67.89
49.95
85
85
82
85
82
Initial
consultation
Pre-operative
n
NS
NS
NS
**
NS
22.82
6.73
25.81
63.9
29.84
21.44
12.5
29.59
70.32
32.86
56
52
54
52
48
NS
NS
NS
*
NS
NS
NS
NS
NS
NS
41.64
51.34
39.74
70.91
51.03
49.82
57.81
50
76.33
58.67
55
56
52
55
51
**
NS
NS
**
**
The values of hb and h* and timing of the measurement of h*
Tables C7 and C8 compare the mean pre-operative scores by SF36 dimension with
the mean scores 3-4 months and 1 year after the operation for those having hip and
knee replacements. The main points are that:
•
Patients showed improved scores across all dimensions.
•
Much of the improvement occurs in the immediate post-treatment period, with
the change in health status being little changed between the first follow-up
period and 1 year after treatment. This implies that a six month follow-up is
sufficient to capture the treatment effect for these patients.
•
A number of patients completed their first follow-up at 6 months rather than 34 months. Some completed the SF36 at both 3-4 months and at 6 months. The
scores for these patients are little different between the two periods. This
implies that the data across these two periods can be considered equivalent for
the purposes of analysis.
•
In other words, the h*(t) time path depicted in figure C1 can be considered flat
between 3 and 12 months for these two types of patient.
68
Table C7 SF36 scores for hip replacement, patients with both before and after
surveys only
York Trust
Hip Replacements
SF36 data: comparison of pre-op and follow-up scores, by dimension
Pre-Op
3/4 months Sig. Level
Pre-Op
139 cases
1 yr
Sig. Level
53 cases
Physical functioning
Role-Physical
Bodily pain
General health
24.8
11.6
27.1
66.1
61.3
44.3
65.1
70.6
**
**
**
**
23.2
8.2
26.1
67.8
61.9
53.8
68.8
70.7
**
**
**
NS
Physical Health
32.5
60.3
**
31.5
63.8
**
Vitality
Social functioning
Role-Emotional
Mental health
45.3
52
51
70.3
59.6
79.2
72.2
78.4
**
**
**
*
43
53.3
41.8
66.1
63.7
80.9
74.8
77.7
**
**
**
**
Mental health
54.6
72.6
**
51.6
74.3
**
* sig at 95%
** sig at 99%
69
Table C8 SF36 scores for hip replacement, patients with both before and after
surveys only
York Trust
SF36 data: comparison of pre-op and follow-up scores, by dimension
Knee Replacements
IC/Pre-op 3/4 months Sig. Level
IC/Pre-op
1 yr
Sig. Level
134 cases
54 cases
Physical functioning
Role-Physical
Bodily pain
General health
27.4
13.4
31.1
69.1
49.2
35.8
58
67.4
**
**
**
NS
24.5
14.2
32.8
66.3
49.6
44.9
57.1
66.1
**
**
**
NS
Physical Health
35.5
52.6
**
34.8
54.4
*
Vitality
Social functioning
Role-Emotional
Mental health
48.4
57
52.3
74.5
53.3
72.5
62.2
76.4
**
**
*
*
45.7
48.4
40.1
68.6
54.6
71.1
72.5
74.8
**
**
*
**
Mental health
58.2
66.1
*
51.5
68.3
**
* sig at 95%
** sig at 99%
5.2.2 Conclusions
A number of lessons emanate from the York experience:
•
Data entry was undertaken using an optical reader. This was highly deficient,
generating a large number of duplicate records and coding inaccuracies. The
technology has to improve substantially for this to be a cost-effective
alternative to professional data entry.
•
For a large proportion of patients originally surveyed there is no information
about their post-treatment experience. This implies that effort must be directed
at ensuring that patients are followed-up. This is not a trivial matter. It requires
a system to alert when a follow-up questionnaire is due to be posted and a
system for ensuring it is returned.
•
For these treatments, there was little difference between the two before
treatment surveys, suggesting that patients did not experience a marked
deterioration in their condition in the immediate period prior to treatment.
70
Indeed, contrary to expectations, patients waiting for knee replacements
reported improvements along some dimensions.
•
There was little difference in SF36 scores at the first follow-up and one year
post-treatment. This implies that, for these conditions, most of the treatment
effect is realised in the first 6 months, and that measurement beyond this
period adds little additional information.
•
Part way through the data collection exercise a decision was made to change
instruments from the SF36 to SF12. These instruments are not directly
comparable, so it was not possible to analyse treatment effects for those
patients who originally completed SF36 but who were sent SF12 postoperatively. Particularly if an objective of the exercise is to identify time
trends, it is important to collect data in a consistent fashion.
71
5.3 BUPA
Since 1998 BUPA have been collecting outcomes data (before and after health status)
on their patients from 70 private hospitals. BUPA provided CHE/NIESR with a
dataset containing 90,000 patient treatment episodes covering the period 1998-2003.
Patients fill in a health status questionnaire before treatment and three or four months
post treatment depending on whether the instrument is SF-36 or VF-14 (for cataract
procedures since 2001). Originally BUPA aimed for comprehensive data collection
but, since 2002, efforts have been concentrated on collecting data from 20 “sentinel”
high volume procedures. We analyse the data for two purposes:
•
To identify temporal trends in health gain that could be attributed to
improvements in each procedure rather than (say) changes in the type of
patients receiving the procedure. For this analysis we concentrate on sentinel
procedures for which more than 1,000 before and after surveys are available.
•
To derive estimates of before and after health status for use in the specimen
index. Full sample information is exploited for this purpose
5.3.1 Temporal trends in health gain
Method
The key attraction of the BUPA data is that it is possible to explore whether any
technological improvements in the provision of care that took place over the period
led to improvements in health status. This is investigated as follows:
•
To ascertain whether any trends in health gain are discernible we compare the
before and after change in health status over time;
•
Changes in health status may be attributable to changes in the characteristics
of patients. We attempt to identify conditional trends by controlling for the age
and gender of patients.
We regressed individual health gain, measured by either SF36 or VF14, on a set of
72
covariates and on time dummies. Covariates include age, gender, and hospital dummy
variables. For both the SF36 and VF14, health gain, ∆hi , is calculated as:
∆hi = hi* − hib
where hi* is the post-treatment measure of health; hib is the pre-treatment measure; and
i indexes patients. We then calculate the time trend, conditioning on various factors
that might explain measured health status. The most comprehensive regression model
took the following form:
∆hi = α + β1agei + β 2 agei2 + β 3 agei3 + β 4 genderi +
J
K
T
j =1
k =1
t =1
∑ δ j providerij + ∑ δ k seasonik + ∑ γ t yearit + ε i
We explored various permutations of this specification and report results for a
reduced model of the following form:
T
∆hi = α + β1agei + β 2 agei2 + β 3agei3 + β 4 genderi + ∑ γ t yearit + ε i
t =1
Data
Since 2002, BUPA has concentrated on collecting data on sentinel (high volume)
procedures. We analyse those for which more there are more than 1,000 observations
in total but restrict analysis only to those patients who completed both a before (preoperative) and an after (3 months post-operative) survey. When the SF36 instrument
is employed this limits the available sample to two-thirds of the original sample,
because not all of patients originally surveyed had completed an “after treatment”
survey at the time at which the data were extracted and because there was item nonresponse in a number of surveys. In contrast, we have complete before and after
surveys for around 95% of those surveyed using the VF14 instrument.
73
Table C9 Numbers of BUPA patients surveyed, selected conditions
HRG code
F73 & F74
G13 & G14
A07
H23 & H24
C14
H02
B02 & B03
B02 & B03
BUPA code and description
T2000 - Primary inguinal hernia
J1830 - Laparoscopic cholecystectomy
A5210 - Epidural injection
M6530 - TURP
F0910 - Wisdom Teeth extraction
W3710 - Primary hip replacement
C7120 & C7180 - Cataract Surgery
C7120 & C7180 - Cataract Surgery
Instrument
SF36
SF36
SF36
SF36
SF36
SF36
SF36
VF14
Initial
sample
3,172
1,322
1,033
1,021
1,686
3,198
3,441
2,759
Usable
sample
2,169
901
633
617
1,254
1,827
1,703
2,613
Results
Before and after health status
The before and after health status of BUPA patients is reported for each SF36 scale
and by the SF36 summary measures in Table C10. On average, patients generally
experienced a significant positive change in both their physical and mental health
status. The exceptions were patients who had wisdom teeth extraction, who
experienced no change in their mental health, and patients who underwent
transurethral resection of the prostate (TURP), who report no change in either their
physical or mental health.
The SF36 scales provide an indication of what aspects of health the treatments have
most impact upon. For instance, patients who have treatment for a primary inguinal
hernia report most improvement in the “role-physical” and “bodily pain” scales of the
SF36, but less change along the other SF36 scales. This “drill down” ability appears
to be an attractive feature to clinicians of the SF36 instrument.
Table C10 SF36 scores by dimension and condition
74
F73 & F74 Primary inguinal
hernia
Pre-Op
3 mths
Sig. Level
C14 Wisdom teeth extraction
Pre-Op
2169 cases
3 mths
80.7
65.9
66.3
75.3
86
79.6
79.2
73
**
**
**
**
94.6
93.1
78.1
79.4
93.8
94.1
87.6
77.4
Physical Health
72.1
79.4
**
86.3
88.2
Vitality
Social functioning
Role-Emotional
Mental health
67.4
85.1
90.8
81.9
68.1
90.5
90.5
82.3
69.5
91.5
94.4
79.8
69.1
91.9
91.9
79.6
Mental health
81.3
82.8
83.8
83.1
**
**
H02 Primary hip replacement
3 mths
Pre-Op
1254 cases
Physical functioning
Role-Physical
Bodily pain
General health
Pre-Op
Sig. Level
G13 & G14 Laparoscopic
cholecystectomy
Sig. Level
1827 cases
3 mths
Sig. Level
Pre-Op
3 mths
**
**
84.7
62.2
51.2
70.7
87.2
83
80.2
72.7
**
**
**
**
79.4
70.4
74.4
70.8
79.1
69.7
77.8
68.4
**
67.2
80.8
**
73.7
73.7
53.9
73
84.4
74.5
64.4
87.9
89.2
79.4
**
**
**
**
63.7
81
88.2
80.3
63.6
84.2
87.4
82
71.4
80.2
**
78.3
79.3
**
Sig. Level
Sig. Level
617 cases
901 cases
A07 Epidural injection
Pre-Op
3 mths
H23 & H24 TURP
**
**
**
**
B02 & B03 Cataract surgery
Pre-Op
633 cases
3 mths
Sig. Level
1703 cases
Physical functioning
Role-Physical
Bodily pain
General health
31.8
17.2
29.3
69.41
59.54
38.68
64.31
70.92
**
**
**
**
48.4
22.8
28.7
61.6
58
39.3
47.4
57.9
**
**
**
**
72.3
73.4
78.5
69.7
71.4
68.8
75
67.7
**
**
**
**
Physical Health
36.93
58.36
**
40.4
50.7
**
73.5
70.7
**
Vitality
Social functioning
Role-Emotional
Mental health
47.68
58.33
72.94
74.91
61.11
76.95
81.66
82.02
**
**
**
**
44.6
54.7
74.1
68.6
51.1
67.2
74.4
71.7
**
**
**
64.3
87.1
88.3
81.5
62
85.2
85.1
80.6
**
**
**
**
Mental health
63.46
75.44
**
60.5
66.1
**
80.3
78.2
**
75
Unconditional time trends
With the exception of the final graph, the graphs below plot the mean and 95%
confidence intervals for the difference in before and after SF36 physical and mental
health summary scores for each of the conditions over time. Time is plotted by
season, with the first set of observations corresponding to winter 1998 and the final
set to spring 2003.
The final graph provides equivalent information derived from the VF14 applied to
patients having cataract surgery. This series is shorter, commencing in winter 2001
and finishing in spring 2004.
•
None of the graphs based on the SF36 provide evidence of a time trend that
might be attributable to technological progress.
•
There does appear to have been a temporal improvement in the change in
health status as measured by the VF14. The 144 cataracts patients surveyed in
winter 2001 reported a 12-point improvement in their VF14 score following
treatment. The 184 patients surveyed in spring 2004 reported a 14-point
improvement. Generally, patients treated in each successive quarter reported
marginal improvements compared to those treated in the preceding quarter.
•
The time trend captured by the VF14 instrument is not in evidence when the
SF36 is applied to cataract patients. This may be because the VF14 is more
sensitive than the SF36 to the specific dimensions of health gain that cataract
patients experienced.
•
It may be that responses to mental health questions, in particular, are
contaminated by seasonal effects. If there is a seasonal effect, it may be
revealed in the type of pattern evident in the SF36 mental health summary
scores for patients who had hip replacements. In quarters 8 and 12 (both JulySeptember), mean summary scores appear higher than those in other quarters.
However, there is no general evidence that these survey responses are
seasonally affected.
76
SF36 physical score by time, hernia
20
95% CI DIFFPHYS
10
0
-10
N=
34
62
95
101
90
161
144
157
192
1
2
3
4
5
6
7
8
9
185
145
128
126
154
136
110
97
52
10 11 12 13 14 15 16 17 18
TIME
SF36 mental score by time, hernia
10
8
6
95% CI DIFFMENT
4
2
0
-2
-4
N=
34
62
95
101
90
161
144
157
192
1
2
3
4
5
6
7
8
9
185
145
128
126
154
136
110
97
52
10 11 12 13 14 15 16 17 18
TIME
77
SF36 physical score by time, wisdom teeth extraction
20
95% CI DIFFPHYS
10
0
-10
N=
12
38
84
78
71
101
74
79
106
1
2
3
4
5
6
7
8
9
107
88
64
65
75
64
64
60
24
10 11 12 13 14 15 16 17 18
TIME
SF36 mental score by time, wisdom teeth extraction
10
95% CI DIFFMENT
0
-10
-20
N=
12
38
84
78
71
101
74
79
106
1
2
3
4
5
6
7
8
9
107
88
64
65
75
64
64
60
24
10 11 12 13 14 15 16 17 18
TIME
78
SF36 physical score by time, laparoscopic cholecystectomy
40
30
95% CI DIFFPHYS
20
10
0
-10
N=
10
25
30
33
54
62
82
50
67
83
1
2
3
4
5
6
7
8
9
10 11 12 13 14 15 16 17 18
74
53
68
51
50
45
53
11
TIME
SF36 mental score by time, laparoscopic cholecystectomy
40
30
95% CI DIFFMENT
20
10
0
-10
N=
10
25
30
33
54
62
82
50
67
83
74
53
68
51
50
45
53
11
1
2
3
4
5
6
7
8
9
10 11 12 13 14 15 16 17 18
TIME
79
SF36 physical score by time, TURP
40
30
20
95% CI DIFFPHYS
10
0
-10
-20
-30
N=
9
19
35
27
20
41
47
44
52
62
1
2
3
4
5
6
7
8
9
10 11 12 13 14 15 16 17 18
36
47
41
36
37
28
25
11
TIME
SF36 mental score by time, TURP
30
20
10
95% CI DIFFMENT
0
-10
-20
-30
N=
9
19
35
27
20
41
47
44
52
62
1
2
3
4
5
6
7
8
9
10 11 12 13 14 15 16 17 18
36
47
41
36
37
28
25
11
TIME
80
SF36 physical score by time, hip replacement
40
30
95% CI DIFFPHYS
20
10
0
N=
13
31
81
71
70
105
109
121
156
1
2
3
4
5
6
7
8
9
145
156
130
149
146
92
113
76
63
10 11 12 13 14 15 16 17 18
TIME
SF36 mental score by time, hip replacment
30
20
95% CI DIFFMENT
10
0
-10
N=
13
31
81
71
70
105
109
121
156
1
2
3
4
5
6
7
8
9
145
156
130
149
146
92
113
76
63
10 11 12 13 14 15 16 17 18
TIME
81
SF36 physical score by time, epidural injection
30
20
95% CI DIFFPHYS
10
0
-10
N=
23
26
28
47
39
45
31
42
28
40
47
42
41
45
43
38
25
2
3
4
5
6
7
8
9
10
11
12 13
14
15
16
17
18
TIME
SF36 mental score by time, epidural injection
20
10
95% CI DIFFMENT
0
-10
-20
N=
23
26
28
47
39
45
31
42
28
40
47
42
41
45
43
38
25
2
3
4
5
6
7
8
9
10
11
12 13
14
15
16
17
18
TIME
82
SF36 physical score by time, cataract surgery
10
95% CI DIFFPHYS
0
-10
-20
N=
27
62
94
101
98
132
141
137
159
1
2
3
4
5
6
7
8
9
148
97
132
81
45
39
26
15
1
10 11 12 13 14 15 16 17 18
TIME
SF36 physical score by time, cataract surgery
10
95% CI DIFFPHYS
0
-10
-20
N=
27
62
94
101
98
132
141
137
159
1
2
3
4
5
6
7
8
9
148
97
132
81
45
39
26
15
1
10 11 12 13 14 15 16 17 18
TIME
83
VF14 by time, cataract surgery
20
18
16
95% CI DIFFVF14
14
12
10
8
6
N=
144
320
238
260
279
271
251
218
217
235
184
1
2
3
4
5
6
7
8
9
10
11
TIME
84
Conditional time trends
The table below reports coefficients from regressing the change in the SF36 physical
and mental health summaries on age, gender (male=1) and year (with 1998 and 1999
being the baseline). The final column of the table reports the change in VF14 for
cataract patients. Coefficients that are significant at 5% are highlighted.
The main messages are:
•
Very few variables are statistically significant;
•
There is no evidence of year-on-year trends in changes in the SF36 physical or
mental health summary scores for any of the conditions;
•
For VF14, the coefficients on the year dummies are increasingly positive. This
suggests that the time trend identified in the unconditional analysis is not
explicable by changes in the age and gender composition of patients having
cataract surgery.
85
n
G13/G14
J1830
H23/H24
M6530
H02
W3710
A07
A5210
Laparoscopic
cholecystectomy
TURP
primary hip
replacement
Epidural injection
2168
1253
900
616
1826
B02/B03
B02/B03
C7120/C7180 C7120/C7180
632
Cataract surgery
C14
F0910
Cataract surgery
F73/F74
T2000
Wisdom teeth
extraction
HRG code
BUPA code
Primary inguinal
hernia
Table C11 Regression results: dependent variable “Change in health status”
1702
VF14
SF36 physical health summary score
Constant
Age
Age^2
Age^3
Gender
2000
2001
2002
2003
13.792
-0.033
-0.001
0.000
-3.666
-1.375
-2.481
0.478
-2.816
-1.280
0.134
0.000
0.000
1.149
0.606
0.841
-0.132
-1.461
-3.463
1.420
-0.031
0.000
1.728
3.580
0.813
-0.578
2.504
2617
16.006
-2.091
0.054
0.000
7.616
-0.274
-1.308
2.389
9.888
54.614
-1.778
0.034
0.000
1.534
-1.082
-0.422
0.553
0.328
-4.076
1.207
-0.026
0.000
-2.310
0.677
-1.349
-0.674
-7.845
-21.141
1.185
-0.019
0.000
0.175
-0.106
-0.475
-1.238
-2.595
-40.681
1.361
-0.010
0.000
11.680
-1.398
-2.191
-2.504
-5.879
36.542
-1.319
0.024
0.000
4.775
-1.769
-1.643
-1.147
1.419
-1.476
0.486
-0.008
0.000
-2.442
2.528
1.954
-2.379
-5.339
-25.284
1.279
-0.020
0.000
0.415
0.560
0.009
-2.194
-1.828
-164.629
8.308
-0.126
0.001
1.231
0.265
1.842
3.132
SF36 mental health summary score
Constant
Age
Age^2
Age^3
Gender
2000
2001
2002
2003
6.692
-0.120
0.000
0.000
-1.450
-0.131
-1.455
0.000
-0.223
-2.937
0.037
0.001
0.000
0.899
0.508
-0.530
0.648
-0.248
-11.132
1.347
-0.026
0.000
2.375
1.511
-0.373
-2.599
4.177
coefficients significant at 5% in bold
86
5.3.2 SF36 data used in the specimen index
Table C12 provides details of the before and after estimates of health status that were
used in constructing the specimen index. Attention is drawn to the following:
•
The BUPA data are reported at procedural level whereas the specimen index
aggregated activities by HRG. We mapped the BUPA procedural codes to
their appropriate HRG. There is a presumption, therefore, that experience of
patients undergoing the BUPA procedures are representative of all those
patients classified within the relevant HRG. This presumption is less likely to
hold the more heterogeneous the HRG, some of which amalgamate a large
number of different types of procedure.
•
Only the physical summary health score from the SF36 is used in the specimen
index. There is no obvious way to combine the physical and mental health
summary scores into a single index.
•
Mean before and after treatment health status is estimated from the full sample
for whom data were available not just those who completed both surveys.
Comparison of the mean values reported in Table C10 with those reported
below shows that the means are similar.
87
Table C12 Estimates of health status before and after treatment from BUPA and
York Trust used for specimen index
BUPA code
HRG
A5210 - Epidural injection
A07
C7120 - Cataract Surgery
B02
C7180 - Cataract Surgery
B03
HRG description
Intermediate Pain Procedures
Phakoemulsification Cataract Extraction
with Lens Implant
Other Cataract Extraction with Lens
Implant
Before
treatment
n
After
treatment
n
977
0.413
909
0.518
2724
0.726
2524
0.688
336
0.699
293
0.656
F0910 - Wisdom Teeth extraction
C14
Mouth or Throat Procedures - Category 2
1625
0.870
1599
0.862
E0360 - Septoplasty
C22
Nose Procedures - Category 3
758
0.830
749
0.826
F3440 - Adult tonsillectomy
C24
Mouth or Throat Procedures - Category 3
618
0.767
607
0.846
K4100 - CABG
E04
747
0.500
725
0.661
K4900 - PCTA
E15
85
0.539
75
0.722
T2000 - Primary inguinal hernia
F73
776
0.644
702
0.692
T2000 - Primary inguinal hernia
F74
Coronary Bypass
Percutaneous Transluminal Coronary
Angioplasty (PTCA)
Inguinal and Umbilical or Femoral Hernia
Repairs >69 or w cc
Inguinal and Umbilical or Femoral Hernia
Repairs <70 and w/o cc
Biliary Tract - Major Procedures > 69 or w
cc
Biliary Tract - Major Procedures <70 and
w/o cc
Primary Hip Replacement
Primary Knee Replacement
Soft Tissue Disorders > 69 or w cc
Soft Tissue Disorders <70 and w/o cc
Complex Breast Reconstruction using
Flaps
Upper Genital Tract major Procedures
Threatened or Spontaneous Abortion
2715
0.738
2612
0.811
144
0.632
135
0.659
1476
0.683
1408
0.808
2991
134
541
413
0.371
0.355
0.768
0.715
2741
134
523
383
0.567
0.526
0.768
0.673
J1830 - Laparoscopic
cholecystectomy
J1830 - Laparoscopic
cholecystectomy
W3710 - Primary hip replacement
Source: York Trust
M6530 - TURP
M6530 - TURP
G13
G14
H02
H04
H23
H24
B3120 - Augmentation mammoplasty J01
Q0740 - Hysterectomy procedures
Q0800 - Hysterectomy procedures
V3310 - Lumbar procedures: posterior
decompression on lumbar spine and
posterior excision of intervertebral
disc
V2530 - Lumbar procedures: posterior
decompression on lumbar spine and
posterior excision of intervertebral
disc
V3410 - Microdiscectomy and
Revision disc
M07
M09
287
0.931
292
0.870
1178
287
0.703
0.715
1143
273
0.723
0.756
R02
Surgery for Degenerative Spinal
Disorders
543
0.365
527
0.613
R03
Spinal Fusdion or Decompression
Excluding Trauma
584
0.365
574
0.559
R09
Revisional Spinal Procedures
40
0.321
38
0.547
Source: BUPA (with exception of HO2 from York Trust)
88
6. Lessons for future routine collection of health outcomes data
The lack of outcome measures is the most obvious barrier to measuring improved
performance of the NHS. Outcome measures are required for a number of different
purposes
• Measurement of NHS quality adjusted output and productivity growth.
• Performance management of provider trusts – how are provider trusts doing
relative to past performance and to other trusts?
• Performance management of consultant teams – how are they doing relative to
their past performance, or relative to other consultant teams doing the same
kind of work?
• Cost effective resource allocation – which treatments are a cost effective use of
resources given the way care is actually delivered in the NHS?
We conclude by offering some recommendations to guide future collection of health
outcomes data.
NHS patient sample. Since the scale of NHS activity is so broad and the potential
volume of patients is so large sampling seems a more sensible strategy than
attempting to measure health effects for all NHS patients in a sector. Sample size
requirements are likely to vary across different types of patients, and the variances in
EQ5D scores reported in Table C2 suggest that sample size would need to be large if
data are required at an HRG level because these groupings are not particularly
“outcome homogenous”.
There are two ways in which the exercise might be made more manageable. The first
would be to start with a small number of salient activities, both elective and
emergency. Even if it is not possible to collect health data before patients receive
emergency care, the post treatment health data will still yield useful information about
trends in the quality of emergency care and comparisons across teams if allowance is
made for case mix. For emergencies where it is not possible to collect no treatment or
pre treatment health outcome data, it may not be unreasonable to make simple
assumptions about the no treatment health outcome in order to estimate a health effect
89
using the actual post treatment outcomes.
The second would be to undertake a pilot exercise at a handful of Trusts. Even in the
longer term there may well be economies of scale from concentrating data collection
within a small number of institutions rather than spreading the data collection burden
more thinly across all Trusts. The drawback is that the selected Trusts may not be
fully representative of the NHS as a whole. CHKS Ltd have approached a number of
Trusts offering to facilitate the collection of health outcomes data, and such efforts
deserve to be encouraged.
Choice of instrument. It is important to collect patient reported outcome measures
(PROMs), using a generic instrument. Since the outcome is health gain for patients it
is their views on the effect of treatment which matter. A single generic, rather than
condition-specific instrument is required in order to facilitate aggregation across
different types of NHS activity in a productivity index. Generic measures also enable
comparisons to be made across activities, to compare the cost-effectiveness of
different activities and to measure quality adjusted output summed over different
activities for performance management of Trusts or for measuring NHS productivity.
Profiles, such as SF-36, provide multiple measures of outcome but are of more limited
use for non-clinical purposes since they typically lack the capacity to form a single
aggregate index. That said, the disaggregated nature of the SF36 instrument appears to
have clinical appeal. In applying the SF36 in this work we focussed on the physical
summary score and ignored the mental health summary score, as there is no obvious
way of combining the two into a single index.
For aggregation and comparative purposes, it is necessary to have a means of
combining the health profiles into an overall score. EQ5D has associated with it a set
of weights derived from samples of the general population (over 10 years ago) which
can be used to aggregate the dimensions to produce an overall score. Methods of
producing an overall score from the SF instruments are less well developed.
Clinicians in some areas argue that some aspects of quality are not measured
sensitively by generic instruments and that condition-specific instruments are
90
necessary. The VF14 used by BUPA and analysed above provides evidence of the
value of such measures. Others may argue that the necessarily relatively short term
follow up of patients cannot detect longer term health impacts and that these are best
measured by using technical process measures.
If necessary, generic instruments can be supplemented with condition-specific process
measures but these must be in addition to, not instead of, generic instruments. It is
essential that the same measure be applied across different activities to enable costeffectiveness comparisons and for aggregate output measurement. Using a conditionspecific instrument alongside a generic measure should not add greatly to the
administrative burden if the data collection system is in place for collecting data.
Comparison of the health outcomes assessed using condition-specific and generic
measures would also provide firmer evidence on whether the generic instrument is
sufficiently sensitive. If it is not then the generic measures should be amended with
additional dimensions for a finer categorisation of existing dimensions.
Timing. The timing of before and after health status measurement may depend on the
type of activity (emergency or elective). Measurements from patient self-report are
preferable but there are circumstances in which proxy or retrospective data provide
the only feasible route (for example, when emergency care is required by patients in
an unconscious state). The timing of the administration of the instrument may also
depend on diagnostic category or intervention type since different treatments may
have an effect over shorter or longer periods. For some activities or conditions it may
be necessary to make several post intervention measurements since the effect on
health may vary substantially with the time elapsed since treatment.
Grouping of NHS activities. Given the enormous range of NHS activities it is
necessary to group them for data analysis. Thus the main grouping of secondary care
activities is by HRG which attempts to group activities by their costs. But a given
HRG may contain a large number of procedures which have very different effects on
health. The availability of patient-level health outcomes data matched to other
datasets (such as HES) will make it possible to explore the extent to which health
outcomes are related to other routinely collected patient characteristics, such as age,
gender, diagnoses and procedures. This information may be used to create
91
homogenous health outcome groupings which, in turn, may allow refinement of how
activities are defined.
Frequency of data collection. If the effect of treatment on health depended only on
the state of medical knowledge and the pace of technological change in medicine was
slow enough it could be argued that collecting data on health effects was an exercise
that needed to be undertaken at intervals of several years. But technological change in
medicine and pharmaceuticals is rapid, the NHS is subject to frequent organisational
change which may affect the mix of patients receiving particular treatments and the
speed at which new technology spreads. We believe that only a continuous sampling
of the NHS patient population will be adequate to capture trends in the effects of the
NHS on its patients. Updates may be concentrated on areas subject to most
technological change.
Stated preference research might be supported to get valuations of the other
dimensions of outcome (waiting times, process experience) in the same units as health
effects measured using the generic instruments. The valuation of waiting times in
terms of willingness to trade off waiting time for health gains as measured in the
generic instrument need not be an annual exercise. The data on waiting times are
generated routinely – what is missing is the valuation and this is not likely to change
greatly from one year to the next. Some data on patient experience is collected
regularly via patient satisfaction questionnaires but there are problems in using the
data to make comparisons over time. Information on trade-offs between the nonhealth dimensions of patient experience and health gain and could also be collected
from regular, though not annual, stated preference research.
92
D. Comparing marginal and average cost
1. Introduction
As we argued in our First Interim Report (section 2.10.3) under certain assumptions,
marginal costs reflect marginal valuations. In such situations, a case can be made for
using marginal costs as a basis for determining the relative values of different
activities. Current NHS practice is to use unit (average) costs derived from the
Reference Costs. In an attempt to test whether the use of marginal costs would make a
difference to calculated productivity growth we estimate marginal costs and compare
them with reference costs.
2. Methods
We compare average cost estimates derived from accounting data and marginal cost
estimates derived from regression models.
Accounting Estimates. Collected annually from all NHS Trusts, the Reference Costs
comprise average costs for different types of activity, defined predominantly by
Healthcare Resource Group. These average costs are accounting costs arrived at by
applying national guidance about how to apportion shared input costs to each activity
type.
We calculated a national weighted average Reference Cost - C ijtRC - for each HRG i in
Trust j at time t for elective, non-elective and daycase activity respectively. The
weights represent the share of activity (FCEs) in each Trust j for each HRG i. Hence
for each time period t we calculated:
C ijRC = ∑ FCEij ( RC ij )
i
∑ FCE
ij
i
where RC ij are the Reference Costs reported by Trusts j in HRG i and
where FCEij are the number of FCEs for each Trust j in HRG i.
93
The annual figures for each time period t were summarised into a single estimate
C ijtRC covering the full period for which data were available, with the annual activity in
each year used to weight the annual mean costs.
In addition to the weighted average Reference Cost figures calculated for elective,
non-elective and daycase activity respectively, we also calculated an overall national
weighted average Reference Cost for all activity. For this, we weighted the mean
Reference Cost for each type of activity by the total activity for each of these 3 types
of activity respectively. These annual figures were then aggregated up in the same
way as the rest of the data, by weighting according to the annual activity in each year.
Regression Estimates. The alternative approach to calculation of marginal costs is to
derive them from a regression in which total costs are regressed on the amount of
activity undertaken in each output activity. The parameters from this regression can be
interpreted as estimates of marginal cost. Hence, we estimated a regression model of
the form:
TC it = α + β 1t x1it + β 2t x 2it + ... + β Jt x Jit + eit
where TC it is the total cost of all types of activity in Trust i at time t. This is
calculated by multiplying the number of FCEs by the reference cost for those FCEs in
each HRG and summing across all HRGs. β 1t can be interpreted as the marginal cost
of treating an additional patient in HRG 1 at time t. We ran regressions for each year
separately, but also created a panel dataset combining all six years’ worth of data,
running regressions for panels of 6, 5 and 4 years (dropping the first two years
respectively to test for robustness of data).
We tested functional form by running linear models, log-log, and log-linear models.
We tested the inclusion of a size variable namely, the average number of beds in the
Trust and various non-linear combinations of this variable. We also tested the
inclusion of time dummy variables. We ran a number of different estimation
procedures including pooled OLS, GLM, GEE, random and fixed effects models.
94
The objective is to compare C ijtRC with estimates of β it from the various specifications
of the regression model.
3. Data
We used six available years of Reference Cost data from 1998/99 to 2003/04. These
give, by NHS Trust, the number of FCEs, and the average cost, in each HRG for
elective, daycase and non-elective (emergency) activity respectively. We excluded
PCTs (some of which began to provide acute care from 2001/02) since their
production and cost structure may differ from those of NHS hospital Trusts.
In the regression model, we would wish to use the number of FCEs in each HRG as
an explanatory variable. This however gives rise to serious degrees of freedom
problems with over 500 HRGs and thus RHS regressors. To address this limitation,
we selected the 60 largest volume HRGs for inclusion in the model. We then created a
composite HRG category for all other activity. Volumes were determined by running
frequencies for all HRGs across all 6 years.
4. Discussion
This exercise produced some significant and plausible coefficient estimates, but also a
few coefficients with negative signs suggesting a negative marginal cost. Possible
reasons for this may be:
1.) issues of scale economies
2.) misspecification in the model
3.) misallocation of costs between elective and daycases and non-electives
5. Results
We present results respectively, for daycases, electives, non-electives and then all
activity together, showing the change in results from including and then excluding
95
data for 1998/99. We also first present results on the proportion of activity represented
by the top 60 volume HRGs in each category. There is notably relatively little overlap
between the top volume HRGs in each type of activity.
Table D1 Proportion of activity explained by top 60 HRGs in each type of
activity and number of overlapping HRGs in top 60, 1998-2003
Daycase
Elective
Non-elective
All
Daycase
Elective
26
Non-elective
5
7
All
28
25
35
Prop of activity
85
61
63
55
Table D2 Estimated marginal cost and average (Reference Cost), Daycases, 19982003
Coeff.
P>[t]
RC
B02
Phakoemulsification Cataract Extraction with Lens Implant
523
0.008
609
C22
Nose Procedures - Category 3
5605
0.004
556
E14
Cardiac Catheterisation without Complications
1144
0.001
627
F06
Oesophagus - Diagnostic Procedures
473
0.024
279
F44
General Abdominal - Endoscopic or Intermediate Procedures <70 w/o cc
829
0.000
468
F54
Inflammatory Bowel Disease - Endoscopic or Intermediate Procedures <70 w/o cc
2208
0.029
324
H17
Soft Tissue or Other Bone Procedures - Category 1 <70 w/o cc
4101
0.003
501
H20
Muscle, Tendon or Ligament Procedures - Category 1
5055
0.046
533
H26
Inflammatory Spine, Joint or Connective Tissue Disorders <70 w/o cc
2021
0.008
382
J36
Minor Skin Procedures - Category 1 w cc
3969
0.039
416
L20
Bladder Minor Endoscopic Procedure w cc
1393
0.014
317
L21
Bladder Minor Endoscopic Procedure w/o cc
550
0.012
307
L48
Renal Replacement Therapy w/o cc
230
0.000
158
M06
Upper Genital Tract Intermediate Procedures
1074
0.010
456
M12
Non-Surgical Treatment of Lower Genital Tract Disorders
493
0.000
260
Table D3 Estimated marginal cost and average (Reference Cost), Daycases, 19992003
Coeff.
P>[t]
RC
679
0.014
392
A07
Intermediate Pain Procedures
B02
Phakoemulsification Cataract Extraction with Lens Implant
519
0.031
611
C22
Nose Procedures - Category 3
5476
0.010
571
E14
Cardiac Catheterisation without Complications
1420
0.011
622
F44
General Abdominal - Endoscopic or Intermediate Procedures <70 w/o cc
774
0.002
476
F54
Inflammatory Bowel Disease - Endoscopic or Intermediate Procedures <70 w/o cc
2771
0.026
326
96
H17
Soft Tissue or Other Bone Procedures - Category 1 <70 w/o cc
3960
0.006
505
J36
Minor Skin Procedures - Category 1 w cc
5267
0.015
424
L21
Bladder Minor Endoscopic Procedure w/o cc
503
0.048
312
L48
Renal Replacement Therapy w/o cc
230
0.000
158
Table D4 Estimated marginal cost and average (Reference Cost), Electives, 19982003
Coeff.
P>[t]
RC
Mouth or Throat Procedures - Category 3
2311
0.025
734
E04
Coronary Bypass
14351
0.000
5707
E15
Percutaneous Transluminal Coronary Angioplasty (PTCA)
10440
0.000
2598
H02
Primary Hip Replacement
7146
0.007
4133
C24
N07
Normal Delivery w/o cc
Q14
Diagnostic Radiology - Arteries or Lymphatics w/o cc
1022
0.041
860
-11103
0.003
877
Table D5 Estimated marginal cost and average (Reference Cost), Electives, 19992003
Coeff.
P>[t]
RC
E04
Coronary Bypass
16058
0.000
5681
E15
Percutaneous Transluminal Coronary Angioplasty (PTCA)
10657
0.001
2543
H02
Primary Hip Replacement
7266
0.027
4213
N12
Other Maternity Events
525
0.018
457
Q14
Diagnostic Radiology - Arteries or Lymphatics w/o cc
-13631
0.011
898
Table D6 Estimated marginal cost and average (Reference Cost), Non-electives,
log-log model, 1998-2003
Coeff.
Transform
P>[t]
RC
A30
Epilepsy <70 w/o cc
-0.050
-12901
0.032
515
D34
Other Respiratory Diagnoses <70 w/o cc
-0.083
-23098
0.001
555
E18
Heart Failure or Shock >69 or w cc
0.087
8927
0.043
1360
E31
Syncope or Collapse >69 or w cc
-0.070
-10884
0.006
835
F07
Disorders of the Oesophagus >69 or w cc
-0.053
-15792
0.040
1215
P03
Upper Respiratory Tract Disorders
-0.042
-3893
0.044
438
Table D7 Estimated marginal cost and average (Reference Cost), Non-electives,
log-log model, 1999-2003
Coeff.
Transform
P>[t]
RC
N12
Other Maternity Events
0.105
1360
0.012
410
P03
Upper Respiratory Tract Disorders
-0.054
-4688
0.027
440
97
Table D8 Estimated marginal cost and average (Reference Cost), Non-electives,
log-linear model, 1998-2003
Coeff.
Transform
P>[t]
RC
A28
Headache or Migraine <70 w/o cc
-0.008
0.008
489
D21
Asthma >49 or w cc
0.003
0.042
1075
F36
Large Intestinal Disorders >69 or w cc
0.005
0.044
1362
F56
Inflammatory Bowel Disease <70 w/o cc
-0.005
0.042
733
J35
Minor Skin Procedures - Category 2 w/o cc
-0.006
0.026
844
L09
Kidney or Urinary Tract Infections >69 or w cc
-0.005
0.024
1403
M05
Upper Genital Tract Minor Procedures
0.003
0.033
560
Table D9 Estimated marginal cost and average (Reference Cost), Non-electives,
log-linear model, 1999-2003
No results – no significant variables
Table D10 Estimated marginal cost and average (Reference Cost), All, 1998-2003
Coeff.
P>[t]
RC
C14
Mouth or Throat Procedures - Category 2
-2556
0.041
506
E12
Acute Myocardial Infarction w/o cc
6427
0.014
920
E14
Cardiac Catheterisation without Complications
4941
0.002
890
E30
Arrhythmia or Conduction Disorders <70 w/o cc
-13773
0.014
506
H17
Soft Tissue or Other Bone Procedures - Category 1 <70 w/o cc
23926
0.000
919
H37
Closed Pelvis or Lower Limb Fractures <70 w/o cc
21000
0.045
1776
L20
Bladder Minor Endoscopic Procedure w cc
5946
0.008
682
N11
Caesarean Section w/o cc
7750
0.007
1857
S01
Haematological Disorders with Minor Procedure
2593
0.001
374
Table D11 Estimated marginal cost and average (Reference Cost), All, 1999-2003
B05
Other Ophthalmic Procedures - Category 2
Coeff.
P>[t]
RC
-4396
0.031
476
C14
Mouth or Throat Procedures - Category 2
-2239
0.032
514
H17
Soft Tissue or Other Bone Procedures - Category 1 <70 w/o cc
23612
0.000
929
H42
Sprains, Strains, or Minor Open Wounds <70 w/o cc
10837
0.010
480
L20
Bladder Minor Endoscopic Procedure w cc
5451
0.030
705
M06 Upper Genital Tract Intermediate Procedures
3777
0.032
531
S01
2450
0.002
374
Haematological Disorders with Minor Procedure
98
Table D12 Estimated marginal cost and average (Reference Cost), All, log-log
model, 1998-2003
Coeff.
Transform
P>[t]
RC
E31
Syncope or Collapse >69 or w cc
-0.013
-3228
0.037
830
J35
Minor Skin Procedures - Category 2 w/o cc
-0.014
-3093
0.006
687
-0.012
-1591
0.035
508
M05 Upper Genital Tract Minor Procedures
M09 Threatened or Spontaneous Abortion
-0.010
-1626
0.014
341
P15
-0.011
-2456
0.019
702
Accidental Injury
Table D13 Estimated marginal cost and average (Reference Cost), All, log-log
model, 1999-2003
Coeff.
C22
Nose Procedures - Category 3
0.008
Transform
1925
P>[t]
RC
0.041
821
C24
Mouth or Throat Procedures - Category 3
-0.022
-1900
0.017
708
E31
Syncope or Collapse >69 or w cc
-0.023
-5591
0.016
836
F35
Large Intestine - Endoscopic or Intermediate Procedures
-0.027
-1204
0.015
364
F47
General Abdominal Disorders <70 w/o cc
-0.023
-2157
0.026
593
J37
Minor Skin Procedures - Category 1 w/o cc
0.023
1169
0.003
474
As these results suggest, there are large discrepancies between marginal and average
cost estimates produced. There is also volatility in the coefficient estimates when
using different model specifications, e.g. log-log versus log-linear. Different sets of
HRGs would emerge as significant. As a result of these problems, we did not pursue
this approach as a means of obtaining measures of marginal cost.
99