Download Report

Evaluation of a Consumer-Oriented Internet Health Care
Report Card: The Risk of Quality Ratings Based on
Mortality Data
Online article and related content
current as of October 20, 2009.
Harlan M. Krumholz; Saif S. Rathore; Jersey Chen; et al.
JAMA. 2002;287(10):1277-1287 (doi:10.1001/jama.287.10.1277)
http://jama.ama-assn.org/cgi/content/full/287/10/1277
Correction
Contact me if this article is corrected.
Citations
This article has been cited 50 times.
Contact me when this article is cited.
Topic collections
Informatics/ Internet in Medicine; Internet; Quality of Care; Quality of Care, Other
Contact me when new articles are published in these topic areas.
Related Articles published in
the same issue
Public Profiling of Clinical Performance
C. David Naylor. JAMA. 2002;287(10):1323.
March 13, 2002
JAMA. 2002;287(10):1333.
Related Letters
Should Consumers Trust Hospital Quality Report Cards?
Emily V. A. Finlayson et al. JAMA. 2002;287(24):3206.
Subscribe
Email Alerts
http://jama.com/subscribe
http://jamaarchives.com/alerts
Permissions
Reprints/E-prints
[email protected]
http://pubs.ama-assn.org/misc/permissions.dtl
[email protected]
Downloaded from www.jama.com at Rutgers University Libraries on October 20, 2009
ORIGINAL CONTRIBUTION
Evaluation of a Consumer-Oriented
Internet Health Care Report Card
The Risk of Quality Ratings Based on Mortality Data
Harlan M. Krumholz, MD
Saif S. Rathore, MPH
Jersey Chen, MD, MPH
Yongfei Wang, MS
Context Health care “report cards” have attracted significant consumer interest, particularly publicly available Internet health care quality rating systems. However, the
ability of these ratings to discriminate between hospitals is not known.
Martha J. Radford, MD
Objective To determine whether hospital ratings for acute myocardial infarction (AMI)
mortality from a prominent Internet hospital rating system accurately discriminate between hospitals’ performance based on process of care and outcomes.
I
Design, Setting, and Patients Data from the Cooperative Cardiovascular Project,
a retrospective systematic medical record review of 141914 Medicare fee-for-service
beneficiaries 65 years or older hospitalized with AMI at 3363 US acute care hospitals
during a 4- to 8-month period between January 1994 and February 1996 were compared with ratings obtained from HealthGrades.com (1-star: worse outcomes than predicted, 5-star: better outcomes than predicted) based on 1994-1997 Medicare data.
NCREASING INTEREST IN THE QUALity of health care has led to the development of “report cards” to
grade and compare the quality of
care and outcomes of hospitals,1 physicians,2 and managed care plans.3 The
organizations that produce these evaluations span the spectrum of popular periodicals, federal and state agencies,
nonprofit accreditation organizations,
consulting companies, and for-profit
health care information companies.4 In
addition, the Centers for Medicare and
Medicaid Services (formerly called the
Health Care Financing Administration) has recently expressed interest in
developing a public performance report for hospitals.5
One of the most prominent organizations involved in providing health care
quality ratings is HealthGrades.com, Inc.
This company has developed “Hospital
Report Cards” as part of an effort to provide comparative information about
quality of health care providers via the
Internet.6-8 The company’s Web site indicates that as “the healthcare quality experts,” it is “creating the standard of
healthcare quality.”9 Using primarily
publicly available Medicare administra-
For editorial comment see p 1323.
Main Outcome Measures Quality indicators of AMI care, including use of acute
reperfusion therapy, aspirin, ␤-blockers, angiotensin-converting enzyme inhibitors; 30day mortality.
Results Patients treated at higher-rated hospitals were significantly more likely to
receive aspirin (admission: 75.4% 5-star vs 66.4% 1-star, P for trend = .001; discharge: 79.7% 5-star vs 68.0% 1-star, P=.001) and ␤-blockers (admission: 54.8%
5-star vs 35.7% 1-star, P=.001; discharge: 63.3% 5-star vs 52.1% 1-star, P =.001),
but not angiotensin-converting enzyme inhibitors (59.6% 5-star vs 57.4% 1-star, P=.40).
Acute reperfusion therapy rates were highest for patients treated at 2-star hospitals
(60.6%) and lowest for 5-star hospitals (53.6% 5-star, P=.008). Risk-standardized
30-day mortality rates were lower for patients treated at higher-rated than lowerrated hospitals (21.9% 1-star vs 15.9% 5-star, P =.001). However, there was marked
heterogeneity within rating groups and substantial overlap of individual hospitals across
rating strata for mortality and process of care; only 3.1% of comparisons between 1-star
and 5-star hospitals had statistically lower risk-standardized 30-day mortality rates in
5-star hospitals. Similar findings were observed in comparisons of 30-day mortality
rates between individual hospitals in all other rating groups and when comparisons
were restricted to hospitals with a minimum of 30 cases during the study period.
Conclusion Hospital ratings published by a prominent Internet health care quality
rating system identified groups of hospitals that, in the aggregate, differed in their quality of care and outcomes. However, the ratings poorly discriminated between any 2
individual hospitals’ process of care or mortality rates during the study period. Limitations in discrimination may undermine the value of health care quality ratings for
patients or payers and may lead to misperceptions of hospitals’ performance.
www.jama.com
JAMA. 2002;287:1277-1287
tive data to calculate risk-adjusted mortality rates for a variety of conditions,
HealthGrades.com claims to provide “ac-
©2002 American Medical Association. All rights reserved.
Author Affiliations are listed at the end of this article.
Corresponding Author: Harlan M. Krumholz, MD, Yale
University School of Medicine, 333 Cedar St, PO Box
208025, New Haven, CT 06520-8025.
(Reprinted) JAMA, March 13, 2002—Vol 287, No. 10
Downloaded from www.jama.com at Rutgers University Libraries on October 20, 2009
1277
INTERNET HEALTH CARE REPORT CARD
curate and objective ratings” for hospitals to enable patients to make “wellinformed decisions about where to
receive their care.” As a free service, public interest in the Web site is substantial, with over 1 million visitors in 2001
and discussion of the company’s rating
system in publications such as Yahoo! Internet Life10 and in print stories in USA
Today and the Los Angeles Times.11,12
HealthGrades.com is publicly traded on
NASDAQ and reported over $7 million
in revenue in 2000, with a 640% increase in ratings revenue over the fourth
quarter of 1999.13 With ratings soon appearing for nursing homes, hospices,
home health agencies, fertility clinics,
linkages to data concerning individual
health plans and providers, and a recently announced partnership with The
Leapfrog Group,14 this is one of the most
ambitious health ratings resources available online today.
While hospital ratings are widely
disseminated to the public, little
information is available about their
validity. The HealthGrades.com rating
system uses publicly available Medicare Part A billing data for many of its
ratings, but its statistical methods
have not been published in the peerreviewed literature, nor has any published study, to our knowledge, evaluated its performance. By providing
ready access to ratings for all US hospitals via a free, public-access Web
site, this rating system offers consumers, who may be unfamiliar with the
limitations of rating systems, an
option that no other rating system
today provides—the opportunity to
directly compare 2 individual hospitals’ “performance” for a variety of
conditions. Use of such ratings may
have substantial benefit if it encourages hospitals to compete on quality,
but may have significant, unintended,
and potentially deleterious consequences if the ratings do not accurately discriminate between individual
hospitals’ performance. Accordingly,
we sought to determine if these ratings could discriminate between hospitals based on their quality of care
and outcomes.
For this evaluation we used data from
the Cooperative Cardiovascular Project
(CCP), a national initiative to improve
quality of care for Medicare beneficiaries hospitalized with acute myocardial
infarction (AMI). The CCP involved the
systematic abstraction of clinically relevant information from more than
200000 hospitalizations for AMI nationwide. As a highly prevalent condition
with significant morbidity and mortality and established quality of care and
outcomes measures, AMI is well suited
to an assessment of hospital performance. We compared hospitals ratings
with process-based measures of the quality of AMI care and risk-standardized 30day mortality based on medical record
review. Since the public is expected to
be particularly interested in comparisons between individual hospitals, we determined how often individual higherrated hospitals performed better than
lower-rated hospitals in head to head
comparisons.
METHODS
The CCP
The CCP, a Centers for Medicare and
Medicaid Services project developed to
improve the quality of care provided to
Medicare beneficiaries hospitalized with
AMI,15 included a sample (n=234769)
of fee-for-service patients hospitalized
with a principal discharge diagnosis code
of AMI (International Classification of Diseases, 9th Revision, Clinical Modification
[ICD-9-CM] code 410, excluding 410.x2)
at 4834 hospitals between January 1994
and February 1996. Identified hospital
medical records were forwarded to 1 of
2 clinical data abstraction centers and abstracted for predefined variables including demographics, previous medical
history, clinical presentation, electrocardiographic reports, laboratory test results, in-hospital treatments, complications, and vital status. Data quality was
ensured through the use of trained abstractors, clinical abstraction software,
and random record reabstraction.
Study Sample
We excluded patients younger than 65
years (n = 17 593), those in whom a
1278 JAMA, March 13, 2002—Vol 287, No. 10 (Reprinted)
clinical diagnosis of AMI was not confirmed (n=31186), and those who were
readmitted for AMI (n = 23 773). Patients who transferred into a hospital
(n=34409) were excluded, as we could
not ascertain their clinical characteristics at initial admission. We also excluded patients with a terminal illness
(documentation of anticipated life expectancy ⬍6 months) or metastatic cancer (n=5496) since the focus of their
treatment may not have been targeted
toward improved survival. Patients admitted to the 1059 hospitals that averaged fewer than 10 patients annually
(n=4724) were also excluded to replicate the minimal volume requirements used in the development of the
Internet rating system. Patients admitted to the 66 hospitals for which American Hospital Association data were
unavailable (n=2363) or the 1170 hospitals for which hospital quality ratings were not available (n=17162), and
patients with unverified mortality from
linkage with the Medicare Enrollment
Database and the Social Security Administration’s Master Beneficiary Record or death outside of the study period (n=402) were excluded. In total,
92855 cases (1471 hospitals) met 1 or
more of the above exclusion criteria; the
remaining 141914 patients (3363 hospitals) comprised the study cohort.
Hospital Quality Ratings
We collected individual hospital ratings for AMI outcomes directly from the
HealthGrades.com Web site in summer 1999.9 Using publicly available
Medicare Part A billing data for the period of October 1994 to September 1997
inclusive, the company used a proprietary formula to predict mortality rates
during hospitalization and the 30 days
following discharge for each hospital incorporating demographic, clinical, and
procedural information.9 Each hospital’s predicted mortality rate was then
compared with its observed mortality
rate over the same time period. Hospitals were given a 3-star rating if their
“actual performance (observed mortality) was not significantly different from
what was predicted.”9 Hospitals with
©2002 American Medical Association. All rights reserved.
Downloaded from www.jama.com at Rutgers University Libraries on October 20, 2009
INTERNET HEALTH CARE REPORT CARD
statistically significant differences between their observed and expected mortality rates were divided into 2 groups:
those hospitals that exceeded predicted performance (ie, observed mortality lower than predicted) and those
with poorer performance (ie, higher observed mortality than expected). Among
those hospitals that exceeded performance, up to 10% (of the overall population) with the greatest difference between their observed and predicted
mortality rates were assigned a 5-star
rating to indicate “actual performance
was better than predicted and [that] the
difference was statistically significant”9; all remaining hospitals that exceeded predicted performance were assigned a 4-star rating. Similarly, among
those hospitals in which performance
was significantly worse than predicted, up to 10% (of the overall population) with the greatest difference between their observed and predicted
mortality rates were assigned a 1-star
rating to indicate “actual performance
was worse than predicted and the difference was statistically significant.”9
Due to a skewed left-shifted distribution in the hospital ratings, no hospitals received a 4-star rating in the period we surveyed. The 2-star and 4-star
ratings have since been eliminated and
only 1-star or 5-star ratings are now
used to identify hospitals whose performance significantly exceeds or fails
to meet predicted levels (3-star).
Process of Care Measures
and Outcomes
Six process of care measures, drawn
from clinical guidelines for the management of AMI,16 were used in our
evaluation: (1) use of acute reperfusion therapy (thrombolytic agents or
primary angioplasty within 12 hours of
admission for patients with STsegment elevation or left bundle branch
block), (2) aspirin within 48 hours of
admission, (3) ␤-blockers within 48
hours of admission, (4) aspirin at discharge, (5) ␤-blockers at discharge, and
(6) angiotensin-converting enzyme inhibitors at discharge. Criteria used to
identify patients who were considered
“ideal” candidates for each treatment
are listed in the BOX. Mortality at 30
days’ postinfarction was determined
from the Medicare Enrollment Database and the Social Security Administration’s Master Beneficiary Record.17
Statistical Analysis
Patient characteristics, performance on
process of care measures, in-hospital
outcomes, and 30-day mortality rates
were compared between hospitals with
different ratings using global and test
of trend ␹2 analyses for categorical variables and analysis of variance for continuous variables.
Hospital ratings were evaluated for
their association with each of the 6 process of care measures using a multivariable logistic regression analysis
among the cohort of patients classified as ideal candidates for each specific therapy. Analyses were adjusted for
patient demographic characteristics; illness severity as assessed by the Medicare Mortality Prediction System, a disease-specific model for predicting 30day mortality in elderly patients with
AMI18; findings on admission; and comorbid conditions. Separate analyses
were conducted for each process of care
measure, comparing performance in all
hospitals relative to the performance of
5-star (top-rated) hospitals.
Hospitals’ expected mortality rates
were calculated using the mean Medicare Mortality Prediction System predicted probability of 30-day mortality
for all patients treated in that hospital.
Hospitals’ risk-standardized mortality
rates were calculated by dividing each
hospital’s observed 30-day mortality
rate by its predicted 30-day mortality
rate and multiplying this ratio by the
entire cohort’s 30-day mortality rate
(18.2%). The resulting 30-day mortality rate is standardized to the overall
CCP population and provides an estimate for each hospital, assuming that
hospital had the same patient composition as the entire sample. To determine the independent association of
hospital rating groups with patient survival at 30 days, a multivariable logistic regression analysis was conducted
©2002 American Medical Association. All rights reserved.
adjusting for patient demographic characteristics, illness severity, admission
findings, and comorbid conditions.
To examine variations in treatment
and outcomes within different hospital
rating groups, we plotted the distribution of risk-adjusted treatment rates and
risk-standardized mortality rates for each
hospital rating group using “box and
whisker” graphs. Hospital rating groups
with less variation will have both
“shorter” boxes and whiskers, while
groups with a broader distribution of
rates will have “longer” boxes and whiskers. Box and whisker plots of treatment rates were restricted to those hospitals with 20 or more patients classified
as ideal for each therapy.
To evaluate the discrimination provided by the ratings for individual hospitals, we compared risk-standardized
mortality rates between individual hospitals within each rating group. If the
ratings provided perfect or near perfect discrimination, then all, or nearly
all, hospitals in higher rating groups
would have lower mortality rates than
hospitals in lower rating groups. Thus,
all hospitals with 1-star ratings were
compared with all hospitals with 5-star
ratings to determine the proportion of
comparisons in which a 5-star hospital had a significantly lower riskstandardized mortality rate than the
1-star hospital to which it was compared. Similar comparisons were made
between 2-star and 5-star hospitals,
3-star and 5-star hospitals, 1-star and
2-star hospitals, 1-star and 3-star hospitals, and 2-star and 3-star hospitals.
Secondary analyses were conducted
incorporating hospital characteristics,
physician specialty, geographic location, and AMI therapies to determine
if these characteristics may have accounted for variations in treatment and
outcomes between the rating groups.
In addition, comparisons of riskstandardized mortality rates between
hospitals in different rating groups were
also repeated, restricting analysis to the
1738 hospitals with 30 or more cases
and evaluating mortality at 60 days’
postadmission. Because the time periods of the CCP cohort and the Internet-
(Reprinted) JAMA, March 13, 2002—Vol 287, No. 10
Downloaded from www.jama.com at Rutgers University Libraries on October 20, 2009
1279
INTERNET HEALTH CARE REPORT CARD
based ratings did not exactly overlap,
we repeated our analyses by restricting our evaluation of the rating sys-
tem to those patients admitted after
October 1, 1994. We similarly repeated our analyses, including cases that
Box. Treatment Exclusion Criteria to Classify Patients as Ideal Candidates
Acute Reperfusion Therapy
Absence of ST-segment elevation or left bundle branch block on admission
electrocardiogram
Transferred into the hospital
Chest pain of more than 12 hours in duration
Bleeding before or at time of admission
Increased risk of bleeding or hemorrhage
Stroke on admission or history of cerebrovascular disease
Warfarin use before admission
Malignant hypertension
Age older than 80 years
Patient or physician refused thrombolytics
RESULTS
Hospital and Patient
Characteristics
Aspirin Within 48 Hours of Admission
Bleeding before or at time of admission
Increased risk of bleeding or hemorrhage
History of allergy to aspirin
Transferred into the hospital
␤-Blockers Within 48 Hours of Admission
Heart failure at time of admission or history of heart failure
Shock or hypotension at time of admission
Second- or third-degree heart block
History of asthma or chronic obstructive pulmonary disease
Bradycardia at time of admission (unless taking a ␤-blocker)
History of allergy to ␤-blockers
Transferred into the hospital
Aspirin at Discharge
Died during hospitalization
Bleeding during hospitalization
Increased risk of bleeding or hemorrhage
History of allergy to aspirin or reaction to aspirin during hospitalization
History of peptic ulcer disease
Warfarin prescribed at discharge
Transferred out of the hospital
␤-Blockers at Discharge
Died during hospitalization
Heart failure at time of admission, during hospitalization, or left ventricular ejection
fraction (LVEF) less than 35%
Shock or hypotension during hospitalization
Second- or third-degree heart block
History of asthma or chronic obstructive pulmonary disease
Peripheral vascular disease
Bradycardia during hospitalization (unless taking a ␤-blocker)
History of allergy to ␤-blockers or reaction to ␤-blockers during hospitalization
Transferred out of the hospital
Angiotensin-Converting Enzyme (ACE) Inhibitors at Discharge
Died during hospitalization
LVEF 40% or greater or LVEF unknown
Aortic stenosis
Creatinine level greater than 3 mg/dL at time of admission or during hospitalization
Hypotension (unless taking an ACE inhibitor)
History of allergy to ACE inhibitors or reaction to ACE inhibitors during hospitalization
Transferred out of the hospital
1280 JAMA, March 13, 2002—Vol 287, No. 10 (Reprinted)
had been excluded as readmissions.
Huber-White variance estimates19 were
used in all models to provide robust estimates of variance and to adjust for
clustering of patients by individual hospitals. All models demonstrated appropriate discrimination and calibration.
Odds ratios were converted to relative
risk ratios using the conversion formula specified by Zhang and Yu. 20
Statistical analyses were conducted
using SAS 6.12 (SAS Institute Inc, Cary,
NC) and STATA 6.0 (STATA Corp,
College Station, Tex).
Of the 3363 hospitals studied, 10.6%
were classified as 5-star hospitals, 74.0%
as 3-star hospitals, 7.8% as 2-star hospitals, and 7.6% as 1-star hospitals. Hospitals with higher ratings had a higher
AMI volume, were more likely to be
teaching hospitals, not-for-profit in
ownership, and have invasive cardiac
care facilities (TABLE 1).
Patients were elderly, predominantly male, and white, and a significant number had comorbid conditions. Patients were mostly treated at
3-star hospitals (n=98725, 69.6%), a
smaller group at 5-star hospitals
(n=23944, 16.9%), and even fewer at
1-star (n = 5089, 3.6%) or 2-star
(n = 14 156, 10.0%) hospitals. Differences in patient characteristics across
hospital rating groups were small, although many of these small differences were statistically significant because of the large sample (TABLE 2).
Process of Care Measures
A graded association was observed between hospital rating and use of aspirin and ␤-blockers, both on admission
and at discharge. There was no apparent trend for greater use of angiotensinconverting enzyme inhibitors or acute
reperfusion therapy in higher-rated hospitals (TABLE 3). Multivariable analysis of AMI treatment indicated lower
rates of aspirin (admission and discharge) and ␤-blocker use on admission in 1-, 2-, and 3-star hospitals, while
©2002 American Medical Association. All rights reserved.
Downloaded from www.jama.com at Rutgers University Libraries on October 20, 2009
INTERNET HEALTH CARE REPORT CARD
only 1- and 2-star hospitals were less
likely to provide ␤-blockers on discharge (TABLE 4). Patients at 2- and
3-star hospitals were more likely to receive acute reperfusion therapy than patients at 5-star hospitals. In addition,
there was significant heterogeneity in
the use of treatments among each hospital rating group (FIGURE 1). Findings were similar in secondary analyses except for 3-star hospitals, which
were comparable to 5-star hospitals for
use of all therapies and 2-star hospitals’ use of ␤-blockers on admission.
In-Hospital Outcomes
and Mortality
Patients at 5-star hospitals had lower inhospital mortality rates and higher total
charges than patients at lower-rated hospitals; no clear trend was observed for
length of stay (Table 3). Crude 30-day
mortality rates were highest for patients treated at 1-star hospitals (23.0%),
lower in 2- and 3-star hospitals, and lowest among patients treated at 5-star hos-
pitals (15.4%). Risk-standardized mortality rates were nearly identical for
patients in 1-star and 2-star hospitals, but
higher than those for patients in 3-star
and 5-star hospitals, with a 6.0% absolute difference in 30-day mortality between 1-star and 5-star hospitals. Multivariable analysis also indicated a higher
30-day mortality risk among patients
treated at 1-star and 2-star hospitals and
a slightly lower, but still increased, mortality risk for patients treated at 3-star
hospitals compared with 5-star hospitals (TABLE 5).
While lower-rated (1-star and 2-star)
hospitals had a higher average mortality risk compared with that of 5-star
hospitals, there was marked intragroup variation in individual hospitals’ 30-day mortality rates. Discrimination in individual hospitals’ riskstandardized mortality rates between
rating groups was poor, as indicated by
the box and whisker plots (FIGURE 2).
Pairwise comparisons of hospitals with
1-star ratings and those with 5-star rat-
ings found that in 92.3% of comparisons, 1-star hospitals had a riskstandardized mortality rate that was not
statistically different than that of a 5-star
hospital and a lower risk-standardized mortality rate in 4.6% of comparisons. Similarly, 95.9% of 2-star hospital comparisons and 94.6% of 3-star
hospital comparisons had riskstandardized mortality rates that were
not statistically different or lower than
those of the 5-star hospitals to which
they were compared. The proportion of
comparisons in which mortality rates
were statistically comparable between
hospitals in different rating groups was
similarly high in the comparison of
1-star and 2-star hospitals, 1-star and
3-star hospitals, and 2-star and 3-star
hospitals (TABLE 6).
Secondary Analyses
Our findings were similar in secondary analyses evaluating hospitals with
30 or more cases, assessing mortality
at 60 days’ postadmission, restricting
Table 1. Hospital Characteristics*
Hospital Ratings
Characteristic
No. (%) of hospitals
Myocardial infarction volume, mean (SD), No.
Rural location, %
Ownership, %
Public
Not-for-profit
For-profit
Teaching status, %
COTH member
Residency affiliated
Nonteaching
Cardiac care facilities, %
Cardiac surgery suite
Cardiac catheterization laboratory
No invasive facilities
Census region, %
New England
Mid-Atlantic
South Atlantic
East North Central
East South Central
West North Central
West South Central
Mountain
Pacific
5 Stars
355 (10.6)
185 (137)
9.1
Global P Value
P Value
for Trend
.001
.001
.001
.001
All Hospitals
3363 (100.0)
117 (99)
20.4
1 Star
257 (7.6)
58 (94)
47.3
2 Stars
262 (7.8)
110 (74)
20.2
3 Stars
2489 (74.0)
104 (84)
21.8
12.1
77.6
10.3
26.6
63.7
9.7
12.6
78.0
9.4
12.7
76.8
10.6
6.3
83.7
9.9
.001
12.0
22.7
65.3
6.4
9.8
83.8
5.2
22.2
72.7
10.2
22.6
67.2
24.7
26.0
49.4
.001
38.7
23.4
37.9
15.9
12.3
71.9
30.6
29.9
39.6
34.6
24.4
41.0
65.3
17.9
16.9
.001
6.3
18.6
22.1
0.2
7.3
15.9
1.3
24.2
20.9
6.3
18.9
21.2
10.5
16.4
28.0
15.9
6.3
6.7
9.6
3.7
10.9
26.7
11.9
10.2
20.0
4.4
3.5
17.3
8.5
6.1
13.1
1.8
6.9
15.6
5.6
6.7
10.2
3.9
11.7
13.9
6.9
6.3
2.9
3.8
11.4
.001
*Hospital ratings in Tables 1 through 6 are from the HealthGrades.com rating system. COTH indicates Association of American Medical Colleges Council of Teaching Hospitals.
©2002 American Medical Association. All rights reserved.
(Reprinted) JAMA, March 13, 2002—Vol 287, No. 10
Downloaded from www.jama.com at Rutgers University Libraries on October 20, 2009
1281
INTERNET HEALTH CARE REPORT CARD
Table 2. Patient Characteristics*
Hospital Ratings
Characteristic
All Hospitals
5 Stars
23 944 (16.9)
76.3 (7.3)
47.6
.01
.001
.008
.001
141 914 (100.0)
Age, mean (SD), y
Female, %
76.4 (7.4)
49.0
76.6 (7.5)
50.8
76.4 (7.4)
49.9
76.4 (7.4)
49.2
90.8
6.1
3.1
90.1
6.7
3.3
90.0
6.5
3.5
90.7
6.3
3.0
91.8
5.2
3.0
.001
62.0
30.6
29.3
14.0
21.1
60.6
31.0
27.3
15.9
21.8
62.4
31.5
28.4
14.3
21.5
61.8
30.7
29.0
14.0
21.1
63.1
29.8
31.3
13.6
20.8
.001
.006
.001
.003
.27
.003
.001
.001
.001
.05
10.4
6.7
12.6
14.9
8.5
5.1
11.0
16.0
9.8
6.2
11.8
15.3
10.3
6.4
12.3
14.9
11.5
8.6
14.6
14.4
.001
.001
.001
.006
.001
.001
.001
.001
9.6 (4.8)
13.2
25.4
12.2
8.9
23.6
40.3
46.3
59.7
29.3
10.0 (4.9)
13.4
27.2
12.1
11.1
22.5
39.9
46.5
60.1
28.0
9.8 (5.0)
12.9
25.9
12.9
9.2
23.7
41.0
46.4
59.0
29.9
9.6 (4.8)
13.2
25.4
12.1
9.0
23.8
40.0
46.2
60.0
29.5
9.5 (4.7)
13.3
25.1
12.3
7.6
22.8
41.4
46.9
58.6
28.2
.001
.67
.009
.06
.001
.003
.001
.25
.001
.001
.001
.62
.002
.47
.001
.48
.09
.34
.09
.03
50.8
12.1
34.7
2.5
52.5
12.5
33.1
2.0
50.7
11.7
35.1
2.5
50.9
12.0
34.7
2.4
50.1
12.8
34.4
2.7
.001
10.7
29.6
21.7
3.2
34.9
7.5
20.4
15.9
2.3
53.9
10.2
29.2
21.6
2.9
36.0
10.7
29.4
21.1
3.1
35.7
11.5
32.3
25.2
3.9
27.0
.001
5.9
4.4
4.4
0.4
20.2
1.1
6.0
4.8
4.1
0.2
22.1
0.9
5.9
5.1
4.4
0.3
20.3
1.1
6.0
4.4
4.4
0.4
20.2
1.1
5.7
4.2
4.8
0.3
19.8
1.0
.29
.001
.02
.12
.005
.31
78.7
15.5
76.2
16.7
77.8
15.7
78.6
15.6
80.3
14.6
Unable to move
Unknown
Urinary incontinence, %
3.0
2.8
7.1
3.8
3.3
8.7
3.6
3.0
7.9
3.0
2.7
7.2
2.5
2.7
6.2
Admitted from a nursing home, %
5.2
6.5
5.5
5.3
4.4
Peripheral vascular disease
PTCA
CABG
Current smoker
Clinical presentation
APACHE II score, mean (SD)
Mean arterial pressure ⬍80 mm Hg, %
Heart rate ⬎100/min, %
Renal insufficiency, %
DNR order on admission, %
CHF on radiograph, %
Subendocardial infarction, %
Anterior infarction, %
Q-wave infarction, %
ST-segment elevation infarction, %
Killip class, %
I
II
III
IV
Left ventricular ejection fraction, %
Normal (⬎55%)
Mild (⬎40%-55%)
Moderate (⬎20%-40%)
Severe (⬍20%)
Missing
Comorbid conditions, %
Dementia
Microalbuminuria
Anemia
Liver disease
Chronic obstructive pulmonary disease
HIV or other immunocompromised status
Mobility, %
Independent
Assisted
3 Stars
98 725 (69.6)
P Value
for Trend
No. (%) of patients
Race, %
White
Black
Other
Clinical history, %
Hypertension
Diabetes
Myocardial infarction
Stroke/cerebrovascular disease
Congestive heart failure
2 Stars
14 156 (10.0)
Global
P Value
1 Star
5089 (3.6)
.20
.001
.02
.13
.003
.74
.001
.001
.001
.001
.001
(continued)
1282 JAMA, March 13, 2002—Vol 287, No. 10 (Reprinted)
©2002 American Medical Association. All rights reserved.
Downloaded from www.jama.com at Rutgers University Libraries on October 20, 2009
INTERNET HEALTH CARE REPORT CARD
Table 2. Patient Characteristics (cont)
Hospital Ratings
Characteristic
All Hospitals
1 Star
2 Stars
3 Stars
Attending physician specialty, %
Cardiology
Internal medicine
32.3
36.3
17.7
36.6
31.1
36.7
31.5
36.8
39.6
33.9
Family or general practice
Other
17.3
14.1
35.3
10.4
17.8
14.4
17.8
13.9
10.9
15.9
5 Stars
Global
P Value
P Value
for Trend
.001
*PTCA indicates percutaneous transluminal coronary angioplasty; CABG, coronary artery bypass graft; APACHE, Acute Physiology and Chronic Health Evaluation; DNR, do not
resuscitate; CHF, congestive heart failure; and HIV, human immunodeficiency virus.
Table 3. Process of Care Measures and In-Hospital Outcomes According to Hospital Rating*
Hospital Ratings
Characteristic
Patients classified as ideal
for therapy, No. (%)
Aspirin on admission
␤-Blockers on admission
Acute reperfusion therapy
Aspirin at discharge
␤-Blockers at discharge
ACE inhibitors at discharge
Ideal patients receiving therapy, No. (%)
Aspirin on admission
␤-Blockers on admission
Acute reperfusion therapy
Overall
Primary PTCA
Thrombolytic therapy
Aspirin at discharge
␤-Blockers at discharge
ACE inhibitors at discharge
In-hospital outcomes
Mortality, %
Length of stay, mean (SD), d
Total charges, mean (SD), $
All Hospitals
Global
P Value
P Value for Trend
1 Star
2 Stars
3 Stars
5 Stars
117 332 (82.7)
58 261 (41.0)
10 605 (7.5)
49 503 (34.9)
27 164 (19.1)
17 281 (12.2)
83.4
40.9
7.0
28.4
14.2
7.7
82.8
40.6
7.8
32.0
17.0
11.0
83.2
41.3
7.6
34.7
18.9
11.8
80.3
40.5
7.0
38.9
22.7
15.5
.001
.10
.009
.001
.001
.001
.001
.64
.12
.001
.001
.001
84 694 (72.2)
29 347 (50.4)
66.4
35.7
69.4
46.6
72.1
50.6
75.4
54.8
.001
.001
.001
.001
6201 (58.5)
646 (6.1)
5711 (53.8)
37 472 (75.7)
16 710 (61.5)
10 480 (60.6)
55.5
1.1
54.9
68.0
52.1
57.4
60.6
4.3
56.9
71.4
58.1
57.6
59.4
5.5
55.3
75.5
61.8
61.5
53.6
10.8
45.3
79.7
63.3
59.6
.001
.001
.001
.001
.001
.004
.008
.001
.001
.001
.001
.40
⬍.001
⬍.001
⬍.001
⬍.001
.005
⬍.001
14.2
16.1
16.7
14.0
12.8
9.2 (9.3)
8.4 (8.2)
9.6 (9.4)
9.1 (9.3)
9.3 (9.2)
12 863 (14 676) 10 486 (12 471) 12 927 (14 277) 12 747 (14 732) 13 736 (15 054)
*Data are presented as percentages unless otherwise indicated. ACE indicates angiotensin-converting enzyme; PTCA, percutaneous transluminal coronary angioplasty.
the cohort to patients admitted after October 1, 1994, and repeating analyses
including readmissions.
COMMENT
In our evaluation of a popular Webbased hospital report card for AMI, we
found a gradient in the care and outcomes of patients in hospitals in
different rating categories. In general,
patients who received care in higherrated hospitals were, on average, more
likely to receive aspirin and ␤-blockers and had lower risk-standardized
mortality rates than patients treated in
lower-rated hospitals. This finding
would seem to validate the use of ratings derived from a proprietary model
using administrative data. However, we
also found substantial heterogeneity in
performance within rating categories.
In addition, when hospitals assigned to
any 2 different rating groups were considered individually instead of in aggregated categories, risk-standardized
mortality rates were either comparable or even better in the lower-rated
hospital in more than 90% of the comparisons. These findings suggest that
these ratings do convey some important information in aggregate, but provide little meaningful discrimination
between individual hospitals’ performance in a manner sufficient for a public interested in making informed hospital choices.
©2002 American Medical Association. All rights reserved.
This rating system’s performance at
the group and individual hospital level
highlights a discrepancy common to
hospital rating and evaluation systems. While such ratings may differentiate between groups of hospitals in
the aggregate when sample sizes are
large enough to produce stable estimates, they do not differentiate well
between quality and outcome differences between individual hospitals
where sample sizes are much smaller.
Although evaluating more cases at each
hospital would increase the precision
of estimates associated with any individual hospital’s performance and the
likelihood of detecting differences when
comparing 2 hospitals, the patient vol-
(Reprinted) JAMA, March 13, 2002—Vol 287, No. 10
Downloaded from www.jama.com at Rutgers University Libraries on October 20, 2009
1283
INTERNET HEALTH CARE REPORT CARD
Table 4. Association Between Hospital Rating and Process of Care Measures*
Hospital Ratings, Risk Ratio (95% CI)
Process of Care Measures
Acute reperfusion therapy
Unadjusted
Adjusted for patient characteristics
Adjusted for patient and hospital characteristics
Aspirin on admission
Unadjusted
Adjusted for patient characteristics
Adjusted for patient and hospital characteristics
␤-Blockers on admission
Unadjusted
Adjusted for patient characteristics
Adjusted for patient and hospital characteristics
Aspirin at discharge
Unadjusted
Adjusted for patient characteristics
Adjusted for patient and hospital characteristics
␤-Blockers at discharge
Unadjusted
Adjusted for patient characteristics
Adjusted for patient and hospital characteristics
Angiotensin-converting enzyme inhibitors at discharge
Unadjusted
Adjusted for patient characteristics
Adjusted for patient and hospital characteristics
1 Star
2 Stars
3 Stars
1.04 (0.92-1.14)
1.02 (0.89-1.14)
0.99 (0.85-1.12)
1.13 (1.05-1.21)
1.13 (1.04-1.22)
1.09 (0.99-1.18)
1.11 (1.05-1.16)
1.11 (1.04-1.17)
1.07 (1.00-1.14)
0.88 (0.84-0.91)
0.88 (0.84-0.92)
0.92 (0.88-0.96)
0.92 (0.89-0.94)
0.92 (0.89-0.95)
0.95 (0.92-0.98)
0.96 (0.93-0.98)
0.96 (0.93-0.98)
0.97 (0.95-0.99)
0.65 (0.58-0.73)
0.66 (0.59-0.74)
0.83 (0.76-0.90)
0.85 (0.80-0.91)
0.86 (0.80-0.91)
0.95 (0.90-1.01)
0.92 (0.88-0.96)
0.93 (0.89-0.97)
1.00 (0.96-1.03)
0.85 (0.80-0.90)
0.80 (0.76-0.84)
0.92 (0.88-0.96)
0.90 (0.86-0.93)
0.88 (0.85-0.91)
0.95 (0.92-0.98)
0.95 (0.92-0.97)
0.93 (0.92-0.95)
0.98 (0.97-1.00)
0.82 (0.74-0.91)
0.86 (0.78-0.94)
0.90 (0.83-0.98)
0.92 (0.86-0.98)
0.93 (0.88-0.99)
0.97 (0.91-1.02)
0.98 (0.94-1.01)
0.99 (0.95-1.03)
1.00 (0.96-1.04)
0.96 (0.85-1.08)
0.95 (0.84-1.06)
0.95 (0.83-1.06)
0.97 (0.90-1.03)
0.96 (0.90-1.02)
0.97 (0.91-1.04)
1.03 (0.99-1.07)
1.03 (0.98-1.07)
1.03 (0.98-1.07)
*Multivariable logistic regression models for therapy use among ideal patients adjusted for clustering of patients by hospital. Patient characteristics adjusted for demographics,
clinical history, admission characteristics, and comorbid conditions. Hospital characteristics adjusted for ownership, teaching status, cardiac care facilities, acute myocardial
infarction volume, physician specialty, and location. The 5-star group is the comparison category. CI indicates confidence interval.
Figure 1. Risk-Adjusted Rates of Therapy Use Among the Rating Groups
Risk-Adjusted Treatment Rate, %
Therapy on Admission
β-Blocker
Aspirin
100
Reperfusion
80
60
40
20
0
Risk-Adjusted Treatment Rate, %
Therapy on Discharge
β-Blocker
Aspirin
100
ACE Inhibitor
80
60
40
20
0
1
2
3
5
Hospital Rating Group
1
2
3
5
Hospital Rating Group
1
2
3
5
Hospital Rating Group
The outer lines of each “box” correspond to the 25th and 75th percentiles, and the middle line corresponds to the 50th percentile in the distribution of treatment rates.
The upper horizontal line or “whisker” represents upper adjacent values or treatment rates above the 75th percentile that fall within the range of rates defined by the
75th percentile plus 1.5 times the interquartile range (25th-75th percentile). The lower horizontal line or “whisker” represents lower “adjacent” values or treatment
rates below the 25th percentile that fall within the range of rates defined by the 25th percentile minus 1.5 times the interquartile range (25th-75th percentile). ACE
indicates angiotensin-converting enzyme.
1284 JAMA, March 13, 2002—Vol 287, No. 10 (Reprinted)
©2002 American Medical Association. All rights reserved.
Downloaded from www.jama.com at Rutgers University Libraries on October 20, 2009
INTERNET HEALTH CARE REPORT CARD
Table 5. Association Between Hospital Rating and 30-Day Mortality*
Hospital Ratings
30-Day mortality rates
Observed†
Predicted†
Risk-standardized†
Logistic regression model,
RR (95% CI)
Unadjusted
Adjusted for patient
characteristics
Adjusted for patient and
hospital characteristics
Adjusted for patient and
hospital characteristics
and AMI treatment
1 Star
2 Stars
3 Stars
5 Stars
23.0
19.2
21.3
18.6
18.2
18.2
15.4
17.6
21.9
20.8
18.1
15.9
1.50 (1.40-1.60) 1.38 (1.32-1.45) 1.18 (1.14-1.22) Referent
1.50 (1.40-1.61) 1.40 (1.33-1.48) 1.17 (1.13-1.22) Referent
1.42 (1.32-1.53) 1.36 (1.29-1.43) 1.15 (1.10-1.20) Referent
1.38 (1.28-1.49) 1.34 (1.27-1.41) 1.14 (1.10-1.19) Referent
*Risk of 30-day mortality compared with patients hospitalized in 5-star hospitals, adjusted for clustering of patients by
hospitals. RR indicates risk ratio; CI, confidence interval; and AMI, acute myocardial infarction.
†P = .001 for global and test of trend.
treated in a way that is not accounted
for in risk-adjustment models. Administrative data are far easier and less expensive to obtain than more clinically
detailed information that can be derived from medical records, but they may
have limited utility in publicly reported ratings. Concerns about data
quality, adequacy of methods, issues of
selection bias in patient populations, inadequate risk adjustment, and reliable
identification of outlier hospitals were
some of the reasons why the then Health
Care Financing Administration abandoned its decision to publicly release
hospital mortality statistics after 1993.6
The repackaging of Medicare hospital
mortality data in this rating system does
not address the fundamental limitations of administrative data. This is particularly problematic given that such rating data are provided, with minimal
explanation of design concerns, to health
care consumers unfamiliar with basic
statistical concepts or the limitations of
administrative data and administrative
data-based rating systems.
Publicly reported hospital ratings
based on patient mortality may result
in poorer net clinical outcomes than observed prior to public reporting.30 Even
if mortality ratings were based on highquality data and comprehensive riskadjustment models, mortality has limited utility as a measure of quality.
Although mortality is an important
©2002 American Medical Association. All rights reserved.
Figure 2. Risk-Standardized 30-Day
Mortality Rates Among the Rating Groups
100
Risk-Standardized 30-Day
Mortality Rate, %
ume at many hospitals is insufficient to
produce precise estimates. Even when
analyses were restricted to hospitals
with an annual volume of 30 or more
cases (a large number given the volumes of smaller centers), the proportion of comparisons in which hospitals in 2 different ratings groups were
statistically comparable was relatively
unchanged. Alternatively, multilevel
regression analyses may facilitate comparisons incorporating centers with
small volumes. In the absence of this
approach, invalid classifications resulting in mislabeling may have significant unintended consequences by providing consumers with an inaccurate
perception of an individual hospital’s
performance. For example, the publication by the then Health Care Financing Administration of statistical outliers for mortality quickly became known
as the government’s hospital “death
list.”21,22
Misclassification of hospitals also may
be due to the performance of the predictive model. Due to the proprietary nature of the HealthGrades.com model, we
were unable to evaluate it directly. Nevertheless, even without information
about this model, it is likely that these
ratings are limited by their reliance on
administrative data. Administrative data
are subject to significant limitations, including insufficient clinical information, the inevitable inclusion of substantial numbers of patients who did not
experience an AMI because of administrative diagnosis imprecision,23 and
confusion concerning complications and
preexisting conditions.24-27 Inconsistencies in coding (“overcoding” in “low”
mortality hospitals and “undercoding”
in “high” mortality hospitals) are also
problematic and often explain some of
the difference between hospitals’ ratings.28 Risk models based on administrative data can lead to substantial
misclassification of hospitals when compared with models based on higherquality data.29 Because of issues of patient selection, either as a result of
location, ownership, membership in
health plans, or teaching status, hospitals may differ in the kinds of patients
80
60
40
20
0
1
2
3
5
Hospital Rating Group
The outer lines of the “box” correspond to the 25th
and 75th percentiles, and the middle line corresponds to the 50th percentile in the distribution of 30day mortality rates. The upper horizontal line or “whisker” represents upper adjacent values or 30-day
mortality rates above the 75th percentile that fall within
the range of rates defined by the 75th percentile plus
1.5 times the interquartile range (25th-75th percentile).
The lower vertical line or “whisker” represents lower
“adjacent” values or 30-day mortality rates below the
25th percentile that fall within the range of rates defined by the 25th percentile minus 1.5 times the interquartile range (25th-75th percentile).
measure, it does not identify specific
processes that require improvement31
and often correlates poorly with quality of care.32 Mortality results may be
best used for internal quality audits in
which other supplementary information can be obtained and evaluated. A
more accurate evaluation of hospital
quality for the public may be achieved
by the use of process measures. Comparisons of hospitals’ processes of care
(eg, the use of ␤-blockers during hos-
(Reprinted) JAMA, March 13, 2002—Vol 287, No. 10
Downloaded from www.jama.com at Rutgers University Libraries on October 20, 2009
1285
INTERNET HEALTH CARE REPORT CARD
Table 6. Comparison of Risk-Standardized
Mortality Rates Between Hospital Rating
Groups*
Hospital Ratings
2 Stars
Better
Comparable
Worse
3 Stars
Better
Comparable
Worse
5 Stars
Better
Comparable
Worse
1 Star
2 Stars
3 Stars
6.2
91.4
2.4
...
...
...
...
...
...
3.9
92.7
3.4
3.2
89.5
7.3
...
...
...
4.6
92.3
3.1
4.1
88.0
7.9
5.4
90.3
4.3
*Data presented are the proportion of statistical comparisons made between hospitals in HealthGrades.com rating groups in which 2 hospitals had statistically better,
comparable, or worse risk-standardized 30-day mortality rates. “Better” refers to the proportion of hospital
comparisons in which hospitals from a rating group (listed
across the top) had a statistically better riskstandardized mortality rate compared with hospitals from
a rating group listed down the side of the table, eg, in
6.2% of comparisons between 1-star hospitals and 2-star
hospitals, 1-star hospitals had statistically better riskstandardized outcomes. “Comparable” refers to the proportion of hospital comparisons in which hospitals from
the rating group (listed across the top) had a statistically comparable risk-standardized mortality rate compared with hospitals from a rating group listed down the
side of the table. “Worse” refers to the proportion of hospital comparisons in which hospitals from the rating group
(listed across the top) had a statistically worse riskstandardized mortality rate compared with hospitals from
a rating group listed down the side of the table.
pitalization for AMI) would directly
demonstrate whether a hospital’s care
is in compliance with national treatment guidelines.
The Joint Commission on the Accreditation of Healthcare Organizations is developing such a process-based evaluation,33 and the Centers for Medicare and
Medicaid Services is currently evaluating process-based care measures with the
goal of reducing preventable mortality.34 Such an approach has its own limitations, notably how to develop standards for reporting and measuring
process of care. However, this approach may represent an improvement
in the measurement of hospitals’ performance by providing quantifiable measures of quality that can be of benefit to
both hospitals and consumers.
Several issues should be considered
in evaluating our methods. Although
we sought to replicate HealthGrades
.com’s rating approach, there are several differences between our cohort and
that it evaluated. First, the period of the
rating system’s data (October 1994 to
September 1997) overlapped with only
half of our study period (January 1994
to February 1996). A perfect overlap was
not feasible because this rating system
first began reporting data (in 1999) for
the 1994-1997 period; thus, no ratings
were available for the entire CCP period.
Lack of a precise temporal overlap, however, would only be of concern if hospitals’ ratings markedly changed between
March 1996 and September 1997. This
would raise even further concerns as to
the stability and validity of these ratings because they are based on admissions that occurred 2 to 5 years earlier.
Second, the rating system was based
on patients admitted with a principal
ICD-9-CM diagnosis code of 410 or a diagnosis related group code of 121, 122,
123, while CCP data only include patients admitted with a principal ICD9-CM discharge diagnosis of AMI. We believe the use of the principal discharge
diagnosis is the most appropriate method
of identifying AMIs (which are subsequently clinically confirmed) as it identifies the condition chiefly responsible for
a patient’s admission to the hospital.35
Third, the rating system retained patients’ readmissions in their evaluation cohort. Because multiple admissions for the same patient may violate
independence assumptions required for
regression analyses, we only included
patients’ first admissions in our main
analysis. However, findings were similar when analyses were repeated incorporating cases that had previously been
excluded as readmissions.
Fourth, the rating system includes
hospitalizations of patients who arrived by means of interhospital transfer in their hospital evaluation while we
excluded these patients from our analysis. Patients with AMI who are admitted by interhospital transfer are generally healthier than those who arrive by
direct admission.36 Including these patients would result in a bias toward lower
estimates of mortality in large, urban,
and advanced cardiac care hospitals that
receive patients by transfer. Furthermore, risk adjustment for “admission”
characteristics for patients who arrive
1286 JAMA, March 13, 2002—Vol 287, No. 10 (Reprinted)
by transfer would reflect their clinical
status several days postinfarction as
opposed to peri-infarction characteristics for patients who arrive by direct
admission.
Fifth, the rating system excluded hospitalizations of patients who are transferred out of a hospital while we retained these patients in our analysis.
Given that patients who leave a hospital by transfer are generally healthier
than those not transferred, the rating
system’s exclusion of these patients results in a systematically biased higher
estimate of mortality for smaller hospitals, hospitals in rural areas, hospitals without cardiac care facilities, and
others more likely to transfer patients
to other centers.37
Finally, we compared hospitals’ ratings, based on mortality during hospitalization and the 30 days following discharge, with mortality at 30 days’ and
60 days’ postadmission. We used a
slightly different follow-up period to ensure hospitals’ outcomes reflected standardized ascertainment of mortality, not
influenced by variations in length of stay
or discharge practices, thus avoiding the
documented biases associated with using in-hospital mortality to assess hospitals’ performance.38
Several possible limitations should be
considered in interpreting these data. We
only considered a single disease in our
evaluation of the rating system, so our
findings may not necessarily be generalizable to ratings for other conditions.
Nonetheless, AMI is a common, highvolume condition at many hospitals with
a clear base of evidence to support recommended treatments. In addition, our
study was limited to data concerning
Medicare fee-for-service patients hospitalized with AMI and may not be relevant to the care of younger patients or
those hospitalized with other conditions. However, the hospital ratings were
also derived from data related to this
group, and thus should be ideally suited
for producing hospital ratings for the
treatment of this population. Also, we
evaluated only 1 “report card” system.
These results may not be generalizable
to other ratings systems, although it is
©2002 American Medical Association. All rights reserved.
Downloaded from www.jama.com at Rutgers University Libraries on October 20, 2009
INTERNET HEALTH CARE REPORT CARD
unlikely that a ratings system focused on
outcomes, using the same data source
and the same methods would achieve
different results.
The increase in the number of publicly available hospital report cards such
as HealthGrades.com reflects the public’s desire for comparative data on quality and outcomes. However, the necessary and often overlooked caveat
associated with such report cards is that
the public (and health care professionals) often become focused on identifying “winners and losers” rather than using these data to inform quality
improvement efforts. Our evaluation of
an Internet hospital rating system highlights the importance of this message.
Although the ratings we evaluated accurately differentiated between large
groups of hospitals, they inadequately
classified individual hospitals, with significant potential consequences for perceptions of an individual institution’s
quality of care, particularly if directly
released to a public unfamiliar with the
design and limitations of administrative data-derived rating systems. As
such, current outcome-based report
card efforts are better used as a tool for
quality improvement, rather than as a
publicly reported means of discriminating between hospital performance.
Author Affiliations: Section of Cardiovascular Medicine, Department of Medicine (Drs Krumholz, Chen,
and Radford, and Messrs Rathore and Wang), and Section of Health Policy and Administration, Department
of Epidemiology and Public Health (Dr Krumholz), Yale
University School of Medicine, New Haven, Conn; YaleNew Haven Hospital Center for Outcomes Research and
Evaluation, New Haven, Conn (Drs Krumholz and Radford); and Qualidigm, Middletown, Conn (Drs Krumholz and Radford). Dr Chen is currently affiliated with
the Department of Radiology, Hospital of the University of Pennsylvania, Philadelphia.
Author Contributions: Study concept and design:
Krumholz, Rathore, Chen, Radford.
Acquisition of data: Krumholz, Chen, Radford.
Analysis and interpretation of data: Krumholz,
Rathore, Chen, Wang, Radford.
Drafting of the manuscript: Krumholz, Rathore, Chen.
Critical revision of the manuscript for important intellectual content: Krumholz, Rathore, Chen, Wang,
Radford.
Statistical expertise: Krumholz, Rathore, Chen, Wang.
Obtained funding: Krumholz.
Administrative, technical, or material support: Chen,
Wang.
Study supervision: Krumholz.
Funding/Support: The analyses upon which this article is based were performed under Contract 500-
99-CT01 entitled “Utilization and Quality Control Peer
Review Organization for the State of Connecticut,”
sponsored by the Health Care Financing Administration, US Department of Health and Human Services.
Disclaimer: The content of this publication does not
necessarily reflect the views or policies of the US Department of Health and Human Services, nor does
mention of trade names, commercial products, or organizations imply endorsement by the US government. The authors assume full responsibility for the
accuracy and completeness of the ideas presented. This
article is a direct result of the Health Care Quality Improvement Project initiated by the Health Care Financing Administration, which has encouraged identification of quality improvement projects derived from
analysis of patterns of care, and therefore required no
special funding on the part of the contractor.
Acknowledgment: The authors thank Christopher
Puglia, BS, for assistance in data collection and Maria
Johnson, BA, for editorial assistance.
REFERENCES
1. America’s Best Hospitals: 2001 Hospital Guide. US
News and World Report; 2001. Available at: www
.usnews.com/usenews/nycu/health/hosptl/tophosp
.htm. Accessed February 12, 2002.
2. Green J, Wintfeld N. Report cards on cardiac surgeons: assessing New York State’s approach. N Engl
J Med. 1995;332:1229-1232.
3. National Committee for Quality Assurance.
NCQA’s Health Plan Report Card. Washington, DC:
National Committee for Quality Assurance; 2000.
4. Health Care Report Cards 1998-1999. 4th ed. Washington, DC: Atlantic Information Services Inc; 1998.
5. Pear R. Medicare shift towards HMOs is planned.
New York Times. June 5, 2001:A19.
6. Morrissey J. Internet company rates hospitals. Modern Healthcare. 1999;29:24-25.
7. Schifrin M, Wolinsky M. Use with care. Forbes Best
of the Web, June 25, 2001. Available at: http://www
.forbes.com/bow/. Accessed February 12, 2002.
8. Prager LO. Criteria to identify “leading physicians” yield a long list. American Medical News. September 6, 1999. Available at: www.ama-assn.org/
sci-pubs/amnews/pick_99/prl10906.htm. Accessed
February 12, 2002.
9. Healthgrades.com: The Healthcare Quality Experts. Available at: www.healthgrades.com. Accessed June 18, 2001.
10. Butler R. Fifty most incredibly useful sites. Yahoo! Internet Life. July 2001. Available at: http://www
.yil.com/features/features.asp?volume=07
&issue=07&keyword=usefulsites. Accessed February
12, 2002.
11. Appleby J, Davis R. Is your doctor bad? USA Today. October 11, 2000:B1.
12. Carey B. Say “aah”: your health online. Los Angeles Times. July 2, 2001:S2.
13. HealthGrades, Inc announces fourth quarter and
year-end results; 2001. Available at: www.healthgrades
.com. Accessed February 12, 2002.
14. HealthGrades announces partnership agreement with the Leapfrog Group; 2002. Available at:
www.healthgrades.com. Accessed February 12, 2002.
15. Marciniak TA, Ellerbeck EF, Radford MJ, et al. Improving the quality of care for Medicare patients with
acute myocardial infarction: results from the Cooperative Cardiovascular Project. JAMA. 1998;279:13511357.
16. Ryan TJ, Anderson JL, Antman EM, et al. ACC/
AHA guidelines for the management of patients with
acute myocardial infarction: a report of the American
College of Cardiology/American Heart Association Task
Force on Practice Guidelines (Committee on Management of Acute Myocardial Infarction). J Am Coll Cardiol. 1996;28:1328-1428.
©2002 American Medical Association. All rights reserved.
17. Fleming C, Fisher ES, Chang CH, Bubolz TA,
Malenka DJ. Studying outcomes and hospital utilization in the elderly: the advantages of a merged data
base for Medicare and Veterans Affairs hospitals. Med
Care. 1992;30:377-391.
18. Daley J, Jencks SF, Draper D, Lenhart G, Thomas
N, Walker J. Predicting hospital-associated mortality for
Medicare patients: a method for patients with stroke,
pneumonia, acute myocardial infarction, and congestive heart failure. JAMA. 1988;260:3617-3624.
19. White HA. A heteroskedasticity-consistent covariance matrix estimator and a direct test for heteroskedasticity. Econometrica. 1980;48:817-838.
20. Zhang J, Yu KF. What’s the relative risk? a method
of correcting the odds ratio in cohort studies of common outcomes. JAMA. 1998;280:1690-1691.
21. Vladeck BC, Goodwin EJ, Myers LP, Sinisi M. Consumers and hospital use: the HCFA “death list.” Health
Aff (Millwood). 1988;7:122-125.
22. How to read the hospital death list. New York
Times. March 17,1986:A18.
23. Iezzoni LI, Burnside S, Sickles L, et al. Coding of
acute myocardial infarction: clinical and policy implications. Ann Intern Med. 1988;109:745-751.
24. Miller MG, Miller LS, Fireman B, Black SB. Variation in practice for discretionary admissions: impact
on estimates of quality of hospital care. JAMA. 1994;
271:1493-1498.
25. Mennemeyer ST, Morrisey MA, Howard LZ. Death
and reputation: how consumers acted upon HCFA
mortality information. Inquiry. 1997;34:117-128.
26. Green J, Passman LJ, Wintfeld N. Analyzing hospital mortality: the consequences of diversity in patient mix. JAMA. 1991;265:1849-1853.
27. Green J, Wintfeld N, Sharkey P, Passman LJ. The
importance of severity of illness in assessing hospital
mortality. JAMA. 1990;263:241-246.
28. Wilson P, Smoley SR, Wedegar D. Second Report
of the California Hospital Outcomes Project: Acute Myocardial Infarction. Sacramento, Calif: Office of Statewide Health Planning and Development; 1996.
29. Krumholz HM, Chen J, Wang Y, et al. Comparing AMI mortality among hospitals in patients 65 years
of age and older: evaluating methods of risk adjustment. Circulation. 1999;99:2986-2992.
30. Dranove D, Kessler D, McClellan M, Satterthwaite M. Is More Information Better? The Effects of
“Report Cards” on Health Care Providers. Cambridge, Mass: National Bureau of Economic Research; 2002. NBER Working Paper 8697.
31. Lohr KN. Outcome measurement: concepts and
questions. Inquiry. 1988;25:37-50.
32. Thomas JW, Hofer TP. Accuracy of risk-adjusted
mortality rate as a measure of quality of care. Med
Care. 1999;37:83-92.
33. Braun BI, Koss RG, Loeb JM. Integrating performance measure data into the Joint Commission accreditation process. Eval Health Prof. 1999;22:283-297.
34. Jencks SF, Cuerdon T, Burwen DR, et al. Quality
of medical care delivered to Medicare beneficiaries: a
profile at state and national levels. JAMA. 2000;284:
1670-1676.
35. Iezzoni LI. Data sources and implications: administrative data bases. In: Iezzoni LI, ed. Risk Adjustment for Measuring Health Care Outcomes. Ann Arbor, Mich: Health Administration Press; 1994.
36. Mehta RH, Stalhandske EJ, McCargar PA, Ruane TJ, Eagle KA. Elderly patients at highest risk with
acute myocardial infarction are more frequently transferred from community hospitals to tertiary centers:
reality or myth? Am Heart J. 1999;138:688-695.
37. Thiemann DR, Coresh J, Powe NR. Quality of care
at teaching and nonteaching hospitals. JAMA. 2000;
284:2994-2995.
38. Jencks SF, Williams DK, Kay TL. Assessing hospitalassociated deaths from discharge data: the role of length
of stay and comorbidities. JAMA. 1988;260:2240-2246.
(Reprinted) JAMA, March 13, 2002—Vol 287, No. 10
Downloaded from www.jama.com at Rutgers University Libraries on October 20, 2009
1287