Download Report

CLIN. CHEM. 32/8, 151 0-1 516 (1986)
How to Make LaboratoryInformationMore Informative
Peter E. PoHtser
I review psychological literature relevant to the design of
medical laboratory data reports and computer display systems, illustrate how to improve such systems, and discuss
needs for further research.
AddItional Keyphrases: lactate dehydrogenase isoenzymes
data display
diagnosis
medical decisionmaking
computer-assisteddiagnosis
myocardial infarction
.
.
In recent years, laboratory data have become more numerous and more difficult to evaluate. Automatic analyzers
and computer storage capabilities have multiplied the potential volume of test information, and new diagnostic
technologies have increased its complexity. Simultaneously,
the growing proportion
of elderly patients with multiple
organ-system
failure has made data interpretation
more
difficult. Thus, it is no surprise to find many published
studies (reviewed in 1) documenting failures to recognize
important
findings and diagnostic possibilities.
Some of these errors may arise from deficiencies in
laboratory reports, which often seem to complicate rather
than simplify information. Irrelevant data in them can
distract us from significant results, as a magician’s gesture
can divert us from a trick. Both exploit the limitations of
human attention.
Fortunately,
these and other deceptions may be preventable if we design laboratory reports with human psychological limitations in mind. Here I illustrate how this may be
done. This paper provides a framework, based on psychological research, for improving standard laboratory reports and
computer-data displays.
Background
Creating better laboratory-reporting methods may help
compensate for human limitations
and support decision
making in ways that other decision-support methods cannot. Computer-assisted diagnostic methods all have serious
deficiencies. Although Bayesian methods can be used to
interpret complex laboratory data automatically (i.e., compute disease likelihoods), they typically do not assist in the
early stages of diagnosis, in discriminating multiple coexisting disorders, or in monitoring changes in disease. Such
applications are possible in theory, but in practice the
possible disease spectra often are too numerous and the data
insufficient. Artificial-intelligence programs, derived from
expert judgment rather than a database, may enlarge the
range of applications. However, they usually do not assist
perceptual
recognition of patterns to monitor disease progressionor treatment effects, and they also require simple
Harvard School of Public Health, Departments of Health Policy
and Management, and Biostatistics, and the Institute for Health
Research, a Joint Program of Harvard University and The Harvard
Community Health Plan, 677 Huntington Ave., Boston, MA 02115.
Presented in part at the 5th Annual Conference on Clinical
Laboratory Organization and Management, Haifa, Israel, 1985.
Received October 21, 1985; accepted May 22, 1986.
1510 CLINICALCHEMISTRY, Vol. 32, No. 8, 1986
patient categorizations, as do algorithms and most other
decision-support methods.
Graphic display techniques are free of these deficiencies.
They can assist perception of abnormalities or relationships
very early in the diagnostic process, help monitor subtle
changes, summarize or reduce data, and even suggest the
presence of coexisting diseases. They also leave the usual
judgment tasks to the physician. As a result, they are often
more clinically acceptable than other decision-support
methods.
Many different display techniques have been proposed
(reviewed in 2), including computer-generated time plots,
multiple graphs related to a common system or organ,
summary statistics, and displays only of abnormal results.
These methods, however, can hinder as well as help. They
sometimes overwhelm the clinician with too many results or
omit important findings in an effort to reduce information
overload. Some designers of computer display systems have
begun to acknowledge the need to consider “human factors”
(2, 3). Others (4) have developed general guidelines for
choosing graphic techniques in the presentation of research
data. However, no one has yet adequately demonstrated the
applicability of these new techniques to the improvement of
laboratory
data presentations.
Also, the need for many
additional techniques, motivated by other psychological
literature,
has been ignored. Most seriously, no adequate
framework has existed specifically to guide selection of
methods for display of laboratory data. This paper makes a
contribution toward filling these gaps.
Basic Concepts
In the following sections, I present guidelines to show how
better to tailor routine laboratory and computer-generated
reports to the requirements of human perception. By pre-
editing laboratory reports-as
humans do when they read
the reports-less
human effort in extracting information
from them is needed. These guidelines illustrate the use of
this principle to assist six natural stages of perception:
ifitering, simplification, coding, grouping, recognition, and
segregation (5, 6). Later, I discuss further guidelines to help
one decide which editing techniques to use when.
Filtering.
Humans initially edit information by filtering.
For example, physicians screen data in a written laboratory
report to locate key findings. With experience, they may
become reasonably proficient at this; however, superfluous
data may still be highly distracting, and ifitering
them may
require cognitiveeffort that would be better spent elsewhere. For example, the report in Figure la includes
unnecessary numbers and grids and repetitious reports of
normal ranges. It repeats the copyright number and the
name and location of the company producing the sheet
several times. Regrettably, similar distractions typically
occur on many slips throughout patients’ charts. (On many
slips, in fact, the most prominently displayed piece of
information is the name of the chairman of the department!)
These unnecessary distractions can be easily eliminated,
and guidelines for doing sohave already been published (7).
Figure lb illustrates the results of applying such guidelines.
0
XXXXXX
LABOAATOEES
1500
C009’.. PECX1I0 TEXAS CAT NO 0070
b.
a.
1340
‘ti-
‘1:
29
30
2
Ac’d 5/12/83
ID 73245
SEX 02
1%
U%
13.10
3
8l6 AM
I
19.42
260.1
4-12-98
5
2
23.91
320.7
1
7
3
25.19
337.2
U
4
17,91
239.6
9
1000
10
5
500
1392
192.2
-39
SSN 4919
IlliCo
wHIsO#{128}MtF,,,.t,31lO.0
9’8j
©
Copylight 1986
CAT. NO. 0000
Ihri
-
LD1 LD2 LD3 LD4 LD5
LDT
LAOORA1CX1SS
XXXXXX
Poono. Tenon
Fig. 1. Two formsof report
In report a, a typical LDH isoenzymereport 1% refersto isoenzyme(1-6) proportions, U%’ refersto isoenzyme(1-5) values.These termsarenot explainedIn the
graphandwehavediscovered(Informally)thatmanydiniciansdo notunderstandthem.
In reportb, a simplifiedbar chartrepresentsthe essentialinformationina. The graphinb showsthe LDHisoenzymes(1-5) andtotal() witha slashfor the upper range
of normal
The isoenzyme proportions and total activity of lactate
dehydrogenase (LDH) are given as bar graphs with a slash
for the upper range of normal. This display omits unnecessary markings, repetitions, and legends, yet captures all the
important
information in the previous figure. This “preifitering” reduces the need for human perceptual editing.
Simplification.
Humans also edit data by mentally simplifying them-for
example, by rounding numbers. Unfortunately, this may require considerable effort-e.g., when
laboratory data are reported to four digits as in Figure la.
We do not need and cannot remember all of these numbers.
Thus, we may want to simplify such data, perhaps by
rounding them to two significant digits when additional
ones are unlikely to improve judgment.
Nonnumerical, graphic reports of data also can be simplified. Cleveland and McGill (8) have suggested classifying
graphical
displays hierarchically, from the simpler (e.g.,
point graphs, bar charts) to the more complex (e.g., pie
charts, Chernoff faces). As they demonstrated, the more
complex the display, the less accurate people are in discriminating differences in values. For example, whereas some
investigators have suggested that we report multiple test
results by representing each as a different feature in a
drawing of a human face (a Chernoff face) and varying their
sizes or shapes according to the magnitudes of the result,
discriminating
the differences in the sizes and shapes of
facial features is perceptually complex. The user must
remember the meaning of each feature, and irrelevant
variations in one feature can bias judgments about another
(9). Thus, despite claims that such displays are more memorable, empirical evidence suggests we generally should
avoid them and choose simpler methods (8).
Coding. Further to reduce quantitative data, people often
remember them only qualitatively. They mentally “code”
data as changes from a reference point-e.g., as above or
below normal, more or less than expected, etc. (5). Unfortunately, they may choose reference points improperly, by
comparing laboratory findings with the normal range for an
inappropriately general population. For example, failure to
consider sex in using the hematocrit to regulate blood
transfusion and similar omissions are discussed in reference
1. To correct such errors, we commonly normalize reference
values with respect to age, sex, lean body mass, etc. Still
other forms of coding may involve comparing different test
results and noting whether or not inequalities occur-e.g., is
LDH 1> LDH 2? Many common clinical heuristics take this
form. If enough prior data points are available, we may even
consider the patient’s own past average of results. Because
patients often have different homeostatic set points regulating their own normal values, recomputing the normal range
based on previous results, when feasible, may avoid erroneous judgments about the significance of change (1).
Another common problem in coding results as normal or
abnormal arises from the problem of multiple testing.
Obviously, the likelihood of an abnormal test result for a
normal individual
is not merely .05 (the frequency for an
individual test) but depends on the number of tests done. If
we perform 20 independent tests, the likelihood of a falsepositive rises to .64. Thus the question arises: should we
alter the normal range on the basis of the number of tests
involved to ensure that the likelihood of an abnormal result
from a normal patient still remains below some threshold
(e.g., .05)? (This could be done automatically, with computer
lab. data displays, and perhaps with the Bonferroni inequalCLINICAL CHEMISTRY, Vol. 32, No. 8, 1986
1511
ity or, ideally, methods
considering
inter-test correlations)
(10). Such a correction might have disadvantages, when
physicians are accustomed to working with standard normal
ranges for a common battery of tests; however, an adjustment in the normal range might be advisable in less
common cases, especially when a large number of tests are
done. If used to supplement rather than replace traditional
reporting methods, this could help the physician “see” the
data in another perspective. It could also diminish the
frequency of unnecessary
repeat tests to rule out false
positives.
Grouping. Humans also perceptually edit data by crudely
separating or grouping them (11). However, this can prove
difficult if data are improperly arranged. To illustrate, try to
identify the letter F in Figure 2a and 2b. People are much
slower and less accurate in Figure 2a because the letter F is
not separated from the other group of symbols as it is in
Figure 2b (11); in the first panel they must examine details
within the group to see the letter.
b.
1+1-
+
F
FIg. 2. The perceptual effects of grouping abstract symbols(adapted
from 24): a, distinctive element (F) displayed together with others;b,
distinctiveelementdisplayedseparately
Potassium
Similarly,
when laboratory
reports are not properly
grouped in a patient’s chart, we may miss related or unusual
findings. However, we can mechanically rearrange the data
to make perception easier. For example, Connelly et al. (3)
developed a computer system to automatically group related
findings such as results of renal- or liver-function tests. We
can also regroup related results at an even more detailed
level. To illustrate, suppose we intentionally disorganize a
panel of renal function tests in the system Connelly developed (Figure 3a). Here, the synchronous patterns for serum
urea nitrogen and serum creatinine are almost indiscernible. However, rearranging the related elements in a single
row or column (Figure 3b) helps us to seemuch more readily
the pattern correlations.-the
improvements in renal function (urea and creatinine) synchronously with the declining
potassium.’ Elsewhere, Connelly et al. (3) provided similar
examples showing synchronous increases in urea nitrogen
and creatinine, suggesting a renal rather than pre-renal
cause. We notice related trends much more easily with such
arrangements
because perception naturally flows acrossa
single row or column, or in some other continuous linear
direction, as previously illustrated in Figure 2.
Recognition. At a later stage of perception, we often must
recognize less-obvious findings, such as changes in a patient’s condition. This, however, may require the ability to
ignore irrelevant similarities and focus attention on differences in successive test results. Because this is a difficult
task for unaided perception, especially when irrelevancies
are prominent (12), assistance may be needed.
‘This example is intended only to illustrate the perceptual effects
ofgrouping, not to advocatethat we alwaysgroupanalysesmeasuring the same function. In some cases the clinician may wish to
group tests from different organs to examine their associations.
Bicarbonate
Bicarbonate
Creatinine
BUN
Chloride
BUN
Chloride
Sodium
Potas slum
Fig. 3. Effectof grouping methods on information
conveyed(adaptedfrom2): a (left sixgraphs),disorganizedpanelofkidneyfunctiontest profiles; b
(lightsix graphs),the same panel,with related profiles linearlyorganized
1512
CLINICALCHEMISTRY, Vol. 32, No. 8, 1986
For example, Figure 4 (top and middle) shows that
patterns with similar shapes on successivedays
may be scarcely distinguishable.
However, by subtracting
the raw values on day 1 from those on day 2 and thus
removing the similar pattern features, an important change
is clearly apparent: a statistically significant increase in
LDH 1, often evidence of recurrent acute myocardial infarction. Thus, even when a characteristic sign of myocardial
infarction (LDH 1 > LDH 2) escapes notice in the daily
patterns, it may become obvious when the change between
them is emphasized (ILDH
1 > LDH
2 in Figure 4,
bottom). Such subtraction could, of course, be misleading
when analytical variability is high, but when it is justified
(e.g., when the differences are statistically significant),
it
could eliminate irrelevant similarities that distract us from
important
differences in successiveresults. Thus, it relieves
the human mind of a difficult editing task.
isoenzyme
DAY 1
500
ISOENZYME
CONCENTRATIONS
U/L
Transformations other than subtraction might also help
eliminate irrelevancies
and clarify subtle differences. Displaying ratios, logarithms, reciprocals, or other transformations can, in theory, remove irrelevant curvilinear features
of patterns in time plots of serial results (similar to those
shown in Figure 3). Reciprocal transformationshave been
advocated to linearize patterns for serial determinations of
plasma creatinine, soas to distinguish changes due to renaltransplant rejection (12). Unfortunately, patients are often
too heterogeneous for any single transformation method to
linearize patterns for all of them; choosing a transformation
(e.g., a logarithm rather than a reciprocal) after the data
have been obtained would be unreasonable and could lead to
spurious results (missing a rejection when it occursor seeing
one when it does not). Thus these methods have often had
limited utility in practice.
Other transformations, however, that assume less about
the homogeneity of patients can minimize
irrelevant
changes or highlight important ones. For example, weighted
averages of serial tests sometimes can remove short-term
variability and help detect long-term trends (14). Conversely, a summation of successive results may quickly detect
important short-term changes (15). Even altering the measurement scale to “fill” the graph, as shown in Figure 3 and
as is done automatically
0
1
2
3
4
LDH ISOENZYMES (1-5)
DAY 2
500
ISOENZYME
CONCENTRATIONS
U/L
0
1
2
3
LDH ISOENZYMES
4
5
(1-5)
50
CHANGE
ISOENZYME
CONCENTRATiONS
U/L
IN
0
1
2
3
4
LDH ENZYMES (1-5)
Fig.4. Organization of information to facilitaterecognitionof changes:
LDH isoenzymeon day 1 (top)and day 2 (middle) and the changein
isoenzyme concentrations
fromday 1 to day 2 (boltom)
in computer graphing
systems, can
magnify trends in serial tests. Unfortunately,
this attempt
to highlight important changes can also accentuate irrelevant ones. For example, the variations in serum sodium (see
Figure 3) appear large only because the scale is considerably
expanded. Thus, the choice of scale and transformation
method requires a careful balancing of the risks of false
positives and false negatives.
The choice of appropriate scales or transformations may
also be influenced by the need to detect relationships
between graphs (see Figure 3) as well as changes within a
particular graph. Because different tests naturally have
different means and variabilities, some changes may appear
more salient than others, even when this conclusion is not
warranted. Equalizing the error ranges in Figure 3 is one
way to overcome this problem. Standardizing
different tests
(subtracting their means and dividing by their standard
errors) also can aid comparisons and sometimes improve
interpretation
(16).
Overall, the issue of which scale to choose is a complex
one. We should try to strike a balance between different
communication goals, depending on the clinical problems
likely to occur. Any single simplistic principle (e.g., “always
change the scale to fill up the graph”) is likely to fail. Also,
when a single graph (as in Figure 3) cannot satisfy all the
requirements simultaneously, separate ones (individual and
joint plots) may be preferred.
Segregation. Beyond recognizing changes due to a single
disease or event (e.g., transplant rejection), sometimes we
must detect co-existing disorders. Often we intuitively attempt to segregate their effects, trying to decide which
individual diseases have caused the observed test abnormalities and to what extent. Unfortunately, in complex cases
with failures of multiple organ systems, the effects of one
disorder may conceal those of another (17). Thus, when
possible, methods that mechanically segregate the effects of
separate diseases may be useful.
To illustrate how co-existing diseases may be overlooked,
supposewe construct four different LDH isoenzyme profiles:
one by graphing their normal mean activity concentrations,
another by adding a contribution from the heart, another by
CLINICALCHEMISTRY, Vol. 32, No. 8, 1986 1513
-
100
a
b.
MEAN
100
MEAN + LIVER
100
B
Mean
+
U/L
U/L
LDH 2
Heart
U/L
50
50
-
0,
0
LDH lso.nzyme
C.
I
0
I
.Dll
I
50
100
2
(1-5)
3
LDH Isoenzyme
d.
MEAN + HEART + LIVER
100
4
5
(1-5)
MEAN + HEART
ii
1
U/L
WL
U/L
FIg. 5. Constructionof the four LDH lsoenzyme profiles
The snows notethat,to the normalmean values(pointA) we add a contilbution
from the heart,which has high LDH 1 and lowLDH 2 (point C).To thiswealso
add a conhibutionfrom liver, which has low LD 1 and high LD 2 (point C). The
arrowsfrom A to B and B to C representa similarsequenceof addthons,but In
reverseorder(fIrsta contiibutlonfromthe liver andthenonefromthe heart)
includingone from the liver, and a final one by adding a
contribution from both organs, according to the reported
proportions of isoenzymes in these tissues (18). (The graph
in Figure 5 illustrates this construction method for two
isoenzymes, but the conclusions that follow apply to all five.)
Such profiles actually do occur in patients with acute
myocardial infarction, congestive heart failure, or coexisting
diseases (myocardial infarction complicatedby congestive
heart failure). Suppose next we ask: in theory, how easy to
distinguish
should these profiles be-for example, how easy
should it be to detect co-existing disorders (to distinguish
their proffle from the normal one)? The answer is that,
statistically, profiles for co-existing diseases and for normal
values are further apart (see Figure 5) and so they clearly
should differ more than the other pair of proffles (heart
alone and liver alone). In fact, this is true for any proportions of contributions representing different severities of
congestive heart failure and acute myocardial infarction,
and virtually
any common measure of similarity.2 So, when
we display the entire set of LDH proffles (as in Figure 6), it
should be easier to distinguish coexisting disorders (6c) from
normal results (6a) than to distinguish individual abnormalities from the heart (Gd) and from the liver (6b).
Curiously, however, our perceptions of the actual proffles
so constructed seem to violate the predictions
of statistical
theory. The pair we expected to be less similar (Figure Ga
and 6c) appear more similar, and the pair we expected to be
more similar (Figures 6b and Gd) look quite different. In the
latter pair, our attention
naturally
focuses on their obvious
dissimilarities in shape (due to the LDH “flip,” LDH 1 >2,
characteristic
of acute myocardial
infarction, in the heart
profile, and the increased proportion of LDH 5, characteristic of liver abnormalities,
in the other). However, in the
2This holds true in five dimensions (although only two are
pictured in Figure 5) for any proportions of contributions from the
liver or heart, representingdifferent seventies of congestive heart
failure and acute myocardial infarction, and according to an infinite
variety of statistical distance measures-Eudidean, city block, or
any Miskowski metric (19) with a parameter between zero and
infinity. It alsois true for the Mahalanobis distance, computed from
the data in reference20.
1514 CLINICALCHEMISTRY, Vol. 32, No. 8, 1986
2
3
LDH Ieoenzyme
4
(1-5)
5
0
1
2
3
LDH lsoenzvme
4
(1-5)
5
FIg. 6. Actual data from the four LDH isoenzymeprofileswhose
construction
was illustratedin Figure5
combined proffle (Figure Gc), the liver contributions mask
those of the heart; by increasing LDH 1 relative to LDH 2,
they prevent the LDH ffip. Thus the combined proffle
(Figure 6c) resembles the normal profile (Figure Ga).
This similarity,
however, is an illusion. The similar
shapes of the profiles first catch our eyes and distract us
from theirstatistically
importantdifferences
(the graph in
Figure Gc is elevated compared with that of Figure Ga).
Indeed, with traditionally reported proportions or electrophoretic proffles (as in Figure la), even this difference would
vanish.
To remove this illusion, my colleagues and I have developed a method to estimate the separate contribution from
each organ (20). We first defined five isoenzymatically
similar types of organs by means of a cluster analysis. We
then estimated (by solving a system of linear equations or
by regression analysis) the unknown amounts that each
“type” of organ contributed to the total LDH. A detailed
description of this method can be found in a separate paper
in this same issue (20).
Illustrative
example. Figure 7 demonstrates the application of this technique to the data of a patient admitted with
clinical and laboratory evidence of myocardial infarction.
3Although bar graphs as in Figure lb might clarify differences
between Figures 6a and c, they have their own drawbacks.Cleveland and McGill (8) have pointed out that the longerthe bars, the
more difficult it is to distinguish differences within a profile (e.g.,
between the isoenzymes1-5 within Figure 6a or 6c). Sobar graphs
might help correctone problem, but would create another. They
alsowould not elucidatethe sourcesof the abnormalities in Figure
& (heart and liver). Moreover,evenif point graphs (e.g.,Figures 6a
and c) are suboptimal, they are often usedin automatedsystemsfor
graphing laboratory data (e.g., Figures 3a and b); thus our example
serves to illustrate serious problemswith thesecommonerformsof
display. Failure to use the same scalesfor the axes or merely
reporting isoenzyme percentages, as laboratories sometimes do,
further compounds these problems.
Chest
patn
Recurrent
Mittat
Chest
putniolia
pain
iI1SLII
1
ficiency
y edema
14
Death
‘i’ .1.
1.
LDH2
lsoenzymes
LDH
A\ATAW
LDH 1
DATA
(1-5)
LDH3
100
LDH5
LDH4
0
1
2
3
4
0
5
TIME (days)
1
2
3
4
5
TIME (days)
Fig.7. Interpretivedisplay of changes in serial LDH isoenzymes in a patient with mitral prolapse
(a)Rawdata(uncorrectedactualactivitiesof LDHisoenzymesinserum).(b) Display
ofestimatesofthe amounts oftotalLDH(fromthe datains) attributabletoseparate
dustersof organs(the arrowsin b (unlabeled)referto the samedinical eventsdescribedat the corresponding
timesin a)
Two days after admission, this patient developed recurrent
chest pain and clinical signs of mitral valve prolapse,
including pulmonary edema. Subsequently, he underwent
surgery for coronary artery bypass with mitral valve replacement, had cardiac respiratory failure, and died. The
right side of the graph showing the organ contributions
clearly indicates that most of the LDH came from the heart,
liver, and lungs (or from iso-enzymatically similar organs).
Interestingly, the increases in lung and liver LDH clearly
mirror the development of congestive heart failure after
mitral prolapse. From the raw data, the pathologist noted
possible liver abnormalities, because of the increase in LDH
5, but failed to consider lung congestion. More importantly,
clear evidence of recurrent infarction appears in the transformed data from the last three days (Figure 7b). The
clinical staff considerd re-infarction unlikely because the
electrocardiographic
and clinical findings (chest pain relieved by nitroglycerin) were ambiguous, there was no
report of creatine kinase MB4 isoenzyme, and there was no
new LDH 1:2 flip (LDH 1 > LDH 2). Undoubtedly, the lung
and liver contributions increased LDH 2 relative to LDH 1
and prevented the flip. The estimates of separate organ
contributions, however, clearly reveal the previously hidden
heart abnormalities.5
The validity of this transformation was confirmed by
autopsy, which revealed evidence of a new infarction in the
anterior papillary muscle, undoubtedly the cause of the
mitral prolapse. An experimental test of this new method
with 73 patients in the intensive-care unit also revealed
4Actually the CK MB was never determined because the total
CK activity was not high enough to fractionate the isoenzymes.A
closer inspection of the data revealed, in fact, the presenceof a
recurrent peak in the total CK, but this was overlooked
by the
clinical staff.
5Because erythrocyte isoenzyme proportions are similar to those
from the heart, we might have also consideredhemolysis(e.g.,
hemolytic anemia) as a potential
source
of LDH abnormalities.
However,no evidencesupportedthis (the patient was notanemic).
gains in the detection of acute myocardial
infarction and
other disorders (such as pulmonary embolism). We performed a split-half cross-validation, taking half of the cases,
determining the optimal threshold based on discriminant
analysis, and applying it to the other half of the cases to
determine sensitivity and specificity. The order of analysis
was then reversed and the results from both analyses were
summed to calculate overall sensitivity and specificity. The
result was that, for the detection of acute MI, the test had
98% sensitivity and 100% specificity. It significantly outperformed unaided pathologists’ judgments and accepted indices (LDH 1:2, LDH 1, total LDH, LDH 1:total LDH) for
interpretation of LDH isoenzymes (20). Moreover, the cases
in which this approach did better were almost always ones
in which other diseases or complications concealed the
effects of acute myocardial infarction. Thus, the uncovering
of hidden disorders by estimating separate organ contributions was not isolated to the case in Figure 7 but appeared
useful for other patients as well. A full discussion of the
methodology
and its empirical validation is beyond the
scopeof this paper but is reported in the next paper (20).
Discussion
I have discussed human psychological limitations at different stages of perception, and have suggested possible
methods for displaying laboratory data to attenuate these
vulnerabilities. Natural psychological editing skills serve as
a model for many of the methods.
We can speculate about how and when each may be
useful. Filtering and simplification techniques seem the
most likely to be applied first, because they are the first
steps naturally required to perceive important results and
their use does not omit important information; they could be
used routinely in laboratory reports or computer displays.
Depending on the context, humans may naturally switch
between other editing mechanisms, such as coding and
grouping (21), which might be best used interactively
through computer displays (2, 3). WIth these, any data
CLINICALCHEMISTRY, Vol. 32, No. 8, 1986 1515
temporarily suppressed or condensed could easily be retrieved. Recognition methods, such as rescaling or subtracting serial results, probably should apply when more subtle
distinctions are critical or are most likely to be obscured. For
example, we may need to rescale results in routine reports
to communicate the impression of change more forcefully
when serum constituents are tightly regulated. An interactive computer system might even be programmed to display
changes automatically when they are most likely to be
obscured-e.g., when two successive multivariate
proffles
(as in Figures 4a and b) are highly correlated or perceptually similar by other standards. Similar principles could guide
use of segregated displays like the estimates of organspecific LDH, but only under special circumstances, when
appropriate methodologies exist. Inevitably, in the presence
of noise there is a tradeoff concerning what differences (or
segregated components) should be reported. Statistical tests
of their significance, perhaps weighted by their clinical
importance, should help guide their selection.
Most importantly, the physicians themselves (who, rather
than pathologists, are more likely to be the end users of the
data) must decide what displays are needed for patient
management and must have some flexibility in choosing
them. Display needs may vary in different settings. In
intensive care, with serious time constraints, for example,
physicians may need simpler, more familiar forms of test
display (perhaps even numeric rather than graphic). Also, a
physician actively searching for expected informationmay
require different forms of editing than one who is passively
receiving
unexpected
findings. The clinician’s needs may
also differ depending on whether the data are for monitoring
patients or for diagnosis. In the latter case, clinical priorities
for the display of information should center on the clinician’s diagnostic hypotheses. Such methods require extensive knowledge of disease-finding
relationships, as embodied in artificial intelligence systems. In a future paper I will
discuss a system that provides further editing mechanisms
based on hypotheses, whereby the clinician selects, at a
terminal, different segments of a data set on the basis of
various hypotheses.
Clearly, further research is needed to answer many other
questions and to refine the guidelines just presented. Studies must determine when the benefits of adding edited
displays to the raw data outweigh the increased information
burdens they create. We must also -ask whether new displays produce patterns with unclear relevance or eliminate
information needlessly; in some cases, redundant information may actually improve human judgment (1, 8). Editing
may also be less useful for physicians with more expertise or
for those more familiar with a case (e.g., consultants
vs
house staff) (12). Given that natural perceptual editing
techniques appear to change with increased expertise or
familiarity (22,23), different physicians may require different degrees of prior simplification.
Once we better understand the skills of specialists in
editing complex patterns of laboratory data, we may even
discover new, more effective strategies. By modeling and
applying these refined expert editing skills to data, we may
seenew information that is informative even to the experts
themselves.
I thank David Chou, Hang-Yat Tam, G. S. Kumar, Timothy
Clark, StephenPowell, and Robert Galen for their comments and
assistance. This research was supported in part by grants LM04132,
1516 CLINICALCHEMISTRY, Vol. 32, No. 8, 1988
LM03306 and LM04086 from the National Library ofMedicine. Dr.
Politser is recipient of NIH Research Career DevelopmentAward
LMOO8Ofrom the National Library of Medicine.
References
1. Politser PE. Decision analysis and clinical judgement: a reevaluation. Med DecisionMaking 1982;1:368-89.
2. Connelly DP, Lasky LC, Keller R, Moore AA. Graphical representationof clinical laboratory data. In: O’Neill JT, ed. Computer
applications in medical care. Washington, DC: IEEE Computer
Society Press,1980:841-8.
3. Connelly DP, Lasky LC, Keller RM, Morrison DS. A systemfor
graphical display of clinical laboratory data. Am J Clin Pathol
1981;78:729-37.
4. Tufte EA. The visualdisplayofquantitative information.Cheshire, CT: GraphicPress,1983.
5. Kahneman D. Attention and effort. Englewood Cliffs, NJ: Prentice-Hall, 1973.
6. Kahneman D, Tversky A. Prospect theory: an analysis of decision under risk. Econometrica1979;47:263-91.
7. Tukey JW. Exploratory data analysis. Boston, MA: AddisonWesleyPublishing Co., 1977.
8. ClevelandWS, McGill R. Graphical perception:
theory, experimentation, and applicationto the developmentof graphical methods.J Am Stat Assoc1983;79:531-54.
9. Pachella RG, Somers P, Hardzinski M. A psychophysical approach to dimensional integrality. In: Getty DJ, Howard JH Jr, eds.
Auditory and visual pattern recognition. Hilisdale, NJ: Lawrence
Erlbaum Associates, 1981.
10. Ingelfinger JA, Mosteller F, ThibodeauLA, Ware LH. Biostatistics in clinical medicine. New York: Macmillan, 1983.
11. Cooper LA. Recent themes in visual information processing: a
selectedoverview. In: Nickerson RS, ed. Attention and performance.Hillsdale, NJ: Lawrence Erlbaum Associates, 1980.
12. Tversky A, Gati I. Similarity, separability, and the triangle
inequality. PsycholRev 1982;89:123-54.
13. Gore SM. Assessingmethods-transforming the data. Br Med J
1981;283:348-500.
14. JacquezJA. Compartmentalanalysisin biologyand medicine.
New York: Elsevier, 1972.
15. Peterson PH, Groth T, Hjelm M. Characterization of increased
synthesis of acute phase proteins from plasma concentration
measurements:correctionof sampling errors and of disturbing variationsof plasma volume, exchange fluxes, and catabolic rates in the
clinical situation.J Clin Comp 1979;8:180-201.
16. ParkersonGR Jr. Labstand: a computerized systemfor reporting clinical laboratory data in standard units. J Family Practice
1978;6:611-20.
17. Harvey AM, Johns 1W,McKusick VJ, Owens AH, Ross RB, eds.
The principles
and practice of medicine, 20th ed. New York:
Appleton-Century Crofts, 1980.
18. Roberts B. Diagnostic assessmentof myocardia) infarction
based on lactate dehydrogenaseand creatine kinase isoenzynies.
Heart Lung 1981;10:486-506.
19. Coombe CH, DawesRM, Tversky A. Mathematical psychology:
an elementary introduction. Englewood Cliffs, NJ: Prentice-Hall,
1970.
20. PolitserPE, Powell S, Fink J. A new method for reportingthe
sourcesof abnormal activities of lactate dehydrogenasein serum.
Clin Chem 1986;32:1517-24.
21. Payne JW. Contingent decision behavior. Psychol Bull
1982;92:382-402.
22. Johnson MD, Russo JE. Product familiarity and learning new
information. J ConsumerRes 1984;11:542-50.
23. Larkin J, McDermott J, Simon DP, Simon HA. Expert and
novice performance in solving physics problems. Science
1980;208:1335-42.
24. Prinzmetal W, Banks WP. Good continuation
affects
visual
detection.PerceptionPsychophysics1977;21:389-95.