Download Report

This article was originally published in Brain Mapping: An Encyclopedic
Reference, published by Elsevier, and the attached copy is provided by
Elsevier for the author's benefit and for the benefit of the author's institution,
for non-commercial research and educational use including without limitation
use in instruction at your institution, sending it to specific colleagues who you
know, and providing a copy to your institution’s administrator.
All other uses, reproduction and distribution, including without limitation
commercial reprints, selling or licensing copies or access, or posting on open
internet sites, your personal or institution’s website or repository, are
prohibited. For exceptions, permission may be sought for such use through
Elsevier's permissions site at:
http://www.elsevier.com/locate/permissionusematerial
Warfield S.K., and Tomas-Fernandez X. (2015) Lesion Segmentation. In: Arthur
W. Toga, editor. Brain Mapping: An Encyclopedic Reference, vol. 1, pp. 323332. Academic Press: Elsevier.
Author's personal copy
Lesion Segmentation
SK Warfield and X Tomas-Fernandez, Harvard Medical School, Boston MA, USA
ã 2015 Elsevier Inc. All rights reserved.
Glossary
Lesion A lesion is any kind of abnormality in the brain.
Magnetic resonance imaging (MRI) MRI is a type of
imaging that uses nonionizing radio frequency energy to
Introduction
The development of imaging strategies for the optimal detection and characterization of lesions continues at a rapid pace.
Several modalities are in common use, including magnetic
resonance imaging (MRI), ultrasound, computed technology,
and positron emission tomography (PET). Each modality is
appropriate for certain types of lesions, but MRI is particularly
attractive due to its lack of ionizing radiation and the flexibility
of contrast mechanisms that it provides.
Expert and Interactive Segmentation
In routine clinical practice, the detection of lesions is important for diagnosis, directing intervention, and assessing
response to therapy. In clinical trials, it is often important to
have effective measures of the number of lesions, the size of
lesions, and how they change over time. Volumetric assessment of lesions is best carried out by segmentation of the
lesion, in which every voxel that is part of the lesion is delineated. This allows characterization of the entire volume of the
lesion and further measures such as lesion heterogeneity and
lesion shape. Furthermore, it allows the assessment of potential imaging biomarkers of response to therapy in the lesion,
such as diffusion weighted imaging (DWI) measures of cellularity or perfusion, or PET measures of metabolic activity.
Segmentation is usually carried out by an expert who is
trained to recognize normal anatomy and lesions in a particular modality or modalities under study. Most commonly, the
expert will delineate the lesion or lesions that they see in the
images interactively. A number of excellent software tools are
available to facilitate the delineation of user-observed regions
of interest.
However, the task of segmentation is challenging for experts
to carry out and leads to segmentations with errors in which
some voxels are incorrectly labeled. Expert segmentations may
have errors due to loss of attention or fatigue, due to changes in
perception over short or long periods of time, or due to subjective differences in judgment in regions in which the correct
decision is unclear. These errors may be well characterized as
locally random mislabeling and by structurally correlated
errors, such as consistent mislocalization of a segment of a
boundary.
Careful management of perception of the boundary can be
a challenge and depends on characteristics of the image such as
Brain Mapping: An Encyclopedic Reference
spatially encode the distribution of tissues in the brain and
body.
Segmentation The delineation of the location and extent of
structures visible in images.
display of contrast and the workspace environment. For example, a laterality bias in visual perception was identified as the
source of left–right asymmetry in some manual segmentations
and was found to be especially prominent in the hippocampus
(Maltbie et al., 2012). If present, this can be managed by
mirroring the images across the left–right plane of symmetry
and segmenting each structure twice, once appearing on the
left hand side and once on the right hand side, and then
averaging (Thompson et al., 2009). This is time-consuming
and therefore expensive and may be avoided by careful management of the expert’s perception.
The test–retest reproducibility of interactive segmentation
has been characterized. In general, it has been found that an
expert rater will be more successful when the boundary of the
structure being delineated is readily observed and with a simple shape. Long and complicated boundaries are more difficult
to segment and lead to a reduction in interrater reliability
(Kikinis et al., 1992). Cortical gray matter, for example, can
be challenging to delineate (Warfield et al., 1995).
Variability in Lesion Segmentation
The interactive detection and delineation of the complete
extent of lesions by experts is very challenging to achieve.
As for normal anatomical structures with long and complex
boundaries, or with heterogeneous tissue contrast, the
test–retest reproducibility of lesion detection and lesion segmentation has been low.
Quantitative assessment in multiple sclerosis (MS) is critical
both in understanding the natural history of disease and in
monitoring the effects of available therapies. Conventional
MRI-based measures include central nervous system atrophy
(Bermel & Bakshi, 2006), contrast-enhanced lesion count
(Barkhof et al., 2012), and T2w hyperintense lesion count
(Guttmann, Ahn, Hsu, Kikinis, & Jolesz, 1995). Such measures
have served as primary outcome in phase I and II trials and as
secondary outcome in phase III trials (Miller et al., 2004).
However, the quantitative analysis of lesion load is not without
difficulties. Because the natural change in lesion load year to
year is generally small, measurement error or variation in
lesion load assessment must be reduced as far as possible to
maximize the ability to detect progression. Ideally, measurement errors should be significantly less than the natural variability that occurs in individual patients over time (Wei,
Guttmann, Warfield, Eliasziw, & Mitchell, 2004). Although
http://dx.doi.org/10.1016/B978-0-12-397025-1.00302-X
Brain Mapping: An Encyclopedic Reference, (2015), vol. 1, pp. 323-332
323
Author's personal copy
324
INTRODUCTION TO METHODS AND MODELING | Lesion Segmentation
several factors influence lesion load measurements in MS, only
the variability introduced by the human operator who performs the measurements has been studied in detail.
Standard image analysis methods currently utilized in clinical trials are largely manual. Manual segmentation is difficult,
time-consuming, and costly. Errors occur due to low lesion
contrast and unclear boundaries caused by changing tissue
properties and partial volume effects. Segmentation inconsistencies are common even among qualified experts. Many studies have investigated the wide variability inherent to manual
MS lesion segmentation, finding an interrater volume variability of 14% and an intrarater volume variability of 6.5%
(Filippi, Horsfield, Bressi, et al., 1995). Further, other studies
have reported interrater lesion volume differences ranging
from 10% to 68% (Grimaud, Lai, Thorpe, & Adeleine, 1996;
Styne et al., 2008; Zijdenbos, Forghani, & Evans, 2002). Furthermore, during a longitudinal interferon beta-1b study (Paty
& Li, 1993), the authors attributed a significant decrease in MS
lesion volume during the third year of the study due to a
methodological change applied by the single observer who
performed the measurements. Because the same change was
applied consistently to all scans, it did not affect the found
intergroup differences, but it stressed the need for rigorous
quality control checks during long-term studies.
To reduce the intra- and interrater variability inherent in
manual lesion segmentation, many semiautomatic methods
have been proposed. These algorithms require the human
rater to identify the location of each lesion by clicking on the
center of the lesion and then automatically delineate the extent
of the lesion. In this way, the detection of the lesion relies on
the expert judgment, but the extent of the lesion is determined
by an automatic rule. A variety of rules to estimate the boundaries of each identified lesion have been investigated, including
the use of a local intensity threshold (Filippi, Horsfield, Tofts,
et al., 1995), region growing (Ashton et al., 2003), fuzzy connectedness (Udupa et al., 1997), intensity gradient (Grimaud
et al., 1996), or statistical shape priors (Shepherd, Prince, &
Alexander, 2012).
Semiautomatic lesion segmentation has demonstrated
reduced intrarater variability, but interrater variability is still
an issue due to the initialization by manual lesion identification. Given this, a substantial effort has been devoted to the
development of fully automatic segmentation algorithms capable of detecting and delineating lesions, especially in MS.
Lesion Segmentation Validation
Validation of segmentation in medical imaging is a challenging
task due to the scarcity of an appropriate reference standard to
which results of any segmentation approach can be compared.
Comparison to histology is helpful, but rarely available for
clinical data, and directly relating histology to MRI can be
difficult (Clarke et al., 1995). Consequently, validation studies
typically rely on expert evaluation of the imaging data. The
intra- and interexpert variability of manual segmentation
makes it challenging to distinguish the dissimilarities between
manual and automatic segmentation methods caused by errors
in the segmentation algorithm from those caused by variability
in the manual segmentation.
An excellent approach that overcomes the inter- and intraexpert reference variability consists in evaluation using synthetic image data (Kwan, Evans, & Pike, 1999). Since the
correct segmentation is known, this allows for direct comparison to the results of automatic segmentation algorithms.
Unfortunately, simulated images may not exhibit the wide
range of anatomy and acquisition artifacts found in clinical
data, and therefore, the conclusions may not generalize to the
broader range found in images of patients.
Given that expert measurements are highly variable, any
validation should always evaluate automatic segmentation
accuracy against a series of repeated measurements by multiple
experts. These multiple expert segmentations can be combined
using STAPLE (Akhondi-Asl & Warfield, 2013; Commowick,
Akhondi-Asl, & Warfield, 2012; Commowick & Warfield,
2010; Warfield, Zou, & Wells, 2004), which provides an optimal weighting of each expert segmentation, based on the comparison of each segmentation to a hidden reference standard
segmentation. The confidence of the expert performance estimates can also be estimated, indicating whether or not sufficient data are available to have high confidence in the reference
standard and the expert performance assessments. Ultimately,
the best automated segmentation algorithms should have an
accuracy similar to that of the best expert segmentations, but
with higher reproducibility.
Validation Metrics
Two main aspects characterize the validation of a segmentation
algorithm: accuracy and reproducibility.
Accuracy
The accuracy of segmentation can be evaluated in many different ways. A sensible evaluation criterion depends on the purpose of the segmentation procedure. If the goal is to estimate
the lesion volume, a measure often referred to as total lesion
load (TLL), the volumetric error would be the criteria of choice
(Garcı´a-Lorenzo, Prima, Arnold, Collins, & Barillot, 2011;
Shiee et al., 2010; Van Leemput, Maes, Vandermeulen,
Colchester, & Suetens, 2001). The main limitation of such
approach is that it does not provide information regarding
the overlap with the reference segmentation. Thus, segmentation with exactly the same volume as the reference can be
completely wrong if a voxel by voxel comparison is made. It
has been demonstrated that high TLL correlation can be
achieved while still achieving a poor degree of precise spatial
correspondence. For example, Van Leemput et al. (2001)
reported a high TLL correlation but considerable disagreement
in spatial overlap between expert segmentations and between
expert and automatic measurements.
Commonly, brain segmentation literature describes the
spatial overlap of segmentations by means of the dice similarity coefficient (DSC) (Dice, 1945). The DSC between the automatic and reference segmentation is defined as the ratio of
twice the overlapping area to the sum of the individual areas.
The value of the index varies between 0 (no overlap) and 1
(complete overlap with the reference). This is an excellent
measure if the detection of every voxel of every lesion is critical.
In practice, evaluation of DSC of MS lesion segmentations is
dependent on the TLL of the patients (Zijdenbos, Dawant,
Brain Mapping: An Encyclopedic Reference, (2015), vol. 1, pp. 323-332
Author's personal copy
INTRODUCTION TO METHODS AND MODELING | Lesion Segmentation
Margolin, & Palmer, 1994). This is in part because scans depicting high lesion burden will typically have some lesions with
unambiguous boundaries. Thus, DSC heavily reflects the presence of lesions with easy to detect boundaries, which are more
likely to be present in patients with an increased lesion burden
and less likely to occur in patients with a lower lesion burden.
The variation in the contrast of the boundaries of different
lesions has led to efforts to find alternative measures of
accuracy.
Given the disagreement in lesion boundaries among manual raters, some authors have proposed to validate lesion segmentation algorithms by reporting the number of correctly
detected lesions (Styne et al., 2008), where a lesion is defined
as detected if it overlaps at all with any lesion present in the
reference. Such a metric has the advantage of being insensitive
to error in the boundary of the lesion localization in the
manual reference standard segmentations. However, such
lesion counting measures cannot give information about the
accuracy of the boundary localization of the lesion. A commonly accepted recommendation is that validation measures
should assess both lesion detection and lesion delineation
accuracy (Wack et al., 2012).
Reproducibility
High reproducibility, of accurate segmentation, is crucial for
longitudinal trials to ensure that differences in segmentations
obtained over time result from changes in the pathology and
not from the variability of the segmentation approach.
To test interscan variability, MS patients may undergo a
scan–reposition–scan experiment. As scans are obtained
within the same imaging session, it is assumed that the disease
has not evolved during this period. Such an approach was used
in Kikinis et al. (1999) and Wei et al. (2002) where reproducibility was measured using the coefficient of variation on
the TLL.
Reproducibility is a necessary but not sufficient part of
validation. One still needs to show that the method is accurate
and sensitive to changes in input data. Measuring accuracy
requires an independent estimate of the ground truth, an
often difficult task when using clinical data.
Validation Datasets
In order to provide objective assessments of segmentation
performance, there is a need for an objective reference standard
with associated MRI scans that exhibit the same major segmentation challenges as that of scans of patients. A database of
clinical MR images, along with their segmentations, may provide the means to measure the performance of an algorithm
by comparing the results against the variability of the expert
segmentations. However, an objective evaluation to systematically compare different segmentation algorithms also needs
an accurate reference standard.
An example of such a reference standard is the synthetic
brain MRI database provided by the Montreal Neurological
Institute that is a common standard for evaluating the segmentations of MS patients. The synthetic MS brain phantom available from the McConnell Brain Imaging Centre consists of
T1w, T2w, and proton density MRI sequences with different
acquisition parameters as well as noise and intensity
325
inhomogeneity levels (Kwan et al., 1999). The MS brain phantom was based on the original BrainWeb healthy phantom,
which had been expanded to capture three different MS lesion
loads: mild (0.4 cm3), moderate (3.5 cm3), and severe
(10.1 cm3). Each MS phantom was provided with its own MS
lesion ground truth.
Although the BrainWeb synthetic dataset provides a reference standard, it presents several limitations. First, the BrainWeb dataset just provides one brain model, which results in a
poor characterization of the anatomical variability present in
the MS population. Also, although the BrainWeb dataset is
based on real MRI data, the final model is not equivalent to
clinical scans in its contrast, and it produces an easier dataset to
segment than real clinical scans.
To overcome these limitations, most of the lesion segmentation algorithms also evaluate their results in a dataset consisting in clinical scans. Such an approach allows for a better
understanding of the performance of the evaluated algorithms
when faced with real data. Unfortunately, because each segmentation algorithm is validated with different datasets, comparison between different methodologies is more difficult.
A recent effort in providing publicly available datasets for
validation of MS lesion segmentation was released at the MS
Segmentation Grand Challenge held during the Medical Imaging Computing and Computer Assisted Intervention (MICCAI
2008) conference (Styne et al., 2008). For this event, the University of North Carolina at Chapel Hill (UNC) and Boston
Children’s Hospital (BCH) released a database of MS MRI
scans that contains anatomical MRI scans from 51 subjects
with MS.
Images were placed into two groups: a 20-subject training
group and a 31-subject testing group, the balance of the original 51 subject cohort. MS lesion manual reference data were
only available for those subjects in the training group. Organizers retained and continue to hold secret the interactively
delineated reference standard lesion segmentations of the testing group. To evaluate the performance of any segmentation
algorithms, researchers may upload their automatic segmentations of the testing data into the challenge website, where a
number of performance metrics are computed and an overall
performance ranking is provided. Since the competitors do not
have access to the reference standard segmentation, this evaluation of publicly available scans allows for a truly objective
comparison.
Intensity Artifact Compensation, Normalization, and
Matching
The MRI intensity scale in conventional structural imaging has
no absolute, physical meaning. Instead, images are formed
with a contrast that is related to spin density, T1 relaxation,
and T2 relaxation, without quantifying the precise value of
these parameters. As a consequence, the image intensities and
contrast are dependent on the particular pulse sequence, static
magnetic field strength, and imaging parameter settings such as
flip angle.
In addition, several phenomena of the physics of acquisition lead to a spatially varying intensity inhomogeneity,
which may be severe enough in some cases to perturb image
Brain Mapping: An Encyclopedic Reference, (2015), vol. 1, pp. 323-332
Author's personal copy
326
INTRODUCTION TO METHODS AND MODELING | Lesion Segmentation
segmentation. These intensity nonuniformities arise from
radio frequency coil nonuniformity and coupling with the
patient (de Zwart et al., 2004). They can be compensated for
by measuring the RF receive profiles from a homogeneous
transmit field (Kaza, Klose, & Lotze, 2011). Filtering based on
the concept of separating low-frequency artifact from signal
through homomorphic filtering has also been widely used
(Brinkmann, Manduca, & Robb, 1998; Sled, Zijdenbos, &
Evans, 1998).
For accurate and reproducible segmentation, it is important
that the location of boundaries between structures in the
images be able to be detected despite these potential variations
in signal intensity. This can be facilitated by the creation of new
images in which the intensities are more similar between
subjects.
Nyu´l and Udupa (1999) proposed a piecewise linear mapping that adjusts the intensity histogram of an input image so
it that matches a reference histogram based on a set of predefined landmarks. Similar approaches based on intensity rescaling have been extensively used in MS lesion segmentation
(Anbeek, Vincken, van Osch, Bisschops, & van der Grond,
2004; Datta & Narayana, 2013; Shah et al., 2011).
An adaptive segmentation algorithm was developed that
achieved tissue segmentation and intensity inhomogeneity
compensation with an expectation–maximization (EM) algorithm (Wells, Grimson, Kikinis, & Jolesz, 1996). The intensity
model was learned through supervised classification, requiring
an interactive training for each imaging protocol. Since the
intensity adaptation utilizes the same intensity model for all
subjects, the final intensity-compensated images have the same
range of intensity distributions. This enables compensation for
intersubject and intrasubject intensity inhomogeneities.
In order to avoid interactive training of the intensity distributions, while still achieving intersubject MRI intensity matching, Weisenfeld and Warfield (2004) developed an algorithm
based on finding a smoothly varying intensity modulation
field that minimized the Kullback–Leibler divergence between
pairs of acquisitions. This algorithm was able to simultaneously use T1w and T2w images, from pairs of scans of subjects, in order to identify an intensity transformation field that
drove the intensity distribution of the scan of one subject to
closely match the intensity distribution of the scan of the
second subject. This achieved intensity matching across scans.
Automated Lesion Segmentation Algorithms
The challenges of interactive and semiautomated lesion
segmentation have led to the development of fully automated
lesion segmentation algorithms. This work has grown out of early
efforts to develop segmentation algorithms for normal brain
tissue (Clarke et al., 1995; Vannier, Butterfield, & Jordan, 1985;
Vannier, Butterfield, Jordan, Murphy, Levitt, & Gado, 1985).
Segmentation in healthy brain MRI has been the topic of a
great deal of study, with most successful algorithms employing
voxelwise, intensity feature space-based classification. The
basic strategy is usually based on statistical classification theory. Given a multispectral grayscale MRI (i.e., T1w, T2w, and
fluid attenuated inversion recovery (FLAIR)) formed by a finite
set of N voxels, and the multispectral vector of observed
intensities Y ¼ (y1, . . ., yN) with yi 2 m , a statistical classifier
algorithm seeks to estimate Zi, a categorical random variable
referring to tissue class label by maximizing p(Zi|Yi), the probability of the class from the observed intensity at the given
voxel. A Bayesian formulation of voxelwise, intensity-based
classification can be posed as follows:
p Y i Zi pðZÞ
pðZi j Y i Þ ¼ PK p Y i Zi ¼ j pðZ ¼ jÞ
j¼0
The term p(Yi|Z ¼ j) is the likelihood of the observed feature
vector Yi and p(Z) is the tissue prior probability. The usefulness
of such a classification scheme was demonstrated in Vannier,
Butterfield, and Jordan (1985) with both a supervised
classification and an unsupervised classification on brain
MRI data.
Tissue segmentation algorithms differ in the estimation of
the likelihood p(Yi|Z ¼ j) and the tissue prior models p(Z). In
Wells et al. (1996), an algorithm suitable for images corrupted
by a spatially varying intensity artifact was proposed and
devised as an EM algorithm for simultaneously estimating the
posterior probabilities p(Zi|Yi) and the parameters of a model
of the intensity artifact. They modeled the likelihoods both
parametrically as Gaussians and nonparametrically using Parzen windowing. Van Leemput, Maes, Vandermeulen, and
Suetens (1999) extended Wells’ EM scheme to also update
the means and variances of tissue class Gaussians and also to
include both a spatially varying prior and a Markov random
field (MRF) spatial homogeneity constraint, replacing the
global tissue prior with the product of a spatially varying
prior p(Zi) and a prior based on the MRF neighborhood
p@ (Zi). Updating the model to include a spatially varying
prior and an MRF prior model results in the following Bayesian
formulation of voxelwise, intensity-based classification:
pðZi j Y i Þ ¼ PK
pðY i j Zi ÞpðZi Þp@ ðZi Þ
j¼0 pðY i j Zi
¼ jÞpðZi ¼ jÞp@ ðZi ¼ jÞ
Considering the success of such approach for healthy brain
MRI tissue segmentation, first attempts in MS lesion segmentation automation modified these voxelwise, intensity-based
classifiers to model white matter (WM) lesions on MRI as an
additional tissue class. This first attempts described MS lesion
segmentations burdened with false-positive misclassification
mainly happening in the sulcal GM (Kapouleas, 1989).
Any classification algorithm estimates an optimal boundary
between tissue types on a given feature space. Thus, tissue
classification relies on contrast between tissue types (i.e., WM
and MS lesions) on a particular feature space. However, the MS
lesion intensity distribution overlaps with that from healthy
tissues (Kamber, Louis Collins, Shinghal, Francis, & Evans,
1992; Zijdenbos et al., 1994); thus, an MRI intensity feature
space alone has limited ability to discriminate between MS
lesions and healthy brain tissues. This limitation, in turn,
generally results in lesion segmentation that is inaccurate and
hampered with false-positives.
Attempts to deal with the overlapping intensity range of
healthy tissues and MS lesions led to increased development of
model-based systems, which encoded knowledge of brain
anatomy by means of a digital brain atlas with a priori tissue
probability maps. For instance, Kamber, Shinghal, Collins,
Brain Mapping: An Encyclopedic Reference, (2015), vol. 1, pp. 323-332
Author's personal copy
INTRODUCTION TO METHODS AND MODELING | Lesion Segmentation
Francis, and Evans (1995) proposed a model that compensated
for the tissue class intensity overlap by using a probabilistic
model of the location of MS lesions. Many MS lesions appear
in the WM but have an intensity profile that includes an
unambiguously bright region and a surrounding region more
similar in intensity to gray matter. By confining the search for
MS lesions to those regions with at least a 50% prior probability of being WM, the incorrect classification of gray matter as
MS lesion was greatly reduced. More recently, Shiee et al.
(2010) used a topologically consistent atlas to constrain the
search of MS lesions.
Warfield et al. (1995) used a different approach where the
gray matter was segmented for each patient under analysis,
rather than using a probabilistic model of the average location
of the WM for all patients. By first successfully identifying all of
the gray matter, the segmentation of lesions was then made
possible through an optimal two-class classifier that identified
normal WM and lesions using an optimal minimum distance
classifier. This approach was able to correct for both gray
matter as MS lesion and MS lesion as gray matter classification
errors. Later work by Warfield, Kaus, Jolesz, and Kikinis (2000)
extended the classifier intensity feature space by using a distance map generated from an aligned template segmentation
and demonstrated the efficacy of iterated segmentation and
nonrigid registration. The algorithm iterated between tissue
classification and elastic registration of the anatomical template to the segmentation of the subject generated by the
classifier, which led to an increasingly refined and improved
segmentation of normal anatomical structures and lesions.
An alternative approach attempting to improve lesion segmentation specificity proposed to extend the MRI intensity
feature space by including spatial features. Zijdenbos et al.
(2002) used an MRI intensity feature space that was extended
by the tissue probability of the given voxel based in a probabilistic tissue atlas. Instead of using the tissue prior probability,
Anbeek et al. (2004) and Hadjiprocopis and Tofts (2003)
proposed to extend the MRI intensity feature space by means
of the Cartesian and polar voxel coordinates. An alternative
way to encode spatial information was proposed by Younis,
Soliman, Kabuka, and John (2007), where local neighboring
information was included by extending the voxel intensity
feature by including the MRI intensity of the six neighboring
voxels. To account for the MRI intensity variability observed
at different parts of the brain, Harmouche, Collins, Arnold,
Francis, and Arbel (2006) proposed a Bayesian classification
approach that incorporates voxel spatial location in a standardized anatomical coordinate system and neighborhood information using MRF.
More recently, some authors instead of relying in a specific
set of features proposed to select the most discriminant features
from large sets including voxel intensities, spatial coordinates,
tissue prior probabilities, shape filters, curvature filters, and
intensity derivatives. For instance, Morra, Tu, Toga, and
Thompson (2008) and Wels, Huber, and Hornegger (2008)
introduced tens of thousands of features in a classification
process using an AdaBoost algorithm with a probabilistic
boosting tree to improve the training process. Another method
(Kroon et al., 2008) employed principal component analysis
to select those features explaining the greatest variability of the
training data, and then a threshold was computed in the new
327
coordinate system to perform the lesion segmentation. An
alternative approach was proposed by Geremia et al. (2011)
who used a feature space composed by local and context-rich
features. Context-rich features compare the intensities of the
voxel of interest with distant regions either in an extended
neighborhood or in the symmetrical counterpart with respect
to the midsagittal plane. This set of features was employed with
a random decision forest classifier to segment MS lesions.
Furthermore, after analysis of the decision forest fitting process, the authors reported that the most discriminative features
towards MS lesion segmentation were FLAIR intensities and
the spatial tissue prior probability.
The role of FLAIR was demonstrated by de Boer et al.
(2009), where a model of MS lesions surrounded mostly by
WM voxels was used again. Gray matter, WM, and CSF were
segmented but with false-positives possible due to the intensity
overlap of lesions with normal tissues. An optimal FLAIR
intensity threshold based on the region of gray matter segmentation was then computed, and lesion false-positives were
removed by a heuristic rule of eliminating lesion candidates
outside a region of likely WM. Similarly, Datta and Narayana
(2013) rejected segmented lesions located in the cortical gray
matter or in the choroid plexus by means of the ICBM tissue
atlas. Furthermore, it has been proposed to enhance the
contrast between MS lesions and healthy tissues in FLAIR
scans prior to generate the lesion segmentation by intensity
thresholding (Bijar, Khayati, & Pen˜alver Benavent, 2013;
Souple et al., 2008).
Approaches to reduce the extent of lesion false-positives are
usually based on postprocessing steps, specifically experimentally tuned morphological operators, connectivity rules, and
minimum size thresholds, among others. However, these postprocessing steps may have to be retuned based on individual
features of each case or tailored to different subjects for different degrees of lesion burden.
Considering that MS lesions are exhibit by a highly heterogeneous appearance, the selection of an appropriately sensitive
and specific classifier feature space has proved to be a daunting
task. Some authors proposed not to model the lesions, but to
consider them as intensity outliers to the normal appearing
brain tissues model. The advantage of such approach is that it
avoids the need to model the heterogeneous intensity,
location, and shape of the lesions.
This approach was examined by Van Leemput et al. (2001),
where lesions were modeled as intensity outliers with respect
of a global Gaussian mixture model (GMM) initialized by an
aligned probabilistic tissue atlas. Similarly, Souple et al. (2008)
used a trimmed likelihood estimator (TLE) to estimate a tencomponent GMM and modeled MS lesions as GM intensity
outliers on an enhanced FLAIR image. Additional methods
further combine a TLE with a mean shift algorithm (Garcı´aLorenzo et al., 2011) or a hidden Markov chain (Bricq, Collet,
& Armspach, 2008).
Given the presence of structural abnormalities (i.e., WM
lesions, brain atrophy, and blood vessels) in MS patients,
there is the need of estimation algorithms that are robust in
the presence of outliers. For instance, Prastawa, Bullitt, and Ho
(2004) proposed to edit the training data by means of a
minimum covariance determinant. In Cocosco, Zijdenbos,
and Evans (2003), a clustering solution was proposed based
Brain Mapping: An Encyclopedic Reference, (2015), vol. 1, pp. 323-332
Author's personal copy
328
INTRODUCTION TO METHODS AND MODELING | Lesion Segmentation
in the geometry of tissue class distributions to reject training
data. Weisenfeld and Warfield (2009) demonstrated a registration and fusion algorithm that was able to automatically learn
training data of normal tissues for an optimal classifier with an
accuracy indistinguishable from that of the best manual raters,
which provided high accuracy rates.
State-of-the-art lesion segmentation algorithms are primarily based on a patient global MRI intensity feature space, which
have limited sensitivity and specificity for MS lesions and
which require extensive postprocessing to achieve increased
accuracy. This limitation, in turn, results in MS lesion segmentation that is generally inaccurate and burdened with falsepositives. For instance, during the MS Grand Challenge
(Styne et al., 2008), the winning algorithm (Bricq et al.,
2008) reported a lesion false-positive rate (LFPR) of 55% and
a lesion true-positive rate (LTPR) of 42%. That is, of all the
detections of lesions generated by the automatic algorithm,
about half of them are segmentation errors. Furthermore, the
best lesion segmentation algorithm at the Grand Challenge
was able to detect, on average, less than half of the existing
lesions. These results are not as good as the performance of an
average human rater reported by the challenge organizers
(LTPR ¼ 68% and LFPR ¼ 32%).
Model of Population and Subject Intensities
To address these limitations, we have experimented with augmenting the imaging data used to identify lesions to include
both an intensity model of the patient under consideration and
a collection of intensity and segmentation templates that provide a model of normal tissue. We call this combination a
model of population and subject (MOPS) intensities.
Unlike the classical approach where lesions are characterized
by their intensity distribution compared to all brain tissues,
MOPS aims to distinguish locations in the brain with an abnormal intensity level when compared to the expected value in the
same location in a healthy reference population. This is achieved
by a tissue mixture model, which combines the MS patient
global tissue intensity model with a population local tissue
intensity model derived from a reference database of MRI
scans of healthy subjects (Tomas-Fernandez & Warfield, 2012).
Global GMM MRI Brain Tissue Segmentation
Consider a multispectral grayscale MRI (i.e., T1w, T2w, and
FLAIR) formed by a finite set of voxels. Our aim is to assign
each voxel to one of classes (i.e., GM, WM, and CSF) considering the observed intensities Y ¼ (Y1, . . ., YN) with yi e m . Both
observed intensities and hidden labels are considered to be
random variables denoted, respectively, as Y ¼ (Y1, .. ., YN) and
Z ¼ (Z1, .. ., ZN). Each random variable Zi ¼ ek ¼ (zi1, .. ., ziK) is a
K-dimensional vector with each component zik being 1 or
0 according whether Yi did or did not arise from the kth class.
It is assumed that the observed data Y are described by the
conditional probability density function f(Y|Z, fY)that incorporates the image formation model and the noise model and
depends on some parametersfY. Also, the hidden labels are
assumed to be drawn according to some parametric probability
distribution f(Z|fZ), which depends on parameters fZ.
Segmenting the observed image Yis to propose an estimate Z^ of Z on the basis ofY, to this purpose, the parameter c ¼ (fZ1, . . ., fZK; fY1, . . ., fYK) needs to be estimated
somehow. If the underlying tissue segmentation Z was
known, estimation of the model parameters would be
straightforward. However, only the image Y is directly
observed, making it natural to tackle this problem as one
involving missing data making the EM algorithm the candidate for model fitting. The EM algorithm finds the
parameter c that maximizes the complete data loglikelihood by iteratively maximizing the expected value of
the log-likelihood log(f(Y, Z|c)) of the complete data (Y,
Z), where the expectation is based on the observed data Y
and the estimated parameters c m obtained in the previous
iteration m:
log LC ðc Þ ¼ log ðf ðY, Zj c ÞÞ
YN X K
¼ log
f ðZ i ¼ ek j fZk Þf ðY i j Z i ¼ ek , fYk ÞÞ
i¼1
k¼1
¼
XN XK
i¼1
z ð log f ðZ i
k¼1 ik
¼ ek j fZk Þ
þ log f Y i Z i ¼ ek , fYk Þ
E-step: The E-step requires the computation of the conditional expectation of log(Lc(c)) given Y, using c m for c, which
can be written as
Qðc; c m Þ ¼ Ecm log LC ðc ÞY
As the complete data log-likelihood log LC(c), is linear in
the hidden labels zij, the E-step simply requires the calculation
of the current conditional expectation of Zi given the observed
data Y. Then,
m
f Z i ¼ ej j fm
Zj f Y i Z i ¼ ej , fYj
Ecm Zi ¼ ej j Y ¼ PK f Z i ¼ ek j fm f Y i Z i ¼ ek , fm
Zk
k¼1
Yk
that corresponds to the posterior probability that the ith member of the sample belongs to the jth class.
M-step: The M-step on the mth iteration requires the maximization of Q(c; c m) with respect to c over the parameter
space to give the updated estimate c mþ1. The mixing proportions pk are calculated as follows:
1 XN
m
f
Z
¼
e
,
c
pk ¼ f Z i ¼ ek j fmþ1
Y
¼
i
k i
Zk
i¼1
N
The update of fY on the M-step of the (m þ 1)th iteration, it
is estimated by maximizing log LC(c) with respect to fY:
XN XK
m @ log f Y i Z i ¼ ek , fYk
f
ð
Z
¼
e
j
Y
,c
Þ
¼0
i
i
k
i¼1
k¼1
@fY
Consider that f(Yi|Zi ¼ ek, fYk) is described by a Gaussian
distribution parameterized by fYk ¼ (mk, Sk)
f ðY i j Zi ¼ ek , fYk Þ ¼
1
ðm=2Þ
ð2pÞ
jSk j
ð1=2Þ
T 1
1
e 2 ðY i mk Þ Sk ðY i mk Þ
with mk and Sk being, respectively, the intensity mean vector
and covariance matrix for tissue k. Thus, the update equations
may be written as
Brain Mapping: An Encyclopedic Reference, (2015), vol. 1, pp. 323-332
Author's personal copy
INTRODUCTION TO METHODS AND MODELING | Lesion Segmentation
mmþ1
k
Smþ1
k
PN
¼
m
i¼1 Y i f Z i ¼ ek Y i , c
PN m
i¼1 f Z i ¼ ek Y i ,c
m
m T
Y i mm
i¼1 f ðZ i ¼ ek j Y i , c Þ Y i mk
k
PN
f Zi ¼ ek Y i ,c m
PN
¼
i¼1
Local Reference Population GMM Intensity Tissue Model
Consider a reference population P formed by R healthy subjects aligned to the MS patient. Each reference subject is composed of a multispectral grayscale MRI V(i.e., T1w, T2w, and
FLAIR scans) and the corresponding tissue segmentation (i.e.,
GM, WM, and CSF); thus, P ¼ (V, L) ¼ (V1, . . ., VR; L1, . . ., LR).
Each reference grayscale MRI Vr ¼ (Vr1, . . ., VrN) is formed by a
finite set of N voxels with V ri em . Also, each reference tissue
segmentation Lr ¼ (Lr1, .. ., LrN) is formed by a finite set of N
voxels where Lri ¼ ek ¼ (lri1, .. ., lriK) is a K-dimensional vector
with each component lrik being 1 or 0 according whether Vri
did or did not arise from the kth class.
At each voxel i, the reference population P intensity distribution will be modeled as a GMM parameterized by
ji ¼ (pPi, mPi, SPi) with pPi, mPi, and SPi, respectively, the population tissue mixture vector, the population mean intensity
vector, and the population intensity covariance matrix at
voxel i.
Because (V, L) are observed variables, j i can be derived
using the following expressions:
1X
p Lij ¼ ek
jeNR
R
P
jeNR V ij p Lij ¼ ek
mPik ¼ P
jeNR p Lij ¼ ek
pPik ¼
P
SPik ¼
jeNR
V ij mPik p Lij ¼ ek
jeNR p Lij ¼ ek
V ij mPik
P
T where p(Lij ¼ ek) is the probability of voxel i of the jth reference
subject belonging to tissue k given by Lj and NR is the neighborhood centered in voxel i of radius R voxels.
Once the local tissue model is estimated from P, the intensity likelihood of Y can be derived as
YN XK
f ðY,Zj j Þ ¼ i¼1 k¼1
T 1
1
f Zi ¼ ek j ik
e 2 ðY i mPik Þ SPik ðY i mPik Þ
ðm=2Þ
ðm=2Þ
ð2pÞ
jSPik j
with f(Zi ¼ ek| j ik) ¼ pPik.
Combining Global and Local Models
Consider that in addition to the patient scan Y, we observe an
aligned template library of R healthy subjects P ¼ (V, L) ¼
(V1, . . ., VR; L1, . . ., LR).
Since the observed population data P is conditionally independent of the observed patient scan Y, the formation model
parametrized by c can be expressed as
329
log LC ðc Þ ¼ log f Y, P,Z c
¼
¼
XN XK
i¼1
z log ðpk f ðY i j Z ik ,c k Þf ðP ik j Y i ,Z ik , c k ÞÞ
k¼1 ik
m , S N Y m ,S
z
log
p
p
N
Y
i
i Pik Pik
k Pik
k k
k¼1 ik
XN XK
i¼1
Given that the underlying tissue segmentation Z is
unknown, the EM algorithm will be used to find the parameters that maximize the complete log-likelihood.
E-step: The E-step requires the computation of the conditional expectation of log(LC(c)) given (Y, P), using the current
parameter estimate c m:
Qðc; c m Þ ¼ Ecm log LC ðc ÞY, P
Since the complete log-likelihood is linear in the hidden
labels zij, the E-step requires the calculation of the current
conditional expectation of Zi given the observed data (Y, P):
Ecm ðZ i ¼ ek j Y, P Þ ¼ PK
pk pPik N Yi mk , Sk N Yi mPik , SPik
p 0 p 0 N Yi m 0 , S 0 N Yi m 0 , S
k0 ¼1 k
Pik
k
k
Pik
Pik0
M-step: Because the local reference population model
parameter j is constant, the Maximization step will consist of
the maximization of Q(c; c m) with respect to c, which results
in the same update equations derived in Wells et al. (1996).
In order to be robust to the presence of outliers, we used a TLE
to estimate c. The TLE was proposed as a modification of the
maximum likelihood estimator in the presence of outliers in the
observed data (Neykov, Filzmoser, Dimova, & Neytchev, 2007).
Using the TLE, the complete log-likelihood can be expressed as
log LC ðc Þ ¼ log
f
Y
,
P
,Z
vðiÞ vðiÞ
vðiÞ c
i¼1
Y
h
where for a fixed c, f(Yv(1), Pv(1), Zv(1)|c, j 1) . . . f(Yv(N), Pv
for i ¼ 1, . . ., N with v ¼ (v(1), . . ., v(N)) being
the corresponding permutation of indices sorted by their probabilityf(Yv(i), Pv(i), Zv(i)|c) and h is the trimming parameter
corresponding to the percentage of values included in the
parameter estimation. In other words, now, the likelihood is
only computed using the voxels that are more likely to belong
to the proposed model.
The TLE was computed using the fast-TLE algorithm, in
which iteratively, the N h voxels with the highest estimated
likelihood are selected to estimate c mþ1 using the update
equations. These two steps are iterated until convergence.
It follows intuitively that the local intensity model downweighs
the likelihood of those voxels that have an abnormal intensity
given the reference population. Since MRI structural abnormalities
will show an abnormal intensity level compared to similarly
located brain tissues in healthy subjects, we seek to identify MS
lesions by searching for areas with low likelihood LC(c).
(N), Zv(N)|c, j N)
Illustrative Applications of Segmentation with the
MOPS Intensities
We evaluated MOPS using the MS Grand Challenge dataset
(Styne et al., 2008). The MS Grand Challenge website accepts
Brain Mapping: An Encyclopedic Reference, (2015), vol. 1, pp. 323-332
Author's personal copy
330
INTRODUCTION TO METHODS AND MODELING | Lesion Segmentation
new segmentations and rates them with a score that summarizes
the performance of the segmentation algorithm. A score of 90
was considered to equal the accuracy of a human rater. MOPS
achieved a score of 84.5, which ranks as the best performing
algorithm over all the 17 lesion segmentation algorithms for
which results have been submitted (Figure 1).
The lesion detection rates of MOPS were consistently more
specific, and at least equally sensitive, to previously reported
algorithms (Figure 2). This demonstrates that a model of
lesions as global intensity outliers within each subject’s MRI
is less able to discriminate true lesions than the joint MOPS
intensities. MOPS is able to successfully identify lesions in
patients with pediatric-onset multiple sclerosis as will be illustrated later (Figure 3). Furthermore, MOPS is able to detect
atypical local intensities through comparison to images of a
healthy reference population, so MOPS can detect many types
of brain abnormalities. Figure 4 illustrates the successful detection of a pediatric brain tumor.
MS Grand Challenge scores
85
84.46
84
83
82.12
82.07
82
Score
81
80
80
79.9
79.1
79
78.19
78
77
76
75
ek
be
11
08
n
tio
20
20
la
zo
04
n
re
20
Lo
a-
10
.,
al
11
20
et
20
le
e
up
ie
ci
ar
An
G
Sh
So
ia
08
20
em
er
G
q
ic
Br
pu
po
of ct
el bje
od su
M nd
a
Participant team
Figure 1 Comparison of lesion segmentation performance of different algorithms from the MS Lesion Segmentation Grand Challenge (Styne et al.,
2008). The highest score is best.
1 − f(Yi, Zi|ψ)
T1w MRI
1.00
0.75
0.50
0.25
0.00
(a)
(c)
1 − f(Yi, Pi, Zi|ψ)
T2w MRI
1.00
0.75
0.50
0.25
0.00
(b)
(d)
Figure 2 Comparison of detection of a brain tumor from (a) T1w MRI
and (b) T2w MRI, using a (c) global intensity model and (d) model of
population and subject (MOPS). The figure demonstrates the improved
lesion sensitivity of the voxel lesion probability derived by MOPS enabling
accurate localization of the brain tumor.
Figure 3 Illustration of lesion segmentation with MOPS from an MRI
scan of a patient with pediatric onset multiple sclerosis.
Brain Mapping: An Encyclopedic Reference, (2015), vol. 1, pp. 323-332
Author's personal copy
INTRODUCTION TO METHODS AND MODELING | Lesion Segmentation
Figure 4 Illustration of tractography of the corticospinal tract in the
region of the brain tumor detected automatically by MOPS. Careful
assessment of the path of the corticospinal tract allows for optimization
of the surgical approach to minimize the risk of loss of function following
surgery.
Conclusion
Lesion segmentation is an important task, regularly carried out
by experts using interactive and semiautomatic segmentation
tools. Automated algorithms for segmentation of lesions have
explored a wide range of techniques and are increasingly effective for a range of types of pathology. Advances in MS lesion
segmentation enable quantitative and accurate detection of
lesions from high-quality MRI.
See also: INTRODUCTION TO ACQUISITION METHODS:
Anatomical MRI for Human Brain Morphometry; Myelin Imaging;
INTRODUCTION TO CLINICAL BRAIN MAPPING: Demyelinating
Diseases; MRI in Clinical Management of Multiple Sclerosis;
INTRODUCTION TO METHODS AND MODELING: Diffeomorphic
Image Registration; Intensity Nonuniformity Correction; Nonlinear
Registration Via Displacement Fields; Posterior Probability Maps.
References
Akhondi-Asl, A., & Warfield, S. K. (2013). Simultaneous truth and performance level
estimation through fusion of probabilistic segmentations. IEEE Transactions on
Medical Imaging, http://dx.doi.org/10.1109/TMI.2013.2266258.
Anbeek, P., Vincken, K. L., van Osch, M. J., Bisschops, R. H. C., & van der Grond, J.
(2004). Automatic segmentation of different-sized white matter lesions by voxel
probability estimation. Medical Image Analysis, 8, 205–215. http://dx.doi.org/
10.1016/j.media.2004.06.019.
Ashton, E. A., Takahashi, C., Berg, M. J., Goodman, A., Totterman, S., & Ekholm, S.
(2003). Accuracy and reproducibility of manual and semiautomated quantification of
MS lesions by MRI. Journal of Magnetic Resonance Imaging, 17, 300–308. http://
dx.doi.org/10.1002/jmri.10258.
Barkhof, F., Simon, J. H., Fazekas, F., Rovaris, M., Kappos, L., de Stefano, N., et al.
(2012). MRI monitoring of immunomodulation in relapse-onset multiple sclerosis
331
trials. Nature Reviews. Neurology, 8, 13–21. http://dx.doi.org/10.1038/
nrneurol.2011.190.
Bermel, R. A., & Bakshi, R. (2006). The measurement and clinical relevance of brain
atrophy in multiple sclerosis. Lancet Neurology, 5, 158–170. http://dx.doi.org/
10.1016/S1474-4422(06)70349-0.
Bijar, A., Khayati, R., & Pen˜alver Benavent, A. (2013). Increasing the contrast of the
brain MR FLAIR images using fuzzy membership functions and structural similarity
indices in order to segment MS lesions. PloS One, 8, e65469. http://dx.doi.org/
10.1371/journal.pone.0065469.
Bricq, S., Collet, C., & Armspach, J. P. (2008). Markovian segmentation of 3D brain
MRI to detect multiple sclerosis lesions. In 15th IEEE International Conference on
Image Processing (pp. 733–736).
Brinkmann, B. H., Manduca, A., & Robb, R. A. (1998). Optimized homomorphic
unsharp masking for MR grayscale inhomogeneity correction. IEEE Transactions on
Medical Imaging, 17(2), 161–171. http://dx.doi.org/10.1109/42.700729.
Clarke, L. P., Velthuizen, R. P., Camacho, M. A., Heine, J. J., Vaidyanathan, M.,
Hall, L. O., et al. (1995). MRI segmentation: Methods and applications. Magnetic
Resonance Imaging, 13, 343–368.
Cocosco, C. A., Zijdenbos, A. P., & Evans, A. C. (2003). A fully automatic and robust
brain MRI tissue classification method. Medical Image Analysis, 7, 513–527. http://
dx.doi.org/10.1016/S1361-8415(03)00037-9.
Commowick, O., Akhondi-Asl, A., & Warfield, S. K. (2012). Estimating a reference
standard segmentation with spatially varying performance parameters: Local MAP
STAPLE. IEEE Transactions on Medical Imaging, 31(8), 1593–1606. http://dx.doi.
org/10.1109/TMI.2012.2197406.
Commowick, O., & Warfield, S. K. (2010). Estimation of inferential uncertainty in
assessing expert segmentation performance from STAPLE. IEEE Transactions on
Medical Imaging, 29(3), 771–780. http://dx.doi.org/10.1109/TMI.2009.2036011.
Datta, S., & Narayana, P. A. (2013). A comprehensive approach to the segmentation of
multichannel three-dimensional MR brain images in multiple sclerosis.
NeuroImage. Clinical, 2, 184–196. http://dx.doi.org/10.1016/j.nicl.2012.12.007.
de Boer, R., Vrooman, H. A., van der Lijn, F., Vernooij, M. W., Ikram, M. A.,
van der Lugt, A., et al. (2009). White matter lesion extension to automatic brain
tissue segmentation on MRI. NeuroImage, 45, 1151–1161.
de Zwart, J. A., Ledden, P. J., van Gelderen, P., Bodurka, J., Chu, R., & Duyn, J. H.
(2004). Signal-to-noise ratio and parallel imaging performance of a 16-channel
receive-only brain coil array at 3.0 Tesla. Magnetic Resonance in Medicine, 51(1),
22–26. http://dx.doi.org/10.1002/mrm.10678.
Dice, L. R. (1945). Measures of the amount of ecologic association between species.
Ecology, 26, 297. http://dx.doi.org/10.2307/1932409.
Filippi, M., Horsfield, M. A., Bressi, S., Martinelli, V., Baratti, C., Reganati, P., et al.
(1995). Intra- and inter-observer agreement of brain MRI lesion volume
measurements in multiple sclerosis. A comparison of techniques. Brain, 118(Pt 6),
1593–1600.
Filippi, M., Horsfield, M. A., Tofts, P. S., Barkhof, F., Thompson, A. J., & Miller, D. H.
(1995). Quantitative assessment of MRI lesion load in monitoring the evolution of
multiple sclerosis. Brain, 118(Pt 6), 1601–1612.
Garcı´a-Lorenzo, D., Prima, S., Arnold, D. L., Collins, D. L., & Barillot, C. (2011).
Trimmed-likelihood estimation for focal lesions and tissue segmentation in multisequence MRI for multiple sclerosis. IEEE Transactions on Medical Imaging, 1–13.
http://dx.doi.org/10.1109/TMI.2011.2114671.
Geremia, E., Clatz, O., Menze, B. H., Konukoglu, E., Criminisi, A., & Ayache, N. (2011).
Spatial decision forests for MS lesion segmentation in multi-channel magnetic
resonance images. NeuroImage, 57, 378–390.
Grimaud, J., Lai, M., Thorpe, J., & Adeleine, P. (1996). Quantification of MRI lesion
load in multiple sclerosis: A comparison of three computer-assisted techniques.
Magnetic Resonance Imaging, 14, 495–505.
Guttmann, C. R., Ahn, S. S., Hsu, L., Kikinis, R., & Jolesz, F. A. (1995). The evolution of
multiple sclerosis lesions on serial MR. AJNR. American Journal of Neuroradiology,
16, 1481–1491.
Hadjiprocopis, A., & Tofts, P. (2003). An automatic lesion segmentation method for fast
spin echo magnetic resonance images using an ensemble of neural networks.
In IEEE XIII workshop on neural networks for signal processing (IEEE Cat.
No.03TH8718) (pp. 709–718).
Harmouche, R., Collins, L., Arnold, D., Francis, S., & Arbel, T. (2006). Bayesian MS
lesion classification modeling regional and local spatial information. In: Eighteenth
international conference on pattern recognition (ICPR’06) (pp. 984–987).
Kamber, M., Shinghal, R., Collins, D. L., Francis, G. S., & Evans, A. C. (1995). Modelbased 3-D segmentation of multiple sclerosis lesions in magnetic resonance brain
images. IEEE Transactions on Medical Imaging, 14, 442–453. http://dx.doi.org/
10.1109/42.414608.
Kamber, M., Louis Collins, D., Shinghal, R., Francis, G. S., & Evans, A. C. (1992).
Model-based 3D segmentation of multiple sclerosis lesions in dual-echo MRI data.
Brain Mapping: An Encyclopedic Reference, (2015), vol. 1, pp. 323-332
Author's personal copy
332
INTRODUCTION TO METHODS AND MODELING | Lesion Segmentation
In: Proceedings of the SPIE: Visualization in biomedical computing 1992, Chapel
Hill, NJ. vol. 1808 (pp. 590–600).
Kapouleas, I. (1989). Automatic detection of multiple sclerosis lesions in MR brain
images. In: Proceedings of the annual symposium on computer application in
medical care (pp. 739–745).
Kaza, E., Klose, U., & Lotze, M. (2011). Comparison of a 32-channel with a 12-channel
head coil: are there relevant improvements for functional imaging? Journal of
Magnetic Resonance Imaging, 34(1), 173–183. http://dx.doi.org/10.1002/
jmri.22614.
Kikinis, R., Guttmann, C. R., Metcalf, D., Wells, W. M., Ettinger, G. J., Weiner, H. L.,
et al. (1999). Quantitative follow-up of patients with multiple sclerosis using MRI:
Technical aspects. Journal of Magnetic Resonance Imaging, 9, 519–530.
Kikinis, R., Shenton, M. E., Gerig, G., Martin, J., Anderson, M., Metcalf, D., et al. (1992).
Routine quantitative analysis of brain and cerebrospinal fluid spaces with MR
imaging. Journal of Magnetic Resonance Imaging, 2(6), 619–629.
Kroon, D., Oort, E.V., & Slump, K. (2008). Multiple sclerosis detection in multispectral
magnetic resonance images with principal components analysis. MS Lesion
Segmentation (MICCAI 2008 Workshop).
Kwan, R. K., Evans, A. C., & Pike, G. B. (1999). MRI simulation-based evaluation of
image-processing and classification methods. IEEE Transactions on Medical
Imaging, 18, 1085–1097. http://dx.doi.org/10.1109/42.816072.
Maltbie, E., Bhatt, K., Paniagua, B., Smith, R. G., Graves, M. M., Mosconi, M. W., et al.
(2012). Asymmetric bias in user guided segmentations of brain structures.
NeuroImage, 59(2), 1315–1323. http://dx.doi.org/10.1016/j.
neuroimage.2011.08.025.
Miller, D. H., Filippi, M., Fazekas, F., Frederiksen, J. L., Matthews, P. M., Montalban, X.,
et al. (2004). Role of magnetic resonance imaging within diagnostic criteria for
multiple sclerosis. Annals of Neurology, 56, 273–278.
Morra, J., Tu, Z., Toga, A., & Thompson, P. (2008). Automatic segmentation of MS
lesions using a contextual model for the MICCAI grand challenge. MS Lesion
Segmentation, (MICCAI 2008 Workshop).
Neykov, N., Filzmoser, P., Dimova, R., & Neytchev, P. (2007). Robust fitting of mixtures
using the trimmed likelihood estimator. Computational Statistics & Data Analysis,
52, 299–308. http://dx.doi.org/10.1016/j.csda.2006.12.024.
Nyu´l, L. G., & Udupa, J. K. (1999). On standardizing the MR image intensity scale.
Magnetic Resonance in Medicine, 42, 1072–1081.
Paty, D. W., & Li, D. K. (1993). Interferon beta-1b is effective in relapsing-remitting
multiple sclerosis. II. MRI analysis results of a multicenter, randomized, doubleblind, placebo-controlled trial. UBC MS/MRI Study Group and the IFNB Multiple
Sclerosis Study Group. Neurology, 43(4), 662–667.
Prastawa, M., Bullitt, E., & Ho, S. (2004). A brain tumor segmentation framework based
on outlier detection. Medical Image Analysis, 8, 275–283.
Shah, M., Xiao, Y., Subbanna, N., Francis, S., Arnold, D. L., Collins, D. L., et al. (2011).
Evaluating intensity normalization on MRIs of human brain with multiple sclerosis.
Medical Image Analysis, 15(2), 267–282. http://dx.doi.org/10.1016/j.
media.2010.12.003, pii: S1361-8415(10)00133-7.
Shepherd, T., Prince, S. J., & Alexander, D. C. (2012). Interactive lesion segmentation
with shape priors from offline and online learning. IEEE Transactions on Medical
Imaging, 31, 1698–1712. http://dx.doi.org/10.1109/TMI.2012.2196285.
Shiee, N., Bazin, P. L., Ozturk, A., Reich, D. S., Calabresi, P. A., & Pham, D. L. (2010). A
topology-preserving approach to the segmentation of brain images with multiple
sclerosis lesions. NeuroImage, 49, 1524–1535.
Sled, J. G., Zijdenbos, A. P., & Evans, A. C. (1998). A nonparametric method for
automatic correction of intensity nonuniformity in MRI data. IEEE Transactions on
Medical Imaging, 17, 87–97. http://dx.doi.org/10.1109/42.668698.
Souple, J., Lebrun, C., Ayache, N., & Malandain, G. (2008) An Automatic Segmentation
of T2-FLAIR Multiple Sclerosis Lesions. MS Lesion Segmentation (MICCAI 2008
workshop).
Styne, M., Lee, J., Chin, B., Chin, M.S., Commowick, O., Tran, H., et al. (2008). 3D
Segmentation in the Clinic: A Grand Challenge II, MS Lesion Segmentation
(MICCAI 2008 Workshop).
Thompson, D. K., Wood, S. J., Doyle, L. W., Warfield, S. K., Egan, G. F., & Inder, T. E.
(2009). MR-determined hippocampal asymmetry in full-term and preterm neonates.
Hippocampus, 19(2), 118–123. http://dx.doi.org/10.1002/hipo.20492.
Tomas-Fernandez, X., & Warfield, S. K. (2012). Population intensity outliers or a new
model for brain WM abnormalities. In: Ninth IEEE international symposium on
biomedical imaging (ISBI) (pp. 1543–1546), IEEE.
Udupa, J. K., Wei, L., Samarasekera, S., Miki, Y., van Buchem, M. A., & Grossman, R. I.
(1997). Multiple sclerosis lesion quantification using fuzzy-connectedness
principles. IEEE Transactions on Medical Imaging, 16, 598–609. http://dx.doi.org/
10.1109/42.640750.
Van Leemput, K., Maes, F., Vandermeulen, D., Colchester, A., & Suetens, P. (2001).
Automated segmentation of multiple sclerosis lesions by model outlier detection.
IEEE Transactions on Medical Imaging, 20, 677–688.
Van Leemput, K., Maes, F., Vandermeulen, D., & Suetens, P. (1999). Automated
model-based tissue classification of MR images of the brain. IEEE Transactions on
Medical Imaging, 18, 897–908. http://dx.doi.org/10.1109/42.811270.
Vannier, M. W., Butterfield, R. L., Jordan, D., Murphy, W. A., Levitt, R. G., & Gado, M.
(1985). Multispectral analysis of magnetic resonance images. Radiology, 154(1),
221–224. http://dx.doi.org/10.1148/radiology.154.1.3964938.
Vannier, M. W., Butterfield, R. L., & Jordan, D. (1985). Multispectral analysis of
magnetic, resonance images. Radiology, 154, 221–224.
Wack, D. S., Dwyer, M. G., Bergsland, N., Di Perri, C., Ranza, L., Hussein, S., et al.
(2012). Improved assessment of multiple sclerosis lesion segmentation agreement
via detection and outline error estimates. BMC Medical Imaging, 12, 17. http://dx.
doi.org/10.1186/1471-2342-12-17.
Warfield, S. K., Kaus, M., Jolesz, F. A., & Kikinis, R. (2000). Adaptive, template
moderated, spatially varying statistical classification. Medical Image Analysis, 4,
43–55.
Warfield, S., Dengler, J., Zaers, J., Guttmann, C. R., Wells, W. M., 3rd., Ettinger, G. J.,
et al. (1995). Automatic identification of gray matter structures from MRI to improve
the segmentation of white matter lesions. Journal of Image Guided Surgery, 1(6),
326–338. http://dx.doi.org/10.1002/(SICI)1522-712X.
Warfield, S. K., Zou, K. H., & Wells, W. M. (2004). Simultaneous truth and performance
level estimation (STAPLE): An algorithm for the validation of image segmentation.
IEEE Transactions on Medical Imaging, 23, 903–921. http://dx.doi.org/10.1109/
TMI.2004.828354.
Wei, X., Guttmann, C. R., Warfield, S. K., Eliasziw, M., & Mitchell, J. R. (2004). Has your
patient’s multiple sclerosis lesion burden or brain atrophy actually changed?
Multiple Sclerosis, 10, 402–406.
Wei, X., Warfield, S. K., Zou, K. H., Wu, Y., Li, X., Guimond, A., et al. (2002).
Quantitative analysis of MRI signal abnormalities of brain white matter with high
reproducibility and accuracy. Journal of Magnetic Resonance Imaging, 209,
203–209. http://dx.doi.org/10.1002/jmri.10053.
Weisenfeld, N. I., & Warfield, S. K. (2009). Automatic segmentation of newborn brain
MRI. NeuroImage, 47, 564–572. http://dx.doi.org/10.1016/j.
neuroimage.2009.04.068.
Weisenfeld, N. I., & Warfield, S. K. (2004). Normalization of joint image-intensity
statistics in MRI using the Kullback–Leibler divergence. In Proceedings of the 2004
IEEE international symposium on biomedical imaging: From nano to macro,
Arlington, VA, April 15–18, 2004.
Wells, W. M., Grimson, W. L., Kikinis, R., & Jolesz, F. A. (1996).
Adaptive segmentation of MRI data. IEEE Transactions on Medical Imaging, 15,
429–442.
Wels, M., Huber, M., & Hornegger, J. (2008). Fully automated segmentation of multiple
sclerosis lesions in multispectral MRI. Pattern Recognition and Image Analysis, 18,
347–350. http://dx.doi.org/10.1134/S1054661808020235.
Younis, A. A., Soliman, A. T., Kabuka, M. R., & John, N. M. (2007). MS lesions
detection in MRI using grouping artificial immune networks. In IEEE 7th
international symposium on bioinformatics and bioengineering (pp. 1139–1146).
Zijdenbos, A. P., Dawant, B. M., Margolin, R. A., & Palmer, A. C. (1994). Morphometric
analysis of white matter lesions in MR images: Method and validation. IEEE
Transactions on Medical Imaging, 13, 716–724. http://dx.doi.org/10.1109/
42.363096.
Zijdenbos, A. P., Forghani, R., & Evans, A. C. (2002). Automatic ‘pipeline’ analysis of
3-D MRI data for clinical trials: Application to multiple sclerosis. IEEE Transactions
on Medical Imaging, 21, 1280–1291.
Relevant Websites
http://brainweb.bic.mni.mcgill.ca/brainweb/selection_ms.html – BrainWeb Lesion
Simulator.
http://www.spl.harvard.edu/publications/item/view/1180 – Warfield/Kaus database.
http://crl.med.harvard.edu/software – STAPLE validation software.
http://martinos.org/qtim/miccai2013/ – Multimodal Brain Tumor Segmentation.
http://www.sci.utah.edu/prastawa/software.html – Brain Tumor Simulator.
http://www.ia.unc.edu/MSseg/ – Multiple Sclerosis Lesion Segmentation Grand
Challenge.
Brain Mapping: An Encyclopedic Reference, (2015), vol. 1, pp. 323-332