On the Usage of the 2D-AR-Model in Texture

On the Usage of the 2D-AR-Model in Texture
Completion Scenarios with Causal Boundary
Conditions: A Tutorial
Martin K¨
oppela,∗, Dimitar Doshkovb , Fabien Racapeb , Patrick Ndjiki-Nyab ,
Thomas Wieganda,b,∗∗
a School
of Electrical Engineering and Computer Science, Berlin Institute of Technology
Institute for TelecommunicationsHeinrich Hertz Institute (HHI)
b Fraunhofer
Abstract
In recent years, significant progress has been witnessed in several image and
video completion scenarios. Given a specific application, these methods can
produce, reproduce or extend a given texture sample. While there are many
promising algorithms available, there is still a lack of theoretical understanding
on how some of them are designed and under which conditions they perform.
For that, we analyze and describe the technique behind one of the most popular
parametric completion algorithms: the autoregressive (AR) model. Furthermore, we address important implementation details, complexity issues and restrictions of the model. Beyond that, we explain how the performance of the AR
model can be significantly improved. In summary, this paper aims to achieve
three major goals: 1) To provide a comprehensive tutorial for experienced and
non- experienced readers, 2) to propose novel methods that improve the performance of the 2D-AR completion, and 3) to motivate and guide researchers that
are interested in the usage of the AR model for texture completion tasks.
Keywords: Texture completion, texture synthesis, inpainting, autoregressive
model, image reconstruction
1. Introduction
Texture understanding and representation have been a research focus for
many years in human perception, computer graphics and computer vision. Re∗ Principal
corresponding author
author
Email addresses: [email protected] (Martin K¨
oppel),
[email protected] (Dimitar Doshkov),
[email protected] (Fabien Racape),
[email protected] (Patrick Ndjiki-Nya),
[email protected] (Thomas Wiegand)
∗∗ Corresponding
Preprint submitted to ELSEVIER Signal Processing: Image Communication
April 2, 2015
cently a substantial portion of research activities in this area emphasize on two
main topics: texture synthesis [1] and inpainting [2].
1.1. Definitions
Texture synthesis refers to the generation process of a novel texture pattern
from a limited sample. An arbitrarily large output texture is generated that is
perceptually similar to the input sample. Hence, this method is a way to create
textures for different applications (e.g. texture mapping on surfaces). On the
other side, the term inpainting stands for approaches that regenerate missing
or damaged image areas using information from the rest of it. Most of the
work on inpainting focused on applications such as image/video restoration (e.g.
scratch removal), object removal (e.g. removal of selected image elements) and
error concealment (e.g. filling-in image blocks lost during data transmission).
The main difference between texture synthesis and inpainting is that inpainting
techniques are better suited for complex images containing both texture and
dominant structures.
In this paper, we will use the generic term “texture completion” to describe
the seamless reconstruction of missing texture samples of a given picture or part
of texture sample.
1.2. Classification of Texture Completion Approaches
Texture completion algorithms can be divided into three main categories:
1) Parametric, 2) partial-differential- equations- based (PDE), and 3) nonparametric. An overview of texture completion categories is given in Table
1 (cf. also Ndjiki-Nya et al. [3]).
Parametric completion approaches approximate the probability density function (PDF) of the texture source using a compact model with a fixed parameter
set [4, 5, 6, 7, 8, 9, 10, 11, 12, 13]. That is, these methods extract statistics
from the given input texture that are modeled based on a compact parameter
set. Such approaches also provide information relating to the underlying texture
properties, which can be relevant in identification and recognition applications.
Some of the most commonly used parametric methods are based on the autoregressive (AR), the moving average (MA) and the autoregressive moving average
(ARMA) models.
The second texture completion methods category, named PDE-based algorithms, employs a diffusion process to fill the missing image parts in a visually
plausible manner. These techniques commonly use non-linear or high order
partial-differential-equations to propagate information from the boundary towards the interior of the unknown area. Several approaches based on PDE have
been developed in the last decade [14, 15, 16, 17].
The last class of methods, non-parametric completion approaches, does not
explicitly model the PDF, but instead measure it from an available texture
sample. In general, in this completion category, a best match is determined
from a source region and copied to a target region [18, 19, 20, 21, 22, 23, 24].
2
Table 1: Overview on texture completion approaches [3], visual quality and complexity limitations
Category
models
Completion of
Limitations
Complexity
texture classes
PDE-Based
AR, MA,
ARMA
PDE
Non-
MRF
Parametric
parametric
Rigid and
non-rigid
Rigid, thin,
elongated regions
Rigid and
non-rigid
structures
Medium
Structures,
smooth results
Prone to error
Medium
Hight
1.3. Contributions
In this paper, we will describe the technique behind the autoregressive-based
texture completion strategy. Our interest is to present a low complexity usage
of the AR model. Hence, the focus lies in using the prediction equations to
extrapolate/synthesize texture very efficient given a visually pleasing quality
rather than attempting to find an optimal but high complex solution for the
texture filling problem. Furthermore, we are specifically addressing applications
where the known data are on the top and left of the missing region (causal
model). This is for example true in video coding scenarios at the decoder side. A
theoretical understanding of the AR model will be provided through the detailed
explanation of issues like how the AR approach can be implemented, how correct
results can be reproduced and how complexity and other restrictions apply.
Furthermore, new contributions to further improve the AR technique will also
be presented. They relate to the adaptive definition of a training area, pre- and
post-processing steps, a consistency criterion and a regularization procedure.
In this work, we will describe a robust, multi-application, texture completion
method that can be integrated in texture completion applications and as well
in inpainting scenarios as a sub-module (due to the limitation of the AR model
to reconstruct structures, cf. Table 1). In addition, we will emphasize the
faster computational performance of the AR framework in comparison to the
state-of-the-art, while remaining visually pleasing completion results.
The remainder of this paper is organized as follows. In Sec. 2, an overview
on the state-of-the-art in AR-related research topics is introduced. Next, the
texture completion problem is explained in Sec. 3. The overall AR-based completion framework is presented in Sec. 4. A detailed description of 2D-AR texture completion together with a proposal on adaptive training area definition
can be found in Sec. 5. The experimental results followed by various application scenarios are provided in Sec. 6 and 7 respectively. Finally, conclusions
and future steps are given in Sec. 8.
2. State-of-the-Art of AR Modelling
Three decades ago, the autoregressive model, traditionally used on temporal
signals, started being utilized for image processing (e.g. in the area of image
3
and video texture completion). The AR models have been successfully used for
texture representation. In the work of Chellappa et al. [7], a 2D non causal
autoregressive (NCAR) model was used to synthesize different texture samples
sized 64 64 with several neighbor sets and parameters. The authors showed
that the AR model can reproduce natural textures. A similar contribution by
Deguchi [8] focused on texture characterization and completion of gray-level textures, using the same NCAR model as [7]. The basic properties of the model, the
algorithm and the model identification problem are discussed. Furthermore, the
work of Deguchi was at the time an innovative texture segmentation approach,
where blocs with similar AR parameters were merged iteratively. The work of
Tugnait [9] investigated the suitability of 2D NCAR models with asymmetric
support for completion of 128 128 real life textures. Here the AR model is
fitted to textures with abstracted mean value, i.e. with zero mean. The removed mean is added back in the synthetic image. In [10], the authors used
causal and non-causal neighborhoods for AR parameter estimation and texture
pattern generation. Thus, different image textures were successfully synthesized
using a given set of models and parameters.
In computer vision, the autoregressive model has also been used in image
and video reconstruction applications. A statistical framework for filling-in gaps
in images is presented in [11]. The method proposed in that paper, relies on
an iterative algorithm, with a block based model estimation and pixel based
filling. Visual results show effective reconstructions of images with thin elongated holes. In [25] Kokaram and Rayner proposed a 3D autoregressive model
which is utilized to remove blotches in old film material. They developed an
interpolation method which consider all texture areas in the extended boundary of the hole. In such a way the hole is filled with data that agrees with the
boundary. Furthermore, an improved version of this interpolation method [25]
was proposed in [26]. Janssen et al. [27] developed a deterministic approach
to fill missing samples given the AR coefficients. To determine the coefficients
Janssen et al. [27] considered all known samples in the boundary region of the
hole. In the work of Szummer [12], the temporal textures were modeled by a
spatio-temporal AR model. The authors showed that the autoregressive model
can be used to synthesize video sequences, using large causal neighborhoods
containing over 1000 parameters. The approach reproduced the dynamics of
most input sequences effectively. The proposed algorithm was also used to recognize the type of content of temporal textures, by applying a purely spatial
AR model. During the recognition tests, textures were correctly categorized at
95% as belonging to a certain texture class.
In general, all previous works on the AR model focused on specific applications, e.g. texture completion, image/video restoration (inpainting), texture
segmentation, etc. However, there is still a lack of theoretical understanding on
how some of them are designed and under which conditions these algorithms
work well. Furthermore, important modules of the AR concept are not taken
into account, e.g. an adequate quality assessment control, quality improvement
possibilities, an appropriate training (sample) decision procedure, solutions of
possible texture completion failure, etc. Therefore, our contribution is to pro4
vide a complete AR-modelling framework, that can be integrated in several
texture completion applications. The novel ideas in this paper compared to the
state-of-the-art approaches are the implementation of an adaptive definition of
a training area, a pre- and post-processing step, a consistency criterion and a
regularization procedure in case of erroneous texture completion result.
3. Texture Completion Problem
The major problems that have to be tackled in any texture completion process are roughly two-fold. The first one relates to the proper estimation of the
underlying stochastic process of a given texture based only on a finite sample of
it. The second task refers to the formulation of an efficient procedure (model)
for generating new textures from a sample [18]. The former challenge steers the
accuracy of the synthesized textures, while the latter challenge, also referred
to as PDF sampling, determines the computational complexity of the texture
generation procedure.
Although a variety of texture models have been developed in the last years,
the most successful models for imaging applications are based on Markov Random Field (MRF) assumptions. The MRF model is characterized by statistical
interrelations within local vicinities. Thus the underlying generative stochastic
process of the texture source is assumed to be both local and stationary. That
is, each sample of a texture pattern is characterized by a small set of spatially
neighboring samples, and this characterization is the same for all samples. This
can be formalized as:
p(Ti jI−i ) = p(Ti jΨi )
(1)
The assumption is that any texture pattern Ti extracted from a given sample I (e.g. image area) at location i can be predicted from the corresponding
neighborhood system Ψi and is independent of the rest of the texture. The
homogeneity property of the MRF presumes that the conditional probability
p(Ti jΨi ) is independent of the site i in I.
In this work, we will examine the autoregressive model as a popular MRF
approach for generating new textures from a given sample. The AR model is a
random process often used to predict various types of natural phenomena (e.g.
temporal signals). The underlying method can be extended to more than one
dimension and adapted to the problem of texture completion. This model treats
images as Markov processes, where a single sample is conditioned upon a spatial
neighborhood of samples. The conditional relationship, however, is expressed in
a parametric form through the linear autoregressive model, i.e. the AR model
attempts to predict the values of a data set as a linear combination of their
previous terms.
4. Overall AR Texture Completion Framework
The 2D-AR texture completion algorithm consists of four main and four
optional steps (cf. Fig. 1) 1) Definition of the training area (sample), 2) preprocessing of the training data (optional), 3) estimation of the AR coefficients,
5
Definition of
training area
Pre-processing
(normalization)
Ω
Output
Estimation of
optimal AR model
Completion
error
Estimation of
innovation term
Completion
of Ω
Post-processing
AR consistency
criterion
Completion failed
Input image
Estimation of AR
coefficients
Figure 1: Block diagram of the AR reconstruction process, showing four main (solid blocks)
and four optional (dashed blocks) steps.
4) estimation of the innovation term, 5) estimation of the optimal AR model
(optional), 6) completion of the unknown area, 7) consistency assessment of the
final result (optional) and finally 8) post-processing (optional). Note that by
discarding the optional modules, the AR completion procedure can be simplified
significantly at the cost however of the visual quality of the results.
In the 2D case, it is assumed that the input picture I includes a closed
region Ω
I, containing unknown or damaged information that has to be
reconstructed.
The AR algorithm is a parametric technique that reconstructs each sample
in the unknown area Ω using a linear combination of its neighbors plus additive
white Gaussian noise. In the first steps of the algorithm, the parameters of
the AR model are derived from a spatial (2D) training area adjacent to Ω.
It is therefore assumed that this area has similar characteristics as Ω. This
means that the trained model can only be used to reconstruct textures similar
to the training texture. It is usually not feasible to reconstruct several classes
of textures or transitions between different textures with the same parameters.
In the last step of the process, Ω is completed using the parameters previously
estimated. In depth description of the framework is given in the next section.
6
5. Spatial Autoregressive Model
In this paper a two dimensional AR model is used in order to complete the
content of a missing image area Ω. Initially, the idea is to define each sample
of the image as a linear combination of its spatial neighbors plus a noise term.
Hence, the general definition of a 2D AR model can be expressed as
ˆ y) =
I(x,
yX
max
xX
max
αi,j I(x
i, y
j) + (x, y)
(2)
j=ymin j=xmin
with (x, y) N (0, σ 2 ) and (i, j) 6= (0, 0),
ˆ y) represents the completed sample at location (x, y) in the current
where I(x,
image I. (i, j) determine the known spatial neighborhood values. ymin , ymax ,
xmin and xmax are constants that characterize the model order [cf. Fig. 2
(a)] and αi,j correspond to the prediction coefficients with j 2 [ymin , ymax ]
and i [xmin , xmax ]. The innovation term (x, y) is a white noise process with
zero mean and variance σ 2 . (x, y) thus corresponds to N (0, σ 2 ) and denotes
the innovation signal which drives the AR model. In theory, white noise is
spatially uncorrelated, i.e. the noise for each sample is independent and identically distributed (iid). Commonly, white noise models can be represented by
Gaussian, Laplacian, Poisson, Cauchy, Uniform noises [28] among others. Due
to the fact that the (additive) Gaussian noise provides a good noise approximation of many real-world applications (e.g. imaging systems) and generates
mathematically tractable models [29], the innovation term in this work (x, y)
is represented by white Gaussian noise. However, the white noise driven AR
process is only a subset of a general set of AR models [30]. For other models
than causal models a different noise class usually leads to better outcomes [26],
[31] (cf. Fig. 2).
Fig. 2 illustrates AR models (non-causal, semi-causal and causal) with difˆ y) to be estimated is depicted
ferent neighborhood structures. The pixel I(x,
red. Note that the semi-causal neighborhood can be extended in horizontal [cf.
Fig. 2 (b)] as well as in vertical directions. With respect to Fig. 2 (c) and (2),
ymin = xmin = 0, xmax = cx and ymax = cy , where cx , cy represent the horizontal and vertical orders of the AR model. Depending on the application scenario
and thus the defined neighborhood area, the neighborhood structure can be
chosen accordingly. In the case of 2D-texture completion, other neighborhood
structures than those defined in Fig. 2 may also apply (cf. [32] and Sec. 7).
ˆ y), first steps require 1) to define a training area, 2) to
In order to predict I(x,
estimate the optimal model coefficients and 3) to determine the variance σ 2 of
.
Due to the predefined application (texture completion) of the 2D-AR approach in this tutorial and in order to keep the explanations as legible as possible, a purely causal AR neighborhood model [cf. Fig. 2 (c)] is considered
for further explanations. The extension to semi-causal neighborhoods can be
derived from the explanations of the causal structure.
7
(xmax,ymax )
cx '
(0,0)
(0,0)
cy
cy
(xmin,ymin)
cx
cx
Figure 2: Examples of AR models with different neighborhoods: (a-left) non-causal (b-middle)
semi-causal (c-right) causal.
5.1. Definition of the Training Area
An appropriate training area with valid samples is required in order to derive
the AR coefficients. It is supposed that the texture information in the training
area is stationary (parameters such as the mean and variance, if they exist, do
not vary over spatial locations) and has the same characteristics as the texture
to complete. Theoretically, the latter assumption implies that the texture has
Gaussian distribution. However, the distribution of image data is typically very
far from Gaussian. Methods based on the Gaussian distribution thus neglect
some important aspects of image data [33]. However, the Gaussianity assumption is recommended in many real-world applications [33] as the distribution of
natural or artificial textures has not been clearly determined to date [28].
Following the Markov Random Field theory, the most likely texture with
similar statistic distribution is adjacent to Ω. Fig. 3 shows an example of a
training area that is located at the top-left corner of Ω. However, it is also
possible to define other known image regions as training area. For example
could the expanded boundary around the missing region be utilized for this
purpose. In Fig. 3 a sub-training area of size sx sy is defined as the training
area eroded cx and cy times [due to the semi-causal or causal AR model, cf. Fig.
2 (b),(c)] on the left and top side of the training area respectively. The known
samples I enclosed in the sub-training area are critical for the estimation of the
AR coefficients (cf. Sec. 5.3).
In case the AR model is applied on an area that is non-stationary (e.g.
the image contains multiple textures with different properties), the completion
results may become unreliable. The latter obstacle can be solved by analyzing
the available texture adjacent to Ω and defining an appropriate training area
adaptively. This may be done by defining a criterion, which discards nonstationary regions. Depending on the application scenario, different approaches
may be used: e.g. k-means clustering [34], enhanced segmentation algorithms
[35], probability measurement techniques [28], among others.
In this work, we have designed a robust and fast algorithm, which discards
the unreliable parts of the training region. We assume that the selected area
8
y
x
I(xo,yo)
cy
Training area
sy
Current image I
I(xo+sx -1,yo+sy -1)
cy
Sub-training
area
cx
sx
Initialization area
Ω
cx
Î(x,y)
Unknown area
(to fill-in)
Figure 3: Implementation of the 2D-AR approach using a top-left corner training area.
is large enough (e.g. bigger than 3 Ω), so that the eventual loss of the nonstationary information may not affect the quality of the final result. Thus, the
causal textured area in the vicinity of Ω [cf. Fig. 4 (a)] is considered as a possible
training region. Subsequently, the latter is divided into blocks (bx by ) that
are to be clustered into a stationary sub-set. The mean (µ) and variance (δ 2 ) of
each block is determined (Gaussianity assumption) and clustering is operated
based on the similarity of both features. For that, similarity thresholds tµ and
tδ are introduced for µ- and δ 2 -based comparisons respectively. As a result,
a set of segments are obtained. The largest region is chosen as the validated
training area [cf. Fig. 4 (b)].
5.2. Sample Normalization
Before the training area can be used for the estimation of the AR coefficients,
it may be normalized in order to scale its features. Different scaling alternatives
can be used: e.g. the mean, the maximum, the minimum value, the standard
deviation of the data to be normalized. The most common method consists
in first calculating the mean and standard deviation of the training area [36].
Then, the mean value is subtracted from the training samples that are further
divided by the standard deviation. The mean normalization procedure ensures
that the sample values in the training data have zero-mean and unit-variance,
as a result.
It has been found that this step is important because the range of values of
the image samples does not vary widely and the AR coefficients can therefore
capture the texture characteristics more accurately. This makes it possible to
handle a broader variety of textures.
9
bx
by
1
bx
Current image I
2
by
7
37
Discarded
blocks
Bloc
B
loc
o 3
Discarded
blocks
28
Training 29
area
Current image I
Ω
Ω
Training
area
40
Discarded blocks
Unknown area
Unknown area
Figure 4: Implementation of the stationarity criterion on a training area adjacent to the
unknown area (Ω). (left-a) The training region is divided into blocks (numbered from 1 to
40 in this example) of size bx × by and (right-b) Using a stationarity criterion, non-stationary
blocks are discarded from the training area.
5.3. Estimation of the AR Coefficients
The optimal AR coefficients can be estimated as the solution to the following
least square problem:
αC×1 = arg min kyS×1 − XS×C αC×1 k2
(3)
α
where α α 2 RC is a vector containing the AR coefficients [cf. Fig. 2 (b),(c)].
α = [α1,0 , α2,0 ,
, αcx ,cy ]T .
(4)
S
y (y 2 R ) denotes the known samples I in the sub-training area (cf. Fig. 3)
, I(x0 + sx − 1, y0 + sy − 1)]T .
(5)
y = [I(x0 , y0 ),
and X X 2 RS×C represents the neighboring sample matrix for each of the
samples in y:
3
2
I(xo − 1, y0 )
...
I(xo − cx , y0 − cy )
7
6
..
..
..
X=4
5
.
.
.
I(xo + sx − 2, y0 + sy − 1) . . . I(xo + sx − cx − 1, y0 + sy − cy − 1)
(6)
Furthermore, the subscripts in (3) represent the dimension of the vectors and
matrices, where C is the number of prediction coefficients. For Fig. 2 (c) (causal
neighborhood), C = (cx + 1)(cy + 1) − 1.
S denotes the number of samples in the sub-training area (the number of
linear equations), e.g. using the neighborhood example in Fig. 3: S = sx sy .
10
Hence, equation (3) can be solved with the closed-form solution:
α = (XT X)−1 (XT y).
(7)
As the set of coefficients α minimizes the model error in a least-square sense,
samples that are unsuitable for completion in the current training area are
assigned smaller coefficients, i.e., the AR model adapts to the local texture
characteristics. In case (7) cannot be solved, due to non-invertible matrices
XT X that are bound to appear, a pseudo inverse of the matrix XT X can be
determined [37]. We determine the optimal coefficients in the luminance signal
of an image. During the completion of the hole (cf. Sec. 5.6) these optimal
coefficients are then applied to all channels of the image. Of course, there exist
other efficient ways to estimate the AR coefficients. Alternatively, the covariance
method (Yule-Walker equations) can be used [12].
By using the prediction equation only, we are utilizing the top and left borders as initial conditions but not the right and bottom edges of the hole. This
however can lead to noticeable artifacts at the boundaries (cf. Sec. 5.8). To
overcome this issue it is possible to extend the proposed AR model according
to the interpolation approaches outlined in [25] and [26], assuming that all texture samples in the extended boundary of the hole are available. Nevertheless,
these interpolation methods are more complex than the method proposed in
this paper.
5.4. Estimation of the Innovation Term
Once the AR coefficients are estimated, the standard deviation σ 2 of the
innovation term (x, y) can be calculated using the completion error:
XS×C αC×1 jj2 ,
Err = jjyS×1
(8)
which is normalized by the size of the sub training area [12]
Err
.
(9)
S
Note, that Err is estimated on the same area as the one used to learn the
AR coefficients, i.e. the sub-training area. Hence, when calculating the error,
we simulate a completion Xα of the sub-training area (cf. Fig. 3). During the
simulation, I [cf. (2)] is not modified. This is done to ensure that the completion
error stems only from the imperfectly predicted AR coefficients and not from
the use of completed samples in the simulated filling procedure.
σ2 =
5.5. Estimation of the Optimal AR Model Order
The problem of selecting the optimal order C for the AR model has been
widely studied in the area of model identification selection for many years. Some
of earliest work in this area was performed by Box and Jenkins [38] in their
study of AR time series modeling. Their method relies on analyzing the autocorrelation function and partial autocorrelation function of the signal. At higher
11
dimensions, these two functions have complex patterns however and are difficult
to interpret [12]. Hence, other methods like Akaikes Information Criterion (AIC)
[39] and Schwartzs Bayesian Criterion (SBC) are more suitable for 2D (and
2D+t) estimation of the optimal AR order. The energies of these criteria are
then defined as follows:
Eaic
Esbc
= S(ln σ 2 ) + 2C,
= S(ln σ 2 ) + C(ln S).
(10)
Both criteria contain an error term [the left term in (10)], which depends
on the noise variance, and a term which penalizes high model orders [the right
term in (10)]. The second term has a stronger impact in the SBC than in the
AIC. Also, the SBC was designed to be an improved version of the AIC [12].
Therefore it is the more accurate criterion. The optimal value of C has to fulfill
the following minimization criterion according to [40]
C = arg min(Esbc ).
(11)
C
The energy Esbc must therefore be estimated for several settings of C to find
the optimal value. This can be done in a loop where C is varied in order to find
a optimal size for C that results in a minimal Esbc value (cf. Fig. 1).
5.6. Completion of the Missing Area
Finally, all the model parameters are obtained and the unknown information
can be completed in a raster scan way. Hence, (2) is applied to each sample
ˆ y) of Ω (cf. Fig. 3), where I(x, y) represents the available pixels within
I(x,
a causal neighborhood. If the original values I are not available (inside of Ω),
their completed values Iˆ are used instead. However, other filling orders beside
the raster scan way may also provide competitive results.
After Ω is completed, an inverse normalization operation with the determined scaling features (mean and standard deviation) has to be performed to
achieve the final texture.
5.7. AR Consistency Criterion
In the completion framework, the example texture surrounding Ω is finite.
Hence, the best AR settings may still be a bad compromise. In fact, it may
happen that no good fit is found for the selected training area. Furthermore, it
is possible that the estimated AR coefficients overfit the training data. If such
inadequate AR parameters are used to complete Ω, then erroneous propagation
of the existing texture will be the consequence. Hence, an AR consistency
assessment criterion is recommended in this work.
Since the completion process is derived from the initialization samples (cf.
Fig. 3), the properties of the final result and those of the initialization area
should be similar. If Imin and Imax are respectively the lowest and the highest
12
initialization sample intensity values, the texture completion is considered as
unsuccessful, if
8
ˆ y)
< Imin τ > I(x,
or
with (x, y) 2 Ω,
(12)
:
ˆ y)
Imax + τ < I(x,
where τ is a threshold value, that allows a small deviation from Imin and Imax .
This is a quite simple criterion that is motivated by the observation that AR
distortions typically lead to gross chromatic variations that extremely deviate
from the spatial context.
When this criterion detects an erroneous texture completion result, it is
advised to repeat the whole AR process and use a “regularization procedure”
[41] to make the system yield a different set of coefficients.
In the field of machine learning (also mathematics and statistics), regularization involves introducing additional parameter(s) in order to solve an ill-posed
problem or to prevent overfitting. By minimizing the augmented error function
instead of the error on the image data, complex models can be penalized. In
detail, a new parameter λ is defined that allows us to regularize the coefficients
α, so that the variance of α is decreased. Now the least square problem (3) can
be expressed as:
αC×1 = arg min kyS×1
XS×C αC×1 k2 + λkαC×1 k2 .
(13)
α
Hence, (13) can be estimated with the closed-form solution:
α = (XT X + λU )−1 (XT y).
(14)
where U 2 RC×C represents the unit matrix.
5.8. Post Processing
Depending on the completion scenario, the completed texture in Ω may still
feature noticeable perceptual distortions, especially at the right and bottom
border boundary between Ω and the original texture [cf. Fig. 5 (e),(f)]. For this
reason, a post-processing module is required to improve the perceived quality.
In this work, we present the Poisson cloning [42] as a solution to this issue. In
general, Poisson cloning corrects the reconstructed areas in Ω photometrically
such that subjective impairments are minimized. For further details please refer
to the work of Perez et al. [42].
As a matter of fact, an efficient post-processing step can be designed by
utilizing other tools, e.g. feathering [43].
6. Experimental Results
In this section, detailed experiments are described. In Sec. 6.1, the data
set as well as the evaluation measures used are defined. In Sec. 6.2 - 6.6, the
13
SSIM
PSNR
Number of linear equations (S)
(a)
(c)
Number of linear equations (S)
(b)
(d)
(e)
(f)
Figure 5: Influence of the sub-training area size [number of linear equations (S)] on the filling
results. Average values for (a) PSNR vs. run-time (measured in seconds) and (b) SSIM vs.
run-time (measured in seconds) over the whole test set with Ω = 40 × 40. (c-d) Subjective
differences for test images Rough paper (top) and Cork (bottom). (c) Input with Ω = 40 × 40.
Results of the AR texture completion (without post-processing) with C = 15 and (d) S = 36
(e) S = 841 (f) S = 3025
14
definition of the training area, the estimation of the optimal AR modell order, the AR consistency criterion and the post-processing module are evaluated
to assess their respective contribution to the overall performance of AR-based
texture completion.
6.1. Data Set and Quality Measures
To evaluate the proposed AR algorithm, 20 test images are used: rough
plastic, plaster, rough paper, artificial grass, cork, sponge, lettuce leaf, loofa,
limestone, ribbed paper, straw, corduroy, stones, corn husk, white bread, soleirolia, orange peel, peacock feather, tree bark and moss. The AR model can be
used to synthesize a class of texture with a parameter set that is trained in
the same texture class. The data set is therefore willfully chosen to cover a
broad spectrum of different texture characteristics. All images have a resolution of 180 180 (cropped from the original resolution 640 480) and are
publicly available at the Columbia Utrecht Reflectance and Texture Database
(CUReT) [44]. Furthermore, all tests are conducted with two different hole sizes
( Ω = 20 20 and 40 40, i.e. 1, 2% and 5% of the image size). A PC with an
Intel Xeon CPU (3.33 GHz) and 4 GB RAM was used in our experiments. The
software is currently implemented in MATLAB.
The performance of the proposed completion algorithm is assessed with
PSNR and SSIM. For the presented results, PSNR is computed locally, only
for Ω, while SSIM is determined for the entire image as it is not suitable for
arbitrarily small regions. SSIM is provided in addition to PSNR, since PSNR is
not always a reliable measure to judge the quality of texture completion results
[45]. Furthermore, the synthesized image regions are random samples from an
underlying distribution. Therefore, high PSNR results can not be expected.
However, the metric is utilized in conjunction with SSIM, to visualize trends of
quality changes based on parameter adjustments (cf. Fig. 5, Fig. 8 and Fig.
9).
6.2. Assessment of the Training Area
As mentioned in Sec. 5.1, the first step of the AR process is to identify an
appropriate training area adjacent to Ω to be filled-in. This task incorporates
two main investigations: assessment of 1) the size of the sub-training area and
2) the stationarity of the training texture.
6.2.1. Assessment of the Size of the Sub-Training Area
The first decision to make concerns the size S of the sub-training area, which
also corresponds to the number of linear equations (cf. Sec. 5.3). For this
investigation, experiments were conducted for all test images assuming a square
training window, i.e. cx = cy , sx = sy (S = sx sy ) at the top-left corner of Ω (cf.
Fig. 3) without loss of generality. Furthermore, all tests were performed using
a causal model [cf. Fig. 2 (c)] with three different settings of C and the two
hole sizes (Ω) in order to draw reliable conclusions. Note that the PSNR and
SSIM results are measured using the original texture from Ω. Fig. 5 (a) and (b)
15
(a)
(b)
(c)
(d)
(e)
Figure 6: Pruning of the training area for (a) orange peel, (b) peacock feather, (c) lettuce
leaf, (d) sponge and (e) moss. Examples of (top to down) input with Ω = 40 × 40; the
non-stationary training area; the training area after applying the new block-based clustering
criterion; and the training area after applying k-means clustering.
depict the average values that were achieved for PSNR and SSIM (blue lines)
over the whole test set with Ω = 40 40 depending on the number of linear
equations (S). Similar results are observed for Ω = 20 20. Furthermore, the
averaged run-time to synthesize the missing region in an image is also taken into
consideration and depicted on the y sub-axis [green lines in Fig. 5 (a) and (b)].
It can be seen that all configurations of C (C = 15, 63, 143) perform similarly
although different training sizes (from S = C + 1 “to” 5000) are considered.
Small values of S yield low PSNR and SSIM values. It appears that S should
be larger than 500 [cf. Fig. 5 (a), (b)] to contain a sufficient amount of texture
information to fit the model. On the other side, when S > 1000, there is a clear
saturation of the quality of the final results although the computational costs
progressively increase [green lines in Fig. 5 (a), (b)]. The computational time
increase due to the rising number of linear equations that have to be solved. In
terms of subjective evaluations, it was found that, in order to have satisfying
results, S should be at least ten times bigger than C. The visual influence of
the sub-training size is shown in Fig. 5 (d), (e) and (f). Considering all results
(subjective and objective), leads us to the conclusion that a sub-training size of
S
800 (sx = sy = 29) is a reasonable compromise between complexity and
quality.
16
(a)
(b)
(c)
(d)
Figure 7: Influence of the stationarity criteria on the filling results. Subjective differences for
test images orange peel (top), peacock feather (bottom). (a) Input with Ω = 40 × 40. Results
(with sample normalization) using (b) the non-stationary training area; (c) the training area
achieved with the block-based clustering; and (d) the training area achieved with k-means
clustering (cf. Fig. 6). Note that all results are generated without post-processing.
6.2.2. Pruning of the Training Area
The impact of the content of the training area on the completion results is an
important investigation, which highly depends on the texture characteristics of
the considered image. In general, if the texture information in the training area
is stationary and correlates with the unknown texture in Ω, the missing texture
will be more likely to be well completed. In case the training area is not well
initialized, the methods proposed in section 5.1 can be applied. Fig. 6 illustrates
the effect of processing the training area using the new block-based stationarity
criterion in comparison to the k-means clustering approach. It can be seen that
both methods can successfully recognize and remove the unstationary texture
locations. K-means works sample wise and is an iterative approach (cf. [34]),
whereas our method operates block wise and non-iteratively. Therefore, the
excluded regions of the final results have irregular boundaries after applying the
k-means method (cf. Fig. 6, bottom row). An essential disadvantage of k-means
consists in the problem of finding the global minimum. In general this method
converges towards a local minimum. Commonly, this problem can be solved
by an exhaustive choice of starting conditions. Using several replicates with
random starting points typically results in a solution that is a global minimum.
Therefore, the new method is approximately three times faster than k-means
clustering. In a set of different simulations, it was found that our criterion works
well with tµ = tθ = 20. Furthermore, the computation of the training area
with k-means (k = 2) was optimized. After applying the clustering procedure,
small blobs (smaller than 20 samples) were removed. Hence, only large texture
segments were kept. Fig. 7 illustrates the influence of the stationarity criteria
17
on the filling results (electronical magnification maybe required). Note that all
results are generated with C = 15, tµ = tθ = 20 and without post-processing.
It can be seen that 1) the quality of the completion results increases when an
appropriate texture [cf. Fig. 7 (d) vs. (b),(c)] is selected for the training process
and 2) the quality of the completion results is less dependent on the homogeneity
of the training area when sample normalization is applied [cf. Fig. 7 (c), (d)
and Fig. 10 (e), (f)]. Hence, it can be concluded, that pruning the training area
is recommended, when the training samples are not normalized or the texture
features complex pattern.
6.3. Assessment of the Optimal Coefficient Order
Concerning the optimal decision of the AR coefficient order C (number of
coefficients), the reliability of the SBC criterion has been investigated. Fig. 8
(a), (b) shows the objective results for the test image ribbed paper. The energy
(Esbc ) of the SBC (on the y sub axis) has been compared to the received SSIM
and PSNR values (on the y axis). As a reminder, the PSNR and SSIM results
are measured using the original texture as a reference. Furthermore, all tests
were performed using the constellation shown in Fig. 3 with a sub-training size
of S 800. It can be seen [cf. Fig. 8 (a),(b)] that SBC performs conversely to
SSIM and PSNR. Similar objective results are observed for all other test images.
Based on the subjective and objective results (cf. Fig. 8) it can be concluded
that there is no correlation between the SBC and the quality of the synthesized
output. This statement is also based on the evaluation of all 20 test images.
Nevertheless, the results for the minimum value of the SBC correlates with
reasonable texture quality. Due to the fact that the minimum value of SBC
lies commonly in the range between C = 8 (cx = cy = 2) and C = 24 (cx =
cy = 4) and the visual quality is always acceptable from C = 15 [cf. Fig. 8
(d), (e), (f)], we conclude that C = 15 is a good compromise between quality
and computational complexity. In case of applications in which the run-time is
not critical, larger AR orders can be used. However, it is not recommended to
utilize values of C > 168 as that may lead to overfitting of the training data.
6.4. Evaluation of the AR Consistency Criterion
In some seldom cases, the quality of the output, in spite of all optimization
steps, results in an erroneous propagation of the existing texture. As mentioned
in section 5.7, this can be caused either by the training area not being stationary or by the coefficients obtained by solving the system not being optimal and
overfit the training examples (in some cases both problems are linked together).
Thus, the sample values in Ω oscillate exponentially during the texture completion, in one or more color channels, which results in very noticeably corrupted
results [cf. Fig. 9 (d)]. To detect these failures, an AR verification criterion
was developed (cf. Sec. 5.7). The parameter τ is set to 30 as shown in Table
2. Fig. 9 depicts the results obtained by applying the regularization criterion
as defined in (13) and (14). It can be seen that in this example the final result
does not depend on the number of AR coefficients C [cf. Fig. 9 (d)]. Hence, by
18
AR model order C
(a)
(c)
AR model order C
(b)
(d)
(e)
(f)
Figure 8: Influence of the AR model order (number of AR coefficients C) on the filling
results. (a) PSNR vs. SBC and (b) SSIM vs. SBC for the test image ribbed paper. Subjective
differences for test images plaster (top) and ribbed paper (bottom). (c) Input withΩ = 40×40.
Results of the AR texture completion (without post-processing) with S = 841 and (d) C = 3,
(e) C = 15, (f) C = 63.
19
(a)
(c)
(b)
(d)
(e)
(f)
Figure 9: Influence of the regularization procedure on the filling results. (a) PSNR and (b)
SSIM for the test image corn husk with C = 15. Subjective evaluation: (c) input with
Ω = 40 × 40. Results of the AR texture completion (without post-processing) with S = 841,
C = 15 (top), C = 255 (down) and (d) λ = 0.1, (e) λ = 200 (top), λ = 500 (down) and (f)
λ = 5e5.
varying the regularization parameter λ, the texture completion outcome can be
improved. In general, the optimal parameter value of λ depends on the texture
content and C. If λ is set to be very small, the algorithm fails to eliminate
overfitting [cf. Fig. 9 (d)]. Vice versa, if λ is too large, the algorithm could
result in underfitting (most of the coefficient receive zero or nearby zero values),
i.e. it fails to fit even the training data well (cf. Fig. 9). Hence, we choose for
λ the value that achieved the best objective result (cf. Fig. 9 and Table 2). A
practical application and an extensive evaluation of our verification method can
be found in [46].
20
Table 2: Optimized settings of proposed AR texture completion method
Parameter
S
C
tµ , tθ
bx by
τ
λ
Value
800
15
20
10 10
30
200
6.5. Overall System Evaluation
Given some experiments that will be presented in several additional sections,
optimized parameter settings have been derived and summarized in Table 2.
Normalizing the data is not a standard procedure for AR-based texture
completion. Nevertheless, experimental results have evidenced that using the
proposed normalization leads to an improvement of the final texture quality.
Fig. 10 (e), (f) shows the visual difference without and with the scaling step.
Furthermore, the required AR model order can be significantly reduced through
normalization, i.e. less ( 2-3 times) AR coefficients are required to capture the
texture characteristics accurately. This may, however, be different for other
data sets or specific applications. Nevertheless, using normalized data is not
always an advantage. One of the main drawbacks is the fact that, the AR
approach tends to produce smoother results (cf. Fig. 10 (e), (f), first and fifth
texture). This is due to the fact that at the end of the texture completion step,
the same mean and variance is applied to unscale all reconstructed samples in
the unknown area. Therefore, the local variance of the completed samples in
Ω does not always match the original signal. Furthermore, in a set of different
simulations, it was found that the AR approach provides more reliable results
using the mean and standard deviation of the initialization area than those
obtained on the training area (cf. Fig. 3).
Due to the proposed usage of the prediction equation, the information on
the right and bottom borders of the completed texture in Ω can be perceptibly
inconsistent with the adjacent original information. To solve this limitation, the
Poisson equation is presented as a possible solution. The impact of this postprocessing step of the final results is depicted in Fig. 10 (f), (g). As can be seen,
the results achieved with the Poisson image cloning technique [cf. Fig. 10 (g)]
are much more consistent than those presented in Fig. 10 (f). To overcome this
boundary problem it is also possible to extend the proposed AR model following
the interpolation approaches proposed in [25] and [26].
In the next experiments, the AR-based completion is compared to two template matching algorithms and one parametric method, 1) the priority-based
[19], 2) the coherency sensitive hashing (CSH) algorithm [24] incorporated in
an inpainting framework [47] and 3) the frequency selective extrapolation (FSE)
[13] method. The priority-based method is executed with a patch size of 9 9
samples, as recommended by Criminisi et al. [19]. The CSH relies on hashing,
21
Table 3: Average run-time performances of different frameworks over the whole data set
Methods
AR
AR with post-processing
Priority-based [19]
Inpainting with CSH [47], [24]
Frequency selective [13]
Average
time (s)
0.199
0.303
38.642
1.747
3.517
Loss
factor
1.0
1.5
194.2
8.8
17.7
which maps similar templates to the same bin, in order to find matching representative. As a result, inpainting with CSH [47] is a novel, fast, optimized and
accurate texture completion approach [24]. For the parametrical FSE approach,
the settings proposed in [13] were used. Fig. 10 (b), (c) ,(d) ,(f) ,(g) illustrates
the results with four different approaches. According to the results obtained, the
template matching algorithms provide slightly better visual results, e.g. details
are better preserved in comparison to the AR approach. However, the coarse
patch transition can lead to annoying perceptual distortions. Furthermore, the
results achieved by the FSE approach are significantly blurrier in comparison
to those generated by the AR method.
6.6. Complexity
The AR completion method requires low computational effort. In particular
it is approximately 194, 9 and 18 times faster than the priority, inpainting with
CSH and FSE approaches. The averaged run times in Table 3 are estimated over
all 20 test images and the gains are calculated in relation to the performance
time of the AR approach without post-processing. Nevertheless, using the postprocessing module improves the visual quality of the results significantly, but
adds only a small complexity overhead (0.1s), compared to the competing completion methods (cf. Fig. 6 and Table 3). Note that both template matching
methods can be accelerated by reducing the search area [23], [48]. Nevertheless,
they still cannot perform as fast as the AR approach. Hence, the usage of the
AR approach makes the completion process faster, without degrading the visual
texture completion quality.
7. Practical Applicability of the AR Model
Until now we have presented a controlled texture completion environment
using the AR-model. In this section the application of AR-based completion
to real-world scenarios will be discussed. An example of practical applicability
of this approach is shown in Fig. 11. Image reconstruction is an important
topic in image communication, where concealment of image data corrupted by
transmission errors can be obtained. Fig. 11 (a) depicts two sub-images [Lena
(512 51) and Baboon (512 51)] with 16 16 block losses as discussed in
[13]. Here again we compare the results with the priority-based [19] and the
22
(a)
(b)
(c)
(d)
(e)
(f)
(g)
Figure 10: Subjective results of the test images (from left to right) rough plastic, sponge,
ribbed paper, lettuce leaf, straw, peacock feather applying different texture completion methods. (a) Input with Ω = 40 × 40. Completed results with the (b) priority-based [19], (c)
inpainting with CSH [47], [24] and (d) FSE [13] algorithms. Results of the AR texture
completion with C = 15 and S = 841 (e) without sample normalization, (f) with sample
normalization and (g) with sample normalization and post-processing steps.
23
Figure 11: Concealment of block losses as presented in [13] for the test images Lena (top)
and Baboon (bottom). a) Isolated 16 × 16 block losses. Concealed sub-images with the (b)
priority-based [19], (c) inpainting with CSH [47], [24] template matching and (d) FSE [13]
algorithms. (e) Result of the AR texture completion with C = 15, sample normalization, the
new block-based clustering criterion and the post-processing steps.
Table 4: Average run-time performances in seconds for error concealment of block losses
Methods
AR with post-processing
Priority-based [19]
Inpainting with CSH [47], [24]
Frequency selective [13]
Lena
9.27
94.68
67.51
48.98
Baboon
9.13
100.29
87.83
49.15
inpainting with CSH [47], [24] template matching as well as with the FSE [13]
algorithm. All algorithms provide reasonable results. Nevertheless, prioritybased [19] and the inpainting with CSH algorithm introduce blocking artifacts
and disturbing edges at the boundary of Ω and the original texture (Fig. 11
(b), (c), eye of baboon) while the proposed method conceals these transitions
in a more graceful manner [cf. Fig. 11 (e)]. The FSE [13] algorithm introduces
blur into Ω [cf. Fig. 11 (d)] while the AR model maintains the texture [cf. Fig.
11 (e)]. Additionally, the AR approach requires the lowest computational time
using Matlab (cf. Table 4).
The AR-based synthesis is deterministic and can be controlled. This is of
great value for video coding applications [49], [46]. Avoiding the transmission
of texture regions can drastically reduce the bit-rate. These regions can then be
synthesized at the decoder using 2D+t AR synthesis. However, several frames
need to be available for the training of the AR parameter set [46].
24
8. Conclusion
This paper addressed the 2D autoregressive model for texture completion.
A range of different degrees of freedom were discussed. Furthermore, the parametric completion results were evaluated with a new consistency criterion. In
case of detected errors in the synthesized areas, a regularization procedure was
discussed as fallback. Finally, pre- and post-processing steps were applied in
order to improve the final results. The potential as well as the limitations of
all AR-related modules were addressed and their performance presented when
integrated into a texture completion framework.
Acknowledgements
This work was funded in part by the German Research Foundation [Deutsche
Forschungsgemeinschaft (DFG)], under the grant WI 2032/4-1.
Furthermore, this work was carried out during the tenure of an ERCIM
“Alain Bensoussan” Fellowship Programme. The research leading to these results has received funding from the European Union Seventh Framework Programme (FP7/2007-2013) under grant agreement №246016.
References
[1] L.-Y. Wei, S. Lefebvre, V. Kwatra, G. Turk, et al., State of the art in
example-based texture synthesis, in: Eurographics 2009, State of the Art
Report, EG-STAR, 2009, pp. 93–117.
[2] M. Bertalm´ıo, V. Caselles, S. Masnou, G. Sapiro, Inpainting, Encyclopedia
of Computer Vision, Springer.
[3] P. Ndjiki-Nya, D. Doshkov, H. Kaprykowsky, F. Zhang, D. Bull, T. Wiegand, Perception-oriented video coding based on image analysis and completion: A review, Signal Processing: Image Communication 27 (6) (2012)
579–594.
[4] J. Portilla, E. P. Simoncelli, A parametric texture model based on joint
statistics of complex wavelet coefficients, International Journal of Computer
Vision 40 (1) (2000) 49–70.
[5] D. J. Heeger, J. R. Bergen, Pyramid-based texture analysis/synthesis, in:
ACM Transactions on Graphics (SIGGRAPH), ACM, 1995, pp. 229–238.
[6] G. Doretto, A. Chiuso, Y. N. Wu, S. Soatto, Dynamic textures, International Journal of Computer Vision 51 (2) (2003) 91–109.
[7] R. Chellappa, R. L. Kashyap, Texture synthesis using 2-d noncausal autoregressive models, IEEE Transactions on Acoustics, Speech and Signal
Processing 33 (1) (1985) 194–203.
25
[8] K. Deguchi, Two-dimensional auto-regressive model for analysis and sythesis of gray-level textures, Proceedings of the 1st International Symposium
for Science on Form (1986) 441–449.
[9] J. K. Tugnait, Estimation of linear parametric models of nongaussian discrete random fields with application to texture synthesis, IEEE Transactions on Image Processing 3 (2) (1994) 109–127.
[10] M. S. Joshi, P. P. Bartakke, M. S. Sutaone, Texture representation using autoregressive models, in: International Conference on Advances in
Computational Tools for Engineering Applications (ACTEA), 2009, pp.
386–390.
[11] A. Kokaram, A statistical framework for picture reconstruction using 2d ar
models, Image and Vision Computing 22 (2) (2004) 165–171.
[12] M. Szummer, R. W. Picard, Temporal texture modeling, in: Proceedings
of the International Conference on Image Processing (ICIP), Vol. 3, IEEE,
1996, pp. 823–826.
[13] A. Kaup, K. Meisinger, T. Aach, Frequency selective signal extrapolation
with applications to error concealment in image communication, AEUInternational Journal of Electronics and Communications 59 (3) (2005)
147–156.
[14] M. Bertalmio, G. Sapiro, V. Caselles, C. Ballester, Image inpainting, in:
Proceedings of the 27th Annual Conference on Computer Graphics and
Interactive Techniques, ACM Press/Addison-Wesley Publishing Co., 2000,
pp. 417–424.
[15] C. Ballester, M. Bertalmio, V. Caselles, G. Sapiro, J. Verdera, Filling-in by
joint interpolation of vector fields and gray levels, IEEE Transactions on
Image Processing 10 (8) (2001) 1200–1211.
[16] A. Levin, A. Zomet, Y. Weiss, Learning how to inpaint from global image
statistics, in: Proceedings of the Ninth IEEE International Conference on
Computer Vision (ICCV), IEEE, 2003, pp. 305–312.
[17] M. Bertalmio, L. Vese, G. Sapiro, S. Osher, Simultaneous structure and
texture image inpainting, IEEE Transactions on Image Processing 12 (8)
(2003) 882–889.
[18] L.-Y. Wei, M. Levoy, Fast texture synthesis using tree-structured vector
quantization, in: Proceedings of the 27th Annual Conference on Computer
Graphics and Interactive Techniques, ACM Press/Addison-Wesley Publishing Co., 2000, pp. 479–488.
[19] A. Criminisi, P. P´erez, K. Toyama, Region filling and object removal by
exemplar-based image inpainting, IEEE Transactions on Image Processing
13 (9) (2004) 1200–1212.
26
[20] M. Ashikhmin, Synthesizing natural textures, in: Proceedings of the 2001
Symposium on Interactive 3D Graphics, ACM, 2001, pp. 217–226.
[21] V. Kwatra, A. Sch¨
odl, I. Essa, G. Turk, A. Bobick, Graphcut textures:
image and video synthesis using graph cuts, in: ACM Transactions on
Graphics (SIGGRAPH), ACM, 2003, pp. 277–286.
[22] P. Ndjiki-Nya, C. Stuber, T. Wiegand, Texture synthesis method for generic
video sequences, in: Proceedings of the IEEE International Conference on
Image Processing (ICIP), Vol. 3, IEEE, 2007, pp. III–397.
[23] P. Ndjiki-Nya, M. K¨
oppel, D. Doshkov, T. Wiegand, Automatic structureaware inpainting for complex image content, in: Advances in Visual Computing, Las Vegas, USA, 2008, pp. 1144–1156.
[24] S. Korman, S. Avidan, Coherency sensitive hashing, in: Proceedings of
IEEE International Conference on Computer Vision (ICCV), IEEE, 2011,
pp. 1607–1614.
[25] A. Kokaram, R. P, Detection and interpolation of replacement noise in
motion picture sequences using 3d autoregressive modellin, in: Proceedings
of the IEEE International Symposium on Circuits and Systems, IEEE,
1994, pp. 21–24.
[26] A. Kokaram, Motion Picture Restauration, Springer, London, 1998.
[27] A. J. E. M. Janssen, R. N. Veldhuis, L. B. Vries, Adaptive interpolation of
discrete-time signals that can be modeled as autoregressive processes, IEEE
Transactions on Acoustics Speech and Signal Processing 34 (2) (1986) 317–
330.
[28] A. Hyv`earinen, J. Hurri, P. O. Hoyer, Natural Image Statistics, Vol. 39,
Springer, Feb., 2009.
[29] K. R. Castleman, Digital image processing, Curve and Surface Fitting
(1996) 501–507.
[30] R. L. Kashyap, Univariate and multivariate random field models for images,
Elsevier Computer Graphics and Image Processing 12 (2) (1980) 257–210.
[31] J. S. Lim, J. S. Lim, Two-Dimensional Signal and Image Processing, Prentice Hall, 1989.
[32] D. Doshkov, O. Jottrand, T. Wiegand, P. Ndjiki-Nya, On the efficiency of
image completion methods for intra prediction in video coding with large
block structures, in: IS&T/SPIE Electronic Imaging, International Society
for Optics and Photonics, 2013, pp. 1–4.
[33] M. Tuceryan, A. K. Jain, Texture analysis, Handbook of pattern recognition and computer vision 276.
27
[34] C. M. Bishop, Neural networks for pattern recognition, Oxford University
Press, Oxford, 1995.
[35] P. Ndjiki-Nya, G. Simo, T. Wiegand, Evaluation of color image segmentation algorithms based on histogram thresholding, in: Proceedings of Visual
Content Processing and Representation, 2006, pp. 214–222.
[36] N. J. S. (Ed.), Normalizing Data, SAGE Publications, Inc., Encyclopedia
of Research Design, 2010.
[37] G. Golub, W. Kahan, Calculating the singular values and pseudo-inverse
of a matrix, Journal of the Society for Industrial & Applied Mathematics,
Series B: Numerical Analysis 2 (2) (1965) 205–224.
[38] G. E. Box, G. M. Jenkins, G. C. Reinsel, Time series analysis: forecasting
and control, Wiley, New Jersey, 2013.
[39] H. Akaike, A new look at the statistical model identification, IEEE Transactions on Automatic Control 19 (6) (1974) 716–723.
[40] G. Schwarz, Estimating the dimension of a model, The Annals of Statistics
6 (2) (1978) 461–464.
[41] E. Alpaydin, Introduction to machine learning, MIT press, Cambridge,
Massachusetts, London, England, 2004.
[42] P. P´erez, M. Gangnet, A. Blake, Poisson image editing, ACM Transactions
on Graphics (SIGGRAPH) 22 (3) (2003) 313–318.
[43] R. Szeliski, Image alignment and stitching: A tutorial, Tech. rep., Microsoft
Research, mSR-TR-2004-92 (2006).
[44] K. J. Dana, B. Van Ginneken, S. K. Nayar, J. J. Koenderink, Reflectance
and texture of real-world surfaces, ACM Transactions on Graphics (SIGGRAPH) 18 (1) (1999) 1–34.
[45] Z. Wang, A. C. Bovik, Mean squared error: love it or leave it? a new
look at signal fidelity measures, IEEE Transactions on Signal Processing
Magazine 26 (1) (2009) 98–117.
[46] F. Racape, D. Doshkov, M. K¨oppel, P. Ndjiki-Nya, 2D+T autoregressive
framework for video texture completion, in: Proceedings of International
Conference on Image Processing (ICIP), Paris, France, 2014, pp. 1–5.
[47] P. Strandmark, Inpainting with coherency sensitive hashing (2011).
URL
http://strandmark.wordpress.com/2011/12/12/
inpainting-with-coherency-sensitive-hashing/
[48] P. Ndjiki-Nya, M. K¨
oppel, D. Doshkov, H. Lakshman, P. Merkle, K. M¨
uller,
T. Wiegand, Depth image-based rendering with advanced texture synthesis
for 3-d video, IEEE Transactions on Multimedia 13 (3) (2011) 453–465.
28
[49] A. Khandelia, S. Gorecha, B. Lall, S. Chaudhury, M. Mathur, Parametric
video compression scheme using ar based texture synthesis, in: Proceedings
of Sixth Indian Conference on Computer Vision, Graphics Image Processing
(ICVGIP), 2008, pp. 219–225.
29