Download Report

Why Parameter Control Mechanisms Should Be
Benchmarked Against Random Variation
Giorgos Karafotias
Mark Hoogendoorn
A.E. Eiben
Computer Science Department
VU University
Amsterdam
Email: [email protected]
Computer Science Department
VU University
Amsterdam
Email: [email protected]
Computer Science Department
VU University
Amsterdam
Email: [email protected]
Abstract—Parameter control mechanisms in evolutionary algorithms (EAs) dynamically change the values of the EA parameters during a run. Research over the last two decades
has delivered ample examples where an EA using a parameter
control mechanism outperforms its static version with fixed
parameter values. However, very few have investigated why such
parameter control approaches perform better. In principle, it
could be the case that using different parameter values alone
is already sufficient and EA performance can be improved
without sophisticated control strategies raising an issue in the
methodology of parameter control mechanisms’ evaluation. This
paper investigates whether very simple random variation in
parameter values during an evolutionary run can already provide
improvements over static values. Results suggest that random
variation of parameters should be included in the benchmarks
when evaluating a new parameter control mechanism.
I.
I NTRODUCTION
When setting up an evolutionary algorithm (EA) one aspect
that needs to be addressed is defining appropriate values for
the various parameters of the algorithm. In case inappropriate
values are chosen the performance of the EA can be severely
degraded. The question whether a certain value of a parameter
is appropriate is far from trivial as different phases in an
evolutionary run could require different values. In fact, there
are two principal options to set such values [4]: (1) trying to
find fixed parameter values that seem to work well across the
entire evolutionary run (parameter tuning), and (2) finding a
suitable control strategy to adjust the parameter values during a
run (parameter control). Furthermore, we can distinguish three
forms of parameter control: (a) deterministic parameter control,
which uses a fixed control scheme without using any input
from the state of the process; (b) adaptive parameter control,
utilizing information from the state of the process to determine
good parameter values, and (c) self-adaptive whereby the parameter values are part of the evolutionary process itself. In the
literature a variety of evolutionary algorithms equipped with
sophisticated parameter control strategies have been shown
to outperform their static counterparts (see e.g. [12], [3] and
[18]), and many have acknowledged that dynamically adjusting
parameter values is a very good idea (see e.g. [14]).
In the majority of work presenting parameter control
mechanisms, the value of the controller is assessed only by
comparing its performance to the static version of the EA
that keeps parameter values fixed. The motivation of this
paper is based on the idea that such performance benefits
observed when using parameter control mechanisms over
using static parameter values might be in fact a result of
simply the variation of the parameter and not the intelligent
strategy itself. Some authors have made some weak hints
in this direction, see the next Section, however, none have
performed a rigorous analysis. If it is possible that variation
on its own (without some intelligent strategy) might improve
performance, a methodological issue is raised: when evaluating
a parameter control mechanism the actual contribution of the
intelligent strategy to the performance gain should not be taken
for granted but should be explicitly assessed by also including
’naive variation’ in the set of benchmarks used.
The goal of this paper is to investigate whether (nonintelligent) ‘variation’ alone might indeed improve EA performance as compared to keeping parameter values fixed. To
this end, we implement a few simple random methods to vary
parameter values during the run of an EA and investigate their
impact on a set of standard test problems. In particular, we use
a uniform distribution and a Gaussian distribution and compare
the resulting EAs with an EA whose parameter values are
fixed (by a powerful tuning algorithm) and with an EA whose
parameters change by a sine wave based schedule (enabling
increasing and decreasing the values).
This paper is organized as follows. Section II explains the
motivation in more detail and provides some related work.
Thereafter the experimental setup is dicussed in Section III
whereas the results are presented in Section IV. Finally,
Section V concludes the paper and presents avenues for future
work.
II.
M OTIVATION AND R ELATED W ORK
It is widely accepted in EC that parameter control is preferable over static parameter values because different parameter
values are needed at different stages of the evolutionary process
(e.g. switching from global to local search). Additionally,
information about the fitness landscape that is accumulated
during the search can be used to improve parameter values in
the later phases of the process [2]. Several parameter control
methods for evolutionary algorithms have been suggested,
some literature reviews can be found in [4], [16], [7] and [13].
In many of the studies introducing these parameter control
strategies, performance comparisons between the EA using the
control mechanism and the equivalent EA with static parameter
values are presented as a proof of the controller’s value.
However, it is the usual case that no further investigation is
carried out as to how exactly the parameters are varied and
to what extent the performance gain is a result of the specific
control strategy or the mere fact that parameters simply change
during the run, i.e. just adding some variation in the parameter
values already brings added value.
The idea that simply changing the values of a parameter,
regardless of how that change is done, can result in better
performance has been hinted in some previous work. In [17],
Spears experiments with self-adaptive operator selection for
genetic algorithms. Results show that the GA with random
operator selection has similar performance with the selfadaptive GA meaning that it is just the availability of multiple
operators that improves performance and not self-adaptation.
Randomized values are purposefully used in [5] to set the
parameters of different islands for a distributed EA with the
rationale that, at each moment during the search process,
there will be at least one parameter configuration that will
be favorable for further advance.
In this paper we attempt to answer the question whether
a random variation of parameter values by itself (with no
intelligence, purpose or strategy) can have a positive effect
on the performance of an evolutionary algorithm as compared
to keeping parameter values fixed. Though theoretical studies
on optimal parameter values or ideal control strategies do
exist (see e.g. [6], [9], [8] and [11]), we believe that such
a theoretical approach here would be impossible or greatly
oversimplifying. For this reason we prefer an experimental
approach as will be described in the following section.
III.
E XPERIMENTAL S ETUP
As was explained in the previous sections, the purpose of
the experiments presented here is to determine if the mere
variation of parameter values (with no particular method or
strategy) can result in performance gain for the evolutionary
algorithm when compared to keeping parameters fixed. In
order to assess the effect of parameter variation isolated from
the effect of an ”intelligent” control method or strategy we
use the most naive parameter variation approach possible, i.e.
random variation of parameter values. Keeping all other factors
identical, we compare the performance of an evolutionary
algorithm when its parameter values are kept fixed during the
whole search and when its parameter values vary according
to some random distribution. To show the difference between
the random variation and a non-random (but certainly not
sophisticated) variation approach, an additional approach to
vary the parameter values, i.e. a sine-based function, is used
which facilitates sequences of increase and decrease of such
values.
Before describing the experimental setup, two important
points must be emphasized here. First, we are not trying to
establish as a general truth that parameter variation will by
itself lead to better performance but rather to determine if
it can be possible to observe better performance as a result
of only the availability or application of multiple parameter
values regardless of any control strategy. Second, we do not
propose random variation as a parameter control method. The
performance comparison between search with static parameter
values and with randomly varying parameters aims only at ex-
ploring the effect of parameter variation and not in designating
a winner.
A. The evolutionary algorithm and the test functions
As an evolutionary algorithm we use a (µ + λ) Evolution
Strategy with n-point crossover, Gaussian mutation and tournament selection for both parent and survivor selection. The
parameters used in these experiments are the following six:
•
population size µ
•
generation gap g (g is the ratio of the number of
offspring to the population size)
•
number of crossover points n
•
mutation step size σ
•
parent selection tournament size kp
•
survivor selection tournament size ks
For test problems we use a set of seven standard continuous
optimization test functions (see Table I). All functions are to
be minimized in 10-dimensions. All are multimodal except
one (f3 Rosenbrock). A single EA run is 10000 function
evaluations.
B. Comparison approach
We use the following workflow to facilitate the desired
comparison, in step (2)-(4) experimental results are generated
and comparisons are made:
1)
2)
3)
4)
Tune the parameters values of the ES using a dedicated parameter tuner, resulting in a set of basic
parameter values.
Add variation to the basic parameter values found
under (1) using a Gaussian and uniform distribution
with fixed variation values.
Instead of using fixed variation values as in (2), try
to find the best variation values using a parameter
tuning approach given the basic values found under
(1).
Tune all parameter values (both basic and variation values) at the same time, also include a nonrandomized parameter value generator which can express basic sequences of increasing and/or decreasing
values.
Each of these steps is discussed more elaborately below.
1) Tuning the ES: As a first step, the ES is tuned for
every test function separately (all six parameters are tuned
concurrently with one tuning process per problem). For tuning
we use Bonesa [15], which is a state-of-the-art parameter
tuning method for tuning real valued parameters. This step
results in seven parameter vectors p~i , one for each problem
fi , i = 1...7, with good static values for each parameter. The
ranges and the results of the tuning process are shown in Table
II. A single Bonesa tuning run was given a budget of 10000
algorithm tests.
TABLE I.
s
f1 (Ackley)
f2 (Rastrigin)
f (~
x) = −20 · exp(−0.2 ·
f (~
x) = A · D +
D
P
i=1
f3 (Rosenbrock)
f4 (Schaffer)
f5 (Bohachevsky)
f6 (Griewangk)
f7 (Shekel)
f (~
x) =
D
P
f (~
x) =
i=1
D
P
f (~
x) =
i=1
D
P
f (~
x) =
i=1
D
P
f (~
x) =
i=1
m
P
1
D
·
D
P
i=1
1
x2i − exp( D
·
D
P
T EST FUNCTIONS
cos(2π · xi )) + 20 + e
i=1
(x2i − A · cos (2 · πxi ))
(100 · (x2i − xi+1 )2 + (1 − xi )2 )
((x2i + x2i+1 )0.25 · ((sin 50(x2i + x2i+1 )0.1 )2 + 1))
(x2i + 2x2i+1 − 0.3 · cos 3πxi − 0.4 · cos 4πxi+1 + 0.7)
x2
i )−
( 4000
D
Q
i=1
cos
x
√i
i
+1
1
D
i=1 c + P (x −α )2
i
j
ij
j=1
TABLE II.
PARAMETERS AND CORRESPONDING TUNING RANGES AND TUNED VALUES FOR EACH PROBLEM .
µ
g
σ
N
kp
ks
Range
[1, 200]
[0, 15]
[0, 2]
[1, 9]
[1, 200]
[1, 200]
f1
199
0.137
1.053
6
21
163
f2
147
4.062
0.008
6
2
16
2) Experiment 1: Adding variation around tuned values:
In order to determine the effect of variation we use the tuned
vectors p~i as a starting point and add some random variation.
Specifically, at each generation, parameter values are drawn
from a distribution (Gaussian and uniform distributions are
tested). For each parameter, a separate distribution is used and
the “centers” of these distributions are set to the tuned values:
•
Gaussian: for problem i, values for parameter j are
drawn from a normal distribution N (~
pi (j), d · p~i (j))
•
uniform: for problem i, values for parameter j are
drawn uniformly from the interval [~
pi (j) − w2 , p~i (j) +
w
~i (j)
2 ], w = d · p
Several width coefficients d are tried, d = 0.01, 0.02, 0.03,
0.05, 0.1, 0.15, 0.2, 0.3, 0.5, 0.8. Separate runs are made with
each parameter varied alone and all parameters varied together.
For every setting (i.e. combination of parameter, distribution
and d), the ES is run 30 times to derive statistically reliable
results.
3) Experiment 2: “Tuning” the range of the variation:
The above process attempts to determine whether adding some
variance around the parameter values can result in improved
performance, however, only a small number of hand-picked
ranges (defined by the values of d) are tested. For a more
thorough and rigorous test, the rationale of experiment 1
is maintained but we use Bonesa as a search algorithm to
find good values for the standard deviation of the Gaussian
distribution. Thus now, for problem i, values for parameter j
are drawn from a normal distribution N (~
pi (j), σij ) with every
j
σi being derived through a search process by Bonesa (one
tuning process per problem was executed that concurrently
tuned the deviations of all six parameters). If the tuning process
of Bonesa for a σij converges to a non-zero value, that would
indicate that some random variation is indeed beneficial. A
much longer (25000 algorithm tests) tuning process was used
for this experiment to increase the reliability of the results.
f3
190
0
0.09
6
32
3
f4
150
1.826
1.736
5
3
9
f5
5
0.001
1.991
6
4
196
f6
186
0.327
1.808
7
2
6
f7
195
0.114
0.037
6
121
127
Due to time limitations, this experiment was performed only
for function f1 .
4) Experiment 3: ”Tuning” all the settings of the variation:
As a final test we make a fair comparison between the
performance of the ES using static parameter values and
its performance using varying values. Since the static values
were derived through a tuning process, in order to make the
comparison fair, the settings that determine the varying values
must also be calibrated in equal terms. Thus, an identical
tuning process (using Bonesa with the same budget of 10000
algorithm tests) is used. Here, except for the normal and
uniform random distributions employed previously, we also use
an approach based on a sine wave which is able to generate
sequences of increasing and/or decreasing parameter values.
For each variation mechanism, the following settings are tuned:
•
Gaussian: for each problem i, a tuning process calibrates for each parameter j the mean mji and standard
deviation σij of the normal distribution. Thus, each
tuning process tunes 12 values.
•
uniform: for each problem i, a tuning process calibrates for each parameter j the minimum lij and the
width wij of the range from which values are drawn.
Thus, each tuning process tunes 12 values.
•
sine: for each problem i, a tuning process calibrates
for each parameter j the amplitude Aji , frequency fij ,
angular frequency ωij and phase φji that define a sine
wave used as a deterministic schedule. Thus, each
tuning process tunes 24 values.
After the tuning is complete, for each problem and variation
setting combination, the ES is run 30 times to derive statistically reliable results.
IV.
R ESULTS AND A NALYSIS
The results of experiment 1 are presented in Table III. The
table shows the performance of the three ES variants that have
been run in this first experiment: with tuned static parameter
values, with values drawn from a Gaussian distribution (for
various values of d defining the standard deviation) and with
values drawn from a uniform distribution (for various values
of d defining the width). Emphasized numbers indicate a
significant improvement over fixed parameters.
These results suggest that only varying the value of the
parameter around the static value, without any strategy or
purpose, may possibly lead to better performance. For 4 out
of 7 problems and for 9 out of 49 combinations of problem
and parameter, there exists some kind of variation that may
significantly improve performance. The Gaussian distribution
appears more often, perhaps indicating that mild noise is
preferable, however, there are also cases where drawing values
from a uniform distribution is beneficial when compared to
keeping parameters fixed. An important observation is that, in
most cases where changing parameter values can be beneficial,
performance improves as the range of the change becomes
wider, with the best results achieved when the range is 80%
of the center value. Figure 2 shows some examples where
performance improves with variation and how this performance
gain is influenced by the variation width d.
The parameter that has most often a positive response
is the mutation step size σ but there are also cases where
varying the population size and number of crossover points
may result in improvement. Finally, for function f2 varying all
parameters significantly increases performance while varying
each parameter independently does not.
The results of experiment 2 are shown in Table IV. The best
three vectors resulting from the tuning process are presented,
each vector defining the standard deviations of the Gaussian
distributions from which parameter values are drawn (the
results only concern f1 ). For all parameters, except g, the
tuning process converged to deviation values far from zero,
indicating that the existence of variation (non-zero deviation)
was preferred by the tuning process. Using the best vector
of deviations (and tuned vector p~i for mean values), the ES
was run 30 times with Gaussian variation of all parameters. A
comparison with keeping the parameters static (to tuned values
p~i ) is shown in Figure 1. Also, the two best (for this problem)
cases of experiment 1 are considered, namely the case with
variation of all parameters with a width of d = 0.3 and varying
only σ with a width of d = 0.8. The tuned deviations of this
experiment produce better results than static and than varying
all parameter values with d = 0.3 (experiment 1). However,
they are not better than varying only σ. This may be due to the
fact that varying one of the parameters except σ has a very
detrimental effect (for this problem and EA) but the tuning
process was not able to set the parameter’s deviation to zero.
The results of experiment 3 are presented in Table V,
showing the performance of the ES using static parameter
values, and completely tuned variations (Gaussian and uniform
distributions as well as a completely tuned sine function). Underlined values denote the best average performance and bold
values indicate performance not significantly worse than the
best. We can again see that varying the parameter values can
Fig. 1. Experiment 2: A comparison of the performance when keeping all
parameters fixed to the tuned values (ST), when varying all parameters with a
Gaussian distribution with a tuned standard deviation (TD), when varying all
parameter with a Gaussian distribution with d = 0.3 (VA) and when varying
only σ with a Gaussian distribution with d = 0.8. The function used is f1
(Ackley), which is a minimization problem.
TABLE IV.
R ESULTS OF EXPERIMENT 2 SHOWING THE BEST THREE
PARAMETER VECTORS FOUND BY B ONESA . F OR EACH PARAMETER THE
TUNED DEVIATIONS ARE PRESENTED .
Parameter
µ
g
σ
N
kp
ks
Vector 1
17.923
0.025
0.862
2.692
25.464
39.978
Vector 2
19.616
0.06
1.049
1.064
21.966
22.644
Vector 3
25.785
0.03
0.897
2.771
25.662
38.916
result in better performance in some cases. However, tuning the
settings of the random variations (Gaussian and uniform) did
not produce any improvement compared to the results acquired
by experiment 1 with handpicked d values (see Table III). For
function f5 the performance of the tuned Gaussian is much
worse than the performance acquired simply by setting the
deviation of all parameters to 0.8 · pi (j). Furthermore, for
functions f2 and f4 , while experiment 1 showed improvement
when varying all parameters, here we see worse performance.
It might be that the task of the tuning process is too tough
when tuning the settings of the variation mechanisms due to
the number of values tuned: for the Gaussian and uniform
distributions there are double the settings compared to tuning
static values (two settings per parameter) while for the sine
wave the factor is four. Consequently, though the same tuning
effort was spent for static values and variation mechanisms,
the outcomes are unbalanced.
The tuned sine wave variation performs the best with
problem f7 ; the corresponding parameters variation is shown in
Figure 3. Except for µ, the variation of all other parameters is
just a very fast oscillation within a certain range, showing that
tuning resulted in a process also resembling random instead of
a more “meaningful” schedule that could be expressed with a
sine wave.
TABLE III.
R ESULTS OF EXPERIMENT 1. T HE LEFT PART IS FOR THE GAUSSIAN DISTRIBUTION AND THE RIGHT FOR THE UNIFORM . T HERE IS A
SUBTABLE FOR EVERY FUNCTION AND DISTRIBUTION COMBINATION . F OR EACH SUBTABLE , IN EVERY LINE IT IS DENOTED WHICH PARAMETER IS VARIED .
T HE FIRST COLUMN OF EACH SUBTABLE SHOWS THE PERFORMANCE WHEN THE PARAMETER IS KEPT STATIC AND THE SUBSEQUENT COLUMNS SHOW THE
PERFORMANCE WHEN THE PARAMETER IS VARIED WITH THE CORRESPONDING VALUE OF d. A LL NUMBERS ARE AVERAGES OVER 30 RUNS . E MPHASIZED
VALUES SHOW PERFORMANCE THAT IS SIGNIFICANTLY BETTER THAN STATIC ( WITH 0.95 CONFIDENCE ). A LL FUNCTIONS ARE TO BE MINIMIZED .
St
µ
g
σ
N
kp
ks
all
µ
g
σ
N
kp
ks
all
µ
g
σ
N
kp
ks
all
2.67
7.32
7.56
Gaussian
uniform
0.01
3.22
2.72
2.69
2.67
2.69
2.77
2.82
0.02
2.75
2.81
2.73
2.67
2.70
2.79
2.73
0.03
3.07
2.73
2.78
2.72
2.67
2.66
2.68
0.05
2.72
2.78
2.80
2.77
2.74
2.72
2.87
0.10
2.78
2.75
2.65
2.68
2.78
2.68
3.20
0.15
2.65
2.77
2.50
2.68
2.69
2.69
2.54
0.20
2.75
2.71
2.43
3.26
2.75
2.77
2.42
0.01
8.64
8.30
7.32
7.32
7.32
7.32
8.38
0.02
7.65
7.53
7.31
7.32
7.32
7.84
7.71
0.03
7.88
7.51
7.29
7.32
7.32
7.51
7.38
0.05
7.70
8.05
7.34
6.83
7.32
7.21
6.98
0.10
8.13
7.74
7.22
6.33
7.31
7.66
5.71
0.15
7.88
8.34
6.99
6.71
7.25
7.53
5.75
0.20
7.77
8.46
7.10
6.39
7.13
7.66
6.28
0.01
9.82
7.56
7.53
7.56
9.78
7.56
8.22
0.02
7.27
7.56
7.89
7.56
7.93
7.56
10.69
0.03
8.27
7.56
7.84
9.99
8.30
7.56
8.71
0.05
7.53
7.56
7.82
7.73
10.25
7.70
8.13
0.10
7.51
7.56
8.18
7.67
7.93
7.98
7.72
0.15
7.73
7.56
7.95
8.04
9.94
7.96
8.21
0.20
7.68
7.56
7.94
10.09
8.10
7.92
7.80
µ
g
σ
N
kp
ks
all
0.01
10.78
11.52
11.48
11.51 11.51
11.51
11.51
11.23
0.02
11.18
11.23
11.46
11.51
11.51
11.47
11.72
0.03
10.95
11.26
11.74
11.51
11.51
11.68
11.84
0.05
11.04
11.22
11.53
11.70
11.51
12.11
11.01
0.10
11.60
11.39
11.23
11.70
11.68
11.69
11.67
0.15
11.63
11.58
11.18
11.92
11.38
11.69
11.15
0.20
11.79
11.18
11.15
11.74
11.87
11.48
11.05
µ
g
σ
N
kp
ks
all
0.01
16.16
16.16
15.91
16.16 16.16
16.16
16.16
16.37
0.02
16.16
16.16
16.00
16.03
16.16
16.16
16.59
0.03
15.80
16.16
15.79
15.56
16.33
16.16
15.76
0.05
16.04
16.16
15.15
16.51
15.24
16.16
15.24
0.10
15.61
16.16
14.78
16.20
16.05
16.16
14.85
0.15
16.65
16.16
13.74
15.82
16.50
16.16
12.49
0.20
15.67
16.16
12.21
16.19
15.10
16.16
12.44
0.01
0.81
0.87
0.86
0.87
0.87
0.87
0.88
0.02
0.80
0.86
0.86
0.88
0.87
0.87
0.83
0.03
0.83
0.85
0.84
0.85
0.87
0.84
0.80
0.05
0.86
0.89
0.79
0.79
0.87
0.77
0.82
0.10
0.89
0.92
0.77
0.99
0.85
0.86
0.81
0.15
0.79
0.80
0.85
0.83
0.83
0.94
0.92
0.20
0.94
0.78
0.83
0.90
0.82
0.88
1.01
0.01
1.65
2.03
1.97
1.97
1.80
1.94
1.96
0.02
1.34
2.00
2.00
1.97
1.35
2.03
2.08
0.03
1.30
2.03
1.97
1.97
1.35
2.00
2.03
0.05
1.64
2.03
1.96
2.00
1.50
2.03
1.76
0.10
1.60
2.03
1.97
1.68
1.36
2.03
1.40
0.15
1.60
2.02
2.00
1.42
1.34
2.00
2.05
0.20
1.57
2.03
1.97
1.45
1.40
2.02
1.54
µ
g
σ
N
kp
ks
all
0.87
µ
g
-1.97
σ
N
kp
ks
all
f1 (Ackley)
0.30
0.50
0.80
2.81
2.69
2.70
3.20
2.80
2.73
2.41
1.23
1.12
3.09
2.80
2.69
2.72
2.75
2.78
2.76
3.11
2.71
1.86
1.59
2.00
f2 (Rastrigin)
0.30
0.50
0.80
9.43
12.39 31.03
8.37
9.34
13.99
7.26
6.87
7.06
6.59
8.76
12.37
7.66
7.89
8.25
7.63
8.05
8.90
7.22
15.35 40.55
f3 (Rosenbrock)
0.30
0.50
0.80
8.22
7.69
8.18
7.56
7.56
7.56
7.41
7.30
7.49
7.96
10.53 8.99
7.49
10.39 7.79
7.61
7.54
9.52
7.39
7.12
7.05
f4 (Schaffer)
0.30
0.50
0.80
12.92 15.38 30.87
11.52 12.38 14.40
10.63 10.67 9.80
11.73 12.10 12.26
12.03 11.43 11.48
11.73 11.85 12.06
10.63 16.32 36.76
f5 (Bohachevsky)
0.30
0.50
0.80
15.52 15.64 15.83
16.16 16.16 16.16
7.89
3.95
3.25
15.28 16.64 16.01
15.72 15.02 15.57
16.34 15.75 17.61
7.45
5.33
7.61
f6 (Griewangk)
0.30
0.50
0.80
2.35
3.17
16.68
0.93
0.94
1.18
0.86
0.77
1.17
0.92
0.97
0.95
0.82
0.89
0.85
1.09
0.79
0.88
0.86
5.28
26.38
f7 (Shekel)
0.30
0.50
0.80
1.69
1.58
1.64
2.02
1.72
2.00
1.97
2.01
2.01
1.60
1.80
1.67
1.65
1.38
2.02
2.03
2.00
2.03
1.82
1.78
2.01
0.01
2.70
2.67
2.74
2.67
2.67
2.70
2.71
0.02
2.75
2.78
2.80
2.67
2.67
2.77
2.78
0.03
2.74
2.64
2.78
2.67
2.75
2.77
2.80
0.05
2.74
2.77
2.78
2.67
2.65
2.66
2.76
0.10
2.81
2.76
3.28
2.74
2.70
2.70
2.72
0.15
2.82
2.71
3.27
2.74
2.70
2.78
2.74
0.20
2.76
2.78
3.19
2.68
2.67
2.77
3.19
0.30
2.77
2.77
2.67
2.67
2.70
2.79
2.68
0.50
3.18
2.68
2.63
2.81
2.80
2.74
2.56
0.80
2.77
2.74
2.39
2.70
2.66
2.70
2.50
0.01
7.82
7.87
7.33
7.32
7.32
7.32
8.09
0.02
7.42
7.95
7.35
7.32
7.32
7.32
8.44
0.03
8.20
7.83
7.35
7.32
7.32
7.32
7.80
0.05
7.93
7.75
7.35
7.32
7.32
7.46
7.97
0.10
8.64
7.45
7.31
6.18
7.32
7.66
5.50
0.15
8.38
7.41
7.32
6.18
7.32
7.66
5.54
0.20
8.10
8.60
7.32
5.71
7.32
7.93
6.40
0.30
7.88
8.20
7.44
5.94
7.85
7.41
6.34
0.50
8.57
8.36
7.23
5.93
7.85
7.59
6.42
0.80
8.58
8.76
7.25
6.58
7.85
7.96
7.21
0.01
7.85
7.56
7.74
7.56
7.56
7.56
8.09
0.02
7.61
7.56
7.49
7.56
7.98
7.56
8.48
0.03
8.02
7.56
7.16
7.56
7.98
7.56
8.16
0.05
7.96
7.56
7.53
7.56
8.16
7.56
8.11
0.10
8.35
7.56
7.71
7.72
8.54
7.56
12.08
0.15
8.43
7.56
7.60
7.72
9.77
7.56
7.83
0.20
7.94
7.56
8.10
7.88
7.77
8.19
8.31
0.30
8.03
7.56
10.09
8.19
9.88
8.19
7.95
0.50
8.09
7.56
10.06
7.52
8.15
7.94
8.18
0.80
8.51
7.56
7.34
8.20
8.39
7.94
7.88
0.01
11.43
11.71
11.53
11.51
11.51
11.51
11.55
0.02
11.12
11.74
11.71
11.51
11.51
11.51
11.05
0.03
11.26
11.98
11.39
11.51
11.51
11.51
11.66
0.05
11.30
11.78
11.55
11.51
11.51
11.51
11.26
0.10
11.31
11.77
11.32
11.59
11.51
11.68
11.41
0.15
11.37
11.19
11.25
11.59
11.51
11.23
11.24
0.20
11.47
11.64
11.45
11.59
11.92
11.44
11.70
0.30
11.56
11.40
11.38
11.93
11.92
11.02
11.65
0.50
11.45
11.58
11.60
11.81
11.57
11.55
11.39
0.80
11.54
11.34
10.98
11.48
11.57
11.64
11.75
0.01
16.16
16.16
16.07
16.16
16.16
16.16
15.68
0.02
16.16
16.16
16.38
16.16
16.16
16.16
15.53
0.03
16.16
16.16
16.66
16.16
16.16
16.16
15.79
0.05
16.16
16.16
15.70
16.16
16.16
16.16
15.69
0.10
15.72
16.16
16.25
15.14
16.16
16.16
15.98
0.15
15.72
16.16
15.81
15.14
15.46
16.16
16.23
0.20
15.72
16.16
15.71
15.92
15.46
16.16
16.29
0.30
15.95
16.16
15.26
15.48
16.40
16.16
15.13
0.50
15.87
16.16
14.26
15.94
16.37
16.16
14.04
0.80
15.83
16.16
12.36
15.38
16.77
16.16
12.13
0.01
0.85
0.87
0.80
0.87
0.87
0.87
0.92
0.02
0.79
0.82
0.78
0.87
0.87
0.87
0.85
0.03
0.85
0.80
0.88
0.87
0.87
0.87
0.80
0.05
0.81
0.78
0.80
0.87
0.87
0.87
0.90
0.10
0.90
0.88
0.80
0.86
0.87
0.84
0.78
0.15
0.83
0.80
0.79
0.79
0.87
0.84
0.85
0.20
0.77
0.92
0.76
0.79
0.87
0.87
0.83
0.30
0.94
0.81
0.81
0.80
0.84
0.88
0.86
0.50
0.79
0.83
0.82
0.82
0.84
0.79
0.82
0.80
0.87
0.84
0.87
1.00
0.91
0.88
0.91
0.01
1.78
1.97
1.96
1.97
1.71
2.00
1.84
0.02
1.81
1.97
1.96
1.97
1.74
2.03
2.14
0.03
1.66
2.00
1.96
1.97
1.66
2.00
1.99
0.05
1.33
2.02
1.96
1.97
1.66
2.03
1.84
0.10
1.72
2.03
1.97
1.98
1.47
2.03
1.78
0.15
1.70
2.32
1.97
1.98
1.74
2.02
2.16
0.20
1.43
2.32
1.97
1.36
1.71
2.03
1.50
0.30
1.37
2.32
1.97
2.01
1.35
2.03
1.67
0.50
1.94
1.97
1.97
1.32
1.96
2.06
1.53
0.80
1.32
1.71
1.97
1.37
1.63
2.02
1.85
(a) Ackley σ Gaussian
(b) Rastrigin all Gaussian
(c) Rastrigin N uniform
(d) Bohachevsky σ uniform
Fig. 2. Four cases from the results of experiment 1. Each subgraph shows the performance when varying a parameter (or all) according to a random distribution.
The x-axis is the width d of the distribution. The horizontal dashed line shows the performance when keeping the parameter values static to the tuned values.
The caption of each subgraph lists the test function, which parameter is varied and the type of the random distribution. Lower values are better for all functions.
(a) µ
(d) N
(b) g
(e) kp
(c) σ
(f) ks
Fig. 3. The parameter values over time when using the sine wave with the tuned settings from experiment 3 with f7 . Each subgraph shows the values of a
parameter over the generations.
TABLE V.
R ESULTS OF EXPERIMENT 3. F OR EACH PROBLEM , THE
PERFORMANCES USING THE VARIATION METHODS WITH TUNED SETTINGS
AND THE PERFORMANCE KEEPING THE VALUES FIXED TO THE TUNED
VALUES ARE SHOWN . L OWER VALUES ARE BETTER FOR ALL FUNCTIONS .
F OR EACH COLUMN , UNDERLINED VALUES DENOTE THE BEST AND BOLD
VALUES DENOTE PERFORMANCE NOT SIGNIFICANTLY WORSE THAN THE
BEST.
gaussian
uniform
sine
static
f1
1.74
3.08
3.21
2.67
f2
14.57
12.17
48.27
7.32
f3
9.58
13.28
38.76
7.56
f4
15.88
18.34
14.13
11.51
f5
14.28
21.76
15.40
16.16
f6
6.18
3.54
17.14
0.87
f7
-1.26
-2.24
-3.08
-1.97
[6]
[7]
[8]
[9]
V.
C ONCLUSIONS AND F UTURE W ORK
In this paper we have investigated the effect of randomly
changing the values of an evolutionary algorithm’s parameters.
To be specific, we put forward the assumption that random
variation, without intelligence or strategy, can improve EA
performance, compared to keeping parameters fixed, simply by
making multiple parameter values available to the evolutionary
process. To test this hypothesis we performed three separate
experiments where the effect of randomly varying the parameter values was examined. All three experiments showed that it
is indeed possible to significantly improve the performance of
an evolutionary algorithm by randomly changing its parameter
values.
[10]
The results of this paper raise an important issue in
methodology. It is common practice in literature that presents
parameter control mechanisms to evaluate the controller by
performing a comparison to the equivalent EA with static
parameter values. However, as the results if this paper show,
observing an improvement in such a comparison does not
necessarily show that the controller is good as it is not
shown whether the observed improvement is a consequence
of the intelligent control strategy itself or merely the variation
of the values. We believe that a complete evaluation of a
control mechanism should also include an analysis of how
the parameters are varied during a run and we suggest that a
“naive” variation scheme for the same parameters should be
included in the baseline benchmarks.
[15]
Future work will focus on making a comparison between sophisticated parameter control approaches and the
non-sophisticated random variation approach presented in the
experimental part of this paper to investigate the differences
between the two in terms of performance.
R EFERENCES
[1]
[2]
[3]
[4]
[5]
J. Costa, R. Tavares, and A. Rosa. An experimental study on dynamic
random variation of population size. In Systems, Man, and Cybernetics,
1999. IEEE International Conference on, volume 1, pages 607–612,
1999.
K. A. De Jong. Parameter setting in EAs: a 30 year perspective. In
Lobo et al. [10], pages 1–18.
A. Eiben, M. Horvath, W. Kowalczyk, and M. Schut. Reinforcement
learning for online control of evolutionary algorithms. In Brueckner, Hassas, Jelasity, and Yamins, editors, Proceedings of the 4th
International Workshop on Engineering Self-Organizing Applications
(ESOA’06), volume 4335, pages 151–160. Springer, 2006.
A. Eiben, Z. Michalewicz, M. Schoenauer, and J. Smith. Parameter
control in evolutionary algorithms. In Lobo et al. [10], pages 19–46.
Y. Gong and A. Fukunaga. Distributed island-model genetic algorithms
using heterogeneous parameter settings. In IEEE Congress on Evolutionary Computation, pages 820–827, 2011.
[11]
[12]
[13]
[14]
[16]
[17]
[18]
J. Hesser and R. M¨anner. Towards an optimal mutation probability
for genetic algorithms. In H.-P. Schwefel and R. M¨anner, editors,
Proceedings of the 1st Conference on Parallel Problem Solving from
Nature, number 496 in Lecture Notes in Computer Science, pages 23–
32. Springer, Berlin, Heidelberg, New York, 1991.
R. Hinterding, Z. Michalewicz, and A. Eiben. Adaptation in evolutionary computation: a survey. pages 65 –69.
T. Jansen and K. A. D. Jong. An analysis of the role of offspring
population size in eas. In W. Langdon et al, editor, Proceedings of
the Genetic and Evolutionary Computation Conference (GECCO-2002),
pages 238–246. Morgan Kaufmann, San Francisco, 9-13 July 2002.
J. L. J. Laredo, C. Fernandes, J. J. Merelo, and C. Gagn´e. Improving
genetic algorithms performance via deterministic population shrinkage.
In F. Rothlauf, editor, Proceedings of the Genetic and Evolutionary
Computation Conference (GECCO-2009), pages 819–826. ACM, 2009.
F. Lobo, C. Lima, and Z. Michalewicz, editors. Parameter Setting in
Evolutionary Algorithms. Springer, 2007.
F. G. Lobo. Idealized dynamic population sizing for uniformly scaled
problems. In N. Krasnogor et al., editors, GECCO ’11: Proceedings of
the 13th annual conference on Genetic and evolutionary computation,
pages 917–924, Dublin, Ireland, 12-16 July 2011. ACM.
B. McGinley, J. Maher, C. O’Riordan, and F. Morgan. Maintaining
healthy population diversity using adaptive crossover, mutation, and
selection. Evolutionary Computation, IEEE Transactions on, 15(5):692
–714, 2011.
S. Meyer-Nieberg and H. georg Beyer. Self-adaptation in evolutionary
algorithms. In Lobo et al. [10], pages 47–76.
Z. Michalewicz and M. Schmidt. Parameter control in practice. In Lobo
et al. [10], pages 277–294.
S. Smit and A. E. Eiben. Multi-problem parameter tuning using
bonesa. In J. Hao, P. Legrand, P. Collet, N. Monmarch´e, E. Lutton,
and M. Schoenauer, editors, Artificial Evolution, pages 222–233, 2011.
J. E. Smith and T. C. Fogarty. Operator and parameter adaptation
in genetic algorithms. Soft Computing - A Fusion of Foundations,
Methodologies and Applications, 1:81–87, 1997.
W. M. Spears. Adapting crossover in evolutionary algorithms. In
Proceedings of the Fourth Annual Conference on Evolutionary Programming, pages 367–384. MIT Press, 1995.
Y.-Y. Wong, K.-H. Lee, K.-S. Leung, and C.-W. Ho. A novel approach
in parameter adaptation and diversity maintenance for genetic algorithms. Soft Computing - A Fusion of Foundations, Methodologies and
Applications, 7:506–515, 2003.