Document 113074

BULETINUL INSTITUTULUI POLITEHNIC DIN IAŞI
Publicat de
Universitatea Tehnică „Gheorghe Asachi” din Iaşi
Tomul LVII (LXI), Fasc. 2, 2011
SecŃia
AUTOMATICĂ şi CALCULATOARE
DETERMINING THE BEST MUTATION PROBABILITIES OF A
GENETIC ALGORITHM FOR MAPPING TASKS
BY
ADRIAN ALEXANDRESCU ∗, MITICĂ CRAUS and
IOAN AGAVRILOAEI
“Gheorghe Asachi” Technical University of Iaşi,
Faculty of Automatic Control and Computer Engineering
Received: March 9, 2011
Accepted for publication: June 7, 2011
Abstract. An important aspect of heterogeneous computing systems is the
problem of efficiently mapping tasks to processors. There are various methods of
obtaining acceptable solutions to this problem but the genetic algorithm is
considered to be among the best heuristics for assigning independent tasks to
processors.
This paper focuses on how the genetic heuristic can be improved by
determining the best probabilities for a three-step mutation operator. By
computing the probabilities for selecting a mutation combination we concluded
that the most favoured combinations are the ones which select a task from the
processor with the biggest total execution time and then move the selected task
to the processor which executes it the fastest. Also, the probability of applying
the special mutation operator on a chromosome must be much greater than the
probability of applying the crossover operator.
Key words: task mapping, genetic algorithm, mutation.
2000 Mathematics Subject Classification: 68M14, 68M20, 68T20.
∗
Corresponding author; e-mail: [email protected]
22
Adrian Alexandrescu, Mitică Craus and Ioan Agavriloaei
1. Introduction
Mapping tasks to processors is a very important problem in
heterogeneous computing systems and it often focuses on determining which
tasks will execute on what processors without necessarily considering the order
in which the tasks are assigned.
The mapping algorithm uses estimations of the execution times of each
task on each processor and these estimations are kept in an Expected Time to
Compute (ETC) matrix. If no ordering can be made and no processor can be
outlined for executing all the tasks faster than another processor, then the ETC
matrix is considered to be inconsistent (Braun et al., 2001). The goal of the
mapping heuristics is to minimize the makespan defined as the time it takes for all
the processors to execute all the assigned tasks (Pinedo, 2006; Ucar et al., 2006).
There are heuristics such as Min-Min, Min-Max or Simulated
Annealing which provide fast results, but the Genetic Algorithm offers a much
better solution with little time cost (Alexandrescu & Craus, 2010; Dey &
Majumder, 2002; Omara & Arafa, 2009; Page et al., 2010).
The Genetic Algorithm with 3-Step Mutation (GA3SM) described in
(Alexandrescu et al., 2011) is a task mapping heuristic which uses a 3-step mutation
to find a better solution. Our goal is to further improve the aforementioned
algorithm by finding the optimal probabilities for the mutation methods.
The GA3SM algorithm uses the following mutation steps:
1. Select a processor:
a) Randomly,
b) With the biggest total execution time.
2. Select a task from the chosen processor at step 1:
a) Randomly,
b) With the longest execution time on the selected processor,
c) With the shortest execution time on the selected processor,
d) With the biggest difference between the execution time on the
selected processor and the execution time on the fastest processor.
3. Move the selected task to another chosen processor:
a) Randomly,
b) With the lowest sum of execution times,
c) Which processor had the fastest execution for the selected task.
2. A Genetic Algorithm for Optimizing the Mutation Operator
2.1. General Considerations
In order to improve the mutation operator, we implemented a genetic
algorithm to determine the best probabilities of selecting a mutation method for
each step.
23
Bul. Inst. Polit. Iaşi, t. LVII (LXI), f.2, 2011
Our Optimization Genetic Algorithm (OGA) uses an initial population
of 50 chromosomes, each representing a candidate solution. A chromosome is
made of genes which occupy a position in the chromosome called locus. Each
gene represents a trait and the possible values for a trait are called alleles.
In our case, a gene represents the probability of selecting a mutation
method and the alleles range from 0.00 to 1.00. An example for this encoding is
shown in Table 1. The two methods from the first mutation step (1a and 2b)
have a 0.72 and respectively, a 0.28 probability of being selected. Methods 2a,
2b, 2c and 2d have a probability of 0.32, 0.26, 0.14 and 0.28, while the methods
from the last step (3a, 3b and 3c) can be selected with 0.43, 0.51 and,
respectively, 0.06 probability.
Table 1
Example of a Chromosome Encoding
0.72
0.28
0.32
0.26
0.14
0.28
0.43
0.51
0.06
1a
1b
2a
2b
2c
2d
3a
3b
3c
Section 1
Section 2
Section 3
The chromosome has three sections with their lengths corresponding to
the number of methods from each step. The last probability from each section is
calculated as one minus the sum of the other probabilities from that section. The
sum of the probabilities from each step must be 1 and all the values are positive.
Taking into account the aforementioned chromosome restrictions, we
can determine the size of the search space solution. If we consider s to be the
number of probabilities for a method (the number of alleles for a gene), then the
number of possible combinations for Section 1 is s, because the probability for
method 1b is one minus the probability for method 1a. The number of
combinations for Section 2 is the same as for Section 3 and it is shown in Eq. (1):
s + ( s − 1) + ( s − 2) + … + 2 + 1 =
s ( s + 1)
.
2
(1)
Considering the number of combinations for each section, the size of
the search space solution is:
s⋅
s ( s + 1) s ( s + 1) s 3 ( s + 1) 2
⋅
=
.
2
2
4
(2)
Our algorithm uses probability values within the interval 0.0 and 1.0
and a step of 0.01. This leads to 101 possible alleles which, according to Eq. (2),
can offer 2.6·106 possible solutions.
24
Adrian Alexandrescu, Mitică Craus and Ioan Agavriloaei
For each iteration, the best 20% of the chromosomes from the old
population are copied to the new population (elitism). The algorithm ends once
100 iterations have been performed and it returns the best obtained probabilities
for the mutation combinations.
2.2. Fitness Function
The most challenging part is computing the fitness function, which
evaluates the chromosomes and determines the quality of the candidate
solutions (http://www.talkorigins.org/faqs/genalg/genalg.html). In order to
compute the fitness of a chromosome, we must determine the degree of
improvement for the best candidate solution when using the GA3SM algorithm
with the probabilities resulted from OGA.
The fitness function runs the GA3SM algorithm on a predefined set of
inconsistent ETC matrices. For each chromosome, the resulting fitness value is
equal to the mean of the makespan improvements. The same set of ETC
matrices is used when computing the fitness of all the chromosomes.
Considering the previous statement, we save each computed fitness value in a
map which has the chromosomes as the keys. In order to determine the fitness
of a chromosome, we check to see if it was computed before using the map; in
some cases we don’t need to run the GA3SM algorithm which is highly
computing intensive.
The best fit chromosome of our OGA algorithm is the one which
provides the best mutation probabilities for obtaining the biggest makespan
decrease when running the GA3SM algorithm.
2.3. Selection and Crossover
The selection operator uses the roulette-wheel selection method which
favours the best chromosomes to be selected, while less fit chromosomes have
also a small chance of being selected.
The crossover process implies taking two chromosomes, called parent
chromosomes, and producing two offspring chromosomes (http://www.obitko.
com/tutorials/genetic-algorithms/crossover-mutation.php). We use three types
of crossover: two one-point crossover methods and one two-point crossover
methods. The crossover operator switches sections of the parent chromosomes.
In Fig. 1, offspring 1-1 is produced by taking Section 1 from parent 2
and the other two sections from parent 1, while the second offspring takes the
first section from the first parent and the last two sections from the second one
by using one-point crossover. The same crossover method is used when
generating offspring 3-1 and offspring 3-2, only this time Section 3 is switched
between the parents. By using two-point crossover Section 2 is switched
between the parents to produce two new offspring chromosomes (offspring 2-1
25
Bul. Inst. Polit. Iaşi, t. LVII (LXI), f.2, 2011
and offspring 2-2) which satisfy the restriction of having one as the sum of
probabilities within each section. The crossover is performed on 80% of the
selected chromosomes, but later tests showed that the genetic algorithm for
mapping tasks performs better with a much lower crossover probability.
parent 1
parent 2
offspring 1-1
offspring 1-2
offspring 2-1
offspring 2-2
offspring 3-1
offspring 3-2
Fig. 1 − Two-point crossover.
2.4. Mutation
Two types of mutation have been implemented. The first mutation
operator randomly selects a section of the chromosome and then randomly
regenerates all the probabilities from that section. In Table 2, the values from
the third section change from 0.43, 0.51 and 0.06 to 0.24, 0.11 and,
respectively, 0.65.
Table 2
The Chromosome from Table 1 after the First Mutation
0.72
0.28
0.32
0.26
0.14
0.28
0.24
1a
1b
2a
2b
2c
2d
3a
Section 1
Section 2
0.11
0.65
3b
3c
Section 3
The second mutation randomly chooses two genes from the same
section and modifies their values by adding or subtracting a random value
between 0 and the minimum of the two probabilities. For example, let us
consider the chromosome from Table 1 and apply the second mutation
operator. The second section is randomly chosen and the genes 2a and 2d are
selected. The minimum of the probabilities for the two selected genes is 0.28.
A random number is generated between 0.00 and 0.28, for example 0.19,
which is then added to gene 2a and subtracted from gene 2d. The resulting
chromosome is shown in Table 3, where gene 2a has now a probability of
0.51 and gene 2d has a probability of 0.09. The first mutation operator is
applied to 20% of the selected chromosomes, while the second one modifies
60% of the chosen chromosomes.
26
Adrian Alexandrescu, Mitică Craus and Ioan Agavriloaei
Table 3
The Chromosome from Table 1 after the Second Mutation
0.72
0.28
0.51
0.26
0.14
0.09
0.43
0.51
0.06
1a
1b
2a
2b
2c
2d
3a
3b
3c
Section 1
Section 2
Section 3
3. Tests and Results
In order to compute the fitness of the chromosomes we used the
GA3SM algorithm which mapped 256 tasks to 8 processors in 500 iterations.
Firstly, we generated 10 ETC matrices which we then used to run the GA3SM
algorithm 10 times (once for each matrix). Afterwards, we computed the mean
of the resulted makespans for each chromosome of our algorithm. The
following results were obtained by running our OGA algorithm on three sets of
10 ETC matrices.
Both genetic algorithms (GA3SM and OGA) had the same probabilities
for applying the crossover and mutation operators (0.8 and respectively 0.7).
Other tests with different probabilities for the two operators showed that the
outcome of the OGA algorithm is very similar to the results generated with the
aforementioned probabilities.
The best probabilities for the first mutation step started from 0.23 for
mutation method 1a and 0.77 for method 1b. After only 157 iterations, method
1b had an overwhelming probability of 0.99 of being selected, while 1a had
only 0.01. These values were maintained until the end of the algorithm’s
execution as shown in Fig. 2.
100%
80%
60%
1b
40%
1a
20%
0%
0
50
100
150
200
250
300
350
400
450
500
Fig. 2 − The evolution of the probabilities for the first mutation step (1a and 1b).
Regarding the second mutation step (Fig. 3), the probability for
selecting method 2a reached 0.25 and it started from 0.08, while the probability
27
Bul. Inst. Polit. Iaşi, t. LVII (LXI), f.2, 2011
for 2d started from 0.66 and reached 0.72. The values for the other two
methods, 2b and 2c, started from 0.18 and 0.08 are became almost negligible:
0.01 and, respectively, 0.02.
100%
80%
2d
60%
2c
40%
2b
20%
2a
0%
0
50
100
150
200
250
300
350
400
450
500
Fig. 3 − The evolution of the probabilities for the second mutation
step (2a, 2b, 2c and 2d).
In Fig. 4, we can see that method 3a had the lowest probability of being
selected from the beginning (0.16) but it decreased further to 0.04. Methods 3b
and 3c started from 0.44 and 0.40 and ended on 0.29 and, respectively, 0.67.
100%
80%
60%
3c
40%
3b
3a
20%
0%
0
50
100
150
200
250
300
350
400
450
500
Fig. 4 − The evolution of the probabilities for the third
mutation step (3a, 3b and 3c).
The best makespan mean has improved in 500 iterations from 733.25
time units to 729.30 t.u., as can be seen in Fig. 5. Which means that, choosing
the best probabilities for selecting the mutation methods, can lead to a decrease
of the makespan and, therefore, to a significant improvement of the algorithm’s
solution.
28
Adrian Alexandrescu, Mitică Craus and Ioan Agavriloaei
734
time units
733
732
731
730
729
728
0
50
100
150
200
250
300
350
400
450
500
makespan
Fig. 5 − The makespan mean over 500 iterations of the OGA algorithm.
The best probabilities of selecting the mutation methods for each step
are shown in Fig. 6. The most widely used mutation combination is to select the
processor with the biggest total execution time (1b), then to select the task with
the biggest difference between the execution time on the selected processor and
the execution time on the fastest processor (2d) and, finally, to move that task to
the processor which executes it the fastest (3c).
100%
90%
3c
80%
3b
70%
0.72
60%
50%
0.67
2d
0.99
2c
40%
2b
30%
0.02
0.01
20%
2a
0.29
1b
0.25
10%
0%
3a
1a
0.01
1
0.04
2
3
Fig. 6 − The final probabilities for each of the three steps of the mutation.
4. Further Optimizations to the Genetic Algorithm
Other parameters which can be optimized are the population size, the
crossover probability and the mutation probability. Depending on the stop
condition, optimizations can be made by finding suitable values for the total
Bul. Inst. Polit. Iaşi, t. LVII (LXI), f.2, 2011
29
number of iterations of the genetic algorithm and the number of iterations since
the last fitness improvement. We considered that 500 iterations are acceptable to
find a very good solution, therefore only the former three parameters are taken
into consideration.
Because the solution search space is somewhat limited for these
parameters, there is no need for a genetic algorithm to determine the best
combination. We considered that the population can have a size of 50, 100, 150 or
200 chromosomes and that the crossover and mutation probabilities vary from 0.0
to 1.0 with a step of 0.1. This results in 484 combinations of these parameters.
For each combination, we ran the GA3SM algorithm with the same set
of 10 ETC matrices and, as with our Optimization Genetic Algorithm, we
computed the mean of the best makespan of each run of the algorithm.
Tests have shown that a population size of 200 chromosomes, a crossover
probability of 0.1 and a mutation probability of 0.9 provide the best results.
5. Conclusions
There is always room for improvement when it comes to assigning
independent tasks to processors, especially if the mapping heuristic is a genetic
algorithm. While the crossover ensures that new candidate solutions are
generated, a targeted mutation can significantly improve those solutions.
The mutation operator must lighten the load of the most occupied
processors and move the tasks to the least occupied ones while considering the
execution times of the selected tasks on the targeted processors. An algorithm
which has a low probability of performing a crossover and a high probability of
performing the 3-step mutation will, most likely, produce the best results.
Acknowledgements. This paper was realised with the support of EURODOC
“Doctoral Scholarships for research performance at European level” project, financed
by the European Social Found and the Romanian Government.
REFERENCES
*
* *
Crossover and mutation - Introduction to Genetic Algorithms. 1998, available at
http://www.obitko.com/tutorials/genetic-algorithms/crossover-mutation.php.
*
Genetic
Algorithms and Evolutionary Computation. 2004, available at
* *
http://www.talkorigins.org/faqs/genalg/genalg.html.
Alexandrescu A., Agavriloaei I., Craus M., A Genetic Algorithm for Mapping Tasks in
Heterogeneous Computing Systems. Accepted at the 11th International
Symposium on Automatic Control and Computer Science, ICSTCC, 2011.
Alexandrescu A., Craus M., Improving Mapping Heuristics in Heterogeneous
Computing. In Proceedings ECIT2010 6th European Conference on Intelligent
Systems and Technologies, October 7-9, Iaşi, 1–12, 2010.
30
Adrian Alexandrescu, Mitică Craus and Ioan Agavriloaei
Braun T.D. et al., A Comparison of Eleven Static Heuristics for Mapping a Class of
Independent Tasks onto Heterogeneous Distributed Computing Systems.
Journal of Parallel and Distributed Computing, Vol. 61, 6, 810–837, June 2001.
Dey S., Majumder S., Task Allocation in Heterogeneous Computing Environment by
Genetic Algorithm. In Proceedings of the 4th International Workshop on
Distributed Computing, Mobile and Wireless Computing, ser. IWDC ’02.
London, UK, UK: Springer-Verlag, 348–352, 2002.
Omara F.A., Arafa M.M., Genetic Algorithms for Task Scheduling Problem. Journal of
Parallel and Distributed Computing, Oct. 2009.
Page A.J., Keane T.M., Naughton T.J., Multi-Heuristic Dynamic Task Allocation Using
Genetic Algorithms in a Heterogeneous Distributed System. Journal of Parallel
and Distributed Computing, Vol. Article in Press, 2010.
Pinedo M., Scheduling: Theory, Algorithms and Systems. 2nd Ed. Prentice Hall, 2006.
Ucar B. et al., Task Assignment in Heterogeneous Computing Systems. Journal of
Parallel and Distributed Computing, Vol. 66, 1, 32–46, Jan. 2006.
DETERMINAREA CELOR MAI BUNE PROBABILITĂłI
DE MUTAłIE ALE UNUI ALGORITM GENETIC DE MAPARE
A TASKURILOR
(Rezumat)
O problemă importantă a sistemelor distribuite heterogene este cea a asignării
eficiente a taskurilor la procesoare (noduri de procesare). Există diverse euristici care
produc soluŃii acceptabile cum ar fi Min-Min, Min-Max sau Simulated Annealing, dar
cea mai utilizată metodă de mapare este Algoritmul Genetic.
Această lucrare abordează maparea taskurilor independente din prisma unui
algoritm genetic care utilizează o mutaŃie în trei paşi. De asemenea, este prezentată o
modalitate de a îmbunătăŃi acest algoritm prin determinarea probabilităŃilor optime
pentru fiecare metodă de realizare a paşilor mutaŃiei. Testele au demonstrat că cele mai
bune combinaŃii ale paşilor mutaŃiei sunt cele prin care se mută taskul de la procesorul
cu timpul total de execuŃie cel mai mare la procesorul care execută taskul respectiv în
cel mai scurt timp. O soluŃie mai bună se obŃine şi dacă probabilitatea de selecŃie de a
operatorului de mutaŃie este semnificativ mai mare comparativ cu probabilitatea de
realizare a crossover-ului.