BULETINUL INSTITUTULUI POLITEHNIC DIN IAŞI Publicat de Universitatea Tehnică „Gheorghe Asachi” din Iaşi Tomul LVII (LXI), Fasc. 2, 2011 SecŃia AUTOMATICĂ şi CALCULATOARE DETERMINING THE BEST MUTATION PROBABILITIES OF A GENETIC ALGORITHM FOR MAPPING TASKS BY ADRIAN ALEXANDRESCU ∗, MITICĂ CRAUS and IOAN AGAVRILOAEI “Gheorghe Asachi” Technical University of Iaşi, Faculty of Automatic Control and Computer Engineering Received: March 9, 2011 Accepted for publication: June 7, 2011 Abstract. An important aspect of heterogeneous computing systems is the problem of efficiently mapping tasks to processors. There are various methods of obtaining acceptable solutions to this problem but the genetic algorithm is considered to be among the best heuristics for assigning independent tasks to processors. This paper focuses on how the genetic heuristic can be improved by determining the best probabilities for a three-step mutation operator. By computing the probabilities for selecting a mutation combination we concluded that the most favoured combinations are the ones which select a task from the processor with the biggest total execution time and then move the selected task to the processor which executes it the fastest. Also, the probability of applying the special mutation operator on a chromosome must be much greater than the probability of applying the crossover operator. Key words: task mapping, genetic algorithm, mutation. 2000 Mathematics Subject Classification: 68M14, 68M20, 68T20. ∗ Corresponding author; e-mail: [email protected] 22 Adrian Alexandrescu, Mitică Craus and Ioan Agavriloaei 1. Introduction Mapping tasks to processors is a very important problem in heterogeneous computing systems and it often focuses on determining which tasks will execute on what processors without necessarily considering the order in which the tasks are assigned. The mapping algorithm uses estimations of the execution times of each task on each processor and these estimations are kept in an Expected Time to Compute (ETC) matrix. If no ordering can be made and no processor can be outlined for executing all the tasks faster than another processor, then the ETC matrix is considered to be inconsistent (Braun et al., 2001). The goal of the mapping heuristics is to minimize the makespan defined as the time it takes for all the processors to execute all the assigned tasks (Pinedo, 2006; Ucar et al., 2006). There are heuristics such as Min-Min, Min-Max or Simulated Annealing which provide fast results, but the Genetic Algorithm offers a much better solution with little time cost (Alexandrescu & Craus, 2010; Dey & Majumder, 2002; Omara & Arafa, 2009; Page et al., 2010). The Genetic Algorithm with 3-Step Mutation (GA3SM) described in (Alexandrescu et al., 2011) is a task mapping heuristic which uses a 3-step mutation to find a better solution. Our goal is to further improve the aforementioned algorithm by finding the optimal probabilities for the mutation methods. The GA3SM algorithm uses the following mutation steps: 1. Select a processor: a) Randomly, b) With the biggest total execution time. 2. Select a task from the chosen processor at step 1: a) Randomly, b) With the longest execution time on the selected processor, c) With the shortest execution time on the selected processor, d) With the biggest difference between the execution time on the selected processor and the execution time on the fastest processor. 3. Move the selected task to another chosen processor: a) Randomly, b) With the lowest sum of execution times, c) Which processor had the fastest execution for the selected task. 2. A Genetic Algorithm for Optimizing the Mutation Operator 2.1. General Considerations In order to improve the mutation operator, we implemented a genetic algorithm to determine the best probabilities of selecting a mutation method for each step. 23 Bul. Inst. Polit. Iaşi, t. LVII (LXI), f.2, 2011 Our Optimization Genetic Algorithm (OGA) uses an initial population of 50 chromosomes, each representing a candidate solution. A chromosome is made of genes which occupy a position in the chromosome called locus. Each gene represents a trait and the possible values for a trait are called alleles. In our case, a gene represents the probability of selecting a mutation method and the alleles range from 0.00 to 1.00. An example for this encoding is shown in Table 1. The two methods from the first mutation step (1a and 2b) have a 0.72 and respectively, a 0.28 probability of being selected. Methods 2a, 2b, 2c and 2d have a probability of 0.32, 0.26, 0.14 and 0.28, while the methods from the last step (3a, 3b and 3c) can be selected with 0.43, 0.51 and, respectively, 0.06 probability. Table 1 Example of a Chromosome Encoding 0.72 0.28 0.32 0.26 0.14 0.28 0.43 0.51 0.06 1a 1b 2a 2b 2c 2d 3a 3b 3c Section 1 Section 2 Section 3 The chromosome has three sections with their lengths corresponding to the number of methods from each step. The last probability from each section is calculated as one minus the sum of the other probabilities from that section. The sum of the probabilities from each step must be 1 and all the values are positive. Taking into account the aforementioned chromosome restrictions, we can determine the size of the search space solution. If we consider s to be the number of probabilities for a method (the number of alleles for a gene), then the number of possible combinations for Section 1 is s, because the probability for method 1b is one minus the probability for method 1a. The number of combinations for Section 2 is the same as for Section 3 and it is shown in Eq. (1): s + ( s − 1) + ( s − 2) + … + 2 + 1 = s ( s + 1) . 2 (1) Considering the number of combinations for each section, the size of the search space solution is: s⋅ s ( s + 1) s ( s + 1) s 3 ( s + 1) 2 ⋅ = . 2 2 4 (2) Our algorithm uses probability values within the interval 0.0 and 1.0 and a step of 0.01. This leads to 101 possible alleles which, according to Eq. (2), can offer 2.6·106 possible solutions. 24 Adrian Alexandrescu, Mitică Craus and Ioan Agavriloaei For each iteration, the best 20% of the chromosomes from the old population are copied to the new population (elitism). The algorithm ends once 100 iterations have been performed and it returns the best obtained probabilities for the mutation combinations. 2.2. Fitness Function The most challenging part is computing the fitness function, which evaluates the chromosomes and determines the quality of the candidate solutions (http://www.talkorigins.org/faqs/genalg/genalg.html). In order to compute the fitness of a chromosome, we must determine the degree of improvement for the best candidate solution when using the GA3SM algorithm with the probabilities resulted from OGA. The fitness function runs the GA3SM algorithm on a predefined set of inconsistent ETC matrices. For each chromosome, the resulting fitness value is equal to the mean of the makespan improvements. The same set of ETC matrices is used when computing the fitness of all the chromosomes. Considering the previous statement, we save each computed fitness value in a map which has the chromosomes as the keys. In order to determine the fitness of a chromosome, we check to see if it was computed before using the map; in some cases we don’t need to run the GA3SM algorithm which is highly computing intensive. The best fit chromosome of our OGA algorithm is the one which provides the best mutation probabilities for obtaining the biggest makespan decrease when running the GA3SM algorithm. 2.3. Selection and Crossover The selection operator uses the roulette-wheel selection method which favours the best chromosomes to be selected, while less fit chromosomes have also a small chance of being selected. The crossover process implies taking two chromosomes, called parent chromosomes, and producing two offspring chromosomes (http://www.obitko. com/tutorials/genetic-algorithms/crossover-mutation.php). We use three types of crossover: two one-point crossover methods and one two-point crossover methods. The crossover operator switches sections of the parent chromosomes. In Fig. 1, offspring 1-1 is produced by taking Section 1 from parent 2 and the other two sections from parent 1, while the second offspring takes the first section from the first parent and the last two sections from the second one by using one-point crossover. The same crossover method is used when generating offspring 3-1 and offspring 3-2, only this time Section 3 is switched between the parents. By using two-point crossover Section 2 is switched between the parents to produce two new offspring chromosomes (offspring 2-1 25 Bul. Inst. Polit. Iaşi, t. LVII (LXI), f.2, 2011 and offspring 2-2) which satisfy the restriction of having one as the sum of probabilities within each section. The crossover is performed on 80% of the selected chromosomes, but later tests showed that the genetic algorithm for mapping tasks performs better with a much lower crossover probability. parent 1 parent 2 offspring 1-1 offspring 1-2 offspring 2-1 offspring 2-2 offspring 3-1 offspring 3-2 Fig. 1 − Two-point crossover. 2.4. Mutation Two types of mutation have been implemented. The first mutation operator randomly selects a section of the chromosome and then randomly regenerates all the probabilities from that section. In Table 2, the values from the third section change from 0.43, 0.51 and 0.06 to 0.24, 0.11 and, respectively, 0.65. Table 2 The Chromosome from Table 1 after the First Mutation 0.72 0.28 0.32 0.26 0.14 0.28 0.24 1a 1b 2a 2b 2c 2d 3a Section 1 Section 2 0.11 0.65 3b 3c Section 3 The second mutation randomly chooses two genes from the same section and modifies their values by adding or subtracting a random value between 0 and the minimum of the two probabilities. For example, let us consider the chromosome from Table 1 and apply the second mutation operator. The second section is randomly chosen and the genes 2a and 2d are selected. The minimum of the probabilities for the two selected genes is 0.28. A random number is generated between 0.00 and 0.28, for example 0.19, which is then added to gene 2a and subtracted from gene 2d. The resulting chromosome is shown in Table 3, where gene 2a has now a probability of 0.51 and gene 2d has a probability of 0.09. The first mutation operator is applied to 20% of the selected chromosomes, while the second one modifies 60% of the chosen chromosomes. 26 Adrian Alexandrescu, Mitică Craus and Ioan Agavriloaei Table 3 The Chromosome from Table 1 after the Second Mutation 0.72 0.28 0.51 0.26 0.14 0.09 0.43 0.51 0.06 1a 1b 2a 2b 2c 2d 3a 3b 3c Section 1 Section 2 Section 3 3. Tests and Results In order to compute the fitness of the chromosomes we used the GA3SM algorithm which mapped 256 tasks to 8 processors in 500 iterations. Firstly, we generated 10 ETC matrices which we then used to run the GA3SM algorithm 10 times (once for each matrix). Afterwards, we computed the mean of the resulted makespans for each chromosome of our algorithm. The following results were obtained by running our OGA algorithm on three sets of 10 ETC matrices. Both genetic algorithms (GA3SM and OGA) had the same probabilities for applying the crossover and mutation operators (0.8 and respectively 0.7). Other tests with different probabilities for the two operators showed that the outcome of the OGA algorithm is very similar to the results generated with the aforementioned probabilities. The best probabilities for the first mutation step started from 0.23 for mutation method 1a and 0.77 for method 1b. After only 157 iterations, method 1b had an overwhelming probability of 0.99 of being selected, while 1a had only 0.01. These values were maintained until the end of the algorithm’s execution as shown in Fig. 2. 100% 80% 60% 1b 40% 1a 20% 0% 0 50 100 150 200 250 300 350 400 450 500 Fig. 2 − The evolution of the probabilities for the first mutation step (1a and 1b). Regarding the second mutation step (Fig. 3), the probability for selecting method 2a reached 0.25 and it started from 0.08, while the probability 27 Bul. Inst. Polit. Iaşi, t. LVII (LXI), f.2, 2011 for 2d started from 0.66 and reached 0.72. The values for the other two methods, 2b and 2c, started from 0.18 and 0.08 are became almost negligible: 0.01 and, respectively, 0.02. 100% 80% 2d 60% 2c 40% 2b 20% 2a 0% 0 50 100 150 200 250 300 350 400 450 500 Fig. 3 − The evolution of the probabilities for the second mutation step (2a, 2b, 2c and 2d). In Fig. 4, we can see that method 3a had the lowest probability of being selected from the beginning (0.16) but it decreased further to 0.04. Methods 3b and 3c started from 0.44 and 0.40 and ended on 0.29 and, respectively, 0.67. 100% 80% 60% 3c 40% 3b 3a 20% 0% 0 50 100 150 200 250 300 350 400 450 500 Fig. 4 − The evolution of the probabilities for the third mutation step (3a, 3b and 3c). The best makespan mean has improved in 500 iterations from 733.25 time units to 729.30 t.u., as can be seen in Fig. 5. Which means that, choosing the best probabilities for selecting the mutation methods, can lead to a decrease of the makespan and, therefore, to a significant improvement of the algorithm’s solution. 28 Adrian Alexandrescu, Mitică Craus and Ioan Agavriloaei 734 time units 733 732 731 730 729 728 0 50 100 150 200 250 300 350 400 450 500 makespan Fig. 5 − The makespan mean over 500 iterations of the OGA algorithm. The best probabilities of selecting the mutation methods for each step are shown in Fig. 6. The most widely used mutation combination is to select the processor with the biggest total execution time (1b), then to select the task with the biggest difference between the execution time on the selected processor and the execution time on the fastest processor (2d) and, finally, to move that task to the processor which executes it the fastest (3c). 100% 90% 3c 80% 3b 70% 0.72 60% 50% 0.67 2d 0.99 2c 40% 2b 30% 0.02 0.01 20% 2a 0.29 1b 0.25 10% 0% 3a 1a 0.01 1 0.04 2 3 Fig. 6 − The final probabilities for each of the three steps of the mutation. 4. Further Optimizations to the Genetic Algorithm Other parameters which can be optimized are the population size, the crossover probability and the mutation probability. Depending on the stop condition, optimizations can be made by finding suitable values for the total Bul. Inst. Polit. Iaşi, t. LVII (LXI), f.2, 2011 29 number of iterations of the genetic algorithm and the number of iterations since the last fitness improvement. We considered that 500 iterations are acceptable to find a very good solution, therefore only the former three parameters are taken into consideration. Because the solution search space is somewhat limited for these parameters, there is no need for a genetic algorithm to determine the best combination. We considered that the population can have a size of 50, 100, 150 or 200 chromosomes and that the crossover and mutation probabilities vary from 0.0 to 1.0 with a step of 0.1. This results in 484 combinations of these parameters. For each combination, we ran the GA3SM algorithm with the same set of 10 ETC matrices and, as with our Optimization Genetic Algorithm, we computed the mean of the best makespan of each run of the algorithm. Tests have shown that a population size of 200 chromosomes, a crossover probability of 0.1 and a mutation probability of 0.9 provide the best results. 5. Conclusions There is always room for improvement when it comes to assigning independent tasks to processors, especially if the mapping heuristic is a genetic algorithm. While the crossover ensures that new candidate solutions are generated, a targeted mutation can significantly improve those solutions. The mutation operator must lighten the load of the most occupied processors and move the tasks to the least occupied ones while considering the execution times of the selected tasks on the targeted processors. An algorithm which has a low probability of performing a crossover and a high probability of performing the 3-step mutation will, most likely, produce the best results. Acknowledgements. This paper was realised with the support of EURODOC “Doctoral Scholarships for research performance at European level” project, financed by the European Social Found and the Romanian Government. REFERENCES * * * Crossover and mutation - Introduction to Genetic Algorithms. 1998, available at http://www.obitko.com/tutorials/genetic-algorithms/crossover-mutation.php. * Genetic Algorithms and Evolutionary Computation. 2004, available at * * http://www.talkorigins.org/faqs/genalg/genalg.html. Alexandrescu A., Agavriloaei I., Craus M., A Genetic Algorithm for Mapping Tasks in Heterogeneous Computing Systems. Accepted at the 11th International Symposium on Automatic Control and Computer Science, ICSTCC, 2011. Alexandrescu A., Craus M., Improving Mapping Heuristics in Heterogeneous Computing. In Proceedings ECIT2010 6th European Conference on Intelligent Systems and Technologies, October 7-9, Iaşi, 1–12, 2010. 30 Adrian Alexandrescu, Mitică Craus and Ioan Agavriloaei Braun T.D. et al., A Comparison of Eleven Static Heuristics for Mapping a Class of Independent Tasks onto Heterogeneous Distributed Computing Systems. Journal of Parallel and Distributed Computing, Vol. 61, 6, 810–837, June 2001. Dey S., Majumder S., Task Allocation in Heterogeneous Computing Environment by Genetic Algorithm. In Proceedings of the 4th International Workshop on Distributed Computing, Mobile and Wireless Computing, ser. IWDC ’02. London, UK, UK: Springer-Verlag, 348–352, 2002. Omara F.A., Arafa M.M., Genetic Algorithms for Task Scheduling Problem. Journal of Parallel and Distributed Computing, Oct. 2009. Page A.J., Keane T.M., Naughton T.J., Multi-Heuristic Dynamic Task Allocation Using Genetic Algorithms in a Heterogeneous Distributed System. Journal of Parallel and Distributed Computing, Vol. Article in Press, 2010. Pinedo M., Scheduling: Theory, Algorithms and Systems. 2nd Ed. Prentice Hall, 2006. Ucar B. et al., Task Assignment in Heterogeneous Computing Systems. Journal of Parallel and Distributed Computing, Vol. 66, 1, 32–46, Jan. 2006. DETERMINAREA CELOR MAI BUNE PROBABILITĂłI DE MUTAłIE ALE UNUI ALGORITM GENETIC DE MAPARE A TASKURILOR (Rezumat) O problemă importantă a sistemelor distribuite heterogene este cea a asignării eficiente a taskurilor la procesoare (noduri de procesare). Există diverse euristici care produc soluŃii acceptabile cum ar fi Min-Min, Min-Max sau Simulated Annealing, dar cea mai utilizată metodă de mapare este Algoritmul Genetic. Această lucrare abordează maparea taskurilor independente din prisma unui algoritm genetic care utilizează o mutaŃie în trei paşi. De asemenea, este prezentată o modalitate de a îmbunătăŃi acest algoritm prin determinarea probabilităŃilor optime pentru fiecare metodă de realizare a paşilor mutaŃiei. Testele au demonstrat că cele mai bune combinaŃii ale paşilor mutaŃiei sunt cele prin care se mută taskul de la procesorul cu timpul total de execuŃie cel mai mare la procesorul care execută taskul respectiv în cel mai scurt timp. O soluŃie mai bună se obŃine şi dacă probabilitatea de selecŃie de a operatorului de mutaŃie este semnificativ mai mare comparativ cu probabilitatea de realizare a crossover-ului.