Dynamic Partitioning Based Scheduling of Real-Time Tasks in Multicore Processors N. Saranya and R. C. Hansdah Department of Computer Science and Automation Indian Institute of Science, Bangalore, India Email: {saranya.n, hansdah}@csa.iisc.ernet.in Abstract—Existing real-time multicore schedulers use either global or partitioned scheduling technique to schedule real-time tasks. Partitioned scheduling is a static approach in which, a task is mapped to a per-processor ready queue prior to scheduling it and it cannot migrate. Partitioned scheduling makes ineffective use of the available processing power and incurs high overhead when real-time tasks are dynamic in nature. Global scheduling is a dynamic scheduling approach, where the processors share a single ready-queue to execute the highest priority tasks. Global scheduling allows task migration which results in high scheduling overhead. In this paper, we present a dynamic partitioning based scheduling of real-time tasks, called DP scheduling. In DP scheduling, jobs of tasks are assigned to cores when they are released and remain in the same core till they finish execution. The partitioning in DP scheduling is done based on the slack time and priority of jobs. If a job cannot be allocated to any core, then it is split, and executed on more than one core. DP scheduling technique attempts to retain good features of both global and partitioned scheduling without compromising on resource utilization, and at the same time, also tries to minimize the scheduling overhead. We have tested DP scheduling technique with EDF scheduling policy at each core, and we term this scheduling algorithm as DP-EDF. The performance of DP-EDF scheduling algorithm has been evaluated using simulation study and its implementation in LITMUSRT on a 64-bit intel processor with eight logical cores. Both simulation and experimental results show that DP-EDF scheduling algorithm has better performance in terms of resource utilization, and comparable or better performance in terms of scheduling overhead in comparison to contemporary scheduling algorithms. I. I NTRODUCTION The multicore processor architecture provides higher performance without driving up the power consumption and heat dissipation. The multicore processors are designed to boost performance by integrating two or more processors into a single socket. A wide variety of products from smart phones to desktops to servers have gained enormously by incorporating the multicore technology. Embedded systems can also immensely benefit by using the multicore technology. Hence, in recent years, there has been an increase in many research and industrial works being carried out to exploit the advantages of multicore processors for real-time systems. The real-time tasks in multicore systems are scheduled using two major scheduling techniques, viz., partitioned and global scheduling. In partitioned scheduling, n independent real-time tasks are distributed across m cores in the m-core system such that each task is assigned to a single core. One of the major advantage of partitioned scheduling is that different uniprocessor scheduling algorithms can be applied to the system after the tasks are partitioned. Since tasks do not migrate across processors, the scheduling overhead incurred in partitioned scheduling is free of migration cost. The process of mapping tasks to cores is done off-line before the real-time tasks are scheduled to run, using bin packing heuristics which is a known NP-hard problem. Since the tasks are statically allocated using the bin-packing method, the processors are not utilized to their complete capacity. In on-line systems, where only a few tasks are known prior to scheduling, every time a new task arrives, the “task-to-core” mapping is redone, and hence, this reallocation of tasks done during runtime increases the system’s scheduling overhead. According to [1], a set of n real-time tasks with maximum utilization of each task not greater than umax is schedulable using partitioned scheduling on an m processor system if U < (mβ+1)/(β+1), where β = 1/umax and U is the total utilization of the set of n real-time tasks. In global scheduling, the m cores in the system share a single system-wide queue. The ready tasks are sorted and arranged according to their priority in this ready queue. At any instant of time, the m cores in the system execute the m highest priority tasks. In general, global scheduling is better suited for on-line systems. Unlike partitioned scheduling, the tasks in global scheduling can migrate between cores, resulting in higher scheduling overhead. The migration can be of two types, viz., task-level migration and job-level migration. In task-level migration, migration occurs at job boundaries and in job-level migration, the migration occurs during job execution. In global scheduling, the Pfair class of algorithms [2] have utilization bound of 100%. But Pfair class of algorithms suffer from high scheduling overhead due to large number of migrations and context switchings. Semi-partitioned scheduling [3] [4] is another scheduling technique which combines the concept of global and partitioned scheduling. In semi-partitioned scheduling, partitioning algorithm is applied to the task set and it is divided into two subsets, viz., the non-split tasks and the split tasks. Most of the tasks are non-split tasks, and each non-split task is allocated to a single core. None of the split tasks can be allocated to a single core (because the available utilization on each of the core is less than the task’s utilization), and each of them is split and assigned to more than one core. The non-split tasks are fixed and do not migrate while the split tasks migrates across the allocated cores. Unlike global scheduling, semipartitioned scheduling reduces the migration cost by restricting the number of split tasks. However, in global scheduling, the task migration is dynamic in nature, while, in semi-partitioned scheduling, it is static and planned prior to the execution of the tasks. Semi-partitioned scheduling tries to make maximum use of the available processor utilization. Sporadic-EKG (SEKG) [5], [6] and Notional Processor Scheduling (NPS-F) [7] are two semi-partitioned scheduling algorithms with utilization bound of 88.8% and 66.6% respectively. Both these algorithms can be configured to achieve utilization bound up to 100%, but this comes with increased number of preemption and context switching. In this paper, we propose a novel dynamic partitioning based scheduling of real-time tasks, called DP scheduling. The DP scheduling technique has the good features of existing scheduling techniques such as the dynamic nature of global scheduling, no-migration characteristics of partitioned scheduling, and high resource utilization of semi-partitioning scheduling with reduced overhead. Essentially, DP scheduling entails the following. III. DP SCHEDULING TECHNIQUE In this work, the DP scheduling technique uses Earliest Deadline First (EDF) scheduling policy at each core. EDF is a dynamic event-driven scheduling policy, in which the priority of a task can change over a period of time; however, the priority of a job remains fixed. In EDF scheduling policy, the priority of a job depends on its absolute deadline, earlier the absolute deadline, higher the priority. The system employing the DP scheduling technique and EDF scheduling policy at each core to schedule real-time tasks is abbreviated as DPEDF. A. System Architecture • A dynamic partitioning approach, in which “task-tocore” mapping is done for every job at the time of its release. Based on their functions, the processors in the multicore system are classified into two types, viz., real-time processors and Linux processors. The m cores in the system are divided into l real-time processors and (m-l) Linux processors. The cores in the multicore system communicate with one another using inter-processor interrupts (IPI). • A semi-partitioning approach to schedule those jobs that cannot be assigned to a single processor by allowing them to migrate across different processors. 1) Linux processors: Linux tasks are scheduled to run only on Linux processors. All external interrupts (except IPIs and timer interrupts) are redirected to the Linux processors. • A service core to manage all the scheduling activities of the real-time tasks. • Earliest deadline first (EDF) based priority scheduling at each core to schedule the real-time tasks. 2) Real-Time processors: The real-time processors are responsible for executing the real-time tasks and managing their scheduling activities. The l real-time processors are divided into one rt-server and (l-1) rt-clients. The rt-server and the rtclients together form an rt-scheduling entity. Figure 1 gives an overview of the rt-scheduling entity. The DP scheduling technique aims to make maximum utilization of the available CPU resources and minimize the scheduling overhead while ensuring the schedulability of the real-time systems. Both on-line and off-line systems could greatly benefit from the DP scheduling technique. The rest of the paper is organized as follows. The next section elucidates the system model. Section III presents the details of DP scheduling technique. Section IV discusses the various execution overhead. The DP scheduling technique is evaluated in section V. Section VI reviews a few related works and section VII concludes the paper. II. SYSTEM MODEL We consider a set of periodic real-time tasks, τ , containing n independent tasks, {τ1 , τ2 , ..., τn }. The real-time tasks execute on a multicore platform containing m identical cores, {P1 , P2 , ..., Pm }. Each task, τi , is represented using the tuple (Ei , Di , Ti ), where Ei , Di and Ti are the task’s worst case execution time (WCET), relative deadline and period respectively. The utilization of τi , denoted by Ui , is equal i to E Ti . The utilization of the task set τ , denoted by U(τ ), is Σi=1 to n (Ui ). Job is an instance of a task. The j th instance of τi is represented as j ij . The absolute deadline of a job, jij , released at time t is Di + t. The inter-arrival time of two successive jobs jij and ji(j+1) , is equal to Ti . All jobs complete their execution before the arrival of the next job ,i.e., jij completes its execution before the arrival of ji(j+1) . At any given time, no two processors can execute the same job and a job once released does not self suspend itself. • rt-server: The rt-server manages the scheduling activities of the rt-clients. All jobs are released on the rt-server. On a job release, the rt-server performs the task-to-core mapping and dispatches the job to the appropriate rt-client. The rt-server does not execute any real-time tasks, and it is only responsible for supervising the execution of the real-time tasks on the rt-clients. • rt-client: The real-time tasks are executed on the rtclients. Each rt-client is associated with a queue data structure, called the ready queue. The ready queue contains all runnable tasks waiting for the CPU resource in decreasing order of their priority. On job completion, the rt-client schedules the next highest priority task from its ready queue to execute. Tasks that have completed their execution in the current period and are waiting to be released in the next period are added to the global release queue. There is one global release queue per rt-scheduling entity. B. Job Model A job, jij , is represented using tuple (eij , dij ), where eij is the job’s WCET and dij is its absolute deadline. The release time of jij is represented by tij . In DP-EDF scheduling algorithm, in addition to the worst-case execution time and absolute deadline, every job is associated with four more parameters, viz., remaining execution time, outset time, slack time and instantaneous utilization. Algorithm 1 job to core mapping Fig. 1: Overview of rt-scheduling entity in DP Scheduling Technique. Definition : The remaining execution time of job jij at time t, denoted by e0ij (t), is Ei minus the total amount of time the job spent executing on the processors. Definition : Outset time of job jij on Pk ’s ready queue at time t, denoted by Oij (Pk , t), is defined as the maximum duration of time jij has to wait in Pk ’s ready queue before it can start its execution on Pk . if j ij is not assigned to any core, then, ∀ rt-client Pr Oij (Pr , t) = Σ j mn ∈H(j ij ,P r ) e0mn (t), if j ij is assigned(to core Pk , then, ∀ rt-client Pr Σj mn ∈H(j ij ,P r ) e 0mn (t), if r = k, Oij (Pr , t) = ∞, if r 6= k ,where e0mn (t) is the remaining execution time of jmn , H(jij ,Pr ) is the list of jobs on Pr ’s ready queue such that, ρ(jmn ) > ρ(jij ), and ρ(jij ) is the priority of jij , If H(j ij , P r ) is empty, Σ j mn ∈H(j ij ,P r ) e0mn (t) = 0 Definition : The instantaneous utilization, u0ij (t), of jij at time t depends on jij ’s remaining execution time and period. We define u0ij (t) as, e0ij (t) u0ij (t) = Ti The instantaneous utilization, U0 (Pr (t)), of an rt-client Pr is the sum of instantaneous utilization of all jobs on Pr ’s ready queue at time t. Definition : Slack time Sij (t), of jij at time t, is the maximum duration of time jij is allowed to delay its execution while ensuring that it does not miss its deadline. Sij (t) = (dij - t - e0ij (t)) If Sij (t) = 0, jij can meet its deadline, only if jij is scheduled to run immediately without any preemption. If Sij (t) < 0, jij can no longer meet its deadline. C. Task-to-Core Mapping DP scheduling is a partitioned based scheduling technique, in which, at any instant of time, each rt-client executes the highest priority job from its ready queue. Unlike partitioned scheduling, the partitioning of tasks in DP scheduling is dynamic in nature and is done during run time. In DP scheduling, task-to-core operation is done for every job, thus allowing different jobs of the same task to execute on different cores. Job j ij is released at time t P [1..(l − 1)] is an array of rt − clients in the system O[1..(l − 1)] is the outset time of j ij on P [1..(l − 1)] Q is a list of qualif ied rt − clients. Q is initialized to null. is qualif ied is a boolean variable initialized to f alse P mapped is the rt − client returned for x = 1 to (l − 1) do O[x] = compute outset time(P [x], j ij ); is qualif ied = is rtclient qualif ied(P [x], O[x], j ij ); if is qualif ied = true then Q = Q U P [x]; end if end for if Q 6= null then P mapped = minimum outset time(Q) assign job(j ij , Pmapped ); return else job splitting(j ij ); return end if Scheduling point is defined as the juncture in the process of scheduling when the scheduler makes decisions regarding which task to schedule next. Job release, job completion and job preemption are the scheduling points in DP-EDF scheduling algorithm and a job’s processor affinity changes only on its release. In DP-EDF scheduling, the task-to-core operation is done in two steps, viz., job mapping and job assignment. In job mapping, the rt-client to which the job must be assigned is determined, and in job assignment, the job is added to the mapped rt-client’s ready queue. 1) Job Release: In DP-EDF, all jobs are released on the rt-server. The rt-server performs both job mapping and job assignment operations. Job Mapping: In DP scheduling, a newly released job, jij , is mapped to an rt-client, Pr if and only if on addition of jij , jij and the other jobs previously assigned to Pr do not miss their deadline. For every job jij in DP scheduling, • H(jij ,Pk ) is defined as the list of jobs, jmn , on Pk ’s ready queue such that, ρ(jmn ) > ρ(jij ) • L(jij ,Pk ) is defined as the list of jobs, jmn , on Pk ’s ready queue such that, ρ(jmn ) < ρ(jij ) • Q(jij ), called the qualified set of jij , is a set of rtclients that can accommodate jij without any deadline misses. From the definition of slack time we know that a job, jij , is schedulable on Pk , if it starts its execution before its slack time becomes negative. All jobs assigned to an rt-client can meet their deadline, if the outset time of each of the job assigned to the rt-client is less than its slack time. While adding a new job to an rt-client if the above condition is still true, then all jobs in the rt-client remain schedulable. The DP scheduling technique uses this concept to partition the jobs. When a job, jij , is released at time t, the rt-server performs the job mapping operation by first computing its qualified set, Q(jij ), using jij ’s slack time, outset time and the rt-client’s instantaneous utilization. A core Pr ∈ Q(jij ) if, (a) Oij (Pr , t) < Sij (t) (b) Omn (Pr , t) + e0ij (t) < Smn (t), ∀ jmn ∈ L(jij ,Pr ) (c) U0 (P r ) + u0ij (t) < 1 Condition (a) checks if jij is schedulable on Pr by comparing its slack time with its outset time. Addition of jij to the rt-client’s ready queue does not affect the schedulability of the jobs with higher priority but affects the schedulability of jobs with lower priority. Condition (b) checks if the increase in outset time affects the schedulability of jobs with lower priority. Condition (c) checks if EDF schedulability test is satisfied. The DP-EDF scheduling algorithm deems Pr qualified to schedule jij , if it satisfies the above conditions. The DPEDF scheduling algorithm maps the released jij , to an rt-client in Q(jij ) on which its outset time is minimum. Algorithm 2 job splitting(J) Job j ij = J is released at time t P[1..(l-1)] is an array of (l-1) rt-clients in the system E s [1..(l−1)] is an array of max execution time that rt-client P[1..(l-1)] can assign to j ij . for x = 1 to (l − 1) do E s [x] = compute spare time(j ij , P [x]); end for E s max = compute max(E s [1..(l − 1)]); if (E s max ! = 0) then Pmapped = rtclient of (E s max ) j 1 = assign param(j ij , E s max , null); j 2 = assign param(j ij , 0, j 1 ); assign job(j 1 , Pmapped ); set release timer(j 2 ); else j ij can not be split; end if Unlike the traditional partitioning scheme, if Q(jij ) is empty for jij released at t, the DP scheduling technique does not regard the job as not schedulable but rather schedules it by allowing it to migrate across rt-clients. Each individual job in DP-EDF scheduling algorithm starts its execution at the end of its outset time, i.e., when its outset time becomes zero. However, the job would be schedulable if it even started its execution at the end of its slack time, i.e., when its slack time becomes zero. This implies that an rt-client Pr with job jmn having slack Smn (t) and outset Omn (Pr , t) can accommodate a higher priority job jij with execution time less than (Smn (t) - Omn (Pr , t)). This difference between slack and outset time of jmn , (Smn (t) - Omn (Pr , t)), at time t is denoted as A(jmn , t). Thus, an rt-client containing a set of assigned jobs can accommodate jij if e0ij (t) < min (A(jmn , t), ∀ jmn ∈ L(jij , Pr )) The amount of execution time that a rt-client Pr can spare for jij is denoted as Es (Pr , jij ) where, Es (Pr , jij ) = min(A(jmn , t), ∀ jmn ∈ L(jij , Pr )) When jij arrives with Q(jij ) as null set, the DP-EDF algorithm performs the job splitting operation by computing Es (Pr , jij ) value for all rt-clients Pr , by assigning jij the highest priority in each of them. The rt-client, Pk , with the maximum value of Es (Pk , jij ) is chosen for the execution of jij . The DP-EDF scheduling algorithm splits jij into j ij 1 and j ij 2 , where j ij 1 is assigned to Pk and j ij 2 is released after the completion of j ij 1 with the following parameters e0ij 1 (t) = Es (Pk , jij ) (a) dij 1 = e0ij 1 (t) + tij (b) tij 2 = e0ij 1 (t) + tij = dij 1 (c) dij 2 = Di + tij = dij (d) e0ij 2 (t) = e0 ij (t) - e0ij 1 (t) (e) where tij is the release time of jij , (a) and (e) are the remaining execution time of j ij 1 and j ij 2 , (b) and (d) are the absolute deadline of j ij 1 and j ij 2 and (c) is the release time of j ij 2 . The modified dij 1 and e0 ij 1 (t) of j ij 1 ensures that it is executed with highest priority on Pk . When j ij 2 is released, it is treated like a non-split job. The qualified set for j ij 2 is computed and it is assigned to the best rt-client from Q(jij 2 ). If Q(j ij 2 ) is a null set, j ij 2 is further split and the whole process is repeated again till the job exhausts its execution time. Algorithm 3 is rtclient qualified (P, O, J) j ij is released at time t. P is a rt-client. Oij is the outset time of j ij in P. j ij = J and Oij = O. cond one and cond three are initialized to false. cond two is initialized to true. Sij is the slack time of j ij . if edf schedulabilty test(P, j ij ) then cond one = true; else return f alse; end if for all j mn assigned to P do if ((ρ(jij ) > ρ(jmn )) then Omn = compute outset time(P, jmn ) if (Omn + e0ij < S mn )) then cond two = true; else return f alse; end if end if end for if Oij < S ij then cond three = true; end if if (cond one and cond two and cond three) then return true; else return f alse; end if Hence, for every j ij (split or non-split) released in the rtscheduling entity, DP-EDF scheduling algorithm performs the following job mapping operations • Computes qualified set, Q(jij ) • If Q(jij ) is not empty, j ij is assigned to Pr ∈ Q(jij ), such that, Oij (Pr , t) is minimum in Q(jij ). • If Q(jij ) is empty, j ij is split into j ij 1 and j ij 2 , while j ij 1 is assigned and executed on an rt-client with the highest priority immediately, j ij 2 is released after j ij 1 finishes. Algorithm 1 depicts the job mapping operation on the release of j ij . The function compute outset time(P [x], j ij ) calculates Oij (Px , t), is rtclient qualif ied(P, O, J) depicted in algorithm 3 returns true if P can accommodate J, minimum outset time(Q) determines the rt-client having minimum outset time from the set Q(jij ) and assign job(j ij , Pmapped ) assigns (jij ) to Pmapped . In algorithm 3, edf schedulability test(P, j ij ) returns true if on assigning j ij to P, its utilization does not exceed 1. If L(jij ,Px ) is a null set, then Px is eligible if cond one and cond three are true, and hence, cond two is initialized to true. Algorithm 2 depicts how j ij with an empty qualified set is split and mapped to an rt-client. The function compute spare time(Px , j ij ) calculates Es (Px , jij ) s ) by assigning jij the highest in Px , rtclient of (Emax s returns the processor with maximum E value and set release timer(j) sets j’s release timer. Algorithm 4 calculates and assigns parameters to the split jobs. The job, j ij , is split into two parts j ij 1 and j ij 2 . In assign param(Jorg , max exec, Jsplit ), Jsplit refers to j ij 1 , the split job that is scheduled to run immediately. Hence, j ij 1 and j ij 2 calls assign param(Jorg , max exec, Jsplit ) with Jsplit = null and Jsplit = j ij 1 respectively. Algorithm 4 assign param(Jorg, max exec , Jsplit) j ij = Jorg is released at t. ret job is the job returned with modified parameters. dret job is the absolute deadline of ret job. eret job is the remaining execution time of ret job. rret job is the release time of ret job. if Jsplit = null then eret job = max exec; dret job = eret job + t; return ret job; else rret job = eJsplit + t; eret job = ejij − eJsplit ; dret job = dij ; return ret job; end if Fig. 2: An Example of DP Scheduling and T 4 (9,17) are released. The qualified set for both the jobs include all three processors, and hence, j31 preempts the idle task in P3 . The outset time of j41 is minimum, i.e., 0, in P2 and hence, it is assigned to P2 . At t=2, the first instance of T 5 (6, 8), j51 , is released with absolute deadline at 10 and its qualified set is empty, so j51 is split. The Es values for j51 in P1 , P2 and P3 are 5, 2 and 4 respectively and hence, j 511 is assigned to P1 with e0511 = 5 and absolute deadline 7 and j 512 is set to be released at t=7. At t=7, j 512 is released with e0512 = 1 and absolute deadline 10 and is assigned to P3 . Note that, in the figure, the split jobs of j51 , j 511 and j 512 are represented as j51 . Job Assignment: After the job mapping process is completed, the rt-server performs the job assignment operation. In the job assignment operation, the rt-server enqueues the job on the mapped rt-client’s ready queue. If the enqueued job has the highest priority in the rt-client, the rt-server sends a reschedule IPI to the rt-client. 2) Job Completion: When a job completes its execution, the rt-client adds it to the global release queue and schedules the next highest priority job from its ready queue to run. IV. E XECUTION OVERHEAD Example: Figure 2 depicts an example of DP-EDF scheduling algorithm. When there are no real-time jobs in its ready-queue, the rt-client schedules the idle task, idle task, to run and the priority of idle task is lower than priority of all the real-time tasks. In the figure, we consider an implicit task set, i.e, task’s deadline is equal to its period. Hence, T 1 (5,10) represents a task with WCET 5 and relative deadline 10. The job, jij , is represented by tuple (e0ij , Oij , Sij , dij ). Execution overhead of a real-time task is defined as the time spent by a job in the kernel executing codes that are not related to its function [8]. The overheads and latencies incurred while scheduling real-time tasks, called the scheduler overhead, contribute to the task’s response time. Larger the task’s scheduler overhead, the higher its response time. A current task’s execution overhead can affect the response time of the other tasks. Since the correctness of the real-time system depends on its response time, the scheduler should ensure that the overhead is predictable and as small as possible. At t=0, T 1 (5,10) and T 2 (9,20) are released and assigned to P1 and P2 respectively. At t=1, first instance of T 3 (6,10) DP-EDF is a dynamic event driven scheduling technique, where job releases are achieved using interrupts. Release 1 0.8 0.8 0.8 0.6 0.4 0.2 0 Success Ratio 1 Success Ratio Success Ratio 1 0.6 0.4 0.2 DP-EDF P-EDF G-EDF 0.5 0.55 0 0.6 0.65 0.7 0.75 0.8 System Utilization (a) (umin , umax ) = (0.1, 0.5) 0.85 0.9 0.95 0.6 0.4 0.2 DP-EDF P-EDF G-EDF 0.5 0.55 0 0.6 0.65 0.7 0.75 0.8 System Utilization 0.85 0.9 0.95 DP-EDF P-EDF G-EDF 0.5 (b) (umin , umax ) = (0.5, 1) 0.55 0.6 0.65 0.7 0.75 0.8 System Utilization 0.85 0.9 (c) (umin , umax ) = (0.1, 1) Fig. 3: System Utilization v/s Success Ratio overhead is the time taken to service the release interrupt. The release interrupt handler is responsible for mapping the job to a rt-client and enqueuing the job to the mapped rt-client’s ready queue. As discussed previously, all job releases are handled by the rt-server, and hence, release overhead does not affect the tasks executing on the rt-client. The rt-server needs to have an updated global view of the rt-scheduling entity to ensure that the jobs are assigned to the correct rt-client. The rt-server performs the job mapping operation either by directly accessing the rt-client’s ready queue or by maintaining its own updated copy of all the ready queues. Direct access includes the extra overhead of acquiring the ready queue’s lock while scanning the rt-client. Instead of acquiring the locks on all the ready queues at the same time, the locking can be done sequentially one after the other. In systems, where the rt-server maintains a copy of the ready queues, whenever there is job release, completion or preemption, the rt-server’s copy also needs to be updated. Since the job completion and preemption are events that are specific to the rt-clients, the rtserver is informed of this change using IPIs which is an extra overhead in the job mapping process. Job mapping overhead of the rt-server is the amount of time spent performing the job mapping operation. Once the mapping is done and an rtclient is selected, the job is assigned to the rt-client, which involves locking and adding the job to the rt-client’s ready queue. Job enqueue overhead is the time taken to assign a job to a rt-client. Release overhead is the sum of job mapping and job enqueue overhead. If the assigned job has the highest priority, the rt-server sends reschedule IPI to the rt-client which may incur Reschedule IPI latency. Release interrupt latency on the rt-server is the total amount of time the release interrupt waits before its handler is executed. The rt-client shares a small amount of the extra overhead incurred while providing the rtserver with an updated view of the rt-scheduling entity. When a job completes, the rt-client is responsible for removing the job from its ready queue, adding it to the global release queue and scheduling the next highest priority task to execute. Context switching overhead is the time taken to perform the context switching between two processes. Completion overhead is the time taken to perform dequeue and enqueue operation on ready and global release queue respectively, on job completion. Scheduling overhead of a job in the rt-client is the time taken to schedule it for execution, which is the sum of context switching overhead and completion overhead. Splitting overhead is the total time spent in performing operations related to job splitting on its release and during its execution. Since the split jobs can migrate between rt-clients, this overhead also includes the cost of migrating the job. This overhead is incurred both on the rt-client and rt-server. Preemption/Migration overhead is the cost incurred while reloading the job’s cache when it is scheduled to execute again after its migration or preemption. In DP-EDF scheduling, queue contention is a major contributor to release and scheduling overhead. The release queue is a global shared data structure, and in the worst case scenario, a processor from the rt-client may have to wait till all the other processors have completed their manipulation of the release queue before it can access the same. The rt-client’s ready queue on the other hand is local to the rt-client. V. E VALUATION In this section, we test the performance of DP-EDF algorithm in terms of schedulability and scheduler overhead by comparing it with Global EDF (G-EDF) and partitioned EDF (P-EDF) scheduling algorithm through a series of experiments. We carry out two types of experiments, one by simulating the DP-EDF algorithm as a user program and the other by implementing the DP-EDF algorithm in the Linux kernel. The P-EDF scheduling algorithm in our experiments uses first-fit decreasing heuristic for partitioning the tasks, i.e., the tasks are arranged in non-decreasing order of utilization and are mapped to the processors in the same order, and a task is allocated to the first processor in which it fits. TABLE I: Maximum number of tasks split Utilization (0.1, 0.5) (0.5, 1) (0.1, 1) 0.5 0.6 0.7 0.8 0.9 0 0 0 0 1 2 5 9 23 35 2 2 4 11 30 A. Simulation 1) Setup: DP-EDF scheduling algorithm was simulated by implementing it using the C programming language. Our process of task generation is similar to that in [4]. The task sets are randomly generated for a given tuple (us , umin , umax ). In a task set with (us , umin , umax ), each task has utilization between umin and umax . Sum of utilization of all the tasks in the 0.95 In this paper, we show the simulation results for task sets having us in the range [0.5, 0.95] in steps of 0.05. For each us , we consider three distinct (umin , umax ) pair, (0.1, 0.5), (0.5, 1) and (0.1, 1). For every (us , umin , umax ) tuple, we generate 10,000 task sets. The performance of a scheduling algorithm is determined by calculating the number of schedulable task sets for a given (us , umin , umax ). We express this performance metric using success ratio [4], i.e., the ratio of the number of schedulable task sets to the total number of scheduled task sets. We assume that the various scheduling overheads like task migration, task preemption, lock contention cost, reschedule IPI latency and so forth to be zero. 2) Simulation Results: Figure 3a, 3b and 3c depict the simulation results. As seen in figure 3a, the performance of all 3 scheduling algorithms are comparable when scheduling light tasks(tasks with utilization between 0.1 and 0.5). From the figures, we observe that DP-EDF algorithm can successfully schedule task sets having total utilization as high as 85%. For all three scheduling algorithms, as the utilization increases, the success ratio decreases. However, in DP-EDF scheduling algorithm, this drop in success ratio is not as drastic as in G-EDF or P-EDF and the success ratio drops only at very high(greater than 85%) system utilization. Based on the simulation results, we can conclude that the utilization bound of DP-EDF algorithm is 85% and in terms of schedulability DP-EDF is superior to both G-EDF and P-EDF scheduling algorithm. For the DP-EDF scheduling algorithm, table I depicts the number of times the tasks are splits in the task set having the maximum number of task splitting from the 10,000 executed task sets for each of the (us , umin , umax ) tuple. The maximum number of splitting occurs in the task set containing tasks with utilization between (0.5, 1), and this number does not exceed 35. B. Implementation 1) Setup: We evaluate the DP-EDF scheduling algorithm by implementing it as a scheduler plugin in the LITMUSRT release overhead(in microseconds) 60 DP-EDF PSN-EDF GSN-EDF-DP 40 20 0 0 3 6 9 12 15 no of tasks 18 21 Fig. 4: Average release overhead 24 27 600 release overhead(in microseconds) given task set does not exceed Umax , where Umax = m∗us , m is the number of cores and us is called the system utilization. The task τi ’s, Ti , Di and Ui , are generated randomly and its Ei is computed using Ti *Ui . The simulation is carried out with 4 cores for an implicit deadline system where Ti = Di . In DP-EDF scheduling algorithm, the 4 cores are divided into 1 rt-server and 3 rt-clients. In G-EDF and P-EDF, the real-time tasks execute on 3 cores. 500 DP-EDF PSN-EDF GSN-EDF-DP 400 300 200 100 0 0 3 6 9 12 15 no. of tasks 18 21 24 27 Fig. 5: Maximum release overhead platform. LITMUSRT [9] is a real-time extension to Linux kernel which provides a platform to implement and test different multiprocessor scheduling algorithm. The execution overhead of DP-EDF is compared against GSN-EDF-DP and PSN-EDF, where DP-EDF, GSN-EDF-DP and PSN-EDF are LITMUSRT scheduler plugins implementing DP-EDF, G-EDF and P-EDF scheduling algorithm respectively.The GSN-EDF-DP supports dedicated interrupt handling, i.e., one processor is exclusively dedicated for interrupt handling and this core does not execute any real-time tasks. The experiments are carried out on Intel’s core i7 3632-QM platform, which is a quad core processor with 8 logical cores running at 2.2 GHz clock speed. We use LITMUSRT version 2014.2 which is based on Linux kernel 3.10.42. Execution overheads are calculated using a light weight event tracing toolkit called FeatherTrace. This highly portable toolkit that uses multi-processor safe FIFO buffer to trace events at a very low overhead is included in the LITMUSRT distribution. The experiments are conducted with 4 Linux processors and 4 realtime processors. The 4 real-time processors contain 1 rt-server and 3 rt-clients which forms the rt-scheduling entity. Hence, in both GSN-EDF-DP and DP-EDF, the real time tasks are executed only on 3 processors. We conducted experiments on task sets having n number of tasks in the range (3, 27) in steps of 3. The tasks in the task sets were generated randomly where the task’s utilization is in the range (0.1, 1). The experiment was performed on 100 task sets. In DP-EDF, job splitting is achieved using hrtimer. Each task is associated with two timers, release hrtimer and split hrtimer. Both the types are programmed to expire on the rtserver, i.e, the hrtimer interrupts the rt-server, where its handler is executed. The job programs the release timer to expire at its next release on its completion. When a job is split, it programs its split hrtimer to expire after it has consumed the time it has been allocated on the given processor before it starts executing on the same. On its expiry, the split hrtimer’s interrupt handler preempts the job and assigns it an appropriate processor. In the current implementation of DP-EDF plugin, the job mapping operation is performed by acquiring locks on the rt-client’s ready queue. Instead of using locks, one can maintain a copy of the rt-client, called rt-client, image at the rt-server which is updated on every job release, completion and preemption. The rt-server can then use the rt-client image to perform the job mapping. real-time tasks in multicore platforms. The basic motivation of DP scheduling is to maximize CPU utilization and minimize scheduling overheads. Dynamic partitioning of the real-time tasks ensures that the processors in the system are used to their fullest capacity, while the division of cores helps in reducing the scheduling overhead. The experimental results show the competitiveness of the DP scheduling technique with a few state-of-art real-time scheduling techniques. Our future work includes scaling DP scheduling technique to larger number of cores and also to further reduce the scheduling overheads. 600 scheduling overhead(in microseconds) 2) Experimental Results: Figure 4 and 5 plot the average and maximum release overhead. In both GSN-EDF-DP and DP-EDF, the release overhead is incurred by the hrtimer interrupt handler on the dedicated processor(rt-server), and hence this release overhead does not affect the response time of the other tasks executing on the rt-client. Though the release overhead is smaller in PSN-EDF, this is incurred by a task on the cores executing real-time tasks and can affect the response time of the other real-time tasks. The release overhead of DPEDF plugin includes the time taken to perform the job mapping and assignment, and hence it is slightly higher than that in GSN-EDP-DP and PSN-EDF. Figure 6 and 7 plot the average and maximum scheduling overhead. The maximum scheduling overhead of DP-EDF is smaller than that in GSN-EDF-DP and PSN-EDF, though its average scheduling overhead is greater than that in GSNEDF-DP and PSN-EDF. In DP-EDF, the scheduling overhead is incurred on the rt-client and from the smaller maximum value we can conclude that DP-EDF is more predictable when compared to GSN-EDF-DP and PSN-EDF. Even though the average scheduling overhead of DP-EDF is higher, it has greater utilization bound than PSN-EDF and GSN-EDF-DP. This scheduling overhead in DP-EDF can further be reduced by using a rt-client image, which is a part of our future work. DP-EDF PSN-EDF GSN-EDF-DP 400 300 200 100 0 0 3 6 9 12 15 no. of tasks 18 21 24 27 Fig. 7: Maximum scheduling overhead R EFERENCES 30 scheduling overhead(in microseconds) 500 [1] 20 [2] 10 [3] DP-EDF PSN-EDF GSN-EDF-DP 0 0 3 6 9 12 15 no. of tasks 18 21 24 27 Fig. 6: Average scheduling overhead [4] [5] [6] VI. R ELATED W ORKS DP scheduling technique uses a dedicated processor to handle the scheduling activities of the cores executing the real-time tasks. This concept of using dedicated processor was introduced in Spring Kernel [10]. Brandenburg et al. [11] implemented a global scheduler in LITMUSRT using dedicated processor to handle release interrupt(GSN-EDF-DP) and they proved that the dedicated processor based scheduler performed better than the classic global scheduler on LITMUSRT . Recently, Cerqueira et al. [12], implemented a highly scalable global scheduler using the concept of dedicated processor and message passing. SchedISA [8] is a Linux based multicore scheduler designed to schedule hard real-time and Linux tasks simultaneously. DP’s system architecture is similar to SchedISA; however, SchedISA implements P-EDF algorithm where the tasks are mapped to a processor before runtime. [7] [8] [9] [10] [11] [12] VII. C ONCLUSION In this paper, we present a dynamic partitioning based scheduling technique, called DP scheduling, for scheduling J. M. L´opez, J. L. D´ıaz, and D. F. Garc´ıa, “Utilization bounds for edf scheduling on real-time multiprocessor systems,” Real-Time Systems, vol. 28, no. 1, pp. 39–68, 2004. S. K. Baruah, N. K. Cohen, C. G. Plaxton, and D. A. Varvel, “Proportionate progress: A notion of fairness in resource allocation,” Algorithmica, vol. 15, no. 6, pp. 600–625, 1996. A. Bastoni, B. B. Brandenburg, and J. H. Anderson, “Is semi-partitioned scheduling practical?” in Real-Time Systems (ECRTS), 2011 23rd Euromicro Conference on. IEEE, 2011, pp. 125–135. S. Kato, N. Yamasaki, and Y. Ishikawa, “Semi-partitioned scheduling of sporadic task systems on multiprocessors,” in Real-Time Systems, 2009. ECRTS’09. 21st Euromicro Conference on. IEEE, 2009, pp. 249–258. B. Andersson and K. Bletsas, “Sporadic multiprocessor scheduling with few preemptions.” in ECRTS, vol. 8, 2008, pp. 243–252. G. Nelissen, S. Funk, and J. Goossens, “Reducing preemptions and migrations in ekg,” in Embedded and Real-Time Computing Systems and Applications (RTCSA), 2012 IEEE 18th International Conference on. IEEE, 2012, pp. 134–143. K. Bletsas and B. Andersson, “Notional processors: an approach for multiprocessor scheduling,” in Real-Time and Embedded Technology and Applications Symposium, 2009. RTAS 2009. 15th IEEE. IEEE, 2009, pp. 3–12. N. Saranya and R. C. Hansdah, “An implementation of partitioned scheduling scheme for hard real-time tasks in multicore linux with fair share for linux tasks,” in RTCSA, 2014. J. M. Calandrino, H. Leontyev, A. Block, U. Devi, and J. H. Anderson, “Litmusˆ rt: A testbed for empirically comparing real-time multiprocessor schedulers,” in Real-Time Systems Symposium, 2006. RTSS’06. 27th IEEE International. IEEE, 2006, pp. 111–126. J. A. Stankovic and K. Ramamritham, “The spring kernel: A new paradigm for real-time systems,” Software, IEEE, vol. 8, no. 3, pp. 62–72, 1991. B. B. Brandenburg and J. H. Anderson, “On the implementation of global real-time schedulers,” in Real-Time Systems Symposium, 2009, RTSS 2009. 30th IEEE. IEEE, 2009, pp. 214–224. F. Cerqueira, M. Vanga, and B. B. Brandenburg, “Scaling global scheduling with message passing,” in Real-Time and Embedded Technology and Applications Symposium, 2014, 2014.
© Copyright 2024