Download Report

Dynamic Partitioning Based Scheduling of
Real-Time Tasks in Multicore Processors
N. Saranya and R. C. Hansdah
Department of Computer Science and Automation
Indian Institute of Science, Bangalore, India
Email: {saranya.n, hansdah}@csa.iisc.ernet.in
Abstract—Existing real-time multicore schedulers use either
global or partitioned scheduling technique to schedule real-time
tasks. Partitioned scheduling is a static approach in which, a task
is mapped to a per-processor ready queue prior to scheduling it
and it cannot migrate. Partitioned scheduling makes ineffective
use of the available processing power and incurs high overhead
when real-time tasks are dynamic in nature. Global scheduling
is a dynamic scheduling approach, where the processors share a
single ready-queue to execute the highest priority tasks. Global
scheduling allows task migration which results in high scheduling
overhead. In this paper, we present a dynamic partitioning
based scheduling of real-time tasks, called DP scheduling. In DP
scheduling, jobs of tasks are assigned to cores when they are
released and remain in the same core till they finish execution.
The partitioning in DP scheduling is done based on the slack
time and priority of jobs. If a job cannot be allocated to any
core, then it is split, and executed on more than one core.
DP scheduling technique attempts to retain good features of
both global and partitioned scheduling without compromising on
resource utilization, and at the same time, also tries to minimize
the scheduling overhead. We have tested DP scheduling technique
with EDF scheduling policy at each core, and we term this
scheduling algorithm as DP-EDF. The performance of DP-EDF
scheduling algorithm has been evaluated using simulation study
and its implementation in LITMUSRT on a 64-bit intel processor
with eight logical cores. Both simulation and experimental results
show that DP-EDF scheduling algorithm has better performance
in terms of resource utilization, and comparable or better
performance in terms of scheduling overhead in comparison to
contemporary scheduling algorithms.
I.
I NTRODUCTION
The multicore processor architecture provides higher performance without driving up the power consumption and heat
dissipation. The multicore processors are designed to boost
performance by integrating two or more processors into a
single socket. A wide variety of products from smart phones
to desktops to servers have gained enormously by incorporating the multicore technology. Embedded systems can also
immensely benefit by using the multicore technology. Hence,
in recent years, there has been an increase in many research
and industrial works being carried out to exploit the advantages
of multicore processors for real-time systems.
The real-time tasks in multicore systems are scheduled
using two major scheduling techniques, viz., partitioned and
global scheduling. In partitioned scheduling, n independent
real-time tasks are distributed across m cores in the m-core
system such that each task is assigned to a single core.
One of the major advantage of partitioned scheduling is that
different uniprocessor scheduling algorithms can be applied to
the system after the tasks are partitioned. Since tasks do not
migrate across processors, the scheduling overhead incurred in
partitioned scheduling is free of migration cost. The process
of mapping tasks to cores is done off-line before the real-time
tasks are scheduled to run, using bin packing heuristics which
is a known NP-hard problem. Since the tasks are statically
allocated using the bin-packing method, the processors are not
utilized to their complete capacity. In on-line systems, where
only a few tasks are known prior to scheduling, every time a
new task arrives, the “task-to-core” mapping is redone, and
hence, this reallocation of tasks done during runtime increases
the system’s scheduling overhead. According to [1], a set of
n real-time tasks with maximum utilization of each task not
greater than umax is schedulable using partitioned scheduling
on an m processor system if U < (mβ+1)/(β+1), where β =
1/umax and U is the total utilization of the set of n real-time
tasks.
In global scheduling, the m cores in the system share a
single system-wide queue. The ready tasks are sorted and
arranged according to their priority in this ready queue. At any
instant of time, the m cores in the system execute the m highest
priority tasks. In general, global scheduling is better suited
for on-line systems. Unlike partitioned scheduling, the tasks
in global scheduling can migrate between cores, resulting in
higher scheduling overhead. The migration can be of two types,
viz., task-level migration and job-level migration. In task-level
migration, migration occurs at job boundaries and in job-level
migration, the migration occurs during job execution. In global
scheduling, the Pfair class of algorithms [2] have utilization
bound of 100%. But Pfair class of algorithms suffer from high
scheduling overhead due to large number of migrations and
context switchings.
Semi-partitioned scheduling [3] [4] is another scheduling
technique which combines the concept of global and partitioned scheduling. In semi-partitioned scheduling, partitioning
algorithm is applied to the task set and it is divided into two
subsets, viz., the non-split tasks and the split tasks. Most of the
tasks are non-split tasks, and each non-split task is allocated
to a single core. None of the split tasks can be allocated to
a single core (because the available utilization on each of the
core is less than the task’s utilization), and each of them is
split and assigned to more than one core. The non-split tasks
are fixed and do not migrate while the split tasks migrates
across the allocated cores. Unlike global scheduling, semipartitioned scheduling reduces the migration cost by restricting
the number of split tasks. However, in global scheduling, the
task migration is dynamic in nature, while, in semi-partitioned
scheduling, it is static and planned prior to the execution of
the tasks. Semi-partitioned scheduling tries to make maximum
use of the available processor utilization. Sporadic-EKG (SEKG) [5], [6] and Notional Processor Scheduling (NPS-F) [7]
are two semi-partitioned scheduling algorithms with utilization
bound of 88.8% and 66.6% respectively. Both these algorithms
can be configured to achieve utilization bound up to 100%, but
this comes with increased number of preemption and context
switching.
In this paper, we propose a novel dynamic partitioning
based scheduling of real-time tasks, called DP scheduling.
The DP scheduling technique has the good features of existing scheduling techniques such as the dynamic nature of
global scheduling, no-migration characteristics of partitioned
scheduling, and high resource utilization of semi-partitioning
scheduling with reduced overhead. Essentially, DP scheduling
entails the following.
III.
DP SCHEDULING TECHNIQUE
In this work, the DP scheduling technique uses Earliest
Deadline First (EDF) scheduling policy at each core. EDF is a
dynamic event-driven scheduling policy, in which the priority
of a task can change over a period of time; however, the
priority of a job remains fixed. In EDF scheduling policy, the
priority of a job depends on its absolute deadline, earlier the
absolute deadline, higher the priority. The system employing
the DP scheduling technique and EDF scheduling policy at
each core to schedule real-time tasks is abbreviated as DPEDF.
A. System Architecture
•
A dynamic partitioning approach, in which “task-tocore” mapping is done for every job at the time of its
release.
Based on their functions, the processors in the multicore
system are classified into two types, viz., real-time processors
and Linux processors. The m cores in the system are divided
into l real-time processors and (m-l) Linux processors. The
cores in the multicore system communicate with one another
using inter-processor interrupts (IPI).
•
A semi-partitioning approach to schedule those jobs
that cannot be assigned to a single processor by
allowing them to migrate across different processors.
1) Linux processors: Linux tasks are scheduled to run only
on Linux processors. All external interrupts (except IPIs and
timer interrupts) are redirected to the Linux processors.
•
A service core to manage all the scheduling activities
of the real-time tasks.
•
Earliest deadline first (EDF) based priority scheduling
at each core to schedule the real-time tasks.
2) Real-Time processors: The real-time processors are responsible for executing the real-time tasks and managing their
scheduling activities. The l real-time processors are divided
into one rt-server and (l-1) rt-clients. The rt-server and the rtclients together form an rt-scheduling entity. Figure 1 gives an
overview of the rt-scheduling entity.
The DP scheduling technique aims to make maximum
utilization of the available CPU resources and minimize the
scheduling overhead while ensuring the schedulability of the
real-time systems. Both on-line and off-line systems could
greatly benefit from the DP scheduling technique.
The rest of the paper is organized as follows. The next
section elucidates the system model. Section III presents the
details of DP scheduling technique. Section IV discusses the
various execution overhead. The DP scheduling technique is
evaluated in section V. Section VI reviews a few related works
and section VII concludes the paper.
II.
SYSTEM MODEL
We consider a set of periodic real-time tasks, τ , containing n independent tasks, {τ1 , τ2 , ..., τn }. The real-time
tasks execute on a multicore platform containing m identical
cores, {P1 , P2 , ..., Pm }. Each task, τi , is represented using the
tuple (Ei , Di , Ti ), where Ei , Di and Ti are the task’s worst
case execution time (WCET), relative deadline and period
respectively. The utilization of τi , denoted by Ui , is equal
i
to E
Ti . The utilization of the task set τ , denoted by U(τ ), is
Σi=1 to n (Ui ).
Job is an instance of a task. The j th instance of τi is
represented as j ij . The absolute deadline of a job, jij , released
at time t is Di + t. The inter-arrival time of two successive
jobs jij and ji(j+1) , is equal to Ti . All jobs complete their
execution before the arrival of the next job ,i.e., jij completes
its execution before the arrival of ji(j+1) . At any given time,
no two processors can execute the same job and a job once
released does not self suspend itself.
•
rt-server: The rt-server manages the scheduling activities of the rt-clients. All jobs are released on the
rt-server. On a job release, the rt-server performs the
task-to-core mapping and dispatches the job to the
appropriate rt-client. The rt-server does not execute
any real-time tasks, and it is only responsible for
supervising the execution of the real-time tasks on the
rt-clients.
•
rt-client: The real-time tasks are executed on the rtclients. Each rt-client is associated with a queue data
structure, called the ready queue. The ready queue
contains all runnable tasks waiting for the CPU resource in decreasing order of their priority. On job
completion, the rt-client schedules the next highest
priority task from its ready queue to execute. Tasks
that have completed their execution in the current
period and are waiting to be released in the next period
are added to the global release queue. There is one
global release queue per rt-scheduling entity.
B. Job Model
A job, jij , is represented using tuple (eij , dij ), where
eij is the job’s WCET and dij is its absolute deadline. The
release time of jij is represented by tij . In DP-EDF scheduling
algorithm, in addition to the worst-case execution time and
absolute deadline, every job is associated with four more
parameters, viz., remaining execution time, outset time, slack
time and instantaneous utilization.
Algorithm 1 job to core mapping
Fig. 1: Overview of rt-scheduling entity in DP Scheduling
Technique.
Definition : The remaining execution time of job jij at time
t, denoted by e0ij (t), is Ei minus the total amount of time the
job spent executing on the processors.
Definition : Outset time of job jij on Pk ’s ready queue
at time t, denoted by Oij (Pk , t), is defined as the maximum
duration of time jij has to wait in Pk ’s ready queue before it
can start its execution on Pk .
if j ij is not assigned to any core, then, ∀ rt-client Pr
Oij (Pr , t) = Σ j mn ∈H(j ij ,P r ) e0mn (t),
if j ij is assigned(to core Pk , then, ∀ rt-client Pr
Σj mn ∈H(j ij ,P r ) e 0mn (t), if r = k,
Oij (Pr , t) =
∞, if r 6= k
,where
e0mn (t) is the remaining execution time of jmn ,
H(jij ,Pr ) is the list of jobs on Pr ’s ready queue
such that,
ρ(jmn ) > ρ(jij ), and ρ(jij ) is the priority of jij ,
If H(j ij , P r ) is empty, Σ j mn ∈H(j ij ,P r ) e0mn (t) = 0
Definition : The instantaneous utilization, u0ij (t), of jij at
time t depends on jij ’s remaining execution time and period.
We define u0ij (t) as,
e0ij (t)
u0ij (t) = Ti
The instantaneous utilization, U0 (Pr (t)), of an rt-client Pr is
the sum of instantaneous utilization of all jobs on Pr ’s ready
queue at time t.
Definition : Slack time Sij (t), of jij at time t, is the
maximum duration of time jij is allowed to delay its execution
while ensuring that it does not miss its deadline.
Sij (t) = (dij - t - e0ij (t))
If Sij (t) = 0, jij can meet its deadline, only if jij is scheduled
to run immediately without any preemption. If Sij (t) < 0, jij
can no longer meet its deadline.
C. Task-to-Core Mapping
DP scheduling is a partitioned based scheduling technique,
in which, at any instant of time, each rt-client executes the
highest priority job from its ready queue. Unlike partitioned
scheduling, the partitioning of tasks in DP scheduling is dynamic in nature and is done during run time. In DP scheduling,
task-to-core operation is done for every job, thus allowing
different jobs of the same task to execute on different cores.
Job j ij is released at time t
P [1..(l − 1)] is an array of rt − clients in the system
O[1..(l − 1)] is the outset time of j ij on P [1..(l − 1)]
Q is a list of qualif ied rt − clients.
Q is initialized to null.
is qualif ied is a boolean variable initialized to f alse
P mapped is the rt − client returned
for x = 1 to (l − 1) do
O[x] = compute outset time(P [x], j ij );
is qualif ied = is rtclient qualif ied(P [x], O[x], j ij );
if is qualif ied = true then
Q = Q U P [x];
end if
end for
if Q 6= null then
P mapped = minimum outset time(Q)
assign job(j ij , Pmapped );
return
else
job splitting(j ij );
return
end if
Scheduling point is defined as the juncture in the process
of scheduling when the scheduler makes decisions regarding
which task to schedule next. Job release, job completion
and job preemption are the scheduling points in DP-EDF
scheduling algorithm and a job’s processor affinity changes
only on its release.
In DP-EDF scheduling, the task-to-core operation is done
in two steps, viz., job mapping and job assignment. In job
mapping, the rt-client to which the job must be assigned is
determined, and in job assignment, the job is added to the
mapped rt-client’s ready queue.
1) Job Release: In DP-EDF, all jobs are released on the
rt-server. The rt-server performs both job mapping and job
assignment operations.
Job Mapping: In DP scheduling, a newly released job, jij ,
is mapped to an rt-client, Pr if and only if on addition of jij ,
jij and the other jobs previously assigned to Pr do not miss
their deadline. For every job jij in DP scheduling,
•
H(jij ,Pk ) is defined as the list of jobs, jmn , on Pk ’s
ready queue such that, ρ(jmn ) > ρ(jij )
•
L(jij ,Pk ) is defined as the list of jobs, jmn , on Pk ’s
ready queue such that, ρ(jmn ) < ρ(jij )
•
Q(jij ), called the qualified set of jij , is a set of rtclients that can accommodate jij without any deadline
misses.
From the definition of slack time we know that a job, jij , is
schedulable on Pk , if it starts its execution before its slack time
becomes negative. All jobs assigned to an rt-client can meet
their deadline, if the outset time of each of the job assigned
to the rt-client is less than its slack time. While adding a new
job to an rt-client if the above condition is still true, then all
jobs in the rt-client remain schedulable. The DP scheduling
technique uses this concept to partition the jobs. When a job,
jij , is released at time t, the rt-server performs the job mapping
operation by first computing its qualified set, Q(jij ), using
jij ’s slack time, outset time and the rt-client’s instantaneous
utilization. A core Pr ∈ Q(jij ) if,
(a) Oij (Pr , t) < Sij (t)
(b) Omn (Pr , t) + e0ij (t) < Smn (t), ∀ jmn ∈ L(jij ,Pr )
(c) U0 (P r ) + u0ij (t) < 1
Condition (a) checks if jij is schedulable on Pr by comparing its slack time with its outset time. Addition of jij to
the rt-client’s ready queue does not affect the schedulability of
the jobs with higher priority but affects the schedulability of
jobs with lower priority. Condition (b) checks if the increase
in outset time affects the schedulability of jobs with lower
priority. Condition (c) checks if EDF schedulability test is satisfied. The DP-EDF scheduling algorithm deems Pr qualified
to schedule jij , if it satisfies the above conditions. The DPEDF scheduling algorithm maps the released jij , to an rt-client
in Q(jij ) on which its outset time is minimum.
Algorithm 2 job splitting(J)
Job j ij = J is released at time t
P[1..(l-1)] is an array of (l-1) rt-clients in the system
E s [1..(l−1)] is an array of max execution time that rt-client
P[1..(l-1)] can assign to j ij .
for x = 1 to (l − 1) do
E s [x] = compute spare time(j ij , P [x]);
end for
E s max = compute max(E s [1..(l − 1)]);
if (E s max ! = 0) then
Pmapped = rtclient of (E s max )
j 1 = assign param(j ij , E s max , null);
j 2 = assign param(j ij , 0, j 1 );
assign job(j 1 , Pmapped );
set release timer(j 2 );
else
j ij can not be split;
end if
Unlike the traditional partitioning scheme, if Q(jij ) is
empty for jij released at t, the DP scheduling technique does
not regard the job as not schedulable but rather schedules it by
allowing it to migrate across rt-clients. Each individual job in
DP-EDF scheduling algorithm starts its execution at the end
of its outset time, i.e., when its outset time becomes zero.
However, the job would be schedulable if it even started its
execution at the end of its slack time, i.e., when its slack time
becomes zero. This implies that an rt-client Pr with job jmn
having slack Smn (t) and outset Omn (Pr , t) can accommodate
a higher priority job jij with execution time less than (Smn (t)
- Omn (Pr , t)). This difference between slack and outset time
of jmn , (Smn (t) - Omn (Pr , t)), at time t is denoted as A(jmn ,
t). Thus, an rt-client containing a set of assigned jobs can
accommodate jij if
e0ij (t) < min (A(jmn , t), ∀ jmn ∈ L(jij , Pr ))
The amount of execution time that a rt-client Pr can spare
for jij is denoted as Es (Pr , jij ) where,
Es (Pr , jij ) = min(A(jmn , t), ∀ jmn ∈ L(jij , Pr ))
When jij arrives with Q(jij ) as null set, the DP-EDF
algorithm performs the job splitting operation by computing
Es (Pr , jij ) value for all rt-clients Pr , by assigning jij the
highest priority in each of them. The rt-client, Pk , with the
maximum value of Es (Pk , jij ) is chosen for the execution of
jij . The DP-EDF scheduling algorithm splits jij into j ij 1 and
j ij 2 , where j ij 1 is assigned to Pk and j ij 2 is released after
the completion of j ij 1 with the following parameters
e0ij 1 (t) = Es (Pk , jij )
(a)
dij 1 = e0ij 1 (t) + tij
(b)
tij 2 = e0ij 1 (t) + tij = dij 1
(c)
dij 2 = Di + tij = dij
(d)
e0ij 2 (t) = e0 ij (t) - e0ij 1 (t)
(e)
where tij is the release time of jij , (a) and (e) are the
remaining execution time of j ij 1 and j ij 2 , (b) and (d) are the
absolute deadline of j ij 1 and j ij 2 and (c) is the release time
of j ij 2 . The modified dij 1 and e0 ij 1 (t) of j ij 1 ensures that it
is executed with highest priority on Pk .
When j ij 2 is released, it is treated like a non-split job.
The qualified set for j ij 2 is computed and it is assigned to
the best rt-client from Q(jij 2 ). If Q(j ij 2 ) is a null set, j ij 2 is
further split and the whole process is repeated again till the
job exhausts its execution time.
Algorithm 3 is rtclient qualified (P, O, J)
j ij is released at time t.
P is a rt-client.
Oij is the outset time of j ij in P.
j ij = J and Oij = O.
cond one and cond three are initialized to false.
cond two is initialized to true.
Sij is the slack time of j ij .
if edf schedulabilty test(P, j ij ) then
cond one = true;
else
return f alse;
end if
for all j mn assigned to P do
if ((ρ(jij ) > ρ(jmn )) then
Omn = compute outset time(P, jmn )
if (Omn + e0ij < S mn )) then
cond two = true;
else
return f alse;
end if
end if
end for
if Oij < S ij then
cond three = true;
end if
if (cond one and cond two and cond three) then
return true;
else
return f alse;
end if
Hence, for every j ij (split or non-split) released in the rtscheduling entity, DP-EDF scheduling algorithm performs the
following job mapping operations
•
Computes qualified set, Q(jij )
•
If Q(jij ) is not empty, j ij is assigned to Pr ∈ Q(jij ),
such that, Oij (Pr , t) is minimum in Q(jij ).
•
If Q(jij ) is empty, j ij is split into j ij 1 and j ij 2 , while
j ij 1 is assigned and executed on an rt-client with the
highest priority immediately, j ij 2 is released after j ij 1
finishes.
Algorithm 1 depicts the job mapping operation on the
release of j ij . The function compute outset time(P [x], j ij )
calculates Oij (Px , t), is rtclient qualif ied(P, O, J) depicted in algorithm 3 returns true if P can accommodate J, minimum outset time(Q) determines the rt-client
having minimum outset time from the set Q(jij ) and
assign job(j ij , Pmapped ) assigns (jij ) to Pmapped . In algorithm 3, edf schedulability test(P, j ij ) returns true if on
assigning j ij to P, its utilization does not exceed 1. If L(jij ,Px )
is a null set, then Px is eligible if cond one and cond three
are true, and hence, cond two is initialized to true.
Algorithm 2 depicts how j ij with an empty qualified set is split and mapped to an rt-client. The function compute spare time(Px , j ij ) calculates Es (Px , jij )
s
)
by assigning jij the highest in Px , rtclient of (Emax
s
returns the processor with maximum E value and
set release timer(j) sets j’s release timer. Algorithm
4 calculates and assigns parameters to the split jobs.
The job, j ij , is split into two parts j ij 1 and j ij 2 . In
assign param(Jorg , max exec, Jsplit ), Jsplit refers to j ij 1 ,
the split job that is scheduled to run immediately. Hence, j ij 1
and j ij 2 calls assign param(Jorg , max exec, Jsplit ) with
Jsplit = null and Jsplit = j ij 1 respectively.
Algorithm 4 assign param(Jorg, max exec , Jsplit)
j ij = Jorg is released at t.
ret job is the job returned with modified parameters.
dret job is the absolute deadline of ret job.
eret job is the remaining execution time of ret job.
rret job is the release time of ret job.
if Jsplit = null then
eret job = max exec;
dret job = eret job + t;
return ret job;
else
rret job = eJsplit + t;
eret job = ejij − eJsplit ;
dret job = dij ;
return ret job;
end if
Fig. 2: An Example of DP Scheduling
and T 4 (9,17) are released. The qualified set for both the
jobs include all three processors, and hence, j31 preempts the
idle task in P3 . The outset time of j41 is minimum, i.e., 0,
in P2 and hence, it is assigned to P2 . At t=2, the first instance
of T 5 (6, 8), j51 , is released with absolute deadline at 10 and
its qualified set is empty, so j51 is split. The Es values for
j51 in P1 , P2 and P3 are 5, 2 and 4 respectively and hence,
j 511 is assigned to P1 with e0511 = 5 and absolute deadline 7
and j 512 is set to be released at t=7. At t=7, j 512 is released
with e0512 = 1 and absolute deadline 10 and is assigned to P3 .
Note that, in the figure, the split jobs of j51 , j 511 and j 512 are
represented as j51 .
Job Assignment: After the job mapping process is completed, the rt-server performs the job assignment operation. In
the job assignment operation, the rt-server enqueues the job on
the mapped rt-client’s ready queue. If the enqueued job has the
highest priority in the rt-client, the rt-server sends a reschedule
IPI to the rt-client.
2) Job Completion: When a job completes its execution,
the rt-client adds it to the global release queue and schedules
the next highest priority job from its ready queue to run.
IV.
E XECUTION OVERHEAD
Example: Figure 2 depicts an example of DP-EDF
scheduling algorithm. When there are no real-time jobs in its
ready-queue, the rt-client schedules the idle task, idle task, to
run and the priority of idle task is lower than priority of all
the real-time tasks. In the figure, we consider an implicit task
set, i.e, task’s deadline is equal to its period. Hence, T 1 (5,10)
represents a task with WCET 5 and relative deadline 10. The
job, jij , is represented by tuple (e0ij , Oij , Sij , dij ).
Execution overhead of a real-time task is defined as the
time spent by a job in the kernel executing codes that are
not related to its function [8]. The overheads and latencies
incurred while scheduling real-time tasks, called the scheduler
overhead, contribute to the task’s response time. Larger the
task’s scheduler overhead, the higher its response time. A
current task’s execution overhead can affect the response time
of the other tasks. Since the correctness of the real-time system
depends on its response time, the scheduler should ensure that
the overhead is predictable and as small as possible.
At t=0, T 1 (5,10) and T 2 (9,20) are released and assigned
to P1 and P2 respectively. At t=1, first instance of T 3 (6,10)
DP-EDF is a dynamic event driven scheduling technique,
where job releases are achieved using interrupts. Release
1
0.8
0.8
0.8
0.6
0.4
0.2
0
Success Ratio
1
Success Ratio
Success Ratio
1
0.6
0.4
0.2
DP-EDF
P-EDF
G-EDF
0.5
0.55
0
0.6
0.65
0.7
0.75
0.8
System Utilization
(a) (umin , umax ) = (0.1, 0.5)
0.85
0.9
0.95
0.6
0.4
0.2
DP-EDF
P-EDF
G-EDF
0.5
0.55
0
0.6
0.65
0.7
0.75
0.8
System Utilization
0.85
0.9
0.95
DP-EDF
P-EDF
G-EDF
0.5
(b) (umin , umax ) = (0.5, 1)
0.55
0.6
0.65
0.7
0.75
0.8
System Utilization
0.85
0.9
(c) (umin , umax ) = (0.1, 1)
Fig. 3: System Utilization v/s Success Ratio
overhead is the time taken to service the release interrupt. The
release interrupt handler is responsible for mapping the job
to a rt-client and enqueuing the job to the mapped rt-client’s
ready queue. As discussed previously, all job releases are
handled by the rt-server, and hence, release overhead does not
affect the tasks executing on the rt-client. The rt-server needs
to have an updated global view of the rt-scheduling entity
to ensure that the jobs are assigned to the correct rt-client.
The rt-server performs the job mapping operation either by
directly accessing the rt-client’s ready queue or by maintaining
its own updated copy of all the ready queues. Direct access
includes the extra overhead of acquiring the ready queue’s lock
while scanning the rt-client. Instead of acquiring the locks
on all the ready queues at the same time, the locking can
be done sequentially one after the other. In systems, where
the rt-server maintains a copy of the ready queues, whenever
there is job release, completion or preemption, the rt-server’s
copy also needs to be updated. Since the job completion and
preemption are events that are specific to the rt-clients, the rtserver is informed of this change using IPIs which is an extra
overhead in the job mapping process. Job mapping overhead
of the rt-server is the amount of time spent performing the
job mapping operation. Once the mapping is done and an rtclient is selected, the job is assigned to the rt-client, which
involves locking and adding the job to the rt-client’s ready
queue. Job enqueue overhead is the time taken to assign a
job to a rt-client. Release overhead is the sum of job mapping
and job enqueue overhead. If the assigned job has the highest
priority, the rt-server sends reschedule IPI to the rt-client which
may incur Reschedule IPI latency. Release interrupt latency on
the rt-server is the total amount of time the release interrupt
waits before its handler is executed. The rt-client shares a small
amount of the extra overhead incurred while providing the rtserver with an updated view of the rt-scheduling entity.
When a job completes, the rt-client is responsible for
removing the job from its ready queue, adding it to the
global release queue and scheduling the next highest priority
task to execute. Context switching overhead is the time taken
to perform the context switching between two processes.
Completion overhead is the time taken to perform dequeue
and enqueue operation on ready and global release queue
respectively, on job completion. Scheduling overhead of a job
in the rt-client is the time taken to schedule it for execution,
which is the sum of context switching overhead and completion overhead. Splitting overhead is the total time spent in
performing operations related to job splitting on its release and
during its execution. Since the split jobs can migrate between
rt-clients, this overhead also includes the cost of migrating
the job. This overhead is incurred both on the rt-client and
rt-server. Preemption/Migration overhead is the cost incurred
while reloading the job’s cache when it is scheduled to execute
again after its migration or preemption.
In DP-EDF scheduling, queue contention is a major contributor to release and scheduling overhead. The release queue
is a global shared data structure, and in the worst case scenario,
a processor from the rt-client may have to wait till all the other
processors have completed their manipulation of the release
queue before it can access the same. The rt-client’s ready queue
on the other hand is local to the rt-client.
V.
E VALUATION
In this section, we test the performance of DP-EDF algorithm in terms of schedulability and scheduler overhead by
comparing it with Global EDF (G-EDF) and partitioned EDF
(P-EDF) scheduling algorithm through a series of experiments.
We carry out two types of experiments, one by simulating
the DP-EDF algorithm as a user program and the other by
implementing the DP-EDF algorithm in the Linux kernel. The
P-EDF scheduling algorithm in our experiments uses first-fit
decreasing heuristic for partitioning the tasks, i.e., the tasks are
arranged in non-decreasing order of utilization and are mapped
to the processors in the same order, and a task is allocated to
the first processor in which it fits.
TABLE I: Maximum number of tasks split
Utilization
(0.1, 0.5)
(0.5, 1)
(0.1, 1)
0.5
0.6
0.7
0.8
0.9
0
0
0
0
1
2
5
9
23
35
2
2
4
11
30
A. Simulation
1) Setup: DP-EDF scheduling algorithm was simulated
by implementing it using the C programming language. Our
process of task generation is similar to that in [4]. The task sets
are randomly generated for a given tuple (us , umin , umax ). In
a task set with (us , umin , umax ), each task has utilization between umin and umax . Sum of utilization of all the tasks in the
0.95
In this paper, we show the simulation results for task sets
having us in the range [0.5, 0.95] in steps of 0.05. For each us ,
we consider three distinct (umin , umax ) pair, (0.1, 0.5), (0.5,
1) and (0.1, 1). For every (us , umin , umax ) tuple, we generate
10,000 task sets. The performance of a scheduling algorithm
is determined by calculating the number of schedulable task
sets for a given (us , umin , umax ). We express this performance
metric using success ratio [4], i.e., the ratio of the number of
schedulable task sets to the total number of scheduled task sets.
We assume that the various scheduling overheads like task
migration, task preemption, lock contention cost, reschedule
IPI latency and so forth to be zero.
2) Simulation Results: Figure 3a, 3b and 3c depict the
simulation results. As seen in figure 3a, the performance of all
3 scheduling algorithms are comparable when scheduling light
tasks(tasks with utilization between 0.1 and 0.5). From the
figures, we observe that DP-EDF algorithm can successfully
schedule task sets having total utilization as high as 85%. For
all three scheduling algorithms, as the utilization increases,
the success ratio decreases. However, in DP-EDF scheduling
algorithm, this drop in success ratio is not as drastic as
in G-EDF or P-EDF and the success ratio drops only at
very high(greater than 85%) system utilization. Based on the
simulation results, we can conclude that the utilization bound
of DP-EDF algorithm is 85% and in terms of schedulability
DP-EDF is superior to both G-EDF and P-EDF scheduling
algorithm.
For the DP-EDF scheduling algorithm, table I depicts the
number of times the tasks are splits in the task set having the
maximum number of task splitting from the 10,000 executed
task sets for each of the (us , umin , umax ) tuple. The maximum
number of splitting occurs in the task set containing tasks with
utilization between (0.5, 1), and this number does not exceed
35.
B. Implementation
1) Setup: We evaluate the DP-EDF scheduling algorithm
by implementing it as a scheduler plugin in the LITMUSRT
release overhead(in microseconds)
60
DP-EDF
PSN-EDF
GSN-EDF-DP
40
20
0
0
3
6
9
12
15
no of tasks
18
21
Fig. 4: Average release overhead
24
27
600
release overhead(in microseconds)
given task set does not exceed Umax , where Umax = m∗us , m
is the number of cores and us is called the system utilization.
The task τi ’s, Ti , Di and Ui , are generated randomly and its Ei
is computed using Ti *Ui . The simulation is carried out with
4 cores for an implicit deadline system where Ti = Di . In
DP-EDF scheduling algorithm, the 4 cores are divided into 1
rt-server and 3 rt-clients. In G-EDF and P-EDF, the real-time
tasks execute on 3 cores.
500
DP-EDF
PSN-EDF
GSN-EDF-DP
400
300
200
100
0
0
3
6
9
12
15
no. of tasks
18
21
24
27
Fig. 5: Maximum release overhead
platform. LITMUSRT [9] is a real-time extension to Linux kernel which provides a platform to implement and test different
multiprocessor scheduling algorithm. The execution overhead
of DP-EDF is compared against GSN-EDF-DP and PSN-EDF,
where DP-EDF, GSN-EDF-DP and PSN-EDF are LITMUSRT
scheduler plugins implementing DP-EDF, G-EDF and P-EDF
scheduling algorithm respectively.The GSN-EDF-DP supports
dedicated interrupt handling, i.e., one processor is exclusively
dedicated for interrupt handling and this core does not execute
any real-time tasks.
The experiments are carried out on Intel’s core i7 3632-QM
platform, which is a quad core processor with 8 logical cores
running at 2.2 GHz clock speed. We use LITMUSRT version
2014.2 which is based on Linux kernel 3.10.42. Execution
overheads are calculated using a light weight event tracing
toolkit called FeatherTrace. This highly portable toolkit that
uses multi-processor safe FIFO buffer to trace events at a very
low overhead is included in the LITMUSRT distribution. The
experiments are conducted with 4 Linux processors and 4 realtime processors. The 4 real-time processors contain 1 rt-server
and 3 rt-clients which forms the rt-scheduling entity. Hence,
in both GSN-EDF-DP and DP-EDF, the real time tasks are
executed only on 3 processors. We conducted experiments on
task sets having n number of tasks in the range (3, 27) in steps
of 3. The tasks in the task sets were generated randomly where
the task’s utilization is in the range (0.1, 1). The experiment
was performed on 100 task sets.
In DP-EDF, job splitting is achieved using hrtimer. Each
task is associated with two timers, release hrtimer and split
hrtimer. Both the types are programmed to expire on the rtserver, i.e, the hrtimer interrupts the rt-server, where its handler
is executed. The job programs the release timer to expire at its
next release on its completion. When a job is split, it programs
its split hrtimer to expire after it has consumed the time it has
been allocated on the given processor before it starts executing
on the same. On its expiry, the split hrtimer’s interrupt handler
preempts the job and assigns it an appropriate processor.
In the current implementation of DP-EDF plugin, the
job mapping operation is performed by acquiring locks on
the rt-client’s ready queue. Instead of using locks, one can
maintain a copy of the rt-client, called rt-client, image at the
rt-server which is updated on every job release, completion
and preemption. The rt-server can then use the rt-client image
to perform the job mapping.
real-time tasks in multicore platforms. The basic motivation of
DP scheduling is to maximize CPU utilization and minimize
scheduling overheads. Dynamic partitioning of the real-time
tasks ensures that the processors in the system are used to their
fullest capacity, while the division of cores helps in reducing
the scheduling overhead. The experimental results show the
competitiveness of the DP scheduling technique with a few
state-of-art real-time scheduling techniques. Our future work
includes scaling DP scheduling technique to larger number of
cores and also to further reduce the scheduling overheads.
600
scheduling overhead(in microseconds)
2) Experimental Results: Figure 4 and 5 plot the average
and maximum release overhead. In both GSN-EDF-DP and
DP-EDF, the release overhead is incurred by the hrtimer
interrupt handler on the dedicated processor(rt-server), and
hence this release overhead does not affect the response time
of the other tasks executing on the rt-client. Though the release
overhead is smaller in PSN-EDF, this is incurred by a task on
the cores executing real-time tasks and can affect the response
time of the other real-time tasks. The release overhead of DPEDF plugin includes the time taken to perform the job mapping
and assignment, and hence it is slightly higher than that in
GSN-EDP-DP and PSN-EDF.
Figure 6 and 7 plot the average and maximum scheduling
overhead. The maximum scheduling overhead of DP-EDF is
smaller than that in GSN-EDF-DP and PSN-EDF, though its
average scheduling overhead is greater than that in GSNEDF-DP and PSN-EDF. In DP-EDF, the scheduling overhead
is incurred on the rt-client and from the smaller maximum
value we can conclude that DP-EDF is more predictable when
compared to GSN-EDF-DP and PSN-EDF. Even though the
average scheduling overhead of DP-EDF is higher, it has
greater utilization bound than PSN-EDF and GSN-EDF-DP.
This scheduling overhead in DP-EDF can further be reduced
by using a rt-client image, which is a part of our future work.
DP-EDF
PSN-EDF
GSN-EDF-DP
400
300
200
100
0
0
3
6
9
12
15
no. of tasks
18
21
24
27
Fig. 7: Maximum scheduling overhead
R EFERENCES
30
scheduling overhead(in microseconds)
500
[1]
20
[2]
10
[3]
DP-EDF
PSN-EDF
GSN-EDF-DP
0
0
3
6
9
12
15
no. of tasks
18
21
24
27
Fig. 6: Average scheduling overhead
[4]
[5]
[6]
VI.
R ELATED W ORKS
DP scheduling technique uses a dedicated processor to
handle the scheduling activities of the cores executing the
real-time tasks. This concept of using dedicated processor
was introduced in Spring Kernel [10]. Brandenburg et al.
[11] implemented a global scheduler in LITMUSRT using
dedicated processor to handle release interrupt(GSN-EDF-DP)
and they proved that the dedicated processor based scheduler performed better than the classic global scheduler on
LITMUSRT . Recently, Cerqueira et al. [12], implemented a
highly scalable global scheduler using the concept of dedicated
processor and message passing. SchedISA [8] is a Linux
based multicore scheduler designed to schedule hard real-time
and Linux tasks simultaneously. DP’s system architecture is
similar to SchedISA; however, SchedISA implements P-EDF
algorithm where the tasks are mapped to a processor before
runtime.
[7]
[8]
[9]
[10]
[11]
[12]
VII.
C ONCLUSION
In this paper, we present a dynamic partitioning based
scheduling technique, called DP scheduling, for scheduling
J. M. L´opez, J. L. D´ıaz, and D. F. Garc´ıa, “Utilization bounds for edf
scheduling on real-time multiprocessor systems,” Real-Time Systems,
vol. 28, no. 1, pp. 39–68, 2004.
S. K. Baruah, N. K. Cohen, C. G. Plaxton, and D. A. Varvel,
“Proportionate progress: A notion of fairness in resource allocation,”
Algorithmica, vol. 15, no. 6, pp. 600–625, 1996.
A. Bastoni, B. B. Brandenburg, and J. H. Anderson, “Is semi-partitioned
scheduling practical?” in Real-Time Systems (ECRTS), 2011 23rd Euromicro Conference on. IEEE, 2011, pp. 125–135.
S. Kato, N. Yamasaki, and Y. Ishikawa, “Semi-partitioned scheduling of
sporadic task systems on multiprocessors,” in Real-Time Systems, 2009.
ECRTS’09. 21st Euromicro Conference on. IEEE, 2009, pp. 249–258.
B. Andersson and K. Bletsas, “Sporadic multiprocessor scheduling with
few preemptions.” in ECRTS, vol. 8, 2008, pp. 243–252.
G. Nelissen, S. Funk, and J. Goossens, “Reducing preemptions and
migrations in ekg,” in Embedded and Real-Time Computing Systems
and Applications (RTCSA), 2012 IEEE 18th International Conference
on. IEEE, 2012, pp. 134–143.
K. Bletsas and B. Andersson, “Notional processors: an approach for
multiprocessor scheduling,” in Real-Time and Embedded Technology
and Applications Symposium, 2009. RTAS 2009. 15th IEEE. IEEE,
2009, pp. 3–12.
N. Saranya and R. C. Hansdah, “An implementation of partitioned
scheduling scheme for hard real-time tasks in multicore linux with fair
share for linux tasks,” in RTCSA, 2014.
J. M. Calandrino, H. Leontyev, A. Block, U. Devi, and J. H. Anderson,
“Litmusˆ rt: A testbed for empirically comparing real-time multiprocessor schedulers,” in Real-Time Systems Symposium, 2006. RTSS’06.
27th IEEE International. IEEE, 2006, pp. 111–126.
J. A. Stankovic and K. Ramamritham, “The spring kernel: A new
paradigm for real-time systems,” Software, IEEE, vol. 8, no. 3, pp.
62–72, 1991.
B. B. Brandenburg and J. H. Anderson, “On the implementation of
global real-time schedulers,” in Real-Time Systems Symposium, 2009,
RTSS 2009. 30th IEEE. IEEE, 2009, pp. 214–224.
F. Cerqueira, M. Vanga, and B. B. Brandenburg, “Scaling global
scheduling with message passing,” in Real-Time and Embedded Technology and Applications Symposium, 2014, 2014.