MapReduce Join Strategies for Key

2014 11th International Joint Conference on Computer Science and Software Engineering (JCSSE)
MapReduce Join Strategies for Key-Value Storage
Duong Van Hieu, Sucha Smanchat, and Phayung Meesad
Faculty of Information Technology
King Mongkut’s University of Technology North Bangkok
Bangkok 10800, Thailand
[email protected], {[email protected],[email protected]}, [email protected]
dataset to generate a set of intermediate <key, value> pairs. A
reduce process with a reduce function merges all of
intermediate values generated by the map processes associated
with the same intermediate key to form a possibly smaller set
of <key, value> pairs, called final output <key, value> pairs.
Abstract—This paper analyses MapReduce join strategies
used for big data analysis and mining known as map-side and
reduce-side joins. The most used joins will be analysed in this
paper, which are theta-join algorithms including all pair
partition join, repartition join, broadcasting join, semi join, persplit semi join. This paper can be considered as a guideline for
MapReduce application developers for the selection of join
strategies. The analysis of several join strategies for big data
analysis and mining is accompanied by comprehensive examples.
Fig. 1 is a simple word counting example. The input string
data “Advanced Research Methodology, Advanced
Information Modelling and Database, Advanced Network and
Information Security, Advanced Database and Distributed
Systems” is divided into four blocks corresponding to each
subject name separated by commas. A Hash function
mod(code(upper(left(key,1))),k)+1 is used for distributing
intermediate <key, value> pairs into reduce tasks. The
left(key,1) means taking the first letter of key, the upper(x)
means changing x to upper case, the code (x) means taking
ASCII code of character x, and the mod(m, k) means returning
the remainder after m is divided by k.
Keywords—MapReduce; join strategy; NoSQL
I.
INTRODUCTION
With the continuous development of big data and cloud
computing, it is believed that traditional database technologies
are insufficient for data storage and access, and also
performance and flexibility requirements. In the new era of big
data, NoSQL databases are more appropriate than relational
databases [1]. Key-Value store, a kind of NoSQL databases, is
an appropriate choice for applications that use MapReduce
model for distributed processing. Key-Value stores offer only
four underlying operators including inserting <key, value>
pairs to a data collection, updating values of existing pairs,
finding values associated with a specific key, and deleting pairs
from a data collection [2].
Intermediate
<key,value>pairs
Inputdata
Block1
Key
1
Database
Research
Map1 Research
1
Group1 Database
Information
Methodology
1
Key
Value
Key
1
1
1
Key
Re
Value
1
Advanced
1
Information
duc
e 1
Value
Database
2
Distributed
1
1
Advanced
1
Map2 Modelling
1
Advanced
1
Advanced
4
and
and
1
Advanced
1
and
3
Database
Database
Modelling
Advanced
Advanced
Network
Network
and
Map3 and
1
Key
Reduce2 Information
Value
Group2 and
1
Value
and
1
Methodology
1
1
and
1
Modelling
1
1
Information
1
1
Information
1
Key
2
Value
Information
Information
1
Methodology
1
Network
1
Security
Security
1
Modelling
1
Research
1
Key
Block4
Distributed
<key,value>pairs
producedbyReduce
processs
Value
Advanced
Key
Block3
Key
Advanced
Advanced
Block2
Value
Advanced
Methodology
Joining two data collections to produce a new dataset based
on joining fields is a responsibility of programmers or
application developers rather than of database management
systems. However, several join strategies existing, which have
different advantages and disadvantages. To provide
programmers a guideline to the selection of join strategies, this
study analyses several joining strategies for big data analysis
and mining accompanied comprehensive examples. The
content of this paper is organised into four main sections.
Section 2 gives an overview of the MapReduce programming
model, Section 3 explains MapReduce join strategies, and
Section 4 is the conclusion of and comparison of join strategies
used in MapReduce.
<key,value>pairs
distribution
Value
Key
Advanced
Advanced
1
Database
Database
1
Research
1
Key
and
Map4 and
Group3 Network
Value
1
R
uc
ed
e 3
1
Value
Distributed
Distributed
1
Group4 Security
1
Systems
Systems
1
Systems
1
R
Key
Value
Security
1
4 Systems
c e
e du
1
Fig. 1. Map and reduce processes of a simple word counting example
II.
MAPREDUCE OVERVIEW
MapReduce has been used at Google since February 2003,
and was first introduced in 2004 by Dean and Ghemawat [3]
and in Communications of the ACM in 2008 [4]. It is used for
processing large datasets in a parallel or distributed computing
environment. It is a combination of map processes and reduce
processes. A map process is a function that processes a set of
input <key, value> pairs that is a portion of a large input
‹,(((
III.
MAPREDUCE KEY JOIN STRATEGIES
Physically, data in a Key-Value format can be stored in the
form of a data structure such as B-Tree, Queue, and Hash table
[5, 6]. Logically, each record in a Key-Value store is a single
entry including a key and a value. To make it easy to understand, a set of <key, value> pairs, called data collection, can be
164
2014 11th International Joint Conference on Computer Science and Software Engineering (JCSSE)
Okcan [16]. Theta join algorithms will be analysed in the
following sections.
considered as a two-column table. The first column stores keys
and the second one, which can be a combination of more than
two columns, stores values associated with the keys.
TableL
Joins using MapReduce can be categorized as map-side
join, reduce-side join, memory-backed join, join using Bloom
Filter, and map-reduce merge [7]. However, this paper follows
the categories proposed by Tom White [8, 9], grouping into
two types which are map-side joins and reduce-side joins.
Map-side joins are joins per-formed by mappers, used to join
two large input datasets before feeding data to the map
functions. Reduce-side joins are joins performed by reducers,
being more general than a map-side join because inputs do not
need to be structured in any particular way [9]. In some cases,
reduce-side joins are less efficient than map-side joins because
datasets go through the MapReduce shuffle. For reduce-side
joining, several components are involved. These are multiple
inputs and secondary sorting [8].
Stds
Aj
Hiu
Lo
Su
Suna
Profs
PhMe
PhMe
PhMe
Mar
Un
SELECT*FROML,RWHEREL.Profs=R.Profs
TableR
Stds
Hin
Hiu
Jia
Ling
Sul
Profs
Sup
Sup
PhMe
Su
PhMe
L.Stds
Aj
Aj
Hiu
Hiu
Lo
Lo
Fig. 3. All pairs partition join
Each compound partition will be assigned to a map task.
Output of the map task is <compound key, tagged record>
pairs. A compound key is a combination of partition name
from table R and L such as (1, 2), (1, 2), and (1, 3). To identify
which record comes from which table, each record from table
R or L will be tagged its table name, called tagged record. Each
group of <compound key, tagged record> pairs will be passed
to reducers. Before reducing data, this input data will be split
into table R and L and they will join in the same way as the
traditional joining method.
Among join algorithms used in MapReduce literature listed
in [11-15], it is believed that equi-join strategies used in [11]
are more efficient than those used in Yahoo Pig, Facebook
Hive, and IBM Jaql. This paper focuses on the theta-join
implementation strategies proposed by Blanas et al.[11] and
Ta bl e L
Hi u PhMe
Lo PhMe
Pa rt2
Di t PhMe
Su
Ma r
Pa rt3
Sun
Un
Ta bl eR
Stds Profs
Pa rt1 Hi n Sup
Hi u Sup
Pa rt2 Ji a PhMe
Li ng
Su
R.Profs
PhMe
PhMe
PhMe
PhMe
PhMe
PhMe
B. All Pair Partition Joins
Given table R having |R| records and table L having |L|
records, product of R and L is a set of |R|*|L| records. This
traditional method takes a long time when joining two very
large tables. To compute this product in MapReduce, table R
and table L will be divided into u and v disjoin partitions,
respectively. |R|*|L| records can be obtained from u*v products,
each product partition (1, 1), partition (1, 2),.., partition (u, v)
can be processed by a map or a reduce function. This method is
called all pairs partition join in MapReduce model [16].
A. Theta Joins
Theta join is a kind of join that uses comparison operators
such as <, <=, >, >=, =, <> in the join predicates. Among these,
equi-join is the most used join for joining two datasets to
achieve the intersection between them. Fig. 2 is an example of
equi-join. This join matches every record from table L to every
record from table R which has the same value of the field join.
The results of joining can be projected to eliminate some
redundant fields to produce only required fields.
Stds Profs
Aj PhMe
R.Stds
Jia
Sul
Jia
Sul
Jia
Sul
Fig. 2. A simple equi- join example (using equi-join on the field Profs)
Multiple inputs mean inputs from different sources can
have different formats or presentations. To deal with this
situation, multiple inputs need to be parsed separately. This
parsing is provided in Hadoop, called per-path basis [10].
Secondary sorting occurs when reducers obtain inputs from
two sources and each of them can be sorted by different orders.
To solve this challenge, when the first dataset comes from
source A sorted by key1, the second dataset comes from source
B sorted by key2. The merged data should be sorted by a
composite key (key1, key2) before reducing.
Pa rt1
L.Profs
PhMe
PhMe
PhMe
PhMe
PhMe
PhMe
Key Valuelists
('R',Hi n,Sup)
Key Valuelists
('R',Hi n,Sup)
Key Valuelists
('R',Ji a ,PhMe)
(1,1) ('R',Hi u,Sup)
('L',Aj,PhMe)
('L',Hi u,PhMe)
(1,2) ('R',Hi u,Sup)
('L',Lo,PhMe)
('L',Di t,PhMe)
(2,1) ('R',Li ng,Su)
('L',Aj,PhMe)
('L',Hi u,PhMe)
Key Valuelists
('R',Hi n,Sup)
('R',Hi u,Sup)
(1,3)
('L',Su,Ma r)
('L',Sun,Un)
Key Valuelists
('R',Ji a ,PhMe)
('R',Li ng,Su)
(2,3)
('L',Su,Ma r)
('L',Sun,Un)
Key Valuelists
('R',Ji a ,PhMe)
('R',Li ng,Su)
(2,2)
('L',Lo,PhMe)
('L',Di t,PhMe)
Key
empty R.Stds
Ji a
Ji a
Ji a
Ji a
Valuelists
R.Profs
PhMe
PhMe
PhMe
PhMe
L.Stds
Aj
Hi u
Lo
Di t
Fig. 4. An example of all pairs partition joins (using equi-join on the field Profs)
165
L.Profs
PhMe
PhMe
PhMe
PhMe
2014 11th International Joint Conference on Computer Science and Software Engineering (JCSSE)
The standard version is the same as the partitioned sortmerge join that is used in parallel Rational Database
Management Systems [11]. In the map phase, each map task
works on a block of either table L or table R. To identify which
table an input record is from, the map function tags each record
with its original table and produces the extracted join key and
the tagged records. Output of the map function is a set of
<join_key, tagged_record> pairs. Join_key is the attribute used
to join two tables, and tagged_record is a compound of table
name and record. These outputs are then partitioned, sorted,
and merged. Then, all records for each join key are grouped
together and fed to a reducer. In the reduce phase, for each join
key, the reducer first separates and buffers the input records
into two sets according to the table tagged, and then performs a
cross-product between two sets. This following example uses
hash function mod(code(upper(left(join key,1))),2)+1 for
distributing intermediate <key, value> pairs to each reducer
(the similar has function used earlier).
In Fig. 4, each record from table L and R will be added tag
‘L’ and ‘R’, respectively. Those records are called tagged
records. Only the composite key has records from both table L
and R having the same join key are fed to reduce functions. In
this example, only partition (2, 1), partition (2, 2) has shared
join key records from table R and L, which will be used for
joining. The remaining partitions will be ignored. Disadvantage
of this joining is enumerating every pair may not be processed
by reducers.
C. Repartition Join
Repartition join is the most used join strategy in
MapReduce. Datasets L and R are dynamically split into parts
based on the join key and pairs of partitions from L and R will
be joined [15]. It has two versions called standard repartition
join and improved repartition join.
merging, sorting, and groupin
T able L
Intermediate output
Input of map functions
S tds Profs
Join ke y Tagge d Re cord
key
tagge d re cord
Block1 Su
Mar Map1 Mar ('L', Su, Mar)
Group 1 P hMe ('L' , Aj, P hMe)
Aj P hMe
P hMe ('L', Aj, P hMe)
P hMe ('L' , Hiu, P hMe)
Block2 Hiu P hMe Map2 P hMe ('L', Hiu, P hMe)
P hMe ('L' , Lo, P hMe)
Lo P hMe
P hMe ('L', Lo, PhMe)
P hMe ('R' , Jia, PhMe)
Block3 Sun
Un Map3 Un
('L', Sun, Un)
P hMe ('R' , Sul, P hMe)
T able R
S tds Profs
key
tagge d re cord
Jia P hMe
P hMe ('R', Jia, PhMe)
Mar ('L' , Su, Mar)
Block1
Map4
Sul P hMe
P hMe ('R', Sul, PhMe)
Su ('R' , Ling, Su)
Block2 Ling
Su Map5
Su
('R', Ling, Su)
Sup ('R' , Hin, Sup)
Hin
Sup
Sup
('R', Hin, Sup)
Sup ('R' , Hiu, Sup)
Block3
Map6
Hiu
Sup
Sup
('R', Hiu, Sup)
Group 2 Un ('L' , Sun, Un)
Reduce process
T able L S tds
Aj
Reduce 1 Hiu
Lo
T able R S tds
Jia
Sul
T able L S tds
Su
Reduce 2 Sun
S tds
T able R Ling
Hin
Hiu
Profs
PhMe
PhMe
PhMe
Profs
PhMe
PhMe
Profs
Mar
Un
Profs
Su
Sup
Sup
Final result from reduce process
L.S tds L.Profs R.Stds R.Profs
Aj
PhMe Jia
P hMe
Aj
PhMe Sul
P hMe
Hiu
PhMe Jia
P hMe
Hiu
PhMe Sul
P hMe
Lo
PhMe Jia
P hMe
Lo
PhMe Sul
P hMe
Fig. 5. An example of standard repartition joins (using equi-join on the field Profs)
of those from table L on a given join key. Partition function is
also customised so that hash code is computed from just the
join key instead of composite key. Records are then grouped by
just the join key instead of the composite key. Grouping
function in the reducer which groups records on the join key,
and ensures that records from table R are stored ahead of those
from table L for a given key. To decrease buffer size, only the
record, that have composite key containing all table tags will be
written into buffer.
All records from table L and R will be buffered before
joining and that may lead to insufficient memory problem, as
encountered by Yahoo Pig and Facebook Hive [11, 17, 18]. To
deal with this, improved repartition join is proposed.
In the improved version, the map function is changed.
Output key of the map function is changed to a composite of
join key and table tag. The table tags will be generated in a way
that guarantees that records from table R will be stored ahead
Block 1
Block 2
Block 3
Block 1
Block 2
Block 3
T able R
Stds Profs
Jia
PhMe
Sul
PhMe
Ling
Su
Hin
Sup
Hiu
Sup
T able L
Stds Profs
Su
Mar
Aj
PhMe
Hiu PhMe
Lo
PhMe
Sun
Un
Output of map functions
C omp. Ke ys Tagge d Re cords
Map 1 [PhMe, R] ('R', Jia, PhMe)
[PhMe, R] ('R', Sul, PhMe)
Map 2
[Su, R]
('R', Ling, Su)
[Sup, R]
('R', Hin, Sup)
Map 3
[Sup, R]
('R', Hiu, Sup)
C omp. Ke ys
Map 4 [Mar, L]
[PhMe, L]
Map 5
[PhMe, L]
[PhMe, L]
Map 6
[Un, L]
Tagge d Re cords
('L', Su, Mar)
('L', Aj, PhMe)
('L', Hiu, PhMe)
('L', Lo, PhMe)
('L', Sun, Un)
Inte rme diate Re sults
Ke ys
Tagge d Re cords
[Mar, L] ('L', Su, Mar)
[PhMe, R ('R', Jia, PhMe)
[PhMe, R ('R', Sul, PhMe)
[PhMe, L ('L', Aj, PhMe)
[PhMe, L ('L', Hiu, PhMe)
[PhMe, L ('L', Lo, PhMe)
[Su, R]
('R', Ling, Su)
[Sup, R] ('R', Hin, Sup)
[Sup, R] ('R', Hiu, Sup)
[Un, L] ('L', Sun, Un)
Input of reduce functiom
Ke ys
Lists of Value s
([Jiaja, PhMe], [AjPae, PhMe])
([Jiaja,PhMe], [Hiu, PhMe])
([Jiaja, PhMe], [Lo, PhMe])
[PhMe R, L]
([Sul,PhMe], [AjPae, PhMe])
([Sul, PhMe], [Hiu, PhMe])
([Sul, PhMe], [Lo, PhMe])
Ke ys
Lists of Value s
[Mar,_, L] (_, [Su, Mar])
[Un, _, L]
(_, [Sun, Un])
[Su, R, _]
([Ling, Su],_)
([Hin, Sup],_)
[Sup, R, _]
([Hiu, Sup],_)
Fig. 6. Example of improved repartition joins (using equi-join on the field Profs)
166
Final result from
L.Stds L.Profs
Aj
PhMe
Aj
PhMe
Hiu
PhMe
Hiu
PhMe
Lo
PhMe
Lo
PhMe
reducer
R.Stds
Jia
Sul
Jia
Sul
Jia
Sul
R.Profs
PhMe
PhMe
PhMe
PhMe
PhMe
PhMe
2014 11th International Joint Conference on Computer Science and Software Engineering (JCSSE)
In some cases, a large portion of table R may not be
referenced by any record from table L. For example R is a table
of users including millions of records while L is a table of
activities that users act during an hour. In this situation, only a
few of records from table R are referenced by records from
table L. However, when joining based on broadcasting, a large
amount of records of table R are shipped across network and
loaded into the hash table. If these data are not referenced
based on the join key, the network resource is wasted for the
shipping.
D. Broadcasting Join
Broadcast join is used when table R is much smaller than
table L. Instead of passing both tables R and L across the
network, the smaller table will be broadcasted to larger table.
This technique reduces sorting time and network traffic. At the
beginning of each map function, broadcast join checks whether
R is stored on the local file system or not. If not, it retrieves
table R from the distributed file system, and splits R into
partitions on the join key, and stores these partitions on the
local file system. Hash table is built from table L or R
depending on which one has smaller size.
E. Semi Join
The semi-join proposed to solve the problem mentioned
above is comprised of three phases as follows. The first phase
runs as a full MapReduce job. In the map function, a main
memory table of hash code is used for determining the set of
unique join key values in a part of table L. By sending only
unique key values to the map output, number of records that
need to be sorted is reduced. The reduce function processes
unique join key. In Fig. 9, all unique join keys will be
consolidated by a reducer, result from this phase is a single file
called L.uk.
If R is smaller than a partition of L, then all partitions of R
will be loaded to memory to build the hash table. The map
function then extracts join key value from each record from L,
and uses it to probe the hash table and to generate join output.
If R is bigger than a split of L, joining is not done at the map
function. The map function will map each partition of L with
each partition of R using other join strategies. Then, results
from R and L will be joined at the end of the map process.
TableR
StdId
subject
55501
701
55501
371
55502
555
56701
511
56702
814
TableRisusedto
buildhashtable
In Fig. 7 and Fig. 8, table R is smaller than a part of table L,
so it is broadcasted to each node. The map function loads all
records from table R to build a hash table. For each record from
a partition of table L, the map function finds its reference in the
hash table, and outputs only those it has referenced. All
unreferenced records from table L will be ignored.
TableL
StdId subject
55501
701
Split1 55501
371
55502
555
56701
511
Split2 56701
814
56702
814
Hashtable, Distributedfunction=(StdIdmod2)+1
StdId
Group
55501
2
55502
1
56701
2
56702
1
Split2
Joinkeyis
usedtoprobe
hashtable
Map1
Map2
Joinkeyis
usedtoprobe
hashtable
Split1
Name
Lo
Mo
Bo
Dit
Hiu
Cha
Sul
Sher
Jia
Dih
Tha
Ling
outputL.uk
StdId
55501
55502
56701
56702
Fig. 9. Example of the first phase in Semi joins (using equi-join)
The second phase, similar to the broadcast join, runs as a
map job. Firstly, L.uk will be loaded into a memory hash table,
the map function iterates each record from table R and outputs
it if its join key can be found in the L.uk. Each part of table R
produces one file called Ri. Output of this phase is a list of file
Ri as shown in Fig. 10.
Fig. 7. Building Hash table when R is smaller than any part of L
TableL
StdId
55701
55702
56700
56701
56702
56703
55501
55502
55503
55504
55505
56501
HashtableL1
StdId
55501
55502
HashtableL2
StdId
56701
56702
L.StdId L.Name R.StdId R.subject
IntermediaResults
56701 Dit
56701
511
56702 Hiu
56702
814
L.StdId L.Name R.StdId R.subject
55701Hash table 56701Hash table
55502 Sher
55502
555
55702Hash table 56703Hash table Group1 56702 Hiu
56702
814
56700Hash table
L.StdId L.Name R.StdId R.subject
L.StdId L.Name R.StdId R.subject
55501 Sul
55501
701
55501 Sul
55501
701 Group2 55501 Sul
55501
371
55501 Sul
55501
371
56701 Dit
56701
511
55502 Sher
55502
555
55503Hash table 55505Hash table
55504Hash table 56501Hash table
The third phase, join all file Ri with table L using broadcast
join as shown in Fig. 11. One challenge of semi join is that not
every record in the Ri of R will join with a particular part Li of
table L. To solve this issue, per-split semi join is proposed.
Fig. 8. Example of broadcasting joins when R is smaller than any part of
L(using equi-join)
Split1
TableR
StdId
55701
55702
56700
56701
56702
56703
Name Map1
Lo
Hashtable
Mo
StdId
Bo
55501
Dit
55502
Hiu
56701
Cha
56702
OutputR1
TableR
StdId
Name
StdId Name Map2
56701 Dit
55501 Sul
Hashtable
56702 Hiu
55502 Sher
StdId
55701Hash table
55503 Jia
55501
Split2
55702Hash table
55504 Di
55502
56700Hash table
55505 Tha
56701
56703Hash table
56501 Ling
56702
OutputR2
StdId
Name
55501 Sul
55502 Sher
55503Hash table
55504Hash table
55505Hash table
56501Hash table
Fig. 10. Example of the second phase in Semi joins (using equi-join)
167
2014 11th International Joint Conference on Computer Science and Software Engineering (JCSSE)
TableL
StdId subject
55501
701 Ma p1
Split1 55501
371
55502
555
56701
511 Ma p2
Split2 56701
814
56702
814
OutputR1
StdId Name
56701 Dit
R1
56702 Hieu
TableL
Intermediateresults2
StdId
subject
L.StdId L.Name R.StdId R.subject
55501
701 Map1
55501 Sul
55501
701
55501
371
55501 Sul
55501
371
55502
555
55502 Sher
55502
555
56701
511
R.subject Split2
56701
814 Map2
511
56702
814
L.StdId L.Name R.StdId R.subject
814
OutputR2
56701R2
814
StdId
Name
56702R2
55501 Sul
R2
55502 Sher
L.StdId L.Name R.StdId R.subject
55501R1
Split1
55502R1
Intermediateresults1
L.StdId L.Name R.StdId
56701 Dit
56701
56701 Dit
56701
56702 Hiu
56702
Fig. 11. Example of the last phase in Semi joins (using equi-join)
F. Per-Split Semi Join
Per-split semi join consists of three phases. The first and
the last phases are map jobs, and the second phase is a full map
reduce job. The first phase is to generate the set of unique join
keys in a split Li of table L, and stores them in the distributed
file system, called Li.uk. The second phase is to load all records
from a split of table R into main memory hash table, and read
the unique keys from file Li.uk and probe the hash table for
matching records from R. Each matched record is outputted
with a tag RLi, which is used by reduce function to collect all
records from table R that will join with Li. In the last phase, the
results of the second phase and Li are joined directly as shown
in Fig. 12 and Fig. 13.
IV.
Many of big data mining problems can be solved by using
MapReduce associated with Key-Value store. Based on
advantages and drawbacks of those explained strategies in
terms of time and network resources consumption, we provide
a comparison of join strategies as shown in Table 1.
TABLE 1. COMPARISION OF JOIN STRATEGIES
Strategy
All pair
partition join
Standard
repartition
join
Improved
repartition
join
Broadcasting
join
Fig. 12. Example of the first phase and second phase in Per-Split Semi Join
OutputofRjoinLi.uk
Tags
StdId Name
RL1
55501 Sul
RL1
55502 Sher
RL2
56701 Dit
RL2
56702 Hiu
TableL
StdId subject
55501
701
55501
371
55502
555
56701
511
56701
814
56702
814
CONCLUSION
Semi-join
Outputoffinalphase
L.StdId L.Name R.StdId R.subject
55501 Sul
55501
701
55501 Sul
55501
371
55502 Sher
55502
555
56701 Dit
56701
511
56701 Dit
56701
814
56702 Hiu
56702
814
Per-split semi
join
Fig. 13. Example of the last phase in Per-Split Semi Join (using equi-join)
Pros/Cons
Easy to implement, all
compound partition
transferred to reducers may
not be processed by
reducers.
Easy to implement, all
records from both tables
will be buffered before
joining that may lead to
insufficient memory
problem.
To reduce buffer size,
implementation is more
complex than the standard
version.
Reduce sorting time and
network traffic. May waste
of network resource.
Some records from parts of
a table broadcasted to
another table may not be
joined.
Complicated
implementation, more
reading and writing
operations.
Suggestion
Used when two
datasets have more
data in common, be
sorted by the same
fields.
Same with all pair
partition join.
Used when two
joined datasets have
few data in common,
be sorted by the
same fields.
Used when one table
is much smaller than
the other table.
Used when a large
portion of a table
may not be
referenced by any
record from the
other table.
Same with semijoin.
Which strategy should be used in any problem depends on
nature of the data and available network resources. If two
168
2014 11th International Joint Conference on Computer Science and Software Engineering (JCSSE)
joined tables have more data in common or having sufficient
network resources, all pair partition join, repartition join should
be used because its implementation is not as complex as the
others. If two joined tables have few data in common or having
inadequate network resources, broadcasting join, semi join,
per-split semi join should be used because it may reduce time
and resources consumption.
[6]
[7]
[8]
Data in NoSQL database can be structured, semi-structured,
or unstructured; and can be stored in many types of data
structures such as indexed table of relational database, B-Tree,
Queue, Hash table. Therefore, in addition to the consideration
presented in this paper, selection of join strategies is also
affected by data structures. MapReduce programmers may also
need to consider data accessing time, data sorting time when
selecting joining strategy. This issue is beyond the scope of this
paper and is left for future research.
[9]
[10]
[11]
REFERENCES
[12]
[1]
[2]
[3]
[4]
[5]
Mapanga, I. and P. Kadebu, Database Management Systems: A NoSQL
Analysis. Interna-tional Journal of Modern Communication
Technologies & Research (IJMCTR), 2013. 1: p. 12-18.
Hecht, R. and S. Jablonski. NoSQL evaluation: A use case oriented
survey. in Cloud and Service Computing (CSC), 2011 International
Conference on. 2011.
Dean, J. and S. Ghemawat, MapReduce: Simplified Data Processing on
Large Clusters, in OSDI '04: Sixth Symposium on Operating Systems
Design and Implementation. 2004, USENIX: San Francisco, California,
USA. p. 137–150.
Dean, J. and S. Ghemawat, MapReduce: simplified data processing on
large clusters, in Communications of the ACM - 50th anniversary issue:
1958 - 2008. 2008. p. 107-113.
Celko, J., Chapter 6. Key–Value Stores, in Joe Celko's complete guide
to NoSQL : what every SQL professional needs to know about
[13]
[14]
[15]
[16]
[17]
[18]
169
nonrelational databases, A. Dierna and H. Scherer, Editors. 2014,
Morgan Kaufmann, Elsevier: USA. p. 81-88.
Oracle, Chapter 1. Introduction to Berkeley DB, in Oracle Berkeley DB:
Getting Started with Berkeley DB for C. 2013. p. 8-15.
Jadhav, V., J. Aghav, and S. Dorwani, Join Algorithms Using
MapReduce: A Survey, in International Conference on Electrical
Engineering and Computer Science. 2013, IOAJ INDIA: Coimbatore,
Tamil Nadu, India. p. 40-44.
White, T., Chapter 8. MapReuce Features, in Hadoop: The Definitive
Guide, Second Edi-tion, M. Loukides, Editor. 2011, O'Reilly Media,
Inc.,: USA. p. 225-257.
White, T., Chapter 8. MapReduce Features, in Hadoop: The Definitive
Guide, Third Edition, M. Loukides and M. Blanchette, Editors. 2012,
O'Reilly Media, Inc.,: USA. p. 259-295.
White, T., Chapter 7. MapReduce Types and Formats, in Hadoop: The
Definitive Guide, Third Edition, M. Loukides and M. Blanchette,
Editors. 2012, O'Reilly Media, Inc.: USA. p. 223-258.
Blanas, S., et al., A comparison of join algorithms for log processing in
MaPreduce, in Proceedings of the 2010 ACM SIGMOD International
Conference on Management of data. 2010, ACM: Indianapolis, Indiana,
USA. p. 975-986.
Özsu, M.T. and P. Valduriez, Chapter 3. Distributed Database Design, in
Principles of Dis-tributed Database Systems, Third Edition. 2011,
Springer New York. p. 71-125.
Bernstein, P.A., et al., Query processing in a system for distributed
databases (SDD-1). ACM Trans. Database Syst., 1981. 6(4): p. 602-625.
Lee, K.-H., et al., Parallel data processing with MapReduce: a survey.
SIGMOD Rec., 2012. 40(4): p. 11-20.
Okcan, A. and M. Riedewald, Processing theta-joins using MapReduce,
in Proceedings of the 2011 ACM SIGMOD International Conference on
Management of data. 2011, ACM: Athens, Greece. p. 949-960.
Shim, K., MapReduce algorithms for big data analysis, in Proceedings
of the VLDB En-dowment 2012, VLDB Endowment. p. 2018-2017.
Olston, C., et al., Pig latin: a not-so-foreign language for data
processing, in Proceedings of the 2008 ACM SIGMOD international
conference on Management of data. 2008, ACM: Vancouver, Canada. p.
1099-1110.
Hive, A., Theta Join. 2013.