|
Algorithm 2 Cascade-Convergence-based Word Count |
|
Input: S: the continuous string-format “sensor” data.
Output: The word count result.
-
1:
functionPairify(S)
-
2:
-
3:
for each word
do ▹ Restructuring data into <key, value> pairs and storing them.
-
4:
-
5:
end for
-
6:
return
P
-
7:
end function
-
8:
-
9:
functionMapReduce(F list)
-
10:
procedure
Map(F)
-
11:
for each line
do ▹ Splitting the file into <key, value> pairs.
-
12:
Parse l into
-
13:
EmitIntermediate
-
14:
end for
-
15:
end procedure
-
16:
procedure
Reduce(key, value_array)
-
17:
-
18:
for each value
value_array
do ▹ Counting the number of a particular word key.
-
19:
-
20:
end for
-
21:
Emit
-
22:
end procedure
-
23:
return
list
-
24:
end function
-
25:
-
26:
while receiving S
do
-
27:
repeat
-
28:
▹F is the data block to be cached.
-
29:
repeat
-
30:
Pairify(S)
-
31:
until reaching the threshold size of data block
-
32:
until having a F list
-
33:
for each non-end MapReduce convergers do
-
34:
▹F list is for the outermost MapReduce convergers,
-
35:
▹ while intermediateResult list is for the other intermediate MapReduce convergers.
-
36:
intermediateResult ←MapReduce(F list or intermediateResult list)
-
37:
end for
-
38:
end while
-
39:
MapReduce(final intermediateResult list) ▹ Delivering final result by the end MapReduce converger.
-
40:
return
result
|