| Algorithm 1: Data Collecting Method (GetDataSet) |
| Input: |
| TA: The task attempt |
| TN: The name of the task attempt |
| P: The progress of a running task |
| TT: The type of the running task |
| DM: The data map storing all the data lists of currently running tasks, whose key is TN and value is DL |
| DL: The data list storing HisPro generated by a running task |
| Steps: |
| For each TA in the task pool |
| If Current TA is Running |
| Get the current HisPro |
| Get the DL from DM according to TN |
| If DL does not contain HisPro |
| Add HisPro to the DL |
| Else |
| Update the DL using HisPro |
| EndIf |
| Update the DL in DM |
| EndIf |
| If P > α |
| savetoHDFS (TA, DL, TT) |
| EndIf |
| EndFor |