| First step |
| Inputs: X, window sizes wd, wm, and wf, threshold fTHRESH, sequence data S Outputs: SUSPECT, the set of candidate indel locations |
| For k=w f + 1, …,T − wf, where T is the length of the reference sequence, compute the |
| function f(k) as defined above. |
| Let SUSPECT ←{k : f (k) > fTHRESH} |
| Second step |
| Inputs: X, SUSPECT, S0, M
|
| Outputs: INDEL, the set of indel locations |
| INDEL ← EMPTY |
| For all k ∈ SUSPECT
do
testseq ←
|
| For all yi ∈ S
do
|
| Align yi and testseq using Smith-Waterman (SW) algorithm. |
| end for
|
| if more than M yi align to testseq with less than 2 mismatches and with shared indel interval [p, q], then
|
| INDEL ← INDEL ∪ {[p, q]} |
| end if end for
|