Module 1 |
Extract subset of raw Midori database for query taxon and loci |
|
Remove sequences with non-binomial species names, reduce subspecies to species labels |
|
Add local sequences (optional) |
|
Check for relevant new sequences for list of query species on NCBI (GenBank and RefSeq) (optional) |
|
Select amplicon region and remove primers |
|
Remove sequences with ambiguous bases |
|
Align |
|
End of module: optional check of alignments |
Module 2 |
Compare sequence species labels with taxonomy |
|
Non-matching labels queried against Catalogue of Life to check for known synonyms |
|
Remaining mismatches kept if genus already exists in taxonomy, otherwise flagged for removal |
|
End of module: optional check of flagged species labels |
Module 3 |
Discard flagged sequences |
|
Update taxonomy key file for sequences found to be incorrectly labelled in Module 2 |
|
Run SATIVA |
|
End of module: optional check of putatively mislabelled sequences |
Module 4 |
Discard flagged sequences |
|
Finalize consensus taxonomy and relabel sequences with correct species label and accession number |
|
Select 1 representative sequence per haplotype per species |