Table 2:
List of commands included in both BigSeqKit and seqkit. Those commands with an asterisk support new functionalities not included in seqkit
| Basic commands | |
| seq | Transform sequences (extract ID, filter by length, remove gaps, reverse complement, etc.) |
| subseq | Get subsequences by region/gtf/bed, including flanking sequences |
| stats | Simple statistics of FASTA/Q files: #seqs, min/max length, N50, Q20%, Q30%, etc. |
| faidx* | Create FASTA or FASTQ index file and extract subsequences |
| Format conversion | |
| fa2fq | Retrieve corresponding FASTQ records by a FASTA file |
| fq2fa | Convert FASTQ file to FASTA format |
| translate | Translate DNA/RNA to protein sequence |
| Searching | |
| grep | Search sequences by ID/name/sequence/sequence motifs |
| locate | Locate subsequences/motifs |
| Set operations | |
| sample | Sample sequences by number or proportion |
| rmdup | Remove duplicated sequences by ID/name/sequence |
| common | Find common sequences of multiple files by ID/name/sequence |
| duplicate | Duplicate sequences N times |
| head | Print first N FASTA/Q records |
| head-genome | Print sequences of the first genome with common prefixes in name |
| pair | Match up paired-end reads from 2 FASTQ files |
| range | Print FASTA/Q records in a range (start:end) |
| Edit | |
| concat | Concatenate sequences with the same ID from multiple files |
| replace | Replace name/sequence using a regular expression |
| rename | Rename duplicated IDs |
| Ordering | |
| sort | Sort sequences by ID/name/sequence/length |
| shuffle | Shuffle sequences |