Skip to main content
. 2023 Jul 31;12:giad062. doi: 10.1093/gigascience/giad062

Table 2:

List of commands included in both BigSeqKit and seqkit. Those commands with an asterisk support new functionalities not included in seqkit

Basic commands
seq Transform sequences (extract ID, filter by length, remove gaps, reverse complement, etc.)
subseq Get subsequences by region/gtf/bed, including flanking sequences
stats Simple statistics of FASTA/Q files: #seqs, min/max length, N50, Q20%, Q30%, etc.
faidx* Create FASTA or FASTQ index file and extract subsequences
Format conversion
fa2fq Retrieve corresponding FASTQ records by a FASTA file
fq2fa Convert FASTQ file to FASTA format
translate Translate DNA/RNA to protein sequence
Searching
grep Search sequences by ID/name/sequence/sequence motifs
locate Locate subsequences/motifs
Set operations
sample Sample sequences by number or proportion
rmdup Remove duplicated sequences by ID/name/sequence
common Find common sequences of multiple files by ID/name/sequence
duplicate Duplicate sequences N times
head Print first N FASTA/Q records
head-genome Print sequences of the first genome with common prefixes in name
pair Match up paired-end reads from 2 FASTQ files
range Print FASTA/Q records in a range (start:end)
Edit
concat Concatenate sequences with the same ID from multiple files
replace Replace name/sequence using a regular expression
rename Rename duplicated IDs
Ordering
sort Sort sequences by ID/name/sequence/length
shuffle Shuffle sequences