Table 1.
Dataset | Size of dataset | n | k | Theoretical size of sequence space | Length of protein sequence |
---|---|---|---|---|---|
Cyt P450 | 242 | 8 | 3 | 6561 | 464–466 |
GLP-2 | 31 | 31 | 2 | 2.147 billion | 33 |
Enterotoxin | 12 | 40 | 2 | 1099.5 billion | 233 |
TNF | 21 | 17 | [2, 7, 4, 6, 2, 9, 9, 9, 9, 9, 2, 2, 2, 2, 6, 8, 7] | 213.3 billion | 157 |
The theoretical size of sequence space S is calculated as the product all k values for all mutated positions