Skip to main content
. Author manuscript; available in PMC: 2023 Mar 9.
Published in final edited form as: Science. 2022 Sep 15;378(6615):49–56. doi: 10.1126/science.add2187

Table 1.

Single chain sequence design performance on CATH held out test split.

Noise level when training: 0.00A/0.02A Modification Number of Parameters PDB Test Accuracy PDB Test Perplexit y AlphaFold Model Accuracy
Baseline model None 1.381 mln 41.2/40.1 6.51/6.77 41.4/41.4
Experiment 1 Add N, Ca, C, Cb, O distances 1.430 mln 49.0/46.1 5.03/5.54 45.7/47.4
Experiment 2 Update encoder edges 1.629 mln 43.1/42.0 6.12/6.37 43.3/43.0
Experiment 3 Combine 1 and 2 1.678 mln 50.5/47.3 4.82/5.36 46.3/47.9
Experiment 4 Experiment 3 with random instead of forward decoding 1.678 mln 50.8/47.9 4.74/5.25 46.9/48.5

Test accuracy (percentage of correct amino amino acids recovered) and test perplexity (exponentiated categorical cross entropy loss per residue) are reported for models trained on the native backbone coordinates (left, normal font) and models trained with Gaussian noise (std=0.02Å) added to the backbone coordinates (right, bold font); all test evaluations are with no added noise. The final column shows sequence recovery on 5,000 AlphaFold protein backbone models with average pLDDT > 80.0 randomly chosen from UniRef50 sequences.