Table 1.
Noise level when training: 0.00A/0.02A | Modification | Number of Parameters | PDB Test Accuracy | PDB Test Perplexit y | AlphaFold Model Accuracy |
---|---|---|---|---|---|
Baseline model | None | 1.381 mln | 41.2/40.1 | 6.51/6.77 | 41.4/41.4 |
Experiment 1 | Add N, Ca, C, Cb, O distances | 1.430 mln | 49.0/46.1 | 5.03/5.54 | 45.7/47.4 |
Experiment 2 | Update encoder edges | 1.629 mln | 43.1/42.0 | 6.12/6.37 | 43.3/43.0 |
Experiment 3 | Combine 1 and 2 | 1.678 mln | 50.5/47.3 | 4.82/5.36 | 46.3/47.9 |
Experiment 4 | Experiment 3 with random instead of forward decoding | 1.678 mln | 50.8/47.9 | 4.74/5.25 | 46.9/48.5 |
Test accuracy (percentage of correct amino amino acids recovered) and test perplexity (exponentiated categorical cross entropy loss per residue) are reported for models trained on the native backbone coordinates (left, normal font) and models trained with Gaussian noise (std=0.02Å) added to the backbone coordinates (right, bold font); all test evaluations are with no added noise. The final column shows sequence recovery on 5,000 AlphaFold protein backbone models with average pLDDT > 80.0 randomly chosen from UniRef50 sequences.