TABLE 1.
The expanded benchmark set of protein-nucleic acid targets
| Ligand protein name | Native complex PDB | Total complex #residues | Unbound ligand PDB | Ligand conformational difference (Cα RMSD, Å) |
|---|---|---|---|---|
|
| ||||
| Signal recognition particle 54 kDa protein | 2V3C (AM:C) | 1172 | 3NDB (C) | 13.8 |
| DNA polymerase IV | 2W9B (CE:A) | 745 | 2RDI (A) | 18.2 |
| *DNA polymerase IV | 2IMW (ST:P) | 379 | 3FDS (A) | 16.3 |
| *DNA polymerase beta | 6NKZ (DPT:A) | 366 | 1BPD (A) | 11.9 |
| 3’-5’ exoribonuclease 1 | 4QOZ (AC:B) | 674 | 1ZBH (A) | 13.4 |
| Ribonuclease E | 6G63 (B:AG) | 1997 | 5F6C (AB) | 11.5 |
| Histone H3.3 | 6NQA (ABCDFGHIJKL:E) | 1456 | 5KDM (A) | 10.1 |
| Transcription factor p65 | 2I9T (BCD:A) | 621 | 1NFI (A) | 10.4 |
| *Transcription initiation factor IIB | 1C9B (BCD:A) | 421 | 5WH1 (A) | 12.2 |
| Nuclear factor of activated T-cells, cytoplasmic 2 | 1P7H (ABM:L) | 1204 | 2AS5(N) | 16.5 |
| *Transcriptional activator Myb | 1H89 (ABDE:C) | 337 | 1GV2 (A) | 7.1 |
| Elongation factorTu 2 | 1OB2 (B:A) | 470 | 4ZV4 (A) | 11.8 |
| *Elongation factorTu | 1TTT (D:A) | 482 | 1AIP (A) | 11.4 |
| Fab heavy/light chain | 2R8S (R:HL) | 592 | 6APC (HL) | 10.7 |
| *RP-A 70 kDa DNA-binding subunit | 1JMC (B:A) | 256 | 1FGU (A) | 8.3 |
| *Antiviral innate immune response receptor RIG-I | 7JL1 (XY:A) | 750 | 4ON9 (A) | 13.2 |
| *Phenylalanine-tRNA ligase, mitochondrial | 3TUP (T:A) | 491 | 5MGU (A) | 18.7 |
Asterisks (*) indicate targets also present in the original Flex-LZerD dataset. The native protein ligands of the two polymerase IV entries, 2W9B and 2IMW, have 100% sequence identity, but their bound DNA molecules have 87% sequence identity in terms of standard bases and additionally contain different nonstandard bases, with an RMSD of 5.4 A. The native protein ligands of two elongation factor Tu entries, 1OB2 and 1TTT, have 74% sequence identity, and their bound RNA molecules have 98% sequence identity and additionally contain different nonstandard bases, with an RMSD of 2.0 A. The remaining targets have less than 25% sequence identity between each other. PDB, Protein Data Bank; RMSD, root-mean-square deviation.