Table 3.
Sample code for the evaluation of a multistep design pipeline
Action | Code Sample |
---|---|
Load |
import rstoolbox as rs
import matplotlib.pyplot as plt |
Read |
# With Rosetta installed, scoring can be run for a single structure
baseline = rs.io.get_sequence_and_structure(‘1kx8.pdb’, minimize = True) slen = len(baseline.iloc[0 ].get_sequence (‘A’)) # Pre-calculated sets can also be loaded to contextualize the data # 70% homology filter cath = rs.utils.load_refdata(‘cath’, 70) # Length in a window of 10 residues around expected design length cath = cath[(cath[‘length’] > = slen - 5) & (cath[‘length’] < = slen + 5)] # Designs were performed in two rounds gen1 = rs.io.parse_rosetta_file(‘1kx8_gen1.designs’) gen2 = rs.io.parse_rosetta_file(‘1kx8_gen2.designs’) # Identifiers of selected decoys: decoys = [‘d1’, ‘d2’, ‘d3’, ‘d4’, ‘d5’, ‘d6’] # Load experimental data for d2 (best performing decoy) df_cd = rs.io.read_CD(‘1kx8_d2/CD’, model = ‘J-815’) df_spr = rs.io.read_SPR(‘1kx8_d2/SPR.data’) |
Plot |
fig = plt.figure(figsize = (170 / 25.4, 170 / 25.4))
grid = (3, 4) # Compare scores between the two generations axs = rs.plot.multiple_distributions(gen2, fig, (3, 4), values = [‘score’, ‘hbond_bb_sc’, ‘hbond_sc’, ‘rmsd’], refdata = gen1, violins = False, showfliers = False) |
# See how the selected decoys fit into domains of similar size
qr = gen2[gen1[‘description’].isin(decoys)] axs = rs.plot.plot_in_context(qr, fig, (3, 2), cath, (1, 0), [‘score’, ‘cav_vol’]) axs[0].axvline(baseline.iloc[0][‘score’], color = ‘k’, linestyle = ‘--’) axs[1].axvline(baseline.iloc[0][‘cavity’], color = ‘k’, linestyle = ‘--’) | |
# Plot experimental validation data
ax = plt.subplot2grid(grid, (2, 0), fig = fig, colspan = 2) rs.plot.plot_CD (df_cd, ax, sample = 7) ax = plt.subplot2grid(grid, (2, 2), fig = fig, colspan = 2) rs.plot.plot_SPR (df_spr, ax, fitcolor = ‘black’) | |
plt.tight_layout()
plt.savefig(‘BMC_Fig4.png’, dpi = 300) |
The code shows how to combine the data from multiple Rosetta simulations and assess the different features between two design populations in terms of scoring as well as the comparison between the final designs and the initial structure template. Code comments are presented in italics while functions from the rstoolbox are highlighted in bold. Styling commands are skipped to facilitate reading, but can be found in the repository’s notebook.