Skip to main content
. 2018 May 4;6:e4750. doi: 10.7717/peerj.4750

Algorithm 1. Summary of the preprocessing steps done to each enzyme at training time.

Data: N training enzymes, grid size of l, homothetic transformation ratio λ, p interpolations, probability pflip to flip with respect to each axis.
Input: Raw coordinates contained in PDB files
Output: Volumes of binary voxels representing backbone atoms occupancy
1 foreach of the N enzymes of the training set do
2 Step 1: structural information extraction
3  Extract coordinates of backbone atoms from its PDB file
4 Step 2: holes completion
5  Interpolate consecutive backbone atoms by p new points
6 Step 3: size adjustment
7  Center barycenter S of the coordinates on (0, 0, 0)
8  Homothetic transformation of each point with center S and ratio λ
9 Step 4: enzyme orientation
10  Principal component analysis (PCA) transformation
11 Step 5: random augmentation
12 if True with probability pflip then
13   Flip coordinates with respect to the origin along xaxis
14 if True with probability pflip then
15   Flip coordinates with respect to the origin along yaxis
16 if True with probability pflip then
17   Flip coordinates with respect to the origin along zaxis
18 Step 6: voxelization
19  Center barycenter S of the coordinates on (l2,l2,l2)
20  Transform coordinate points into binary voxels