. 2022 Aug 25;13(35):10486–10498. doi: 10.1039/d2sc02839e

Effect of capturing chain architecture and monomer stoichiometry information on D-MPNN performance. The average R² values obtained from a 10-fold cross validation based on random splits, for the prediction of IP, are shown. Uncertainty is implied as the standard error of the mean was <0.005 in all cases. Under the header “Representation”, “monomers” indicates the model was provided with the graph structure of separate monomer units; “chain architecture” indicates the model was provided with information on how the monomer units may connect to one another to form an ensemble of possible sequences, via the definition of edge weights, used as shown in Fig. 2b and c; “stoichiometry” indicates the model was provided with information on monomer stoichiometry, which was used to weigh learnt node representations as shown in Fig. 2d. An extended version of this table, with results obtained also for EA and showing RMSE too as performance measure, is available in Table S1.

Datasets	Representation
Datasets	Monomers	Monomers + chain architecture	Monomers + stoichiometry	Monomers + chain architecture + stoichiometry
Original dataset	0.88	0.90	0.98	1.00
Inflated chain architecture importance	0.65	0.86	0.71	0.98
Inflated stoichiometry importance	0.26	0.27	0.97	0.99