Skip to main content
. 2023 Dec 5;15(2):500–510. doi: 10.1039/d3sc04610a

Ternary classification accuracies of fine-tuned GPT-3 and trained GNN models for “unknown” molecules.

Conjugated fragment Number of molecules HOMO accuracy LUMO accuracy
GPT-3 GNN GPT-3 GNN
Naphthalene (1) 475 0.94 0.95 0.88 0.91
Anthracene (2) 577 0.99 1.00 0.93 0.97
Tetracene (3) 72 0.96 1.00 0.90 0.99
Pyrene (4) 237 0.98 1.00 0.97 0.99
Perylene (5) 41 0.98 1.00 0.98 0.95
(1) + (2) + (3) + (4) + (5)a 1402 0.97 0.98 0.93 0.95
p-Benzoquinone (6) 295 0.83 0.91 0.87 0.87
1,4-Naphthoquinone (7) 282 0.82 0.91 0.94 0.96
9,10-Anthraquinone (8) 186 0.86 0.91 0.99 1.00
(1) + (2) + (3) + (4) + (5) + (6) + (7) + (8)b 2165 0.88 0.91 0.95 0.96
1,8-Naphthalimide (9) 85 0.86 0.93 1.00 1.00
Naphthalenetetracarboxylic diimide (10) 88 0.86 0.88 0.95 0.98
Perylenetetracarboxylic diimide (11) 76 0.85 0.89 0.97 0.97
(1) + (2) + (3) + (4) + (5) + (6) + (7) + (8) + (9) + (10) + (11)c 3177 0.88 0.88 0.97 0.97
a

All five families of molecules were excluded from model training. The HOMO/LUMO prediction accuracies reported in this row were measured on these five families of molecules.

b

All eight families of molecules were excluded from model training. The HOMO/LUMO prediction accuracies reported in this row were measured on the families 6–8 of molecules.

c

All 11 families of molecules were excluded from model training. The HOMO/LUMO prediction accuracies reported in this row were measured on the families 9–11 of molecules.