Skip to main content
. Author manuscript; available in PMC: 2024 Mar 19.
Published in final edited form as: Nature. 2023 May 31;618(7965):616–624. doi: 10.1038/s41586-023-06139-9

Fig. 4 |. Geneformer encoded gene network hierarchy.

Fig. 4 |

a, ROC curve of Geneformer fine-tuned to distinguish central versus peripheral genes within the N1-dependent gene network using limited data (~30K ECs), compared to alternative methods. b, ROC curve of Geneformer fine-tuned to distinguish N1 activated versus non-target genes using limited data (~30K ECs), compared to alternative methods. c, ROC curve of Geneformer fine-tuned to distinguish central versus peripheral genes within the N1-dependent gene network using increasingly limited data (1K-30K ECs). d, ROC curve of Geneformer fine-tuned to distinguish central versus peripheral genes within the N1-dependent gene network using increasingly limited but more relevant data (884 ECs from healthy or dilated aortas). AUC was higher than alternative methods trained on larger dataset of ~30K ECs (Fig. 3a). e, Pretrained Geneformer attention weights of transcription factors indicated that the model learned in a completely self-supervised way the relative importance of transcription factors, which were more highly attended than other genes in 20% of attention heads (p<0.05, Wilcoxon rank sum, FDR correction) and were more attended in earlier layers (p<0.05, Wilcoxon rank sum). (Alternative methods described in Fig. 2.)