Skip to main content
. 2023 Oct 16;21:728. doi: 10.1186/s12967-023-04576-8

Table 1.

List of converging functional themes identified by different LLMs for a co-expression gene set (M9.2), from the BloodGen3 repertoire associated with circulating erythroid cells: ALAS2 BCL2L1 BPGM C14ORF45 C1ORF128 CA1 EPB42 FAM46C FECH GMPR GPR146 GYPB GYPE IFITL1 KRT1 MARCH8 NFIX OR2W3 OSBP2 PLEK2 SELENBP1 SLC14A1 SLC4A1 SNCA TMCC2 TMOD1 TNS1 TRIM10 TUBB2A XK

Model Functional Themes Associated genes LLMs’ Comments
GPT3.5 Hemoglobin synthesis and red blood cell function ALAS2, CA1, EPB42, FECH, GYPB, GYPE, SLC4A1 These genes are involved in processes related to hemoglobin synthesis, red blood cell membrane stability, and carbon dioxide transport
Enzymes and metabolic processes: BPGM, FECH, GMPR These genes encode enzymes involved in glycolysis, heme biosynthesis, and purine nucleotide biosynthesis, respectively
Cytoskeletal proteins: KRT1, TUBB2A These genes encode proteins involved in maintaining the structure and function of the cytoskeleton
GPT-4 Erythrocyte development and function ALAS2, BPGM, EPB42, GYPB, GYPE, SLC4A1, XK, FECH These genes are all involved in processes related to erythrocyte (red blood cell) development, function, or associated conditions. For example, ALAS2 and FECH are involved in heme biosynthesis, a crucial component of hemoglobin. BPGM, EPB42, SLC4A1, GYPB, GYPE, and XK are directly involved in erythrocyte structure or function, and mutations in these genes can lead to conditions like hereditary spherocytosis, elliptocytosis, or blood group antigen variations
Cytoskeleton and cellular structure TNS1, TMOD1, TUBB2A These genes are all involved in cytoskeleton formation and function, which is essential for cell shape, division, and movement
Ubiquitin–proteasome system MARCH8, TRIM10 These genes are involved in the ubiquitin–proteasome system, which is essential for protein degradation, cell cycle regulation, and other crucial cellular functions
Claude Erythrocyte biology or iron/heme metabolism ALAS2, FECH, SLC4A1 ALAS2 and FECH are both involved in porphyrin/heme biosynthesis. SLC4A1 encodes a major anion exchanger in erythrocytes
Cytoskeletal or cytoskeleton-associated genes EPB42, KRT1, TMOD1, TNS1, TUBB2A These genes encode cytoskeletal proteins like keratins, tubulins, actins
Protein binding, degradation or trafficking MARCH8, TRIM10 MARCH8 is an E3 ubiquitin ligase involved in protein degradation. TRIM10 is a RING-finger E3 ubiquitin ligase as well that regulates trafficking of GLUT4 glucose transporters
Interferon response IFITL1, SELENBP1 IFITL1 and SELENBP1 are both interferon-stimulated genes, though they have different functions
phospholipid metabolism or vesicle trafficking OSBP2, PLEK2 OSBP2 encodes an oxysterol binding protein that regulates lipid metabolism. PLEK2 is involved in regulating vesicle transport
GPCR biology GPR146, SNCA GPR146 is an orphan GPCR. SNCA encodes alpha-synuclein, which is involved in dopamine neurotransmission and signaling
Bard Production of Heme ALAS2, FECH ALAS2 and FECH are both involved in the production of heme, which is an essential molecule for many cellular processes
Glutathione synthesis BPGM, GYPB BPGM and GYPB are both involved in the synthesis of glutathione, which is an antioxidant that protects cells from damage
Cell signaling GPR146, OR2W3 GPR146 and OR2W3 are both G protein-coupled receptors that bind to specific ligands and signal to the cell
Solute carrier genes SLC14A1, SLC4A1 SLC14A1 and SLC4A1 are both solute carrier genes that transport essential molecules across cell membranes