a. At preclinical stage, drug candidates undergo toxicity assessment in animal models to characterize the dose-response relationship of the compound based on histological examination. Toxicogenomics can be employed to complement the compound characterization. b. Overview of TG-GATEs composed of 156 preclinical safety studies (and compounds) accounting for 10,234 pairs of hematoxylin and eosin (H&E) whole-slide images and gene expression profiles. TG-GATEs is split into a development set (127 studies, 8,232 slides) and a test set (29 studies, 2,002 slides). c. We developed two independent prediction models: (1) a morphological lesion prediction model (denoted as Lesion classifier), which classifies 256×256 pixels (or 128 μm) image patches into six lesions, and (2) a gene expression regression model (GEESE), which predicts bulk expression of 1,536 gene targets from an input tissue section. Feature attribution enables GEESE to derive patch-level expression profiles to yield pseudo-spatially resolved expression maps. d. The resulting output forms a dataset of 25 million predicted patch-level morphology-expression pairs, which we use for inferring and validating morphomolecular signatures across several scales, from patches (small regions of interest) to slides (entire tissue sections) to compounds (can include dozens of slides), then across several compounds, and finally across species (rat in vivo to human in vitro).