Skip to main content
BMC Genomics logoLink to BMC Genomics
. 2025 Dec 13;27:59. doi: 10.1186/s12864-025-12427-7

Transcriptomes of thirteen healthy feline tissue types

Jessica J Hayward 1,2,, Susan Garrison 2, Isabel Hernandez 2, Teresa Southard 1, Lin Lin 2, Jennifer K Grenier 3, Faraz Ahmed 3, Sierra Corey 2, Elizabeth Wilcox 2, Michael Boyle 4, Michelle E White 1, Marta G Castelhano 2,5, Leslie A Lyons 6, Rory J Todhunter 5
PMCID: PMC12817795  PMID: 41390362

Abstract

Background

The domestic cat, Felis catus, is a popular worldwide pet but is prone to many genetic diseases in both purebred and random-bred cats. Many of these genetic diseases can serve as important feline models of cognate human diseases. To help understand the fundamental basis of some of these diseases and because healthy tissue sequence is difficult to obtain, we provide a resource of transcriptome sequences from thirteen healthy feline tissue types (adipose, brain cortex, hip joint capsule, cardiac, renal cortex, renal medulla, ileum, jejunum, liver, pancreas, round ligament of the femoral head, skin, and thyroid gland). These represent the tissue types affected with common complex diseases of cats diagnosed at the Cornell University Companion Animal Hospital.

Results

Postmortem tissue specimens were sampled from six healthy cats, for a total of 78 samples. Both mRNA-enriched libraries by poly(A) selection and rRNA-depleted libraries were created for sequencing. These produced transcriptome datasets of high quality, as shown by clustering of samples from the same tissue type in a Principal Components Analysis, by comparison of enriched genes to human tissue in the Genotype-Tissue Expression portal, and by high levels of expected tissue-specific gene expression with relevant pathway enrichment.

Conclusions

We make this dataset publicly available to expand the genomic resources available for future domestic cat research, such as functional studies to confirm variants for complex diseases discovered by genome-wide association studies, and for improving the annotation of the feline genome. Future research could leverage this dataset to explore gene expression profiles of disease-associated genomic variants and their interactions with environmental factors that influence both feline and human health.

Supplementary Information

The online version contains supplementary material available at 10.1186/s12864-025-12427-7.

Keywords: Domestic cat, Felis catus, RNA-seq, Healthy control tissue, mRNA enrichment, rRNA depletion, Transcriptome

Background

Domestic cats serve as valuable models for human disease research, with 463 total traits (disease and non-disease), including 289 potential models of similar human diseases [1]. Among these total traits, 149 exhibit single-gene inheritance patterns, while two-thirds remain with undefined modes of inheritance that likely involve complex genetic mechanisms. For diseases characterized by autosomal recessive or dominant inheritance patterns, genetic tests can identify carriers to direct breeding strategies. Recently, whole genome sequencing has been applied to identify causal variants of feline genetic diseases that are models of similar human diseases, for example Ehlers-Danlos syndrome [2] and sebaceous gland dysplasia [3]. This success was facilitated by the 99 Lives Consortium (http://felinegenetics.missouri.edu/99lives), which pools whole genome sequence resources for access by the feline genetics community.

The genomic infrastructure supporting variant discovery in the domestic cat continues to improve. Buckley and colleagues (2020) produced a long read domestic cat genome, Felis_catus_9.0, to fill the 300,000 gaps in the previous genome sequence [4]. A highly-continuous haploid genome sequence, Fca126, was developed from the DNA of a female F1 Safari cat that had a Geoffroy’s cat sire and a domestic cat dam [5]. The improved reference assemblies have enabled the development of Precision Medicine techniques in cats through whole genome sequencing (WGS) and whole-exome sequencing (WES/WXS), which are likely to improve the diagnosis of rare diseases in veterinary clinical practice [6] and support the identification of tumor-drivers in neoplastic conditions. However, as found in humans, the success rate for WGS/WES techniques to identify causal variants remains below 50%. Given that the cat now possesses a robust, nearly telomere-to-telomere genome reference, poor annotation likely represents the primary weakness in cat Precision Medicine efforts. Complementary genomic approaches like RNA sequencing technology are being used to identify biomarkers and checkpoint targets in feline diseased tissues including oral squamous cell carcinoma [7, 8]. Furthermore, RNA sequencing has been employed to test the effect of a short duration of oral rapamycin in cats with hypertrophic cardiomyopathy [9].

Random-bred domestic cats are commonly admitted to the Cornell University Companion Animal Hospital with diseases that have complex inheritance patterns. One approach to discover the variants underlying these diseases is through genome-wide association studies, for which there is no a priori hypothesis about the causal variants. This approach has proven successful for mapping feline single-gene disorders, for example, hypolkalemia in Burmese cats [10], where subsequent sequencing of associated regions has led to identification of causal variants. Variant identification in random-bred cats, such as domestic shorthairs, presents additional challenges because the genetic background upon which the variants are expressed create conditions where linkage disequilibrium decay occurs more rapidly than in purebred cats [11]. To date, genetic mapping of complex diseases has proven more difficult than mapping diseases with simple inheritance, even in purebred cat populations [12].

The most common complex diseases among cats admitted to our hospital and represented by DNA samples in the Cornell Veterinary Biobank are hypertrophic cardiomyopathy, hyperthyroidism, diabetes mellitus, gingivostomatitis, chronic kidney disease, chronic enteropathies, and small cell alimentary lymphoma. A complicating factor to identification of the causal genes of these common diseases is the substantial number of cases and controls necessary to map complex disorders, with hundreds to thousands of individuals in each group predicted to be necessary to achieve genome-wide significance for mapping canine complex disorders as demonstrated in canine studies [13].

Successful translation of GWAS requires functional validation. For genes in GWAS-associated regions to be viable causal candidates, they should have altered expression in the affected tissue(s). Therefore, a well-curated transcriptome of healthy domestic cat tissues constitutes an essential first step for supporting functional studies of candidate genes once significant associations are identified. While diseased tissue samples are more readily available from sick cats if they don’t survive or need biopsies (for example to confirm a diagnosis), a comprehensive healthy domestic cat tissue transcriptome can facilitate comparative studies with diseased tissue expression patterns and guide the development of next-generation mapping arrays that adequately represent all expressed genes.

RNA sequencing (RNA-seq) provides expression profiles of entire transcriptomes at single-nucleotide resolution without requiring prior knowledge of genetic sequences and delivers unprecedented high throughput capabilities [14]. Beyond enabling gene expression with superior resolution, RNA-seq facilitates comprehensive studies of alternative splicing, RNA editing and allele-specific expression, all of which can be extended to investigate molecular quantitative trait loci (QTL) investigations when population-level genotypes are available. Previous studies have demonstrated that some diseases exhibit high tissue specificity [15] and may reflect genes with tissue restricted expression patterns. Additionally, different library construction protocols can produce varying results in terms of sequence coverage, complexity, evenness, and expression level estimates [16].

Two RNA selection methods were used here, poly(A)-enriched and rRNA-depleted. Each remove a specific set of RNAs: the poly(A) negative RNAs and the ribosomal RNAs, respectively. These two library construction protocols thereby target different portions of the transcriptome, hence both were included in the sequencing protocol.

In this study, RNA was isolated and sequenced from 13 different healthy feline tissues taken from a total of 11 cats (6 cats per tissue). Since the majority of the cats admitted to the hospital are random-bred or mixed-breed cats, the transcriptome sequencing efforts were focused on these populations, including nine domestic shorthair cats, one Siamese and one Siamese mix. Recognizing that gene expression patterns vary with the sex [17] and age of an individual [18] depending on the tissue examined, we included cats of both sexes with ages ranging from < 1 to 19 years old. Since the postmortem interval prior to RNA extraction can significantly impact the integrity of the isolated sample [19], tissues included in the analysis were flash frozen in liquid nitrogen within a short postmortem interval (PMI).

Our analysis revealed that protein-coding and non-coding genes manifest distinct expression patterns, with tissue-specific genes showing strong associations with particular biological processes and development pathways characteristic of each tissue. This comprehensive data set provides a fundamental resource for the investigation of candidate gene expression profiles from tissues of healthy control cats, supporting future comparative studies in feline genetic disease research.

Methods

The methods used in this study meet the ENCODE Guidelines and Best Practices for RNA-Seq (Dec 2016).

Biospecimen collection

Thirteen different tissue types (adipose, brain, joint capsule, heart, ileum, jejunum, kidney cortex, kidney medulla, liver, pancreas, round ligament of hip, skin, and thyroid gland) were harvested postmortem from eleven domestic cats following humane euthanasia. Euthanasias were performed in the time period from Oct 2015 to May 2018. None of the cats were euthanized for the sole purpose of this study. Euthanasia was conducted humanely, either by the hospital or shelter in accordance with veterinary clinical standards of care, or at Cornell under IACUC protocol 2005 − 0151 using an overdose of a pentobarbital following sedation with butorphanol and acepromazine. Six of these cats were < 1 year of age and sourced from Marshall BioResources (North Rose, New York, USA) (Supplementary Table 1). Two cats (4 and 6 years old) were from the Tompkins County SPCA and a further three cats > 16 years old were patients at the Cornell University Companion Animal Hospital (Supplementary Table 1).

Six specimens of each tissue type were evaluated, giving a total of 78 specimens (Table 1). Due to uneven sampling across the eleven cats, 58 of the 78 specimens were from young cats (< 1 year of age). The tissues included in this study were evaluated to confirm their health and structural integrity for transcriptomic analysis. Each tissue underwent gross examination to assess macroscopic features by a board-certified veterinary pathologist, and some had histological examination of formalin-fixed paraffin embedded tissue sections using hematoxylin and eosin (H&E) stained slides.

Table 1.

Tissue samples taken from each of 11 cats, with a total of 6 samples per tissue

13655 13784 15455 20604 23349 23927 23928 23961 23962 24235 24240
A X X X X X X
B X X X X X X
C X X X X X X
H X X X X X X
I X X X X X X
J X X X X X X
KC X X X X X X
KM X X X X X X
L X X X X X X
P X X X X X X
R X X X X X X
S X X X X X X
T X X X X X X

A adipose, brain frontal cortex, joint capsule, cardiac muscle, ileum, jejunum,KC kidney cortex, KM kidney medulla, liver, pancreas, round ligament, skin, thyroid gland

Postmortem tissue collection was performed as described [20]. Briefly, the collection occurred in Cornell’s Animal Health and Diagnostic Center (AHDC) necropsy laboratory; tissue specimens were harvested by a board-certified veterinary pathologist and preserved, accessioned and stored by a licensed veterinary technician of the Cornell Veterinary Biobank team. The PMI, defined here as the interval between time of death and the time of tissue specimen preservation, ranged from 10 to 81 min with an average PMI of 40 min. Tissue specimens of abdominal adipose, brain cortex, joint capsule of the hip, cardiac muscle (left ventricle), ileum, jejunum, kidney cortex, kidney medulla, liver lobe, pancreas, round ligament of the hip, skin from either thoracic limb or abdomen, and thyroid gland were collected. The specimens from each site were flash frozen in liquid nitrogen and transferred into an automated liquid nitrogen storage tank. All procedures were performed according to the Cornell University Institutional Animal Care and Use Committee (IACUC) protocol 2005 − 0151.

RNA extraction

Total RNA was extracted from each tissue by the Cornell Veterinary Biobank or by the Transcriptional Regulation and Expression (TREx) Facility at Cornell University. A total of between 0.007 g and 0.212 g of frozen tissue was placed in a 2.0mL vial containing 2.8 mm ceramic beads (Hard Tissue Homogenizing Mix, VWR, PA, USA). Subsequently, 1mL cold Trizol Reagent (Invitrogen, subsidiary of Thermo Fisher Scientific, MA, USA) was added to the vial and the tissue was homogenized at least twice for 30 s each at room temperature using the Mini Bead Mill Homogenizer (VWR, PA, USA) until tissue fragments could not be seen, chilling on ice between runs. After a brief centrifuge, the supernatant was transferred to a new 2.0mL nuclease-free tube. RNA was purified following the Trizol Reagent RNA isolation protocol, or a hybrid protocol in which the aqueous supernatant generated from the Trizol protocol following the addition of chloroform was combined with 2 volumes of 100% ethanol and RNA extracted with the Quick RNA mini-prep kit (Zymo Research, CA, USA). The RNA pellets were then resuspended or eluted in 50–100µL of RNase-free water. The concentration and purity of the total RNA were determined by spectrophotometry using a NanoDrop 1000 (Thermo Fisher Scientific, MA, USA) and RNA integrity was determined using a Fragment Analyzer system (Agilent Technologies, CA, USA). RNA Quality Numbers (RQN) from the Fragment Analyzer averaged 6.7 across all 78 tissues (range of 2.5 to 10) (Supplementary Table 1).

Library preparation and sequencing

RNA-seq library preparation and sequencing was performed by the Cornell TREx Facility. Two enrichment methods were implemented in this study: ribosomal RNA subtraction and polyA + RNA isolation. For the former method, rRNA was subtracted by hybridization from 100ng total RNA per sample using the RiboZero HMR Gold rRNA Depletion Kit (Illumina, CA, USA). For the latter method, PolyA + RNA was isolated with the NEBNext Poly(A) mRNA Magnetic Isolation Module (New England Biolabs, MA, USA) from 350ng total RNA per sample. We purposefully chose higher quality samples (RQN≥6) for polyA enrichment, resulting in an average RQN for the polyA method of 7.8 and an average RQN for the riboZero method of 5.9 (Supplementary Table 1, Supplementary Fig. 1 A). The difference in the RQN for polyA and riboZero samples was statistically significant (Welch’s T-test, t = 6.5, P < 0.001). Depletion of rRNA library methods are more robust to lower quality and degraded RNA than poly(A) enrichment library methods [21, 22]. We aimed for three samples from each tissue for the polyA method and three samples for the riboZero method, but lower sample quality (measured by RQN) meant we used five and six samples for riboZero for the skin and kidney cortex samples, respectively. T tests and raincloud plots of differences between specific variables for polyA and riboZero libraries were performed using JASP v 0.95.4 [23].

TruSeq-barcoded RNA-seq libraries were generated with the NEBNext Ultra II Directional RNA Library Prep Kit (New England Biolabs, MA, USA). Each library was quantified with a Qubit 2.0 (dsDNA HS kit; Thermo Fisher Scientific, MA, USA) and the size distribution was determined with a Fragment Analyzer (Agilent Technologies, CA, USA). All samples were pooled together and libraries were sequenced on an S4 flowcell on the NovaSeq6000 instrument (Illumina, CA, USA). At least 24 M (average > 39 M) paired-end 150 bp reads were generated per library.

Data quality control

FastQC [24] and MultiQC [25] were used to determine and visualize quality control statistics on the raw fastq files, respectively. Trimmomatic v 0.39 [26] was used to remove low quality bases and adapter sequences from the reads, and FastQC was then run again on these trimmed sequences to determine that trimming had been performed adequately.

Alignment

All commands used for alignment are found in Supplementary Data 1. Using the program STAR v2.7.0d [27], an index genome was generated using the reference genome Felis_catus9.0 (Ensembl release 100, March 2020) and corresponding gene annotation general transfer format (gtf) file. This was performed using the option --runMode genomeGenerate and --sjdbOverhang 149 (since the read lengths are 150 bp). Each sample underwent 1st-pass mapping to this index. During this step, non-canonical splice junctions and those covered by multi-mapping reads were removed, using the options --outFilterIntronMotifs RemoveNoncanonical and --outSJfilterReads Unique, respectively. Further, only junctions supported by at least three reads were kept, using the option --outSJfilterCountUniqueMin −1 3 3 3. Then all splice junctions on mitochondrial DNA and non-chromosomal contigs were manually removed from the output using the linux command awk. All the splice junctions remaining for each sample were collected and used to regenerate the genome index, using the option --sjdbFileChrStartEnd, again using --runMode genomeGenerate. Finally, 2nd-pass mapping to this regenerated index was performed, using option --quantMode GeneCounts. Picard v. 2.19.2 MarkDuplicates (https://broadinstitute.github.io/picard/) was used to identify duplicate reads from the aligned bam files. Metrics from the mapping (percentage of reads uniquely mapped, mapped to multiple loci, mapped to too many loci, and unmapped) were assembled and compared between the polyA and the riboZero library samples. Bayesian ANCOVA [28] was performed in JASP v0.95.4 to investigate the effect of these metrics on differences in percentage aligned in polyA versus riboZero library samples. Qualimap v2.2.1 [29] was used to look at the genomic origin of the aligned reads.

Transcript assembly

To determine all transcripts from aligned reads, we used StringTie v 2.1.4 [30] to catalog annotated transcripts, novel isoforms of annotated genes, and novel transcripts from unannotated genes. All commands used are in Supplementary Data 1. First, transcripts were called for each sample individually (n = 78), with a supplied reference gtf file. Then these were merged together using the --merge command to create a unified set of transcripts. We ran the assembled transcripts through a custom python script called unique_gene_id.py (available https://gist.github.com/e2b3e89fc0220e4363765fadc8c1dd73). StringTie merges transcripts that overlap with each other into the same MSTRG ID. This python script simply assigns a different suffix to MSTRG ID if the gene name is different, (or if no gene name, then transcript ID is used). Finally, transcripts were assembled again in StringTie, using the -e option to only include transcripts matching those in the reference (-G) file, which was the merged modified gtf file output from the python script. This final StringTie step produced a gene abundance file for each sample, including transcripts per million (TPM) [31]. TPM is calculated by normalizing the transcript counts by gene length and then by sequencing depth, and provides a useful measure of RNA abundance [32].

Identification of novel genes

We used the program GffCompare v0.12.1 [33] to find novel genes and transcripts in our data, when compared to the FelCat9 reference annotation. Using the tmap output file from GffCompare, we determined the number of novel genes/transcripts as defined by a classification code of “u” for unknown.

Transcriptome quality control

The completeness of our assembled transcript library was assessed by comparison to a set of conserved single-copy orthologs in BUSCO v 5.2.2 (Benchmarking Universal Single-Copy Orthologs) [34]. We used the carnivora_odb10 dataset of 14,502 BUSCO groups. We also compared our catalog to that of the publicly available reference transcriptome library for felCat 9.0 assembly from NCBI.

To look at the clustering of our samples, we performed a PCA in R v 4.1.2 [35] using the function prcomp. The data used for the PCA was log2(TPM) values per transcript, as output from the final StringTie command. To observe clustering between the library types (polyA and riboZero) we performed PCA using all our samples, and then to observe clustering between the specific tissue types, we performed PCA using only the polyA samples and then only the riboZero samples. All transcripts with any TPM values of 0 were excluded from the PCA, as were low variance transcripts using the command varFilter [36] with an interquartile range cut-off of 0.6. PCA and loadings plots were created using the package ggbiplot [37] in R.

To look at the variability within tissue type for polyA samples and riboZero samples, we calculated pairwise Spearman correlation coefficients for TPM values in R (see Supplementary Tables 2 and 3 for raw coefficients). Only the most variable transcripts were used (upper 75th percentile; n = 5918 for polyA, n = 4955 for riboZero) and the range of coefficient values was rescaled to be between 0 and 1 and then also subtracted from 1. Heatmaps were created to visualize these results using the R package pheatmap [38].

Highest expressed tissue-specific genes

We identified the most highly expressed genes in each tissue type and compared them to human RNA-seq data for the corresponding tissues from the Genotype-Tissue Expression (GTEx) portal v10 [39]. The data used for the analyses described were obtained from the GTEx Portal on Sept 24th, 2025. For each tissue type of the polyA library preparation method, the median TPM was calculated. The genes with the highest TPM median values were directly compared to the 50 highest expressed genes in the human counterpart, as provided on the GTEx website (https://www.gtexportal.org/home/tissue/).

Differential gene expression analysis

We used DESeq2 [40] in R to identify genes that were differentially expressed between different tissue types and specific options used are listed in Supplementary Data 1. The raw counts for each sample were extracted from the gene counts output file from the STAR 2nd pass mapping step. Since we used a stranded protocol and the reverse complement of mRNA was sequenced, we used the 4th column of counts (reverse counts) from the STAR gene count files.

For each library type separately (polyA and riboZero), differential expression comparisons were performed between each pair of tissues (three samples for each, with the exception of 5 skin samples for riboZero and all 6 kidney cortex samples for riboZero – so skin & cortex were not included in the polyA comparisons). One thyroid sample (T114) was shown to be an outlier from the PCA described above and was excluded from this analysis. In total, we performed 133 pairwise comparisons: 55 for the polyA tissues and 78 for the riboZero tissues. For each pairwise tissue comparison, the low expression genes (less than 10 counts across the six samples) were filtered out. Using the DESeq function, size factors and dispersions were estimated, and then the negative binomial model was fitted and tested. The variance stabilizing transformation (vst) was used to transform the raw count data, and then the package apeglm [41] was used for shrinkage of log2 fold change estimates towards zero. We determined the number of differentially expressed genes (DEG) for each pairwise comparison using a Wald Test and significance threshold of FDR-adjusted P-value = 0.05 and log2 fold change = |0.5|. MA plots and volcano plots (with FDR-adjusted P-value = 10e-16 and log2 fold change = |1.5|) were generated in R.

Pathway analysis

To determine the pathways that were enriched in our DEG list results for each pairwise tissue comparison, we used the Database for Annotation, Visualization and Integrated Discovery (DAVID) (https://davidbioinformatics.nih.gov/) [42, 43]. For each pairwise comparison performed, we used the Ensembl gene IDs of the identified DEGs as input for the Gene List and the Ensembl gene IDs of all the genes that were included in each analysis as input for the Background List. The terms used for Pathway Enrichment Analysis in DAVID were KEGG_PATHWAY, and the GO terms BP_DIRECT (biological process), CC_DIRECT (cellular component), MF_DIRECT (molecular function). We selected the output as a Functional Annotation Chart with the defaults of EASE threshold = 0.1 and count threshold = 2 genes. A modified Fisher’s Exact Test with Benjamini-Hochberg FDR correction was used to determine the significant enriched terms. Pathway enrichment and UMAP plots were generated in R right using the packages ggplot2 [44] and umap [45] respectively. For the UMAP, we performed a log2 transformation on the fold enrichment values and then ran the UMAP with default settings. The first two UMAP dimensions were plotted with ggplot2.

Results

Quality control

After trimming low-quality bases and adapter sequences from reads, FastQC was used to determine quality control metrics for each fastq file, including the number of sequences per sample, percentage of duplicates and GC content, and average sequence length (Supplementary Table 4).

One adipose sample (A83) was determined to have a large number of reads, compared to all other samples (134.5 million; average: 38.7 million) (Supplementary Fig. 2). After trimming, no samples were identified with adapter contamination and all samples passed the per sequence quality scores check.

Alignment to reference genome

The total number of aligned reads per sample varied from 16.1 million to 118.1 million, with an average of 33.3 million (Supplementary Table 5). Of the 29 samples with fewer than 30 million aligned reads, 24 were prepared using the riboZero library method. The percentage of reads that mapped uniquely to the feline reference genome ranged from 60.6% to 95.1% with an average of 85.6%. Alignment of polyA libraries performed better than rRNA-depleted libraries, with an average unique mapping of 91.5% and 81.0%, respectively (Supplementary Fig. 1B, Supplementary Tables 4,5). This difference was statistically significant (Welch’s T-test, t=−8.635, P < 0.001). We used Bayesian ANCOVA to model the percentage aligned reads as a function of the library type (polyA or riboZero) and including each covariate (unmapped too short, unmapped other, mapped to too many loci, mapped to multiple loci) one at a time. These indicated that the proportion of short reads and multi-mapped reads strongly affected the percentage of aligned reads (BF10 = 5.9 × 1010, BF10 = 9.5 × 1015 respectively) while the proportion of other unmapped reads and reads mapped to too many loci did not account for differences in library type (BF10 = 428.0, BF10 = 1.4 respectively) (Supplementary Table 6).

Qualimap results showed that, as expected, the riboZero samples had a smaller proportion of exonic origin reads than the polyA samples (about 20–40% compared to about 50–70%, respectively) (Supplementary Fig. 3).

Identification of novel genes

GffCompare was used to investigate our assembly compared to the FelCat9 genome annotation. We identified 70,265 assembled genes and of these, 28,041 were novel as defined by a classification code of “u” for “unknown” in the tmap output file of GffCompare. When transcripts were included, the number increased to 31,520 novel genes and transcripts (Supplementary Table 7).

Transcript assembly

We used BUSCO to look at the quality of our transcriptome library by comparing to the reference NCBI transcriptome library. Of 14,502 BUSCO groups searched, 98.0% were complete in the RNA-data assembly, compared to 99.5% in the reference assembly (Fig. 1). The RNA-data assembly had many more duplicated BUSCOs than the reference (83.7% versus 56.1%, respectively).

Fig. 1.

Fig. 1

BUSCO results comparing the current assembly to the reference assembly

We performed three PCAs of TPM values: using all samples, polyA libraries only, and riboZero libraries only. After excluding all transcripts with any values of 0 and low variance transcripts, we had 5,294 transcripts in all libraries, 9,470 transcripts in polyA, and 7,928 transcripts in riboZero PCA. The PCA plot of all samples shows PC1, which explains 43.5% of the variance, separates samples mostly by tissue type (with pancreas on one end and kidney and other tissues on the other) (Fig. 2A). All pancreas samples cluster together, so this PC1 separation is unlikely to be the result of technical variation. Instead, this separation is likely driven by the pancreas’s unique expression profile driven by pancreas-specific genes involved in endocrine and exocrine functions, and markedly different from those of the other included tissues. PC2 of all samples explains 11% of the variance. Both PC1 and PC2 separate the samples by library preparation method - polyA vs. riboZero. The loadings plot shows all top 25 variables point in the same direction, along the PC2 axis (Fig. 2B). The PCA of polyA samples shows tissues mostly grouping by type, with the exception of a thyroid sample (T114), which clusters with joint capsule samples (Fig. 2C). Therefore, sample T114 is assumed to be contaminated or wrongly coded and was removed from differential expression analyses. PC1 explains 40% of the variance and separates the pancreas samples, while PC2 explains 9% of the variance and separates the other tissues from kidney medulla at one end to cardiac and joint capsule at the other end. All but one of the top 25 PCA loadings point in the same direction (Fig. 2D). Some of the genes highlighted in the loadings include FKBP3, a gene expressed in multiple tissues but most highly in skeletal muscle and brain. This variable is pulling out towards the brain samples. Likewise, the gene IDH2 is pulling out toward the cardiac samples and this gene is most highly expressed in skeletal muscle and heart. Finally, the variable labeled MSTRG.27976.2 maps to the gene TMEM30B; again, this gene is expressed in multiple tissues but most notably the kidney, liver, heart and thyroid. The PCA of riboZero samples also shows tissues mostly grouping by type (Fig. 2E). PC1 explains 42% of the variance and separates the pancreas samples, while PC2 explains 9% of the variance. Samples of the same tissue type are not clustered as tightly in the riboZero PCA. This increase in variability could be due to the wider range of RNA types captured by this library preparation method, compared to polyA-enrichment. The top 25 variables in the PCA loadings plot contribute primarily to PC2 and include several ubiquitously expressed genes, such as PDCD10, as well as many U1 spliceosomal RNAs (Fig. 2F).

Fig. 2.

Fig. 2

PCA plots of log2 TPM values for (A) all samples and (B) loadings plot, (C) polyA samples and (D) loadings plot, (E) riboZero samples and (F) loadings plot. Loadings plots show the top 25 variables only

We calculated the Spearman correlation coefficient of TPM for the polyA samples and for the riboZero samples. The polyA samples had higher values on average, and this difference was statistically significant (Welch’s T-test, t = 35.92, P < 0.001) (Supplementary Fig. 1 C). We created heatmaps for the polyA library samples and the riboZero library samples (Supplementary Fig. 4). The heatmap for the polyA library shows that the samples generally cluster within tissue type, with the exception of T114, which was also identified on the PCA, and the kidney medulla sample (KX1005). Ileum and jejunum samples were most similar, while kidney medulla samples were most different in gene expression compared to all other tissues (Supplementary Fig. 4 A). The riboZero library heatmap also shows clustering within tissue type, with two exceptions: a capsule sample (C11) that clusters with ligament samples, and a kidney cortex sample (KC1004) that does not cluster with other kidney samples (Supplementary Fig. 4B). Again, the ileum and jejunum samples were very similar, and the kidney samples were most different from other tissues.

Highest expressed tissue-specific genes

For the polyA samples, we looked at the most highly expressed genes per tissue type. There is some variability between samples, so we calculated the median of the samples for each tissue type and compared that to human RNA-seq data for the same tissue (sourced from the GTEx Portal). There are no human samples of joint capsule, jejunum, or ligament in the GTEx portal for gene expression comparison.

For adipose samples, the gene with the highest median expression was fatty acid binding protein 4 (FABP4) (Supplementary Fig. 5 A), which is expressed in adipocytes and is the 14th most highly expressed gene in the 714 human subcutaneous adipose samples on the GTEx portal. For brain cortex samples, the genes most highly expressed were the mitochondrial genes mitochondrially encoded cytochrome C oxidase I (MT-CO1), mitochondrially encoded cytochrome C oxidase II (MT-CO2) and mitochondrially encoded cytochrome C oxidase III (MT-CO3) (Supplementary Fig. 5B), which were also highly expressed in the 270 brain cortex human RNA-seq data. For the joint capsule samples, the gene with the highest median expression was actin alpha 1 (ACTA1), which is expressed in skeletal muscle (Supplementary Fig. 5 C). In addition to the mitochondrially encoded cytochrome C oxidase genes, the cardiac samples also had a high expression of the myosin genes myosin light chain 2 (MYL2) and myosin heavy chain 7 (MYH7) (Supplementary Fig. 5D). These mitochondrial and myosin genes were also highly expressed in the human heart (left ventricle) samples (n = 452). For our ileum samples, the gene with the highest median TPM was fatty acid binding protein 6 (FABP6), which is ileum-specific and involved in the metabolism of bile acids (Supplementary Fig. 5E). The human ileum samples (n = 207) on the GTEx portal did not have a high expression of FABP6 but did have a high level of expression of the gene immunoglobulin kappa constant (IGKC), involved in antibody receptor binding activity. However, the human samples were taken from the terminal ileum, which has a greater abundance of Peyer’s patches - an important component of the immune system [46]. The most expressed gene in the feline jejunum was fatty acid binding protein 2 (FABP2) (Supplementary Fig. 5 F), another protein found in the small intestine that is involved in metabolism and transport of long-chain fatty acids. The most highly expressed genes in the kidney medulla samples include insulin like growth factor binding protein 7 (IGFBP7), eukaryotic translation elongation factor 1 alpha 1 (EEF1A1), and secreted phosphoprotein 1 (SPP1) (Supplementary Fig. 5G). The latter, also known as osteopontin (OPN), is expressed in the kidney and may play a role in kidney diseases including stone formation [47]. Both IGFBP7 and EEF1A1 are also seen at relatively high expression levels in the human medulla samples (n = 11). The liver samples had high levels of expression of the genes alpha 2-HS glycoprotein (AHSG), which is synthesized in hepatocytes, and retinol binding protein 4 (RPB4), which transports retinol from the liver to other peripheral tissues but the highest expressed gene is labeled MSTRG.19462.1, which is a transcript that encompasses the gene albumin (ALB), a liver-specific gene encoding this protein (Supplementary Fig. 5H). Ten of the 20 most abundantly expressed genes in our liver samples were also highly expressed in the 262 human liver samples. For our pancreas samples, the transcript labeled MSTRG.10582.0 had the highest expression. This transcript maps to the Ensembl transcript ENSFCAT00000018174.5 (or gene LOC101085453), which is assigned the UniProt protein description cationic trypsin, a protein secreted by the pancreas. The second highest expressed transcript in the feline pancreas is chymotrypsin C (CTRC), which is also highly expressed in the human pancreas samples (n = 362) (Supplementary Fig. 5I). For the feline ligament samples, the gene with the highest median expression by 2-fold is collagen type III alpha 1 chain (COL3A1), which is an important component of ligaments and plays a role in wound healing (Supplementary Fig. 5 J). We only had one polyA skin sample, but that showed highest expression of the gene EEF1A1 (also seen in kidney medulla) and several keratin and keratin-associated protein genes (such as KRT25 and KRTAP3−1) that play a role in the structure of hair (Supplementary Fig. 5 K). The highest levels of non-mitochondrial gene expression in the 651 human samples were the keratin genes KRT10, KRT1 and KRT14. For thyroid, we found the most highly expressed gene by about 3.5-fold was thyroglobulin (TG) (Supplementary Fig. 5L). Thyroglobulin is produced in the thyroid and is a substrate for other forms of the thyroid hormone. This gene also had the highest expression in the human thyroid samples (n = 684).

Differential gene expression analysis

To demonstrate that the samples express the expected tissue-specific genes, we performed pairwise differential expression (DE) between every tissue type pairing within the polyA library samples and within the riboZero library samples. We expected to see, for example, brain samples expressing high levels of brain-specific genes, when compared to all other tissue types.

Our pairwise comparison results show the tissues with the lowest percentage of differentially expressed genes are the ileum and jejunum, with only 274 genes of 18,585 total genes (1.5%) in the polyA and 105 of 18,337 genes (0.6%) for riboZero (Table 1). Conversely, the highest percentage of differentially expressed genes are seen in pairwise comparisons of the brain when compared to almost all other tissues - all tissues except medulla for polyA and capsule for riboZero had > 50% of analyzed genes significantly differentially expressed when compared to brain tissue (Table 1). The polyA kidney medulla samples and riboZero joint capsule samples showed more intra-tissue variation than the other tissues, which may explain the slightly lower number of DE genes.

Table 2.

Percentage of differentially expressed genes in each tissue pairwise comparison A) polyA tissues, b) riboZero tissues

A
A B C H I J R L KM P T
A 19,106 17,501 17,431 18,799 18,323 17,760 17,592 18,643 17,339 18,092
B 64.6 18,904 18,760 19,800 19,458 19,146 18,987 19,634 18,593 19,290
C 41.1 61.0 17,017 18,657 18,156 17,552 17,310 18,505 17,070 17,882
H 47.1 60.8 32.4 18,573 18,066 17,504 17,093 18,351 16,810 17,799
I 50.2 62.9 47.7 53.0 18,585 18,829 18,572 19,296 18,323 18,832
J 47.3 61.8 45.2 51.0 1.5 18,392 17,968 18,938 17,767 18,418
R 42.3 61.5 42.8 50.3 42.7 42.7 17,629 18,664 17,391 18,122
L 50.9 65.6 50.3 53.8 49.5 45.0 53.4 18,403 16,831 17,845
KM 34.1 44.1 34.1 36.6 26.3 28.2 28.0 38.6 18,093 18,715
P 50.1 58.9 48.9 52.3 45.0 42.7 49.0 45.8 28.9 17,577
T 49.5 58.7 43.3 48.8 32.5 33.0 37.7 45.9 14.0 39.1
B
A B C H KC I J R L KM P S T
A 20,494 21,127 19,353 20,527 19,692 19,577 19,149 19,439 20,340 19,009 20,353 19,366
B 50.7 21,320 19,347 20,745 19,900 19,749 19,358 19,613 20,489 18,725 20,570 19,514
C 11.3 45.5 20,274 21,626 20,663 20,454 19,931 20,379 21,423 19,654 21,317 20,294
H 30.2 53.4 26.5 19,794 18,652 18,507 17,918 18,160 19,461 17,284 19,580 18,191
KC 38.5 55.8 36.6 46.5 20,052 19,856 19,769 19,756 20,156 19,215 20,653 19,697
I 28.4 55.4 24.3 43.2 43.6 18,337 18,539 18,581 19,766 17,890 19,801 18,604
J 34.2 61.9 29.9 49.2 40.8 0.6 18,400 18,373 19,556 17,703 19,666 18,457
R 19.1 56.4 1.2 41.2 45.9 33.3 40.5 18,071 19,382 17,419 19,377 18,168
L 27.3 57.8 27.9 44.1 38.8 34.7 37.4 40.5 19,517 17,326 19,636 18,223
KM 38.2 56.6 30.5 47.4 15.5 39.5 40.3 41.7 43.5 18,851 20,405 19,371
P 37.2 54.5 33.7 44.4 45.8 39.1 47.5 46.4 37.5 44.3 19,073 17,492
S 36.7 59.5 29.1 49.9 46.9 38.4 38.7 36.8 46.4 38.0 52.0 19,573
T 26.3 53.7 24.6 38.4 41.2 31.5 38.8 31.4 35.6 37.4 36.0 40.7

Lower triangle is the percentage, upper triangle is the number of genes included in the comparison after filtering. A adipose, B brain, C capsule, H heart, KC kidney cortex, I ileum, jejunum, R ligament, L liver, KM kidney medulla, P pancreas, S skin, T thyroid

We investigated the percentage of DE genes and the difference in average postmortem interval (PMI) for tissue pairs of polyA samples (Fig. 3). Although we see a linear trend between these two variables, this was a weak relationship (R2 = 0.15).

Fig. 3.

Fig. 3

Scatter plot of the difference in average postmortem intervals (in minutes) and the percentage of DE genes for the polyA libraries. Blue line is the linear regression and R2 is 0.149

Volcano plots were created to show the most significant differential expression (DE) results visually, using the thresholds Padj = 10e-16 and log2FC = |1.5|. We chose to look in more detail at the most similar, least similar, and mid-range similar tissue pairings, which were ileum-jejunum, brain-liver and liver-thyroid, respectively (Fig. 4). Volcano plots for all other pairings are in Supplementary Figs. 6 and 7, and MA plots are in Supplementary Figs. 8 and 9. Of the 4,230 DE genes for the brain-liver polyA comparison, 2,417 (57%) were upregulated in brain (Fig. 4A). For the brain-liver riboZero comparison, there were only 2,921 DE genes and 1,628 (56%) were upregulated in brain (Fig. 4D). For the liver-thyroid DE gene comparisons, there were 1,029 (69%) and 650 (71%) upregulated in liver for polyA and riboZero comparisons, respectively (Fig. 4B, E). Finally, for the ileum-jejunum comparison, there were only nine DE genes for polyA and four of these were upregulated in ileum, while there were no DE genes for riboZero (Fig. 4C, F). For some volcano plots (e.g., liver-thyroid), we observe asymmetry, with a greater number of DE genes upregulated in one tissue type compared to the other. Given the different biological functions of these tissues, we expect the asymmetry is likely biologically meaningful rather than technical. The liver is a transcriptionally active organ involved in protein synthesis, drug metabolism, detoxification, and lipid and carbohydrate metabolism. The thyroid, in contrast, is a specialized gland, focused on the synthesis and secretion of thyroid hormones. Therefore, the liver has a broader and more diverse set of genes expressed at higher levels than the thyroid, which contributes to the observed skew in DE genes.

Fig. 4.

Fig. 4

Volcano plots of differential expression results of polyA libraries (A-C) and riboZero libraries (D-F). AD) brain vs. liver; BE) liver vs. thyroid; CF) ileum vs. jejunum. Using significance thresholds of P-value < 10e− 16 and log2 fold change of |1.5|, red points are genes that are significant for both, purple points are genes that reach the P-value significance threshold only, green points are genes that reach the log2 fold change significance threshold only, and gray points are not significant

Our results also demonstrate that the polyA and riboZero libraries showed similar patterns for each pairwise comparison, with generally lower numbers of DE genes identified in the riboZero libraries (Fig. 4; Supplementary Fig. 8, 9).

To determine the quality of our samples, we looked in more detail at the most significant DE genes for each pairwise comparison, for the polyA library. The brain-liver comparison results included 40 DEGs with an FDR-adjusted P-value of < < 0.0001 (Supplementary Table 8). Of these, seven were downregulated in liver (compared to brain) and included genes involved in microtubule assembly and GTP hydrolysis such as dynamin 1 (DNM1), microtubule-associated protein 1 A (MAP1A) and microtubule associated protein 2 (MAP2). The 33 genes that were upregulated in liver (compared to brain) included genes responsible for the hydrolysis of drugs, protection against oxidative stress, and iron metabolism such as carboxylesterase 2 (CES2), transferrin receptor 2 (TFR2), catalase (CAT), and phosphoenolpyruvate carboxykinase 2, mitochondrial (PCK2). Of the forty most significant DEGs in the liver-thyroid comparison, 39 were downregulated in thyroid (compared to liver) (Supplementary Table 8). These include genes that encode important liver enzymes involved in degradation of urea (Carbamoyl-phosphate synthase 1, CPS1) and tyrosine (4-Hydroxyphenylpyruvate dioxygenase, HPD), oxidation of monoamines like dopamine and serotonin (Monoamine oxidase B, MAOB), and metabolism of fructose (Adolase, Fructose-Bisphosphate B, ALDOB). The most significant DEGs in the ileum-jejunum comparison were Fatty acid binding protein 6 (FABP6) (Padj = 1.4 × 10− 6) and cholecystokinin (CCK) (Padj = 1.4 × 10− 30). The former, FABP6, was downregulated in jejunum (compared to ileum) and is an ileal fatty acid binding protein that is required for transport of bile acids in the ileum, while the latter, CCK, was upregulated in jejunum and is a satiety signal that is released from intestinal cells and neurons after a meal. In rats, it has been shown that CCK causes an increased Fos expression in the duodenum and jejunal neurons, but not in the ileum [48].

Pathway analysis

All the significant (Benjamini-Hochberg FDR correction < 0.05) terms from our DAVID enrichment analysis results are listed in Supplementary Table 9. Generally, we observed enrichment of pathways that were relevant for the tissue types involved. For example, the KEGG pathway term “cardiac muscle contraction” is significantly enriched in several comparisons involving cardiac tissue. Likewise, the GO cellular components term “brush border” is enriched in comparisons involving jejunum or ileum tissue.

For the DE genes in the brain-liver comparison, the significant categories with the highest fold enrichment included the GO cellular component terms “postsynaptic density membrane”, “dendritic spine” and “presynaptic membrane”, and the KEGG pathway “Glycine, serine and threonine metabolism” (Fig. 5A). For the ileum-jejunum comparison, only 13 terms were significantly enriched; those with the highest fold enrichment were the GO biological process terms “lipoprotein metabolic process”, “embryonic skeletal system morphogenesis” and “proximal/distal pattern formation” as well as the KEGG pathway term “fat digestion and adsorption” (Fig. 5B). For the liver-thyroid comparison, the significant terms with the highest fold enrichment included KEGG pathway terms “Butanoate metabolism”, “Arginine biosynthesis”, and “Valine, leucine and isoleucine degradation”, and the GO cellular component term “respiratory chain complex I” (Fig. 5C).

Fig. 5.

Fig. 5

Scatter plots showing fold enrichment of significant terms (from DAVID analysis) for the DE genes from the polyA pairwise tissue comparison (A) brain-liver, (B) ileum-jejunum, and (C) liver-thyroid. Points are colored according to -log10 (FDR-adjusted P-value) meaning that the higher values (in red) are more significant. Note that only the 40 terms with the highest fold enrichment are shown for liver-thyroid

We further investigated the KEGG pathway terms only, to look at those that were enriched in specific tissue pairs (Supplementary Table 10). The UMAP of fold-enrichment values for tissue pairings shows some clustering of tissue pairs with similar KEGG enrichment profiles, for example, some brain tissue pairs cluster together near the top of the plot (Supplementary Fig. 10).

Discussion

This study describes transcriptome sequencing results of thirteen healthy control feline tissues. These tissue types were chosen because they are involved in the most common complex diseases of cats diagnosed at the Cornell University Companion Animal Hospital over the last 15 years, in parallel with genome-wide association studies of diseases involving these tissues, such as diabetes mellitus, hyperthyroidism, chronic renal diseases, and hypertrophic cardiomyopathy [49]. As far as the authors are aware, this is the largest study to provide transcriptome-level data for multiple healthy feline tissue types. Previous feline studies typically either use one or a few specific tissue types as healthy controls for disease studies, such as feline chronic gingivostomatitis [50], or use many tissue types to examine a single (or a few) genes, such as cytochrome P450 [51]. Since it is difficult for every geneticist to obtain normal/control tissue for comparative analysis, and this would also result in the repetitive euthanasia of healthy cats, we have made this database publicly available for use by other researchers.

Both poly(A) enrichment and ribosomal RNA depletion methods were used. The poly(A) protocol enriches for transcripts containing poly(A) sequences, which includes mRNAs and many non-coding RNAs, while reducing the number of pre-mRNAs included in the data. This poly(A) method is used extensively due to its focus on mRNAs and the elimination of other housekeeping RNAs (including rRNAs) that can have very high expression levels. In contrast, the rRNA-depleted method depletes both cytoplasmic rRNA and mitochondrial rRNA transcripts from the sample. This leaves the poly(A) positive mRNA, non-coding RNAs and any other transcripts that do not have the poly(A) tail. Inclusion of both poly(A) enrichment and ribosomal RNA depletion protocols meant we were able to include a wide range of transcripts, to best represent the transcriptomic repertoire of our samples. We observed clear differences in the exonic and intronic read distribution produced by these two library selection methods, with higher proportions of exonic reads from polyA-selected libraries, as expected. Generally, the polyA library samples from different tissue types were more similar to each other than the riboZero library samples, as shown by our clustering analysis. However, it is unclear whether this reflects the biology of pre-mRNAs or is a result of noise due to degraded RNA. We intentionally selected lower quality RNA samples (measured by RQN) for RiboZero library preparation, leaving the higher quality RNA samples for polyA preparation. This was done because rRNA-depleted methods are more robust to lower quality, degraded RNA [21, 22].

PCA of both the polyA and rRNA-depleted library preparations demonstrated the robustness and fidelity of our RNA-seq data because the samples clustered distinctly by tissue type. For the polyA library, this separation by tissue type was observed on PC1, allowing us to identify a thyroid sample that showed an expression pattern inconsistent with other thyroid samples. This sample was determined to have likely been contaminated with another tissue or mislabeled during processing. Following established quality control protocols, the sample was removed from all subsequent analyses to ensure the integrity of the dataset.

We assessed the quality of our RNA-seq data using accepted metrics, including BUSCO, a unique tool to evaluate the completeness of a set of genomes, transcriptomes or proteomes, by looking at the expected gene content [34]. Transcriptomic libraries pooled from samples of different ages, sexes, and tissue types should be more complete and therefore have a higher BUSCO completeness score [52]. Model organisms typically have good reference genomes and therefore usually report a BUSCO score of >95% complete [53]. Depending on the biology of the species - for example, the number of repetitive elements - non-model species typically report BUSCO scores ranging from about 70% to 95% complete, for example 86% in the rock pigeon [54] and 69% in the damselfly Calopteryx splendens [55]. Our assembly BUSCO score of 98% compares favorably with this completeness metric.

Appropriate tissue-specific genes (for example, the ileal fatty acid binding protein FABP6 expressed in ileum samples) were highly expressed in our RNA-seq data, providing further evidence that it has high fidelity and should be useful to other investigators studying feline disease genetic mechanisms. Generally, similar genes were found expressed in equivalent feline and human tissues. Of the 20 genes with the highest expression in our feline cardiac samples (measured by median TPM), 13 were also highly expressed in human cardiac samples from the GTEx portal. Further, our cardiac expression profiles are very similar to that reported in a recent study of healthy cardiac tissue [56]. Specifically, 16 of the top 25 genes expressed in left ventricular wall tissue reported in Kaplan et al. (2025) were in the top 30 expressed genes in our left ventricle samples. The only gene from our top 10 not present in the Kaplan top 25 list is TNNC1 (Troponin C1, slow skeletal and cardiac type), which is involved in muscle contraction through the binding of calcium. This gene was also featured in the human GTEx left ventricular expression data. Variants in several cardiac genes including TNNC1 have been reported in cases of human hypertrophic cardiomyopathy (HCM) (for example, see Landstrom et al. 2008) [57], a disease that often involves thickening of the left ventricular wall. Given the high prevalence of HCM in cats, it seems likely that TNNC1 is highly expressed in the Kaplan data but did not make the top 25 gene list.

Tissue-specific expression from DE analyses showed that similar tissues (e.g., ileum and jejunum) share more gene expression patterns than dissimilar tissues, such as brain compared to other tissues. Specifically, in polyA-selected libraries, only 1.5% of genes were DE between ileum and jejunum while pairwise comparisons involving brain tissue showed that 44–66% of genes were significantly differentially expressed (average 60%) relative to other tissues. Other tissue comparisons (excluding brain) showed average DEG proportions of 40–50% for polyA libraries, except the kidney medulla showing a slightly lower average of 31%. While these proportions of DE genes may seem high, they are expected given the pairwise comparisons involve entirely different tissue types, which have different cellular compositions and biological functions. Our DEG findings are consistent with a study that found 414 DEGs out of 37,167 tested (1.1%) when comparing two types of adipose tissue (subcutaneous versus omental) [58], supporting that transcriptomic differences scale with tissue dissimilarity. The tissue-specific DE genes in the pairwise comparisons are consistent with the genes and pathways expected to be relevant to the proteins produced and metabolic pathways for these tissues. Indeed, the pathway enrichment analysis performed in DAVID showed the expected significant enrichment of GO terms relevant to the tissue type, further highlighting the value of this resource for future feline disease genomics research. It is important to note that the RNA-seq we report here is an average of all the cellular gene expression at one time point in each tissue. To make relevant conclusions about the many cell lineages in a given tissue, we would need to employ single-cell RNA-seq, which is becoming more widely used but beyond the scope of this study.

A strength of this experiment is that the health and integrity of the tissues was confirmed by necropsy and/or histological evaluations, both performed by board-certified veterinary pathologists. Some of the most critical tissues were harvested within 20 min, and all within 81 min, of euthanasia. Our average postmortem interval (PMI) of 40 min compares favorably to studies in which these intervals are reported, for example, a maximum 4 h interval for canine brain tissue samples [59], an interval of 2 h 15 min for porcine tissue sample collection [60], and a range of 5 to 24 h for human heart tissue collection [61]. In addition to PMI, factors like the cause of death [62], tissue type [63], ambient temperature [64], and tissue-sampling methods [65] all affect RNA quality and stability. We controlled for these variables (such as increases in temperature) in this study by using standardized processes of postmortem tissue sample collection performed by trained biobank personnel working with board-certified pathologists.

One important limitation of this study is that we only included three samples for each tissue type (for most tissues), so these data should not necessarily be interpreted as representing all possible transcripts, especially given individual cat variables (such as age and sex) that can affect gene expression levels. Here, we included samples from six young cats (< 1 year old) and five adult cats (> 4 years old), although the majority of the samples (58 of 78) were from young cats, preventing a valid young vs. old DEG analysis. We also included a limited number of tissue types in the study (n = 13) and omitted some important types, including bone, muscle, lung and reproductive organs. These omissions are because our goal was to sample tissues that represented some of the more common diseases seen in feline patients at the Cornell hospital (for example, kidney disease) in order to maximize usage and enhance the significance of our data.

We observed differences in some metrics of our polyA and riboZero library samples. As mentioned, the riboZero samples had lower starting RNA quality, on average, than the polyA samples. We also saw a significant difference in pairwise Spearman correlation coefficients and in percentage of uniquely mapped reads between the riboZero and polyA libraries. Using Bayesian ANCOVA, we found that the difference in uniquely mapped reads was due to increased shorter unmapped reads and multi-mapped reads in the riboZero samples. These are likely due to the inclusion of intronic RNA, rRNA contamination, and degraded RNA (lower RQN) in the riboZero samples. Further, although we do not have direct evidence of genomic DNA contamination in our samples, this has been shown to affect riboZero samples more than polyA samples [66]. The genomic DNA contamination leads to false positive DEG results, especially due to over-estimation of low-abundance transcripts. We mitigated this effect in our DE analysis by filtering out low-abundance transcripts, using exon-only counts, and using DESeq2 for our analysis (which includes options such as lfcShrink to adjust fold change estimates for genes with low counts or high variability). However, as we cannot explicitly rule out that genomic DNA contamination resulted in an inflation in identified DE genes, our riboZero DE genes analyses should be interpreted with caution.

The domestic cat is afflicted with a myriad of diseases that are genetic in origin and for which there is no curative treatment. Once complex and simple traits and diseases have been mapped through genome-wide association studies or whole genome sequencing, investigators can follow up with functional studies. One type of such an investigation is to test the expression of candidate genes in associated intervals. This report describes the global gene expression of 13 control tissues which should contain relevant targets for genetic investigations. We have shown that our RNA-seq data is valid. The expressed genes we report here will also be useful for development of future mapping arrays for domestic cat genetic exploration and improved annotation of the feline genome.

Conclusions

Many of the diseases that cat owners observe and that veterinarians diagnose in their practices have strong similarities to human diseases. Diseased tissues can be collected and gene expression compared between the diseased and control tissues like those reported herein. Identifying the genetic basis of these complex diseases in domestic animals like the cat can support genetic investigations into cognate human diseases.

We have made this transcriptomic dataset publicly available on zenodo (doi: 10.5281/zenodo.17229096) and on the NCBI’s sequence read archive (SRA; https://www.ncbi.nlm.nih.gov/sra/PRJNA882763) to contribute toward improvement of feline resources and for use by feline genetics researchers.

Supplementary Information

Supplementary Material 1. (17.9KB, docx)
Supplementary Material 2. (18.3MB, docx)

Acknowledgements

The authors are especially grateful to the owners of the cats that participated in this study. Tissue and RNA samples were provided by the Cornell Veterinary Biobank, built with the support of NIH grant R24 GM082910 and the College of Veterinary Medicine, Cornell University.

Authors’ contributions

JJH assisted with study design, performed the data analyses, and wrote the original draft of the manuscript. SG assisted in sample collection, interpreted histology reports, and contributed to manuscript preparation. IH assisted in sample collection and contributed to documentation of tissues. TS performed necropsies. LL performed the RNA extraction and contributed to manuscript preparation. JKG managed the preparation and sequencing of the RNA-seq libraries, including RNA quality control, enrichment. FA provided preliminary analysis of the RNA-seq libraries. SC assisted in sample collection. EW assisted in sample collection. MB assisted with data analyses. MGC assisted with study design (biospecimen science) and contributed to manuscript preparation. MEW assisted with study design. LAL helped secure funding and with study design. RJT assisted with study design, secured funding, supervised the project, and wrote the original draft of the manuscript. All authors read and approved the final version of the manuscript.

Funding

This study was funded by a grant from the Cornell University Feline Health Center to RJT (PI).

Data availability

The raw fastq files generated in this study have been deposited in the Sequence Read Archive under BioProject ID PRJNA882763 (https://www.ncbi.nlm.nih.gov/sra/PRJNA882763) and sample accession numbers are listed in Supplementary Table 11. The assembled transcriptome gtf file has been deposited on zenodo (https://zenodo.org) (doi:10.5281/zenodo.17229096).

Declarations

Ethics approval and consent to participate

This study was approved by the Cornell University Institutional Animal Care and Use Committee (IACUC), protocol number #2005 − 0151.

Informed owner consent was granted for all client-owned animals used in this research.

Consent for publication

Not applicable.

Competing interests

The authors declare no competing interests.

Footnotes

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

References

  • 1.OMIA - Online Mendelian Inheritance in Animals. Cited 18 Sep 2025. https://omia.org/home/
  • 2.McElroy A, Gray-Edwards H, Coghill LM, Lyons LA. Precision medicine using whole genome sequencing in a cat identifies a novel COL5A1 variant for classical Ehlers-Danlos syndrome. J Vet Intern Med. 2023;37(5):1716–24. . 10.1111/jvim.16805 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Kiener S, McMahill BG, Affolter VK, Welle M, Yager JA, Jagannathan V, et al. SOAT1 missense variant in two cats with sebaceous gland dysplasia. Mol Genet Genomics. 2023;298(4):837–43. 10.1007/s00438-023-02020-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Buckley RM, Davis BW, Brashear WA, Farias FHG, Kuroki K, Graves T, et al. A new domestic cat genome assembly based on long sequence reads empowers feline genomic medicine and identifies a novel gene for dwarfism. PLoS Genet. 2020. 10.1371/journal.pgen.1008926. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Bredemeyer KR, Harris AJ, Li G, Zhao L, Foley NM, Roelke-Parker M, et al. Ultracontinuous single haplotype genome assemblies for the domestic cat (Felis catus) and Asian leopard cat (Prionailurus bengalensis). J Hered. 2021;112(2):165–73. 10.1093/jhered/esaa057. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Rodney AR, Buckley RM, Fulton RS, Fronick C, Richmond T, Helps CR, et al. A domestic cat whole exome sequencing resource for trait discovery. Sci Rep. 2021. 10.1038/s41598-021-86200-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Choi Y, Nam MW, Lee HK, Choi KC. Use of cutting-edge RNA-sequencing technology to identify biomarkers and potential therapeutic targets in canine and feline cancers and other diseases. J Vet Sci. 2023. 10.4142/jvs.23036. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Rodney AR, Skidmore ZL, Grenier JK, Griffith OL, Miller AD, Chu S, et al. Genomic landscape and gene expression profiles of feline oral squamous cell carcinoma. Front Vet Sci. 2023. 10.3389/fvets.2023.1079019. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Rivas VN, Kaplan JL, Kennedy SA, Fitzgerald S, Crofton AE, Farrell A, et al. Multi-omic, histopathologic, and clinicopathologic effects of once-weekly oral rapamycin in a naturally occurring feline model of hypertrophic cardiomyopathy: a pilot study. Animals. 2023. 10.3390/ani13203184. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Gandolfi B, Gruffydd-Jones TJ, Malik R, Cortes A, Jones BR, Helps CR, et al. First WNK4-hypokalemia animal model identified by genome-wide association in Burmese cats. PLoS One. 2012. 10.1371/journal.pone.0053173. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Gandolfi B, Alhaddad H, Abdi M, Bach LH, Creighton EK, Davis BW, et al. Applications and efficiencies of the first cat 63K DNA array. Sci Rep. 2018. 10.1038/s41598-018-25438-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Samaha G, Wade CM, Beatty J, Lyons LA, Fleeman LM, Haase B. Mapping the genetic basis of diabetes mellitus in the Australian Burmese cat (Felis catus). Sci Rep. 2020. 10.1038/s41598-020-76166-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Hayward JJ, Castelhano MG, Oliveira KC, Corey E, Balkman C, Baxter TL, et al. Complex disease and phenotype mapping in the domestic dog. Nat Commun. 2016. 10.1038/ncomms10460. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Grada A, Weinbrecht K. Next-generation sequencing: methodology and application. J Invest Dermatol. 2013. 10.1038/jid.2013.248. [DOI] [PubMed] [Google Scholar]
  • 15.Emilsson V, Thorleifsson G, Zhang B, Leonardson AS, Zink F, Zhu J, et al. Genetics of gene expression and its effect on disease. Nature. 2008;452(7186):423–8. 10.1038/nature06758. [DOI] [PubMed] [Google Scholar]
  • 16.Levin JZ, Yassour M, Adiconis X, Nusbaum C, Thompson DA, Friedman N, et al. Comprehensive comparative analysis of strand-specific RNA sequencing methods. Nat Methods. 2010;7(9):709–15. 10.1038/nmeth.1491. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Yang X, Schadt EE, Wang S, Wang H, Arnold AP, Ingram-Drake L, et al. Tissue-specific expression and regulation of sexually dimorphic genes in mice. Genome Res. 2006;16(8):995–1004. 10.1101/gr.5217506. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Yu Y, Fuscoe JC, Zhao C, Guo C, Jia M, Qing T, et al. A rat RNA-Seq transcriptomic bodymap across 11 organs and 4 developmental stages. Nat Commun. 2014;5:3230. 10.1038/ncomms4230. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Gallego Romero I, Pai AA, Tung J, Gilad Y. RNA-seq: impact of RNA degradation on transcript quantification. BMC Biol. 2014. 10.1186/1741-7007-12-42. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Garrison SJ, Mouttham L, Castelhano MG. Banking on the last gift: Cornell’s signature program of postmortem tissue procurement. Biopreserv Biobank. 2023;21(1):46–55. . 10.1089/bio.2021.0103 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Schuierer S, Carbone W, Knehr J, Petitjean V, Fernandez A, Sultan M, et al. A comprehensive assessment of RNA-seq protocols for degraded and low-quantity samples. BMC Genomics. 2017;18(1):1–13. 10.1186/s12864-017-3827-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Lin X, Qiu L, Song X, Hou J, Chen W, Zhao J. A comparative analysis of RNA sequencing methods with ribosome RNA depletion for degraded and low-input total RNA from formalin-fixed and paraffin-embedded samples. BMC Genomics. 2019;(1):831. 10.1186/s12864-019-6166-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.JASP Team. JASP (Version 0.95.4). [Computer software]. 2025.
  • 24.Andrews S. A Quality Control tool for High Throughput Sequence Data. 2010 https://www.bioinformatics.babraham.ac.uk/projects/fastqc/.
  • 25.Ewels P, Magnusson M, Lundin S, Käller M. MultiQC: summarize analysis results for multiple tools and samples in a single report. Bioinformatics. 2016;32(19):3047–8. 10.1093/bioinformatics/btw354. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Bolger AM, Lohse M, Usadel B. Trimmomatic: a flexible trimmer for illumina sequence data. Bioinformatics. 2014;30(15):2114–20. 10.1093/bioinformatics/btu170. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Dobin A, Davis CA, Schlesinger F, Drenkow J, Zaleski C, Jha S, et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics. 2013;29(1):15–21. 10.1093/bioinformatics/bts635. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.van den Bergh D, Wagenmakers EJ, Aust F. Bayesian repeated-measures analysis of variance: an updated methodology implemented in JASP. Adv Methods Pract Psychol Sci. 2023. 10.1177/25152459231168024. [Google Scholar]
  • 29.Okonechnikov K, Conesa A, García-Alcalde F. Qualimap 2: advanced multi-sample quality control for high-throughput sequencing data. Bioinformatics. 2016;32(2):292–4. 10.1093/bioinformatics/btv566. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Pertea M, Pertea GM, Antonescu CM, Chang TC, Mendell JT, Salzberg SL. StringTie enables improved reconstruction of a transcriptome from RNA-seq reads. Nat Biotechnol. 2015;33(3):290–5. 10.1038/nbt.3122. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Li B, Dewey CN. RSEM: accurate transcript quantification from RNA-Seq data with or without a reference genome. BMC Bioinformatics. 2011. 10.1186/1471-2105-12-323. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Wagner GP, Kin K, Lynch VJ. Measurement of mRNA abundance using RNA-seq data: RPKM measure is inconsistent among samples. Theory Biosci. 2012;131(4):281–5. 10.1007/s12064-012-0162-3. [DOI] [PubMed] [Google Scholar]
  • 33.Pertea G, Pertea M. GFF utilities: GffRead and GffCompare. F1000Res. 2020. 10.12688/f1000research.23297.2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Manni M, Berkeley MR, Seppey M, Simão FA, Zdobnov EM. BUSCO update: novel and streamlined workflows along with broader and deeper phylogenetic coverage for scoring of Eukaryotic, Prokaryotic, and viral genomes. Mol Biol Evol. 2021;38(10):4647–54. 10.1093/molbev/msab199. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.R Core Team. R: The R Project for Statistical Computing. 2021. https://www.r-project.org/.
  • 36.Gentleman R, Carey V, Huber WHF. Genefilter: methods for filtering genes from high-throughput experiments. bioc.genefilter. 2025. 10.18129/B9.bioc.genefilter.
  • 37.Vu VQ, Friendly M. A grammar of graphics implementation of biplots [R package Ggbiplot version 0.6.2]. CRAN Contrib Packag. 2024; https://cran.r-project.org/package=ggbiplot.
  • 38.Kolde R. Pretty Heatmaps [R package pheatmap version 1.0.13]. CRAN Contrib Packag. 2025; https://cran.r-project.org/package=pheatmap.
  • 39.GTEx Consortium. The Genotype-Tissue expression (GTEx) project. Nat Genet. 2013;45(6):580–5. 10.1038/ng.2653. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Love MI, Huber W, Anders S. Moderated Estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 2014;15(12):1–21. 10.1186/s13059-014-0550-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Zhu A, Ibrahim JG, Love MI. Heavy-tailed prior distributions for sequence count data: removing the noise and preserving large differences. Bioinformatics. 2019;35(12):2084–92. 10.1093/bioinformatics/bty895. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Huang DW, Sherman BT, Lempicki RA. Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nat Protoc. 2009;4(1):44–57. 10.1038/nprot.2008.211. [DOI] [PubMed] [Google Scholar]
  • 43.Sherman BT, Hao M, Qiu J, Jiao X, Baseler MW, Lane HC, et al. DAVID: a web server for functional enrichment analysis and functional annotation of gene lists (2021 update). Nucleic Acids Res. 2022;50(W1):W216–21. 10.1093/nar/gkac194. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Wickham H. ggplot2: elegant graphics for data analysis. New York: Springer-; 2016. https://ggplot2.tidyverse.org. [Google Scholar]
  • 45.McInnes L, Healy J, Saul N, Grossberger L. UMAP: uniform manifold approximation and projection [R package Umap version 0.2.10.0]. J Open Source Softw. 2018;3(29):861. 10.21105/joss.00861. [Google Scholar]
  • 46.Jung C, Hugot J-P, Barreau F. Peyer’s patches: the immune sensors of the intestine. Int J Inflam. 2010;2010:1–12. 10.4061/2010/823710. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Kaleta B. The role of osteopontin in kidney diseases. Inflamm Res. 2019;68(2):93–102. 10.1007/s00011-018-1200-5. [DOI] [PubMed] [Google Scholar]
  • 48.Sayegh AI, Ritter RC. CCK-A receptor activation induces Fos expression in myenteric neurons of rat small intestine. Regul Pept. 2000;88(1–3):75–81. 10.1016/s0167-0115(99)00124-x. [DOI] [PubMed] [Google Scholar]
  • 49.Hernandez I, Hayward JJ, Brockman JA, White ME, Mouttham L, Wilcox EA, et al. Complex feline disease mapping using a dense genotyping array. Front Vet Sci. 2022;9:862414. 10.3389/fvets.2022.862414. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Peralta S, Grenier JK, Webb SM, Miller AD, Miranda IC, Parker JSL. Transcriptomic signatures of feline chronic gingivostomatitis are influenced by upregulated IL6. Sci Rep. 2023;13(1):1–11. 10.1038/s41598-023-40679-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.Visser M, Weber KL, Lyons LA, Rincon G, Boothe DM, Merritt DA. Identification and quantification of domestic feline cytochrome P450 transcriptome across multiple tissues. J Vet Pharmacol Ther. 2019;42(1):7–15. 10.1111/jvp.12708. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.Waterhouse RM, Seppey M, Simao FA, Manni M, Ioannidis P, Klioutchnikov G, et al. BUSCO applications from quality assessments to gene prediction and phylogenomics. Mol Biol Evol. 2018;35(3):543–8. 10.1093/molbev/msx319. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53.Seppey M, Manni M, Zdobnov EM. BUSCO: assessing genome assembly and annotation completeness. Methods Mol Biol. 2019;1962:227–45. 10.1007/978-1-4939-9173-0_14. [DOI] [PubMed] [Google Scholar]
  • 54.Holt C, Campbell M, Keays DA, Edelman N, Kapusta A, Maclary E, et al. Improved genome assembly and annotation for the rock pigeon (Columba livia). G3 (Bethesda). 2018;8(5):1391–8. 10.1534/g3.117.300443. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55.Ioannidis P, Simao FA, Waterhouse RM, Manni M, Seppey M, Robertson HM, et al. Genomic features of the damselfly Calopteryx splendens representing a sister clade to most insect orders. Genome Biol Evol. 2017;9(2):415–30. 10.1093/gbe/evx006. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56.Kaplan JL, Rivas VN, Wouters JR, Vandewege MW, Harris SP, Stern JA. Establishing a robust genetic sequencing and gene expression data library in cardiovascularly healthy cats. Sci Rep. 2025;15(1):1–13. 10.1038/s41598-025-05704-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 57.Landstrom AP, Parvatiyar MS, Pinto JR, Marquardt ML, Bos JM, Tester DJ, et al. Molecular and functional characterization of novel hypertrophic cardiomyopathy susceptibility mutations in TNNC1-encoded troponin C. J Mol Cell Cardiol. 2008;45(2):281–8. 10.1016/j.yjmcc.2008.05.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58.Ahn J, Wu H, Lee K. Integrative analysis revealing human adipose-specific genes and consolidating obesity loci. Sci Rep. 2019;9(1):1–13. 10.1038/s41598-019-39582-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 59.Sándor S, Czeibert K, Salamon A, Kubinyi E. Man’s best friend in life and death: scientific perspectives and challenges of dog brain banking. GeroScience. 2021;43(4):1653–68. 10.1007/s11357-021-00373-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 60.Abbott A. Inside the first pig biobank: elaborate array of tissue samples provides powerful animal model for studying diabetes. Nature. 2015;519(7544):397–8. 10.1038/519397a. [DOI] [PubMed] [Google Scholar]
  • 61.González-Herrera L, Valenzuela A, Marchal JA, Lorente JA, Villanueva E. Studies on RNA integrity and gene expression in human myocardial tissue, pericardial fluid and blood, and its postmortem stability. Forensic Sci Int. 2013;232(1–3):218–28. 10.1016/j.forsciint.2013.08.001. [DOI] [PubMed] [Google Scholar]
  • 62.Philips T, Kusmartseva I, Gerling IC, Campbell-Thompson M, Wasserfall C, Pugliese A, et al. Factors that influence the quality of RNA from the pancreas of organ donors. Pancreas. 2017;46(2):252–9. 10.1097/MPA.0000000000000717. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 63.Koppelkamm A, Vennemann B, Lutz-Bonengel S, Fracasso T, Vennemann M. RNA integrity in post-mortem samples: influencing parameters and implications on RT-qPCR assays. Int J Legal Med. 2011;125(4):573–80. 10.1007/s00414-011-0578-1. [DOI] [PubMed] [Google Scholar]
  • 64.Ma J, Pan H, Zeng Y, Lv Y, Zhang H, Xue A, et al. Exploration of the R code-based mathematical model for PMI estimation using profiling of RNA degradation in rat brain tissue at different temperatures. Forensic Sci Med Pathol. 2015;11(4):530–7. 10.1007/s12024-015-9703-7. [DOI] [PubMed] [Google Scholar]
  • 65.Van Blokker. DLA, Kap BM, Weustink M, Riegman AC, Oosterhuis PHJ. Post-mortem tissue biopsies obtained at minimally invasive autopsy: an RNA-quality analysis. PLoS ONE. 2014;9(12). 10.1371/journal.pone.0115675. [DOI] [PMC free article] [PubMed]
  • 66.Li X, Zhang P, Wang H, Yu Y. Genes expressed at low levels raise false discovery rates in RNA samples contaminated with genomic DNA. BMC Genomics. 2022;23(1):554. 10.1186/s12864-022-08785-1. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary Material 1. (17.9KB, docx)
Supplementary Material 2. (18.3MB, docx)

Data Availability Statement

The raw fastq files generated in this study have been deposited in the Sequence Read Archive under BioProject ID PRJNA882763 (https://www.ncbi.nlm.nih.gov/sra/PRJNA882763) and sample accession numbers are listed in Supplementary Table 11. The assembled transcriptome gtf file has been deposited on zenodo (https://zenodo.org) (doi:10.5281/zenodo.17229096).


Articles from BMC Genomics are provided here courtesy of BMC

RESOURCES