COVID-19 patients and Dementia: Frontal cortex transcriptomic data

Maria Garofalo; Stella Gagliardi; Susanna Zucca; Cecilia Pandini; Francesca Dragoni; Daisy Sproviero; Orietta Pansarasa; Tino Emanuele Poloni; Valentina Medici; Annalisa Davin; Silvia Damiana Visonà; Matteo Moretti; Antonio Guaita; Mauro Ceroni; Livio Tronconi; Cristina Cereda

doi:10.1016/j.dib.2021.107432

. 2021 Sep 29;38:107432. doi: 10.1016/j.dib.2021.107432

COVID-19 patients and Dementia: Frontal cortex transcriptomic data

Maria Garofalo ^a,^b, Stella Gagliardi ^a, Susanna Zucca ^c, Cecilia Pandini ^a, Francesca Dragoni ^a,^b, Daisy Sproviero ^a, Orietta Pansarasa ^a, Tino Emanuele Poloni ^d, Valentina Medici ^d, Annalisa Davin ^d, Silvia Damiana Visonà ^e, Matteo Moretti ^e, Antonio Guaita ^d, Mauro Ceroni ^f,^g, Livio Tronconi ^e,^h, Cristina Cereda ^a,^⁎

PMCID: PMC8479506 PMID: 34604485

Abstract

Since the association of SARS-Cov-2 infection with Nervous System (NS) manifestations, we performed RNA-sequencing analysis in Frontal Cortex of COVID-19 positive or negative individuals and affected or not by Dementia individuals. We examined gene expression differences in individuals with COVID-19 and Dementia compared to Dementia only patients by collecting transcript counts in each sample and performing Differential Expression analysis. We found eleven genes satisfying our significance criteria, all of them being protein coding genes.

These data are suitable for integration with supplemental samples and for analysis according to different individuals’ classification. Also, differential expression evaluation may be implemented with other scientific purposes, such as research of unannotated genes, mRNA splicing and genes isoforms.

The analysis of Differential Expressed genes in COVID-19 positive patients compared to non-COVID-19 patients is published in: S. Gagliardi, E.T. Poloni, C. Pandini, M. Garofalo, F. Dragoni, V. Medici, A. Davin, S.D. Visonà, M. Moretti, D. Sproviero, O. Pansarasa, A. Guaita, M. Ceroni, L. Tronconi, C. Cereda, Detection of SARS-CoV-2 genome and whole transcriptome sequencing in frontal cortex of COVID-19 patients., Brain. Behav. Immun. (2021). https://doi.org/10.1016/j.bbi.2021.05.012.

Keywords: SARS-CoV-2, Transcriptomics, Gene expression, Brain

Specifications Table

Subject	Omics: Transcriptomics
Specific subject area	Bulk Whole RNA-sequencing and Differentially Expressed Genes analysis
Type of data	Table Graph Figure
How data were acquired	-Illumina NextSeq 500 Sequencer, -llumina bcl2fastq2 (Version 2.17.1.14 - http://support.illumina.com/downloads/bcl-2fastq-conversion-software-v217.html) -STAR/RSEM (1.3.3) -R (v4.0.2) packages: DESeq2 (1.30.0) enrichR (2.1)
Data format	Raw Analyzed Filtered
Parameters for data collection	We considered four conditions for data collection: (1) COVID-19 individuals with Dementia; (2) NO COVID-19 individuals with Dementia; (3) COVID-19 individuals without Dementia; (4) NO COVID-19 individuals without Dementia.
Description of data collection	Data were collected through bulk RNA-sequencing of total RNA extracted from post-mortem Frontal Cortexes.
Data source location	(1) Institution: IRCCS Mondino Foundation City: Pavia Country: Italy
Data accessibility	Repository name: GEO (Gene Expression Omnibus)-NCBI Data identification number: GSE164332 Direct URL to data: https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE164332
Related research article	S. Gagliardi, E.T. Poloni, C. Pandini, M. Garofalo, F. Dragoni, V. Medici, A. Davin, S.D. Visonà, M. Moretti, D. Sproviero, O. Pansarasa, A. Guaita, M. Ceroni, L. Tronconi, C. Cereda, Detection of SARS-CoV-2 genome and whole transcriptome sequencing in frontal cortex of COVID-19 patients., Brain. Behav. Immun. (2021). https://doi.org/10.1016/j.bbi.2021.05.012.

Open in a new tab

Value of the Data

•
We exploited Next Generation Sequencing technique for providing transcriptomic profiles in Frontal Cortex of both COVID-19 positive or negative individuals and affected or not by Dementia individuals. These screenings are important for the study of impact of current infectious disease on Central Nervous System, so-called NeuroCOVID-19, and on diverse elderly comorbidities, such as Dementia. The aim was to collect information concerning RNA alterations in the prefrontal cortex given its contribution in hemodynamic responses.
•
These data can help in the study of molecular features of SARS-CoV-2 in the brain. Moreover, the dysregulation of specific pathways can be extrapolated from transcriptomic data making them a source of biomarkers.
•
Versatility of both raw and analysed RNA-sequencing data lies in their suitability for several purposes, such as gene expression analysis, unannotated genes discovery, mRNA splicing investigation and genes isoforms study. In addition, data in standard format, such as FastQ and BAM files, but also gene expression tables reporting raw counts, FPKM and TPM values, can be easily re-used and integrated with additional samples or exploited to refine the analysis with different individual classification.”

1. Data Description

A summary of anagraphic and clinical feature of cases included in transcriptomic investigation is reported in Table 1. Individuals with Dementia were six, individuals with Dementia and COVID-19 were seven, two individuals had neither Dementia nor COVID-19 and two individuals had COVID-19 but not Dementia.

Table 1.

Summary of the anagraphic and clinical characteristics of COVID-19 and NON-COVID-19 cases. Under COVID column, “+” stands for positive, “−” stands for negative.

CASE	COVID	AGE	SEX	PMD (hours)	DEMENTIA
COV1	+	74	F	168	Dem
COV3	+	87	M	168	Dem
COV4	+	67	M	120	No Dem
COV5	+	94	F	72	Dem
COV7	+	80	F	360	No Dem
COV8	+	83	F	312	Dem
COV9	+	92	M	144	Dem
COV10	+	81	M	168	Dem
COV6	+	90	F	264	No Dem
BB247	−	104	F	6	Dem
BB236	−	80	M	15	Dem
BB109	−	79	M	16	No Dem
BB47	−	78	F	8	Dem
BB271	−	84	F	2	Dem
BB138	−	85	F	15	Dem
BB120	−	84	M	10	Dem
BB118	−	79	M	3	No Dem

Open in a new tab

Abbreviations: PMD = Post Mortem Delay; Dem = Dementia; No Dem = No Dementia.

In Supplementary Table 1, the counts of each gene (specified as Ensembl ID) are indicated for each sample submitted to sequencing.

The amount of both coding and non-coding counts was evaluated for each sample and as visible in Fig. 1, coding ones were the most abundant. This result is in accordance with the currently available knowledge about non-coding transcripts that result to be globally less expressed than coding ones within the cell [1,2]. BB109 was nonuniform in terms of counts abundancy and did not pass quality check, thus this sample was excluded from further analysis.

A differential expression analysis of genes was performed. We compared the group of individuals with COVID-19 and Dementia (n = 7) versus those with Dementia only (n = 6). In order to evaluate the clustering resulting from this analysis, we represented in the Heatmap in Fig. 2 all the deregulated genes. The list of genes considered significant in this analysis is available in Supplementary Table 2. We found dysregulated 11 genes, 4 up-regulated and 7 down-regulated. All of them were protein coding. In this table Ensembl ID, base mean, log2FoldChange, lfcSE, stat, P-value, adjusted P-value, gene name, gene biotype and gene source are indicated.

Fig.2 — Heatmap of top 11 Differentially Expressed (DE) genes. Samples from COVID-19 positive individuals with Dementia (n = 7) are marked in light blue, while patients with Dementia (n = 6) are marked in pink. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article).

We also performed differential expression analysis of genes considering COVID-19 patients without Dementia (n = 2) versus COVID-19 negative individuals without Dementia (n = 1), but we found no significantly deregulated genes observing our filtering criteria as reported in Supplementary Table 2.

The volcano plot in Fig. 3 shows statistical significance (P-value) versus magnitude of change (fold change) of differential expressed (DE) genes in COVID-19 and Dementia individuals (n = 7) versus individuals with Dementia only (n = 6). The number of genes with |log2(fold change)|n 1 that are also statistically significant is low.

2. Experimental Design, Materials and Methods

Autoptic human brain samples were used for collecting these data. RNA was isolated by Trizol reagent (Life Science Technologies, Italy) according to the manufacturer's instructions and processed as described in Gagliardi et al. [1].

Starting from 1 µg of total RNA, sequencing libraries were prepared with the CORALL Total RNA-Seq Library Prep Kit (Lexogen, Vienna, Austria) and sequenced on an Illumina NextSeq 500 Sequencing (Illumina, San Diego, CA) as described in Gagliardi et al. [1] . FastQ files were generated via llumina bcl2fastq2 (Version 2.17.1.14 - http://support.illumina.com/downloads/bcl-2fastq-conversion-software-v217.html) starting from raw sequencing reads produced by Illumina NextSeq sequencer.

Quality of individual sequences were evaluated using MultiQC software (https://multiqc.info/) after adapter trimming with cutadapt software. UMI sequences were marked and deduplicated with UMI-tools software [2] [UMItools]. Per base sequence quality plots, showing the mean quality value across each base position in the read are shown in Fig. 4. Gene and transcript intensities and differential expression analysis for mRNA and non coding RNAs were computed as in Gagliardi et al. [1]. Human genome reference used for the alignment was GRCh38 (Gencode release 36), containing the up-to-date records for both coding and non coding RNAs. Coding and non coding genes were considered differentially expressed and retained for further analysis with |log2(disease sample/healthy control)| ≥ 1 and a FDR ≤ 0.1. We imposed minimum |Log2FC| of 1 and a FDR lower than 0.1 as thresholds to differentially expressed genes. Inter- and intra-group variability was assessed and shown in Fig. 5. On average, 29.2 M reads were available for each sample and 22.7 M reads were aligned against the reference genome (average overall alignment rate: 77.9%). Input reads number, average read length, number of aligned reads and alignment rate are reported in Table 2 for each sample. Transcripts with a count value of at least 5 were retained for differential expression analysis. On average, 16734.8 coding genes and 5370.6 non coding genes resulted to be expressed in each sample.

Fig. 4 — Quality assessment of FASTQ sequences data for paired end and right reads. Each plot shows the mean quality value across each base position in the read for all the analyzed samples.

Fig. 5 — Panel A shows Principal component analysis result on the whole dataset. Panel B shows an estimate of the dispersion parameter for each gene. In Panel C, the heatmap of the sample-to-sample distance is shown. It was obtained with DeSeq2 package on regularized-logarithm transformed counts. Color code is reported above the heatmap.

Table 2.

For each sample indicated in “Sample_name” column, the total number of input reads, the average read length, the number of reads uniquely mapped to the reference genome and the overall alignment rate are reported.

Sample_name	Number of input reads	Average input read length	Uniquely mapped reads num	Uniquely mapped reads percentage
BB109	2043229	136	1465303	71.72%
BB118	7818855	136	6856173	87.69%
BB120	41915410	130	34790052	83.00%
BB138	14439548	136	12677228	87.80%
BB236	8773432	137	7747656	88.31%
BB247	72990663	133	62361438	85.44%
BB271	52689242	131	43149563	81.89%
BB47	42194239	135	36628979	86.81%
COV1	3558188	136	2844285	79.94%
COV10	37740062	132	28378973	75.20%
COV3	37337535	126	13450528	36.02%
COV4	35786418	134	29279441	81.82%
COV5	2309778	136	1981240	85.78%
COV6	39798363	134	33643489	84.53%
COV7	11548099	136	9483185	82.12%
COV8	38311594	131	24019227	62.69%
COV9	46815183	132	37817645	80.78%

Open in a new tab

Ethics Statement

The study protocol was approved by the Ethics Committee of the University of Pavia on October 6th, 2009 (Committee report 3/2009). In case of deceased subjects, the consent is not required, as the samples had been taken anyway for clinical/forensic purposes and because it is not possible to contact the next of kin in such circumstances. The reference law is the authorization n9/2016 of the guarantor of privacy, then replaced by REGULATION (EU) 2016/679 OF THE EUROPEAN PARLIAMENT AND OF THE COUNCIL.

CRediT authorship contribution statement

Maria Garofalo: Writing – original draft, Methodology. Stella Gagliardi: Writing – original draft, Methodology. Susanna Zucca: Software, Writing – original draft. Cecilia Pandini: Methodology, Data curation. Francesca Dragoni: Methodology, Data curation. Daisy Sproviero: Writing – review & editing. Orietta Pansarasa: Writing – review & editing. Tino Emanuele Poloni: Writing – original draft. Valentina Medici: Writing – original draft. Annalisa Davin: Writing – review & editing. Silvia Damiana Visonà: Data curation. Matteo Moretti: Data curation. Antonio Guaita: Supervision. Mauro Ceroni: Supervision. Livio Tronconi: Supervision. Cristina Cereda: Supervision, Writing – review & editing.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships which have or could be perceived to have influenced the work reported in this article.

Acknowledgments

We thank the Abbiategrasso Brain Bank (ABB) donors and the COVID-19 patients who donated the noblest organ of their body. And we also would like to thank Dr. Antonio Traversi and Prysmian Group for the support.

Footnotes

Supplementary material associated with this article can be found in the online version at doi:10.1016/j.dib.2021.107432.

Appendix. Supplementary materials

mmc1.xlsx^{(7.1MB, xlsx)}

mmc2.xlsx^{(10.9KB, xlsx)}

References

1.Gagliardi S., Poloni E.T., Pandini C., Garofalo M., Dragoni F., Medici V., Davin A., Visonà S.D., Moretti M., Sproviero D., Pansarasa O., Guaita A., Ceroni M., Tronconi L., Cereda C. Detection of SARS-CoV-2 genome and whole transcriptome sequencing in frontal cortex of COVID-19 patients. Brain. Behav. Immun. 2021 doi: 10.1016/j.bbi.2021.05.012. [DOI] [PMC free article] [PubMed] [Google Scholar]
2.Smith T., Heger A., Sudbery I. UMI-tools: modeling sequencing errors in unique molecular identifiers to improve quantification accuracy. Genome Res. 2017;27:491–499. doi: 10.1101/gr.209601.116. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

mmc1.xlsx^{(7.1MB, xlsx)}

mmc2.xlsx^{(10.9KB, xlsx)}

[bib0001] 1.Gagliardi S., Poloni E.T., Pandini C., Garofalo M., Dragoni F., Medici V., Davin A., Visonà S.D., Moretti M., Sproviero D., Pansarasa O., Guaita A., Ceroni M., Tronconi L., Cereda C. Detection of SARS-CoV-2 genome and whole transcriptome sequencing in frontal cortex of COVID-19 patients. Brain. Behav. Immun. 2021 doi: 10.1016/j.bbi.2021.05.012. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib0002] 2.Smith T., Heger A., Sudbery I. UMI-tools: modeling sequencing errors in unique molecular identifiers to improve quantification accuracy. Genome Res. 2017;27:491–499. doi: 10.1101/gr.209601.116. [DOI] [PMC free article] [PubMed] [Google Scholar]

PERMALINK

COVID-19 patients and Dementia: Frontal cortex transcriptomic data

Maria Garofalo

Stella Gagliardi

Susanna Zucca

Cecilia Pandini

Francesca Dragoni

Daisy Sproviero

Orietta Pansarasa

Tino Emanuele Poloni

Valentina Medici

Annalisa Davin

Silvia Damiana Visonà

Matteo Moretti

Antonio Guaita

Mauro Ceroni

Livio Tronconi

Cristina Cereda

Abstract

Specifications Table

Value of the Data

1. Data Description

Table 1.

Fig. 1.

Fig. 2.

Fig. 3.

2. Experimental Design, Materials and Methods

Fig. 4.

Fig. 5.

Table 2.

Ethics Statement

CRediT authorship contribution statement

Declaration of Competing Interest

Acknowledgments

Footnotes

Appendix. Supplementary materials

References

Associated Data

Supplementary Materials

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases