Skip to main content
[Preprint]. 2025 Jan 2:2023.12.11.571168. Originally published 2023 Dec 12. [Version 3] doi: 10.1101/2023.12.11.571168

Table 2:

Availability of data generated in this paper. The data is available on Caltech Data under the DOIs 10.22002/krqmp-5hy81 and 10.22002/k7xqw-88d74.

File Description Category
viral sequences in laboratory reagents.h5ad Count matrix containing virus-like sequences found in sequencing libraries comprised of only sterile water and laboratory reagents Alignment of ‘blank’ sequencing libraries to the PalmDB
host alignment results.zip Raw alignment results obtained by kallisto after alignment to the macaque and dog (to account for the MDCK spike-in) transcriptomes Alignment of the macaque PBMC data37 to the host transcriptome(s)
host QC.h5ad Filtered count matrix containing all host cells
canis QC norm leiden.h5ad Filtered and clustered count matrix containing MDCK cells
macaque QC norm leiden.h5ad Filtered and clustered count matrix containing macaque cells
macaque QC norm leiden celltypes.h5ad Filtered and clustered count matrix containing macaque cells with cell type assignments
virus no mask alignment results.zip Raw alignment results obtained by kallisto translated search after alignment to the PalmDB without masking host sequences Alignment of the macaque PBMC data37 to the PalmDB for the detection of viral RNA with different workflows for the masking of host genome(s) and transcriptome(s)
virus no mask.h5ad Count matrix obtained through the alignment above with added metadata
virus dlist cdna alignment results.zip Raw alignment results obtained by kallisto translated search after alignment to the PalmDB while masking host transcriptome(s) using the D-list
virus dlist cdna.h5ad Count matrix obtained through the alignment above with added metadata
virus dlist dna alignment results.zip Raw alignment results obtained by kallisto translated search after alignment to the PalmDB while masking host genome(s) using the D-list
virus dlist dna.h5ad Count matrix obtained through the alignment above with added metadata
virus dlist cdna dna alignment results.zip Raw alignment results obtained by kallisto translated search after alignment to the PalmDB while masking host genome(s) and transcriptome(s) using the D-list
virus dlist cdna dna.h5ad Count matrix obtained through the alignment above with added metadata
virus dlist cdna dna amb alignment results.zip Raw alignment results obtained by kallisto translated search after alignment to the PalmDB while masking host genome(s) and transcriptome(s) using the D-list + forcing ambiguous sequences to be discarded
virus dlist cdna dna ambiguous.h5ad Count matrix obtained through the alignment above with added metadata
virus host capture alignment results.tar.gz Raw alignment results obtained by kallisto translated search after alignment to the PalmDB + reads that align to the host transcriptome(s) were captured
virus host-captured.h5ad Count matrix obtained through the alignment above with added metadata
virus host capture dlist cdna dna alignment results.tar.gz Raw alignment results obtained by kallisto translated search after alignment to the PalmDB while masking host genome(s) and transcriptome(s) using the D-list + reads that align to the host transcriptome(s) were captured
virus host-captured dlist cdna dna.h5ad Count matrix obtained through the alignment above with added metadata
bwa unmapped reads.tar.gz Raw sequencing files obtained after removal of host sequences based on alignment with bwa
virus bwa alignment results.zip Raw alignment results obtained by kallisto translated search after alignment to the PalmDB after reads that align to the host genome(s) with bwa were removed
virus bwa.h5ad Count matrix obtained through the alignment above with added metadata
models.zip Logistic regression models to predict viral presence based on host gene expression Logistic regression models
palmdb human dlist cdna dna.idx Pre-computed PalmDB reference index with human genomic and transcriptomic sequences masked using D-list Pre-computed references for future use with kallisto translated search
palmdb mouse dlist cdna dna.idx Pre-computed PalmDB reference index with mouse genomic and transcriptomic sequences masked using D-list