Skip to main content
. 2023 Aug 23;2023:gigabyte87. doi: 10.46471/gigabyte.87

Table 2.

File types processed during the testing phase of the aws-s3-integrity-check tool.

File type Description
Bam Compressed binary version of a SAM file that represents aligned sequences up to 128 Mb.
Bed Browser Extensible Data format. This file format is used to store genomic regions as coordinates.
Csv Comma-Separated Values.
Docx File format for Microsoft Word documents.
Fa File containing information about DNA sequences and other related pieces of scientific information.
Fastq Text-based format for storing genome sequencing data and quality scores.
Gct Gene Cluster Text. This is a tab-delimited text format file that contains gene expression data.
Gff General Feature Format is a file format used for describing genes and other features of DNA, RNA, and protein sequences.
Gz A file compressed by the standard GNU zip (gzip).
Html HyperText Markup Language file.
Ibd Pre-processed mass spectrometry imaging (MSI) data.
imzML Imaging Mass Spectrometry Markup Language. Contains raw MSI data.
Ipynb Computational notebooks that can be opened with Jupyter Notebook.
Jpg Compressed image format for containing digital images.
JSON JavaScript Object Notation. Text-based format to represent structured data based on JavaScript object syntax.
md5 Checksum file.
Msa Multiple sequence alignment file. It generally contains the alignment of three or more biological sequences of similar length.
Mtx Sparse matrix format. This contains genes in the rows and cells in the columns. It is produced as output by Cell Ranger.
Npy Standard binary file format in NumPy [27] for saving numpy arrays.
Nwk Newick tree file format to represent graph-theoretical trees with edge lengths using parentheses and commas.
Pdf Portable Document Format.
Py Python file.
Pyc Compiled bytecode file generated by the Python interpreter after a Python script is imported or executed.
R R language script format.
Svg Scalable Vector Graphics. This is a vector file format.
Tab Tab-delimited text or data files.
Tif Tag Image File Format. Tif is a computer file used to store raster graphics and image information.
Tsv Tab-separated values to store text-based tabular data.
Txt Text document file.
Vcf Variant Call Format. Text file for storing gene sequence variations.
Xls Microsoft Excel Binary File format.
Zip A file containing one or more compressed files.