DCMDSM: a DICOM decomposed storage model

Alexandre Savaris; Theo Härder; Aldo von Wangenheim

doi:10.1136/amiajnl-2013-002337

. 2014 Feb 3;21(5):917–924. doi: 10.1136/amiajnl-2013-002337

DCMDSM: a DICOM decomposed storage model

Alexandre Savaris ^1,^2,³, Theo Härder ¹, Aldo von Wangenheim ^2,³

PMCID: PMC4147623 PMID: 24491269

Abstract

Objective

To design, build, and evaluate a storage model able to manage heterogeneous digital imaging and communications in medicine (DICOM) images. The model must be simple, but flexible enough to accommodate variable content without structural modifications; must be effective on answering query/retrieval operations according to the DICOM standard; and must provide performance gains on querying/retrieving content to justify its adoption by image-related projects.

Methods

The proposal adapts the original decomposed storage model, incorporating structural and organizational characteristics present in DICOM image files. Tag values are stored according to their data types/domains, in a schema built on top of a standard relational database management system (RDBMS). Evaluation includes storing heterogeneous DICOM images, querying metadata using a variable number of predicates, and retrieving full-content images for different hierarchical levels.

Results and discussion

When compared to a well established DICOM image archive, the proposal is 0.6–7.2 times slower in storing content; however, in querying individual tags, it is about 48.0% faster. In querying groups of tags, DICOM decomposed storage model (DCMDSM) is outperformed in scenarios with a large number of tags and low selectivity (being 66.5% slower); however, when the number of tags is balanced with better selectivity predicates, the performance gains are up to 79.1%. In executing full-content retrieval, in turn, the proposal is about 48.3% faster.

Conclusions

DCMDSM is a model built for the storage of heterogeneous DICOM content, based on a straightforward database design. The results obtained through its evaluation attest its suitability as a storage layer for projects where DICOM images are stored once, and queried/retrieved whenever necessary.

Keywords: Information Storage and Retrieval, Database Management Systems, Decomposed Storage Model, Digital Imaging and Communications in Medicine

Introduction

The Digital Imaging and Communications in Medicine (DICOM) standard, first released in 1985 as ACR/NEMA 300, comprises a set of non-proprietary specifications regarding structure, format, and exchange protocols for digital-based medical images.¹ ² Combining alphanumerical and binary content into the same image file, the standard defines a self-contained approach for data storage and communication, organized through a hierarchy composed of patient, study, series, and image levels.

Once acquired, DICOM images are stored according to the particular needs of the involved stakeholders. The storage policies vary from simple file persistence in ordinary file systems, to extraction and indexing of particular image attributes in metadata catalogs and/or databases, demanding full-content parsing on the former and index lookups on the latter in executing query/retrieval operations.³ Usually, simpler strategies on storage demand complex and time-consuming routines for searching and retrieving content.

Aiming to contribute to reducing the time spent for both query and retrieval workloads, this work defines and evaluates a data model designed to provide full-content storage for DICOM images, as well as full-metadata indexing, allowing the execution of flexible search operations. Originally conceived to accept heterogeneous content, the proposal is well adapted to manage images from different examination modalities and medical device manufacturers, being able to boost query/retrieval through extraction/indexing of attributes according to their data types/domains.

Background and significance

Structure and organization of DICOM image files

Physically, the content of a DICOM image file can be seen as structured at the attribute level and as semi-structured at the file level. At the lowest organizational level, tags identified by group/element ordered pairs represent attributes. DICOM tags are characterized by value representations (VRs) and value multiplicities (VMs), which specify content data types/domains, formatting rules, and number of data elements allowed per tag.⁴ The DICOM standard defines a data dictionary composed of a set of tags with reserved group/element identifiers, allowing its expansion through the use of proprietary tags.⁵ At the file level, in turn, a DICOM image is structured as a set of tags. The number of tags in a file varies according to the availability of information during the examination scheduling and execution, as well as the examination modality to be performed (eg, CT, MRI), and the medical device manufacturer.

Motivations for metadata indexing and adaptive, full-content storage

Although the image-specific data stored in DICOM files is the most relevant part of the standard defined content, the accompanying metadata play an important role as a complement or as a self-contained dataset. As a complement, metadata can be used in searching similar images based on attribute-value matching between an image source and an image database.⁶ ⁷ The DICOM standard itself specifies query (C-FIND) and retrieval (C-GET, C-MOVE) operations in terms of comparisons performed on key attribute values.⁸ The prediction of compressed-image quality is also possible, through the evaluation of specific DICOM tags.⁹ As a self-contained dataset, in turn, metadata can be used in calculations and in monitoring radiation dose levels to which patients are exposed, allowing the identification of relevant variations and level shifts.^10–12

Considering that the achievement of different goals implies the use of different subsets of data, storage strategies capable of managing heterogeneous and evolving content become a necessity. Current approaches used in managing DICOM image data incur limited or even inexistent support to content-variant datasets, which reduces their suitability to scenarios where new tags can be available over time.

Strategies for managing DICOM content

The DICOM structure and organization allows the adoption of a number of strategies aiming at the execution of storage, query, and retrieval operations, according to constraints like hardware and software availability, volumes of data to be managed, and usage contexts (eg, image visualization and/or manipulation, image exchange, statistical and operational analysis over metadata). In the simplest approach, the content storage is made using common file system architectures running on ordinary hardware; the DICOM semantically defined hierarchy is physically constructed using directories and subdirectories, translating the patient/study/series/image levels into a directory tree.¹³ It is a low-cost option for quick setups, easing expansions in terms of storage capacity. However, deep content searches are quite restricting, demanding individual file-content parsing and imposing a significant overhead considering high volumes of data. Improvements using combined techniques based on enhanced data models (eg, hierarchical data format (HDF), network common data format (NetCDF)), and enhanced file systems (eg, parallel virtual file system (PVFS)), bring distribution and partitioning to the file system strategy.^14–16 The drawback for such an approach remains in the search for specific attribute values, which still demands file-content parsing.

The lack of metadata for query/retrieval in file-system-based storage can be addressed through low- and high-level strategies, using extended file attributes and distributed metadata catalogs.¹⁷ ¹⁸ Both approaches improve the hierarchical and distributed alternatives based on file systems, allowing searches using metadata comparisons. The restrictions in these strategies are directly related to which metadata are used, considering that the extended attributes and metadata catalogs are defined in terms of a subset of the available content.

Relational database management systems (RDBMSs) are alternatives available for structured storage. In these systems, the hierarchical relationship between patients, studies, series, and images is usually implemented through joins performed among tables defined for each level of the hierarchy, improving search performance through the use of indexes.^19–21 The database schemas follow the horizontal model, characterized by a reduced number of tables with numerous fields per table. Such organization maps the conceptual data model accordingly; however, it imposes restrictions regarding its use in heterogeneous use-case scenarios.

For fixed-structured datasets and applications designed for individual healthcare institutions and/or examination modalities, the approaches mentioned above are quite consistent. However, they lack flexibility, demanding maintenance whenever new data elements become available. In scenarios of dynamic content like DICOM, such fixed structures limit the data content and, consequently, the search capabilities.

Materials and methods

The decomposed storage model

DSM (decomposed storage model) is a storage model based on the decomposition of relations from a conceptual schema into a set of simpler, binary relations. For each attribute originally defined in the conceptual schema, the model proposes creating a binary relation composed by a surrogate key and the attribute value (clustered on the surrogate key), and a binary relation with the same structure, but clustered on the attribute.²² This storage model can be considered the predecessor for the current column-oriented architectures, providing a number of improvements (eg, optimization on a per-column access, reduction/elimination of data sparsity), and known drawbacks (eg, reduced performance on manipulating sets of correlated attributes, for both reading and writing operations, involving a large number of inter-table joins and database insertions), when compared to row-oriented architectures.²³

Due to its simplicity, the DSM storage model is suitable for extensions and customizations. In its original/modified form, the model is adopted in a number of scenarios including database self-tuning, multi-tenant Software as a Service (SaaS) applications, management of semantic web data and heterogeneous biomedical data.^24–27 In this work, the original DSM architecture is adapted to incorporate characteristics found in DICOM image files, aiming to provide a full-content storage model with performance gains for query/retrieval operations.

The proposal: DICOM decomposed storage model

The DICOM decomposed storage model (DCMDSM) proposed in this work is based on the original DSM, adapting its characteristics to a scenario of full-content storage for medical image data organized according to the DICOM standard, expecting a 1:n storage-query/retrieval ratio. Assuming that a DICOM file is stored once, and its content is queried/retrieved whenever necessary, the proposal focuses on enhancing query/retrieval aiming to reduce its execution time, to the detriment of the storage execution time. The model differs from already known approaches in the following:

All standard and proprietary tags extracted from DICOM image files are stored/indexed, aiming to provide full flexibility on query construction and execution. This generalization allows managing content from heterogeneous examination modalities, as well as content acquired from devices of different manufacturers, without schema modifications.
Metadata access through Hierarchical Search Methods and Relational-Queries, as defined by the DICOM standard, can be performed using predicates combining any unique/required/optional search key.
Content retrieval can be performed at pixel data (image) level, or at full-content (metadata+pixel data) level.

The proposal is built on top of a standard RDBMS, and physically it is centered on the hierarchical_key table. This table is responsible for the management of surrogate keys (a characteristic from the original DSM), and their binding to the four tag values that identify each level of the DICOM hierarchy (ie, patientid, studyinstanceuid, seriesinstanceuid, sopinstanceuid). For each DICOM image file stored, a new record is inserted in the hierarchical_key table, generating a new surrogate key.

Tags extracted from DICOM image files during parsing time are stored in different tables, according to their VRs. This approach modifies the definition of the original DSM, which states that values for the same attribute must be stored in particular/exclusive tables, clustered by key and value. Mapping the DICOM structure to the original DSM definition implies the creation of a physical model with more than 6000 tables, only for standard tags, incurring schema modifications for each new standard/proprietary tag added over time. The vertical partitioning by VR used in this proposal is simpler and quite consistent, considering that VR definitions have suffered minimal changes since their adoption as part of the DICOM standard, allowing the inclusion of new tags without further modifications in the database schema. The model is further simplified by creating a single table per VR (instead of two tables), replacing the clustering on key/value by indexes.

Another difference between the proposal and the original DSM is the number of fields per table. While the conceptual DSM is based on binary relations (composed by the surrogate key and one attribute from the conceptual schema), the physical DCMDSM uses n-ary tables, with n varying according to each VR. The surrogate key field is used as a foreign key to the hierarchical_key table, allowing the establishment of a relationship between tag values and hierarchical level identifiers, related to a specific DICOM image file. For all VR tables, indexes are created on the primary key and on the group/element fields; textual and numerical VRs are indexed, also, by value. An excerpt extracted from the proposed model is presented in figure 1.

Excerpt extracted from the DICOM decomposed storage model (DCMDSM). Centered on the hierarchical_key table, the model distributes data in value representation-specific structures (eg, *lt_value* for the VR *Long Text*, *da_value* for the VR *Date*), binding tags from the same original files through surrogate keys. Each VR record is uniquely identified by the surrogate key, the *tag order* in the original file (generated at parsing time), and the *sequential* value of the record inside the tag (which is relevant for tags whose value multiplicity is greater than 1). The *type* field classifies each record as pertaining to a header/data tag, and the *group*/*element* fields identify the meaning of the tag, as well as its classification as standard/proprietary. The *value* field stores the content for each data component of the tag, varying its data type/domain according to the VR specifications in the DICOM standard. The *length* field is used to store the real DICOM length for the tags’ content (which can be different from the real length of the physical content, due to the padding rules established by the standard), and is available whenever the VR supports variable-length data. DICOM, Digital Imaging and Communications in Medicine.

The model is complemented with a table designed to store the whole content for each original DICOM image file, unmodified, aiming to simplify and to improve performance to the retrieval operation. Although it is possible to fully rebuild a file from the individually stored tag values, empirical tests show that transposing the vertical database schema to the original, horizontal representation is a very time-consuming task, unfeasible for practical purposes.

Experiments

In order to evaluate the proposed storage model, a number of experiments were performed to observe its behavior and ratify its suitability to the following operations: storing heterogeneous DICOM images without schema modifications, querying metadata using single- and multi-criteria predicates, and retrieving full-content images for study, series and image hierarchical levels. The evaluation was performed using a number of heterogeneous DICOM image datasets, freely available for research activities, detailed in table 1.²⁸

Table 1.

Image datasets used in the evaluation of DCMDSM

			Average number			Average size in bytes per file
Examination modality	Number of studies	Number of series	Image files per study	Image files per series	Tags per image file	Metadata	Image	Size on disk (MB)
Computed radiography	1	6	6	1	80	802	2 278 594	14
X-ray angiography	2	3	30	20	120	5662	1 442 097	83
Secondary capture	4	6	229	153	64	932	168 897	151
Positron emission tomography	9	23	578	226	161	3085	16 211	111
MRI	16	140	307	35	159	2704	72 006	363
CT	35	105	846	282	132	3888	109 054	3272

Open in a new tab

DCMDSM, DICOM decomposed storage model.

For comparative purposes, the same operations were executed on a well established DICOM archive and image manager (dcm4chee v2.17.1), used worldwide by healthcare providers, commercial/open source applications, and research projects.²⁹ The archive uses a hybrid approach, storing metadata in RDBMSs for answering queries, and full-content images in file systems for retrieval. Despite its fixed horizontal database schema (modeled to store a reduced number of tags as fields), the archive allows the storage of the remainder tag values in binary large object (BLOB) fields. The remainding tag values related to the same DICOM hierarchical level are stored in the same table, concatenated into a BLOB field, using a high-level approach similar to the interpreted attribute storage format.³⁰ The following configurations were tested:

Original configuration (OC), the default configuration for the archive. Tags with correspondent fields in the database schema are persisted directly, remainder tags are ignored, and the full image content is stored in the file system.
Extended configuration (EC), to provide full-content storage of metadata in the database schema. Tags with correspondent fields in the database schema are persisted directly, remainder tags are concatenated and stored in BLOB fields related to their hierarchical levels, and the full image content is stored in the file system.

All experiments were performed in the following hardware/software setup: Intel Core i7 2.7 GHz, 4GB DDR3 RAM, 500GB SATA HD, OS X 10.8.4. The RDBMS used for both DCMDSM and the archive was MariaDB V.10.0.3, without any fine-tuning aiming performance enhancement.³¹

Results and discussion

Storage

Experiments involving storage measured the time needed for storing the whole dataset described in table 1, comparing the current proposal with both configurations of the archive. For all cases, the dataset's full content was stored consistently, despite its heterogeneity. The VR-driven vertical partitioning used in DCMDSM proved to be effective, and flexible enough to be used in scenarios of variable content.

For all examination modalities, the originally configured archive outperformed the other approaches in storing DICOM content. Considering the whole evaluation scenario, it was on average 57.9% faster than its extended configured pair, and 78.6% faster than DCMDSM. These results are directly related to the reduced number of database insertions (due to the horizontal organization of the archive's database schema), as well as to the fact that only a subset of the available tags is effectively stored.

Configured to perform full metadata storage, the archive presented a decrease in performance for all examination modalities. Considering that the database schema is the same as its originally configured pair, and the number of database insertions remains constant, the measured overhead is mainly related to the processing step needed to prepare the multi-valued, binary content for insertion (concatenating group values, element values, and tag values). This overhead is particularly relevant for positron emission tomography (PET) (which presents the largest number of tags per file). Being 85.7% and 38.9% slower than the OC and the DCMDSM, respectively, the EC turns out to be the worst choice for this examination modality.

Despite its effectiveness in storing heterogeneous datasets without schema modifications, the DCMDSM approach performed poorly when compared to the other configurations. Decomposing the original, conceptual schema into simpler physical tables, the data model requires n+2 variable-length database insertions for an image file with n tags (ie, one insertion for the hierarchical key, one insertion for each tag, and one final insertion for the full content). This high number of insertions (which is a known drawback inherited from the original DSM) compromised the storage performance, which was on average 3.7–7.2 times slower when compared to the originally configured archive, and 0.6–4.8 times slower when compared to the EC. The values acquired through the execution of the experiments are depicted in figure 2.

Storage time per examination modality. The acquired results are related, on an examination-modality basis, to their corresponding dataset sizes, and include the time needed for parsing/extracting individual tags from the original Digital Imaging and Communications in Medicine (DICOM) image files. A combination of dataset size and number of tags per file define the results, as can be seen comparing the times for secondary capture and positron emission tomography examination modalities.

Query

Experiments involving querying were performed at a hierarchical-level basis. For each level (ie, patient, study, series, image), a set of metadata tag values was queried/retrieved, one tag at a time. The number of metadata tags varied according to the level, and the tags were chosen based on sets of tags related to each level, listed in the DICOM standard (a complete list of tags can be found online, in the online supplementary data section). Query predicates were built using tags defined as unique keys from higher levels, together with the tag defined as the unique key for the current level (eg, patient level was queried using patientid, study level was queried using patientid and studyinstanceuid). Queries were executed 10 times each, using values for the unique keys selected from the stored data for each examination modality, keeping average execution times as results for comparison.

In a global evaluation, it is possible to perceive that the dominant aspect in terms of performance is the number of tags queried per level. Independently of dataset size per examination modality and number of tags per image file, more tags to be individually queried/retrieved directly implies more time spent on the operation.

The query performance was quite similar for both archive configurations, despite the need for fetching/parsing values from BLOB fields in the extended setup. The overall difference between configurations was inferior to 1.0% in cumulative query time, implying that the time for managing binary-retrieved tag values is negligible when compared to accessing/fetching data from the database.

Due to the simpler structure of its physical tables, DCMDSM was able to perform on average 48.0% faster in single-attribute queries, when compared to the better result obtained using the archive. The best time-reduction achieved was 82.6%, comparing DCMDSM and the EC archive, querying the series level of CT examinations. These results are derived from a well-known advantage of the original DSM over horizontal database schemas, based on the reduction of fetching time by defining a small number of fields per table. Fetching smaller records for projecting the required tag values is a suitable strategy, considering scenarios where tags are selected dynamically. In scenarios where multiple tag values are queried at once, though, horizontal database schemas tend to perform better, reducing the number of physical accesses to database files and fetching larger records. The average execution times obtained through the experiments can be seen in figure 3.

Query time per examination modality, using single tag projections. Unique, required and optional tag values are included in the projection lists. Queries executed in lower levels of the Digital Imaging and Communications in Medicine (DICOM) hierarchy present greater selectivity, due to multi-criteria predicates.

To observe the impact on querying multiple tag values at once, DCMDSM and both configurations of the archive were queried varying the number of tags projected simultaneously, using the CT dataset (the largest dataset available). For each level of the DICOM hierarchy, related tags were grouped in numbers of 5, 10, 15, 20, 25, and 30; due to the variable number of tags per level, the number of groups per level varies as well. A query was executed for each group of tags, and the sum of all query times for the groups with the same number of tags was kept as a partial result. The whole process was executed 10 times, using values for the unique keys selected from the stored data; an average of all partial results is presented as the final result.

The behavior of original/ECs for the archive was quite similar, with query times decreasing as the number of tags increases. As expected, a horizontal database schema performs better when more attributes are retrieved simultaneously from each fetched record. A difference of 1.3% in cumulative query time was observed between configurations, favoring the archive's original setup. When compared to DCMDSM, though, the behavior varies among levels of the hierarchy.

At study level, DCMDSM was outperformed by both archive configurations considering all variations in the number of tags per group. The projection of multiple tags simultaneously in vertical-oriented layouts implies access to multiple tables, demanding more time on reading, filtering, and joining results. In contrast, simultaneous retrieving of multiple tags by queries executing on horizontal database schemas favors the archive configurations. In cumulative query time, the archive's extended setup, which presented the best results, was 66.5% faster than DCMDSM.

The decrease in performance derived from the multiple-table access in vertical-oriented layouts can be minimized by the use of more selective predicates, whose results can be seen comparing the query times obtained for the image level. Despite the total number of tags and groups of tags, DCMDSM was 79.1% faster when compared to the OC of the archive. In intermediate scenarios, in turn, where the number of tags/groups of tags is balanced with a medium selectivity, the proposal outperformed the archive's configurations in 64.1% and 44.2%, respectively, at patient and series levels. The query times obtained for this experiment are summarized in figure 4.

Query time per approach, using projections based on groups of tags. Results are shown according to the number of tags per level, in groups composed of 5–30 tags queried simultaneously.

Retrieval

Experiments on retrieval of full-content images were performed, as the previous query experiments, at a hierarchical-level basis. From study to image level, sets of images were searched, retrieved, and persisted on disk, one image per file, according to the unique keys from higher levels together with the unique key for the current level. After 10 executions, using values for the unique keys selected from the available data, average results were computed for comparison.

To be effective, performance comparisons between examination modalities must consider both image size/complexity and number of images retrieved per hierarchical level. It is common sense to infer that modalities with bigger images perform poorly; however, big studies/series composed by simpler images can also be slow. For instance, retrieval performed at study level for PET examinations (characterized by simple, small images) was approximately 2.7 times slower than x-ray angiography examinations (characterized by bigger, complex images), considering results obtained from all approaches.

As stated earlier, the DICOM archive (independently of its configuration) stores images directly on file systems, using a self-managed directory structure. Retrieving images from the archive consists of an initial search performed in the database (in order to find out the file system path for each image file), followed by reading the file(s) content, and the persistence of such content in new file(s) on disk. As expected, both configurations performed similarly, with slight variances related to the file system access. The global variance between configurations was about 1.1% for the retrieval time.

Retrieving entire images from DCMDSM, in turn, involves joining the hierarchical_key table (for searching by level keys) with the original_content table (responsible for storing the original content of all images, one image per record, using a BLOB field). Once found, images are retrieved directly from the database and persisted on disk as raw data. Searching/retrieving data directly from the database contributed to reducing the whole operation time, allowing an overall performance gain of, approximately, 48.3%. The performance improvement was most significant at the most selective level (retrieval of individual images), achieving the best individual result in secondary capture examinations: a reduction of about 91.3% in retrieval time, when compared to the correspondent modality on the EC archive. These numbers justify the adoption of a table for the storage of unmodified content, despite its impact on the final database size and on the storage time. The average resulting times involving search, retrieval, and persistence on disk are depicted in figure 5.

Retrieval time per examination modality. Measurements include operations involving the file system (eg, reading files, writing files, navigating through directories). The patient level is omitted, since the retrieval of all images from a particular patient (regardless their examination modalities) is an unusual procedure.

Conclusion

This work presents the DCMDSM, a proposal for the management of medical images stored according to the DICOM standard, through the extension of the original DSM approach. The proposed model incorporates file- and attribute-level characteristics of DICOM images, providing flexibility and consistency for storage operations, as well as noticeable performance gains on executing query/retrieval instructions when compared to existing alternatives.

Adopting a vertical partitioning strategy based on value representations, the proposal is suitable for storing DICOM images acquired for different examination modalities, from devices of different manufacturers, without modifications on the database schema. Due to its vertically oriented layout, DCMDSM is outperformed by other evaluated approaches in executing storage operations (a side effect of numerous database insertions). Despite the replacement of the two-table per attribute defined in the original DSM by a single-table per VR, DCMDSM is up to 7.2 times slower in comparison to other approaches; however, the proposal performs better in single-tag queries, being about 48.0% faster. In queries involving groups of tags, results vary according to a combination of number of tags and selectivity predicates. In the worst case (large number of tags, low selectivity), the proposal is outperformed in about 66.5%; however, it is about 79.1% faster in the best case (medium/large number of tags, high selectivity). In full-content retrieval, DCMDSM is about 48.3% faster. The model can be adopted as a storage layer for new projects involving DICOM and RDBMSs, or as a substitute for limited horizontal database schemas already in use, with significant gains in scenarios characterized by a 1:n ratio involving storage-query/retrieval operations.

The proposed model was evaluated using a dataset focusing content heterogeneity rather than size. As future work, it is proposed to evaluate the same model with large datasets (not necessarily heterogeneous), comparing results from the current single-node implementation with results from a version built over a distributed RDBMS/column-oriented DBMS.

Supplementary Material

Web supplement

amiajnl-2013-002337-s1.pdf^{(73KB, pdf)}

Footnotes

Contributors: AS was responsible for the conceptual, logical, and physical database design of DCMDSM, for the execution of the experiments, and for the first version of the manuscript. TH contributed with background knowledge in horizontal- and vertical-oriented database designs, including the Decomposed Storage Model. AvW contributed with background knowledge related to the DICOM standard, regarding its organization and particularities involving storage, query, and retrieval operations. All authors were involved in the design of the experiments, in the evaluation of the acquired results, and in the revision of the manuscript's content.

Funding: This work was supported by CNPq—National Council for Scientific and Technological Development—Brazil.

Competing interests: None.

Provenance and peer review: Not commissioned; externally peer reviewed.

References

1.Bidgood WD, Jr, Horii SC, Prior FW, et al. Understanding and using DICOM, the data interchange standard for biomedical imaging. J Am Med Inform Assoc 1997;4:199–212 [DOI] [PMC free article] [PubMed] [Google Scholar]
2.Mildenberger P, Eichelberg M, Martin E. Introduction to the DICOM standard. Eur Radiol 2002;12:920–7 [DOI] [PubMed] [Google Scholar]
3.Chandrashekar N, Gautam SM, Srinivas KS, et al. Design considerations for a reusable medical database. Proceedings of the 19th IEEE International Symposium on Computer-Based Medical Systems 2006:69–74 [Google Scholar]
4.National Electrical Manufacturers Association. Digital Imaging and Communications in Medicine (DICOM): Part 5—Data Structures and Encoding. ftp://medical.nema.org/medical/dicom/2011/11_05pu.pdf (accessed 21 Aug 2013).
5.National Electrical Manufacturers Association. Digital Imaging and Communications in Medicine (DICOM): Part 6—Data Dictionary. ftp://medical.nema.org/medical/dicom/2011/11_06pu.pdf (accessed 21 Aug 2013).
6.Korenblum D, Rubin D, Napel S, et al. Managing biomedical image metadata for search and retrieval of similar images. J Digit Imaging 2011;24:739–48 [DOI] [PMC free article] [PubMed] [Google Scholar]
7.Möller M, Regel S, Sintek M. RadSem: semantic annotation and retrieval for medical images. In: Aroyo L, Traverso P, Ciravegna F, et al., eds The Semantic Web: Research and Applications: 6th European Semantic Web Conference, ESWC 2009, Heraklion, Crete, Greece, May 31–June 4, 2009 Proceedings Springer Berlin Heidelberg, 2009:21–35 [Google Scholar]
8.National Electrical Manufacturers Association. Digital Imaging and Communications in Medicine (DICOM): Part 4—Service Class Specifications. ftp://medical.nema.org/medical/dicom/2011/11_04pu.pdf (accessed 23 Aug 2013).
9.Kim KJ, Kim B, Lee H, et al. Predicting the fidelity of JPEG2000 compressed CT images using DICOM header information. Med Phys 2011;38:6449–57 [DOI] [PubMed] [Google Scholar]
10.Källman H-E, Halsius E, Folkesson M, et al. Automated detection of changes in patient exposure in digital projection radiography using exposure index from DICOM header metadata. Acta Oncol 2011;50:960–5 [DOI] [PubMed] [Google Scholar]
11.Jahnen A, Kohler S, Hermen J, et al. Automatic computed tomography patient dose calculation using DICOM header metadata. Radiat Prot Dosimetry 2011;147:317–20 [DOI] [PubMed] [Google Scholar]
12.Dave JK, Gingold EL. Extraction of CT dose information from DICOM metadata: automated Matlab-based approach. AJR Am J Roentgenol 2013;200:142–5 [DOI] [PubMed] [Google Scholar]
13.Yakami M, Ishizu K, Kubo T, et al. Development and evaluation of a low-cost and high-capacity DICOM image data storage system for research. J Digit Imaging 2011;24:190–5 [DOI] [PMC free article] [PubMed] [Google Scholar]
14.Macedo DDJ, Wangenheim AV, Dantas MAR, et al. An architecture for DICOM medical images storage and retrieval adopting distributed file systems. Int J High Perform Syst Archit 2009;2:99–106 [Google Scholar]
15.Magnus M, Prado TC, Wangenheim AV, et al. A study of NetCDF as an approach for high performance medical image storage. J Phys Conf Ser 2012;341 http://iopscience.iop.org/1742-6596/341/1/012016 (accessed 24 Aug 2013). [Google Scholar]
16.Soares TS, Prado TC, Dantas MAR, et al. An approach using parallel architecture to storage DICOM images in distributed file system. J Phys Conf Ser 2012;341 http://iopscience.iop.org/1742-6596/341/1/012021 (accessed 24 Aug 2013). [Google Scholar]
17.Corriero N, Covino E, D'amore G, et al. HSFS: a compress filesystem for metadata files. In: Snasel V, Platos J, El-Qawasmeh E. eds Digital Information Processing and Communications: International Conference, ICDIPC 2011, Ostrava, Czech Republic, July 7–9, 2011, Proceedings, Part II Chennai, India: Springer Berlin Heidelberg, 2011:289–300 [Google Scholar]
18.Koblitz B, Santos N, Pose V. The AMGA metadata service. J Grid Computing 2008;6:61–76 [Google Scholar]
19.Power D, Politou E, Slaymaker M, et al. A relational approach to the capture of DICOM files for Grid-enabled medical imaging databases. Proceedings of the 2004 ACM Symposium on Applied Computing 2004:272–9 [Google Scholar]
20.Chandrashekar N, Gautam SM, Shivakumar KR, et al. COTS-like generic medical image repository. Proceedings of the Fifth International Conference on Commercial-off-the-Shelf (COTS)-Based Software Systems 2006:199–205 [Google Scholar]
21.Scott W, Ryan A, Jacobs IJ, et al. OSPACS: Ultrasound image management system. Source Code Biol Med 2008;3:11. [DOI] [PMC free article] [PubMed] [Google Scholar]
22.Copeland GP, Khoshafian SN. A decomposition storage model. Proceedings of the 1985 ACM SIGMOD International Conference on Management of Data 1985:268–79 [Google Scholar]
23.Abadi D. Column stores for wide and sparse data. Third Biennial Conference on Innovative Data Systems Research 2007:292–7 [Google Scholar]
24.Rahman SS, Schallehn E, Saake G. ECOS: evolutionary column-oriented storage. Proceedings of the 28th British National Conference on Advances in Databases 2011:18–32 [Google Scholar]
25.Aulbach S, Grust T, Jacobs D, et al. Multi-tenant databases for software as a service: schema-mapping techniques. Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data 2008:1195–206 [Google Scholar]
26.Abadi DJ, Marcus A, Madden SR, et al. SW-Store: a vertically partitioned DBMS for Semantic Web data management. VLDB J 2009;18:385–406 [Google Scholar]
27.Corwin J, Silberschatz A, Miller PL, et al. Dynamic tables: an architecture for managing evolving, heterogeneous biomedical data in relational database management systems. J Am Med Inform Assoc 2007;14:86–93 [DOI] [PMC free article] [PubMed] [Google Scholar]
28.DICOM sample image sets. OsiriX Imaging Software Web site. http://www.osirix-viewer.com/ (accessed 26 Aug 2013).
29.Open Source Clinical Image and Object Management. dcm4che.org Web site. http://www.dcm4che.org/ (accessed 27 Aug 2013).
30.Beckmann JL, Halverson A, Krishnamurthy R, et al. Extending RDBMSs to support sparse datasets using an interpreted attribute storage format. Proceedings of the 22nd International Conference on Data Engineering 2006:58–67 [Google Scholar]
31.Maria DB. MariaDB Foundation Web site. https://mariadb.org/ (accessed 27 Aug 2013). [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Web supplement

amiajnl-2013-002337-s1.pdf^{(73KB, pdf)}

[R1] 1.Bidgood WD, Jr, Horii SC, Prior FW, et al. Understanding and using DICOM, the data interchange standard for biomedical imaging. J Am Med Inform Assoc 1997;4:199–212 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R2] 2.Mildenberger P, Eichelberg M, Martin E. Introduction to the DICOM standard. Eur Radiol 2002;12:920–7 [DOI] [PubMed] [Google Scholar]

[R3] 3.Chandrashekar N, Gautam SM, Srinivas KS, et al. Design considerations for a reusable medical database. Proceedings of the 19th IEEE International Symposium on Computer-Based Medical Systems 2006:69–74 [Google Scholar]

[R4] 4.National Electrical Manufacturers Association. Digital Imaging and Communications in Medicine (DICOM): Part 5—Data Structures and Encoding. ftp://medical.nema.org/medical/dicom/2011/11_05pu.pdf (accessed 21 Aug 2013).

[R5] 5.National Electrical Manufacturers Association. Digital Imaging and Communications in Medicine (DICOM): Part 6—Data Dictionary. ftp://medical.nema.org/medical/dicom/2011/11_06pu.pdf (accessed 21 Aug 2013).

[R6] 6.Korenblum D, Rubin D, Napel S, et al. Managing biomedical image metadata for search and retrieval of similar images. J Digit Imaging 2011;24:739–48 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R7] 7.Möller M, Regel S, Sintek M. RadSem: semantic annotation and retrieval for medical images. In: Aroyo L, Traverso P, Ciravegna F, et al., eds The Semantic Web: Research and Applications: 6th European Semantic Web Conference, ESWC 2009, Heraklion, Crete, Greece, May 31–June 4, 2009 Proceedings Springer Berlin Heidelberg, 2009:21–35 [Google Scholar]

[R8] 8.National Electrical Manufacturers Association. Digital Imaging and Communications in Medicine (DICOM): Part 4—Service Class Specifications. ftp://medical.nema.org/medical/dicom/2011/11_04pu.pdf (accessed 23 Aug 2013).

[R9] 9.Kim KJ, Kim B, Lee H, et al. Predicting the fidelity of JPEG2000 compressed CT images using DICOM header information. Med Phys 2011;38:6449–57 [DOI] [PubMed] [Google Scholar]

[R10] 10.Källman H-E, Halsius E, Folkesson M, et al. Automated detection of changes in patient exposure in digital projection radiography using exposure index from DICOM header metadata. Acta Oncol 2011;50:960–5 [DOI] [PubMed] [Google Scholar]

[R11] 11.Jahnen A, Kohler S, Hermen J, et al. Automatic computed tomography patient dose calculation using DICOM header metadata. Radiat Prot Dosimetry 2011;147:317–20 [DOI] [PubMed] [Google Scholar]

[R12] 12.Dave JK, Gingold EL. Extraction of CT dose information from DICOM metadata: automated Matlab-based approach. AJR Am J Roentgenol 2013;200:142–5 [DOI] [PubMed] [Google Scholar]

[R13] 13.Yakami M, Ishizu K, Kubo T, et al. Development and evaluation of a low-cost and high-capacity DICOM image data storage system for research. J Digit Imaging 2011;24:190–5 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R14] 14.Macedo DDJ, Wangenheim AV, Dantas MAR, et al. An architecture for DICOM medical images storage and retrieval adopting distributed file systems. Int J High Perform Syst Archit 2009;2:99–106 [Google Scholar]

[R15] 15.Magnus M, Prado TC, Wangenheim AV, et al. A study of NetCDF as an approach for high performance medical image storage. J Phys Conf Ser 2012;341 http://iopscience.iop.org/1742-6596/341/1/012016 (accessed 24 Aug 2013). [Google Scholar]

[R16] 16.Soares TS, Prado TC, Dantas MAR, et al. An approach using parallel architecture to storage DICOM images in distributed file system. J Phys Conf Ser 2012;341 http://iopscience.iop.org/1742-6596/341/1/012021 (accessed 24 Aug 2013). [Google Scholar]

[R17] 17.Corriero N, Covino E, D'amore G, et al. HSFS: a compress filesystem for metadata files. In: Snasel V, Platos J, El-Qawasmeh E. eds Digital Information Processing and Communications: International Conference, ICDIPC 2011, Ostrava, Czech Republic, July 7–9, 2011, Proceedings, Part II Chennai, India: Springer Berlin Heidelberg, 2011:289–300 [Google Scholar]

[R18] 18.Koblitz B, Santos N, Pose V. The AMGA metadata service. J Grid Computing 2008;6:61–76 [Google Scholar]

[R19] 19.Power D, Politou E, Slaymaker M, et al. A relational approach to the capture of DICOM files for Grid-enabled medical imaging databases. Proceedings of the 2004 ACM Symposium on Applied Computing 2004:272–9 [Google Scholar]

[R20] 20.Chandrashekar N, Gautam SM, Shivakumar KR, et al. COTS-like generic medical image repository. Proceedings of the Fifth International Conference on Commercial-off-the-Shelf (COTS)-Based Software Systems 2006:199–205 [Google Scholar]

[R21] 21.Scott W, Ryan A, Jacobs IJ, et al. OSPACS: Ultrasound image management system. Source Code Biol Med 2008;3:11. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R22] 22.Copeland GP, Khoshafian SN. A decomposition storage model. Proceedings of the 1985 ACM SIGMOD International Conference on Management of Data 1985:268–79 [Google Scholar]

[R23] 23.Abadi D. Column stores for wide and sparse data. Third Biennial Conference on Innovative Data Systems Research 2007:292–7 [Google Scholar]

[R24] 24.Rahman SS, Schallehn E, Saake G. ECOS: evolutionary column-oriented storage. Proceedings of the 28th British National Conference on Advances in Databases 2011:18–32 [Google Scholar]

[R25] 25.Aulbach S, Grust T, Jacobs D, et al. Multi-tenant databases for software as a service: schema-mapping techniques. Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data 2008:1195–206 [Google Scholar]

[R26] 26.Abadi DJ, Marcus A, Madden SR, et al. SW-Store: a vertically partitioned DBMS for Semantic Web data management. VLDB J 2009;18:385–406 [Google Scholar]

[R27] 27.Corwin J, Silberschatz A, Miller PL, et al. Dynamic tables: an architecture for managing evolving, heterogeneous biomedical data in relational database management systems. J Am Med Inform Assoc 2007;14:86–93 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R28] 28.DICOM sample image sets. OsiriX Imaging Software Web site. http://www.osirix-viewer.com/ (accessed 26 Aug 2013).

[R29] 29.Open Source Clinical Image and Object Management. dcm4che.org Web site. http://www.dcm4che.org/ (accessed 27 Aug 2013).

[R30] 30.Beckmann JL, Halverson A, Krishnamurthy R, et al. Extending RDBMSs to support sparse datasets using an interpreted attribute storage format. Proceedings of the 22nd International Conference on Data Engineering 2006:58–67 [Google Scholar]

[R31] 31.Maria DB. MariaDB Foundation Web site. https://mariadb.org/ (accessed 27 Aug 2013). [Google Scholar]

PERMALINK

DCMDSM: a DICOM decomposed storage model

Alexandre Savaris

Theo Härder

Aldo von Wangenheim