Abstract
In this paper, the authors describe a methodology to transform programmatically structured reporting (SR) templates defined by the Digital Imaging and Communications for Medicine (DICOM) standard into an XML schema representation. Such schemas can be used in the creation and validation of XML-encoded SR documents that use templates. Templates are a means to put additional constraints on an SR document to promote common formats for specific reporting applications or domains. As the use of templates becomes more widespread in the production of SR documents, it is important to ensure validity of such documents. The work described in this paper is an extension of the authors' previous work on XML schema representation for DICOM SR. Therefore, this paper inherits and partially modifies the structure defined in the earlier work.
The Digital Imaging and Communications in Medicine (DICOM) standard1 aims at creating a public and license-free standard method for the transmission of digital medical images and their associated information. The addition of structured reporting (SR) improves the expressiveness, precision, and comparability of not only images and waveforms, but also documentation of diagnostic observations. More specifically, DICOM SR provides a means to encode structured information to enable unambiguous exchange of clinical information and documentation among systems from different vendors. Structured reporting is defined online,2 and there is a good book on this topic.3
As with any structured data in health care, benefits exist in outcome analysis and point-of-care applications. Patient information and data acquisition are typically performed at a location different from that of interpretation and analysis, necessitating exchange of information. Analysis of historical data is facilitated by an accurate representation of the relevant data as well as the interpretation process. Structured reporting is designed to capture unambiguously structured medical data. Structured reporting in its most general form (referred to throughout as general SR) is very flexible, and the same content can be expressed in different forms and structures, hampering interoperability. Templates are a means to put additional constraints on an SR document to promote common formats for specific reporting applications or domains. For example, DICOM standard Part 16 (DICOM Content Mapping Resource2,4) defines reusable SR templates and mammography computer-aided detection SR information object definition (IOD) templates. Additional templates for other specialties are being developed and standardized, such as:
Supplement 26: Ultrasound OB-GYN Procedure Reports
Supplement 66: Catheterization Lab SR SOP (Service-Object Pair) Classes
Supplement 71: Vascular Ultrasound Procedure Reports
Supplement 72: Echocardiography Procedure Reports
Currently, the DICOM standard is maintained as a set of Word documents. The rich structure of SR in general and templates in particular is described using tables, leading to inevitable errors and inconsistencies as more and more templates are developed by different teams of specialists. Such errors and inconsistencies can be typographical errors, inconsistent naming, incorrect structure, and other errors that occur as a consequence of maintaining formal structures using word-processing tables. When the standard is imposed on real-world medical documents, there is no easy way to ensure that the document structure actually conforms to the standard. Therefore, it is highly desirable to capture the structure with a more formal notation that is amenable to machine processing to overcome the above limitations.
In the past few years, we have been focusing our efforts on applying XML technologies to DICOM SR. EXtensible Mark up Language (XML) is a standard format for encoding structured data and has been widely adopted by standards and industries. As the Web now has global acceptance for human information access, XML has become an emerging standard for data exchange. Health care standards organizations including HL75 are transitioning to XML. In previous work,6,7 we developed XML schemas8 from the DICOM SR specification that can be used to create and validate general SR documents encoded in XML. As an extension of that work and adaptation to the emerging standards, this paper proposes new XML schemas for DICOM SR template specifications and an approach to generate such schemas automatically. These schemas can be used to guide the creation and validation of XML-encoded SR reports created with templates and to promote interoperability of such reports.
The organization of this paper is as follows. In the first section, we discuss the relationship between general SR and templates. In the next section, we briefly describe DICOM SR and its XML encoding using approaches described in our previous work. The following section describes the process of generating XML schemas for SR templates. In the fifth section, we discuss some of the issues encountered when adapting our previous approach to include additional template constraints and our solutions. We then provide our conclusions and suggestions for further work.
The reader is assumed to have some knowledge of the DICOM standard, various SR template supplements, and XML technologies, specifically XML schema. The results described in this paper are extensions of previous work.6,7
General SR and Templates
General SR specifies three IODs: basic text, enhanced, and comprehensive. They have the same basic data structures but with more advanced features added to the enhanced and comprehensive IODs. Each IOD contains several normalized or composite object classes called information entities (IEs). An IE consists of a number of modules. For instance, the first column in ▶ (from the DICOM standard) specifies five IEs within a comprehensive IOD. For each IE, the second column lists modules that belong to the specific IE. Each module is defined in another section of the standard. The last column indicates whether the use of the module is mandatory, a user option, or conditional. The subtle difference between user option and conditional is that the presence of optional elements is dependent on user preference, but the presence of conditional elements is dictated by the presence or absence of other elements. These constraints can be used in combination on a single element.
Table 1.
Comprehensive SR IOD Definition (Table A.35.3-12)
| IE | Module | Reference | Usage |
|---|---|---|---|
| Patient | Patient | C.7.1.1 | M |
| Specimen identification | C.7.1.2 | C–Required if the observation subject is a specimen | |
| Clinical trial subject | C.7.1.3 | U | |
| Study | General study | C.7.2.1 | M |
| Patient study | C.7.2.2 | U | |
| Clinical trial study | C.7.2.3 | U | |
| Series | SR document series | C.17.1 | M |
| Clinical trial series | C.7.3.2 | U | |
| Equipment | General equipment | C.7.5.1 | M |
| Document | SR document general | C.17.2 | M |
| SR document content | C.17.3 | M | |
| SOP common | C.12.1 | M |
C = conditional; IE = information entity; IOD = information object definition; M = mandatory; SOP = service-object pair; SR = structured reporting; U = user option.
The document IE is SR specific. Within this IE, the SOP common module is common to all SR IODs. The SR document general module contains general information of an SR document. The SR document content module establishes the information model of observation contexts; it is defined in Section C.17.3 of the DICOM standard but is more easily visualized as a tree, as shown in ▶. Such a tree is known as an SR content tree, and each node is known as a content item.
Figure 1.

A sample SR content tree.
General SR basically specifies the data structures and the allowed types of content items. It does have some semantic constraints such as enumerated values for a few attributes. However, different reporting applications have their own specific observation contexts or vocabularies, which are beyond the general SR constraint scope. This issue is handled through the use of SR templates that apply content constraints on the SR document content module and its components. SR templates are used to put additional constraints on an SR content tree to promote common formats and vocabularies for specific reporting applications (e.g., OB-GYN ultrasound9). They are patterns that specify the concept names, relationship with parent, value type, value multiplicity, requirement type, and value set constraint attributes of content items for a particular application. Currently, these templates are represented as tables. For examples, see ▶▶▶▶.9
Table 2.
TID 5000 OB-GYN Ultrasound Procedure Report
| NL* | Rel with Parent | Value Type | Concept Name | VM | Req Type | Condition | Value Set Constraint | |
|---|---|---|---|---|---|---|---|---|
| 1 | CONTAINER | EV (125000, DCM, “OB-GYN Ultrasound Procedure Report”)† | 1 | M | ||||
| … | ||||||||
| 10 | > | CONTAINS | INCLUDE | DTID (5006) Fetal Long Bones Section | 1-n | U | ||
| … | ||||||||
| 17 | > | CONTAINS | INCLUDE | DTID (5013) Follicles Section | 1 | U | $Laterality = EV (G-A101, SRT, “Left”) | |
| $Number = EV (11879-4, LN, “Number of follicles in left ovary”) | ||||||||
| … |
DCID = defined context group identifier; DCM = DICOM; DTID = defined template identifier; EV = enumerated values; LN = logical observation identifier names and codes; M = mandatory; NL = nested level; SRT = systemized nomenclature of medicine-RT; U = user option; VM = value multiplicity.
I indicates how deep an element is located in the tree.
The three values in parentheses are code value, coding scheme designator, and code meaning.
Table 3.
TID 5006 Fetal Long Bones Section
| NL | Rel with Parent | Value Type | Concept Name | VM | Req Type | Condition | Value Set Constraint | |
|---|---|---|---|---|---|---|---|---|
| 1 | CONTAINER | DT (125003, DCM, “Fetal Long Bones”)* | 1 | M | ||||
| 2 | > | HAS OBS CONTEXT | INCLUDE | DTID (1008) Subject Context, Fetus | 1 | MC | IF this template is invoked more than once to describe more than one fetus | |
| 3 | > | CONTAINS | INCLUDE | DTID (5008) Fetal Biometry Group | 1-n | M | $BiometryType = MemberOf {DCID (12006) Fetal Long Bones Biometry Measurements} |
DCID = defined context group identifier; DCM = DICOM; DT = defined terms; DTID = defined template identifier; M = mandatory; MC = mandatory conditional; NL = nested level; TI = template identifier; VM = value multiplicity.
The three values in parentheses are code value, coding scheme designator, and code meaning.
Table 4.
TID 5008 Fetal Biometry Group
| NL | Rel with Parent | Value Type | Concept Name | VM | Req Type | Condition | Value Set Constraint | |
|---|---|---|---|---|---|---|---|---|
| 1 | CONTAINER | DT (125005, DCM, “Biometry Group”)* | 1 | M | ||||
| 2 | > | CONTAINS | INCLUDE | DTID (300) Measurement | 1-n | MC | At least one of rows 2 and 3 shall be present | $Measurement = $BiometryType |
| $Derivation = DCID (3627) | ||||||||
| Measurement Type | ||||||||
| 3 | > | CONTAINS | NUM | EV (18185-9, LN, “Gestational Age”) | 1 | MC | At least one of rows 2 and 3 shall be present | Units = EV (d, UCUM, days) |
| 4 | ≫ | INFERRED FROM | CODE | DCID (228) Equation or Table | 1 | U | DCID (12013) Gestational Age Equations and Tables | |
| … |
DCID = defined context group identifier; DCM = DICOM; DT = defined terms; DTID = defined template identifier; EV = enumerated values; LN = logical observation identifier names and codes; M = mandatory; MC = mandatory conditional; NL = nested level; NUM = numeric; UCUM = unified code for units of measure; VM = value multiplicity.
The three values in parentheses are code value, coding scheme designator, and code meaning.
Table 5.
TID 300 Measurement
| NL | Rel with Parent | Value Type | Concept Name | VM | Req Type | Condition | Value Set Constraint | |
|---|---|---|---|---|---|---|---|---|
| 1 | NUM | $Measurement | 1 | M | Units = $Units | |||
| 2 | > | HAS CONCEPT MOD | CODE | $ModType | 1-n | U | $Mod Value | |
| … |
M = mandatory; NL = nested level; NUM = numeric; TID = template identifier; U = user option; VM = value multiplicity.
▶ shows part of the top-level template identifier (TID) 5000 OB-GYN Ultrasound Procedure Report template (each template is identified by a unique TID). To make the relationship between general SR and template clear, compare ▶ and ▶. ▶ indicates that besides the hierarchical relationship,* general SR imposes no constraints on, for example, how a content item should be named, what the relationship should be between content items. However, in a template such as the one shown on ▶, specific constraints have been imposed: the tree root must be named “OB-GYN Report” and its value type is CONTAINER, the relationship between content items “Image Library” and “Image Entry” must be CONTAINS, etc.
Figure 2.

Part of the top-level OB-GYN Ultrasound Procedure Report template.
Each reporting application has only one root template. This root template invokes one or more subtemplates. Templates are represented in a tabular format with the same number of columns. Each row represents a content item in the SR content tree. The Rel with Parent column specifies the relationship of the content item with its parent and the Value Type column constrains its content type. If the value of this column is INCLUDE, it means that another template is invoked. The Concept Name column specifies the coded concept of the content item. ▶ is the root SR template of TID 5000 OB-GYN Ultrasound Procedure Report. Row 1 defines the constraints on the root content item: mandatory (M) without a relationship constraint. The other rows may invoke other subtemplates such as TID 5006 (▶) to further constrain subcontent items or impose a single set of constraints on a branch or atomic content items.
The first cell of the Rel with Parent column of ▶ is blank. Its actual value is CONTAINS passed from its invoker, Row 10 of ▶. There are three types of constraints for concept name and value set constraint: coded concept, context group reference, and parameter. A coded concept is a coded vocabulary such as defined terms (125005, DCM, “Biometry Group”) in row 1 of ▶ or enumerated values (EVs) (18185-9, LN, “Gestational Age”) in row 3 of ▶. A context group reference is composed of a context group type and an identifier (ID) number like Define context group identified (DCID) (228) in row 4 of ▶. A context group contains a group of coded concepts or other context group references. In this case, the value of concept name or value set constraint is limited to one of the coded concepts given in the referenced context group.
A parameter declared in a template conveys a value passed from the invoking template, just as a programming language does. The passed value can be a coded concept such as EV (G-A101, SRT, “Left”) in row 17 of ▶, a context group reference such as DCID (3627) Measurement Type in row 2 of ▶, or another parameter such as BiometryType in row 2 of ▶. The sequence from row 3 of ▶ to row 2 of ▶ and then to row 1 of ▶ illustrates how the context group reference DCID (12006) Fetal Long Bones Biometry Measurement is passed from TID 5006 (▶) to the parameter BiometryType declared in TID 5008 (▶) and then to the parameter measurement declared in TID 300 (▶).
The values of value multiplicity (VM) and Req Type constrain the number of occurrences and whether the content item specified in a row is mandatory. When the Req Type is conditional like conditional (C), manditory conditional (MC), or user option conditional (UC), there is a condition statement expressed as logical or (OR), or exclusive or (XOR) or if and only if (IFF), or free text like that in row 2 of ▶.
To deploy SR templates, one of the approaches is to transform the template tables into XML schemas as we did to the general SR so that we can use the schemas to generate and validate the corresponding SR documents. The next section discusses the relationship between the XML schemas for general SR and those for the templates.
DICOM SR and XML Schema Representation
XML schema for general SR is our previous work, and the basis of the methodology described in this paper. A detailed description has been previously published.7 Here we summarize the framework of generating XML schemas for general SR and demonstrate how to adapt it for templates.
XML Schemas for General SR
The XML schemas for general SR are the XML representation of SR IODs. ▶ is a high-level structural diagram of the SR schema. The schema definitions patient_ie, study_ie, series_ie, equipment_ie, and sop_common_module are for the common modules. The schema sr_document_general_module represents general information of SR. Since our focus is on SR templates, we concentrate on the schema sr_document_content_module, which is used to represent content trees. We use XML Spy10 to display the details of the schema for the content tree as illustrated in ▶▶▶▶▶. This is a useful way of visualizing schema fragments: The type of an element is shown under the name and other unnecessary details are suppressed. The schema sr_document_content_module and its children serve as the meta-model of the schemas of the templates. Later we refer back to ▶ to show how it is modified by templates.
Figure 3.

XML schema structure for representing general SR.
Figure 4.

XML schema for DICOM general SR.
Figure 5.

sr_document_content_module element.
Figure 6.

document_relationship_macro element.
Figure 7.

content_sequence element.
Figure 8.

content_sequence_item element.
The content_sequence element is a nonempty sequence of content_sequence_item elements, which correspond to content items mentioned before. For this case, we show both the graphic and the schema fragment because we refer back to this in the next section.
Note the recursive reference to document_relationship_macro. This is how the SR content tree is built up.
XML Schemas for Templates
DICOM SR templates define domain-specific frameworks to allow uniform expression of structured data for various medical specialties. Templates are (additional) constraints on a general SR content tree and can specify, for example, what values are permissible as the value type of a given content item, what relationships with the parent are allowed, what concept names can appear, etc.
In general, one can specify with templates the shape of the subtree hierarchy as well as the allowed values and types at each node. Our approach in generating XML schemas for templates is to modify our methodology for schemas for general SR to take into account these additional constraints.
If we treat the schema definition of sr_document_content_module and its children as a meta-model of the SR content item tree, then the schema definitions of templates are semi-instances of this meta-model. To differentiate between them, we use template-specific names for those complexTypes or groups having the same structures and elements but with specified values. To illustrate the relationship between the schemas of general SR and those of the templates, let us take template TID 5002 OB-GYN Procedure Summary Section shown in ▶ as an example.
Table 6.
TID 5002 OB-GYN Procedure Summary Section
| NL | Rel with Parent | Value Type | Concept Name | VM | Req Type | Condition | Value Set Constraint | |
|---|---|---|---|---|---|---|---|---|
| 1 | CONTAINER | DT (121111, DCM, “Summary”)* | 1 | M | ||||
| 2 | > | CONTAINS | DATE | DCID (12003) OB-GYN Dates | 1-n | U | ||
| 3 | > | CONTAINS | INCLUDE | DTID (300) Measurement | 1-n | U | $Measurement = BCID (12001) OB-GYN Summary | |
| 4 | > | CONTAINS | TEXT | EV (121106, DCM, “Comment”) | 1-n | U | ||
| 5 | ≫ | INCLUDE | DTID (320) Image or Spatial Coordinates | 1-n | U | |||
| 6 | > | CONTAINS | INCLUDE | BTID (5003) OB-GYN Fetus Summary | 1-n | UC | No more than 1 inclusion per fetus |
BTID = baseline template identifier; DCID = defined context group identifier; DCM = DICOM; DT = defined terms; DTID = defined template identifier; EV = enumerated values; M = mandatory; NL = nested level; TID = template identifier; U = user option; UC = user option conditional; VM = value multiplicity.
The three values in parentheses are code value, coding scheme designator, and code meaning.
The graphic representation of the generated XML schema for template TID 5002 is given in ▶. The schema has more constraints on elements. For example, the relationship_type element in procedure_summary_5002_1_content carries a fixed value “CONTAINS”†. A clearer example is shown in ▶. Compare it with ▶, and one can see the additional constraints that the templates put on a general SR content tree (highlighted in shaded boxes).
Figure 9.

XML schema for TID 5002.
Figure 10.

XML schema structure for OB-GYN Ultrasound Procedure Reports templates.
The generation process is described in the following section.
Generating XML Schemas for SR Templates
The discussion in the last section constitutes the basis for programmatic XML schema for SR template generation. The process consists of two steps (▶):
Conversion of template tables (which have a flat structure) to an intermediate XML representation that tries to recover some of the structural information inherent in templates. We use a tool called Majix11 for this purpose. The conversion is not perfect, and some cleaning is needed before the next step. Occasionally, the conversion detects typographical and structural errors in the tables. These, as well as mistakes introduced by Majix, are corrected manually. This step is necessary because at present templates are defined with text and tables in a Word document. DICOM is in the process of defining an XML representation of the specification. When this representation is completed, this step will become unnecessary.
Transformation of XML representations to XML schemas. This is accomplished with a Java program using the Xerces Document Object Model Parser from Apache12 to generate the schemas. This can also be done with XSLT transformations.13 The general approach has been described in the last section. More details are discussed in the next section. The schema generated for template TID 5002 OB-GYN Procedure Summary Section is included in Appendix 1.
Figure 11.

XML schema generation process.
Issues in Defining Schema for SR Templates
In this section, we discuss issues encountered as we refine the methodology for general SR to accommodate templates and how they are dealt with. Consider the utility TID 1000 Quotation4 shown in ▶:
Table 7.
TID 1000 Quotation
| NL | Rel with Parent | Value Type | Concept Name | VM | Req Type | Condition | Value Set Constraint | |
|---|---|---|---|---|---|---|---|---|
| 1 | HAS OBS CONTEXT | CODE | EV (121001, DCM, “Quotation Mode”)* | 1 | M | EV (121003, DCM, “Document”) | ||
| EV (121004, DCM, “Verbal”) | ||||||||
| 2 | HAS OBS CONTEXT | COMPOSITE | EV(121002, DCM, “Quoted Source”) | 1 | MC | Required if quoted material source is a DICOM composite object | ||
| 3 | HAS OBS CONTEXT | INCLUDE | DTID (1001) “Observation Context” | 1 | M |
DCM = DICOM; DTID = defined template identifier; EV = enumerated values; M = mandatory; MC = mandatory conditional; NL = nested list; TID = template identifier; VM = value multiplicity.
The three values in parentheses are code value, coding scheme designator, and code meaning.
A template can refer to another template by inclusion, e.g., row 3 indicates a reference to TID 1001 Observation Context. A domain-specific template set such as the OB-GYN Ultrasound Procedure Reports9 consists of a set of cross-referenced tables.
At the subtree where TID 1000 is used, the table above indicates that the content sequence is such that the first item must have a value type of CODE, and the second item (if present) a value type of COMPOSITE. The concept name of the first item must be “Quotation Mode” (with a well-defined coding scheme designator and code value), and the concept name of the second item is “Quoted Source.” The remaining items are defined in another template (TID 1001).
A first attempt to design a schema for this template is to treat it as a content sequence where the content items have further constraints (▶).
Figure 12.

First schema for TID 1000 (not valid).
Here the element tid_1000 takes the place of the content_sequence element in ▶, and the sequence is constrained to possess specific kinds of content items. (The last element in the diagram, tid_1001, is a reference to the schema for TID 1001.) Constraints on the individual content items can be specified in the respective types as shown in ▶.
Figure 13.

Second schema for TID 1000 (not valid).
The new types are specializations of the type for the content_sequence_item element (▶) and constrain their components with values from the corresponding rows of the template table. For example, the first element in ▶ will have a relationship_type element with a fixed value of “HAS OBS CONTEXT” as given by row 1 of the template table. Of course this means that we must have unique names for the type corresponding to each row. We adopt the convention that these types are named with a string, which is a concatenation of the value from the concept name column, the template ID, the row number, and a suffix indicating the role of the type. Thus, the first content item above has a type called quotation_mode_1000_1_content (▶).
Figure 14.

quotation_mode_1000_1_content type.
There is an additional problem: Since the content_sequence_item elements in ▶ are children of the same content_sequence element, they cannot have different types according to the “Element Declarations Consistent” constraint of the XML Schema Recommendation.8 We resolve this by introducing group references, as shown in the schema fragment below and in ▶.
<xsd:group name=“tid_1000”>
<xsd:sequence>
<xsd:group ref=“pms:group_quotation_mode_1000_1_content”/>
<xsd:group ref=“pms:group_quoted_source_1000_2_content” minOccurs=“0”/>
<xsd:group ref=“pms:tid_1001”/>
</xsd:sequence>
</xsd:group>
<xsd:group name=“group_quotation_mode_1000_1_content”>
<xsd:sequence>
<xsd:element name=“content_sequence_item” type=“pms:quotation_mode_1000_1_content”/>
</xsd:sequence>
</xsd:group>
<xsd:group name=“group_quoted_source_1000_2_content”>
<xsd:sequence>
<xsd:element name=“content_sequence_item” type=“pms:quoted_source_1000_2_content”/>
</xsd:sequence>
</xsd:group>
Figure 15.

Schema for TID 1000.
This allows us to generate a valid schema for TID 1000. Although the resulting schema is larger in size, this is not an issue since it is generated programmatically.
For another example template with a more complicated structure, refer to TID 5002 OB-GYN Procedure Summary Section (▶).9 It illustrates another issue that must be addressed.
Table 8.
TID 5002 OB-GYN Procedure Summary Section
| NL | Rel with Parent | Value Type | Concept Name | VM | Req Type | Condition | Value Set Constraint | |
|---|---|---|---|---|---|---|---|---|
| 1 | CONTAINER | DT (121111, DCM, “Summary”)* | 1 | M | ||||
| 2 | > | CONTAINS | DATE | DCID (12003) OB-GYN Dates | 1-n | U | ||
| … |
DCID = defined context group identifier; DT = defined terms; M = mandatory; NL = nested level; TID = template identifier; U = user option; VM = value multiplicity.
The three values in parentheses are code value, coding scheme designator, and code meaning.
Row 2 of TID 5002 refers to context group 12003 in the Concept Name column. As explained previously, a context group refers to a list of codes from which one code may be chosen in a given context and may appear in the concept name or value set constraint columns of templates tables. Context groups are one of the most important structures used by SR templates. A context group is specified as a table (▶). ▶ is a graphic representation of TID 5002.
Table 9.
Context ID 12003 OB-GYN Dates
| Coding Scheme Designator (0008,0102) | Coding Scheme Version (0008,0103) | Code Value (0008,0100) | Code Meaning (0008,0104) |
|---|---|---|---|
| LN | 11778-8 | EDD | |
| LN | 11779-6 | EDD from LMP | |
| LN | 11781-2 | EDD from average ultrasound age | |
| … | … | … |
ID = identifier; LN = logical observation identifier names and codes.
Figure 16.

Graphic representation of TID 5002.
A context group is incorporated into template schemas by defining a type for each group. The type specifies a choice of the elements listed in a given table. The type for context group 12003 is as follows:
<xsd:complexType name=“cid_12003”>
<xsd:choice>
<xsd:element ref=“pms:cid_12003_1”/>
<xsd:element ref=“pms:cid_12003_2”/>
… …
</xsd:choice>
</xsd:complexType>
<xsd:group name=“cid_12003_1”>
<xsd:sequence>
<xsd:element name=“code_value” type=“pms:code_value” fixed=“11778-8”/>
<xsd:element name=“coding_scheme_designator” type=“coding_scheme_designator” fixed=“LN”/>
<xsd:element name=“code_meaning” type=“pms:code_meaning” fixed=“EDD”/>
</xsd:sequence>
<xsd:group name=“cid_12003_2”>
<xsd:sequence>
<xsd:element name=“code_value” type=“pms:code_value” fixed=“11779-6”/>
<xsd:element name=“coding_scheme_designator” type=“coding_scheme_designator” fixed=“LN”/>
<xsd:element name=“code_meaning” type=“pms:code_meaning” fixed=“EDD from LMP”/>
</xsd:sequence>
</xsd:group>
…
Subsequently, we can refer to this context group simply as < xsd:group ref=“pms:cid_12003”/>.
Conclusion and Future Work
Creating and validating SR documents based on the existing standard description is a labor-intensive process and is subject to misinterpretations and errors. Templates for various application domains are typically developed by different domain specialists, and the resulting templates may have inconsistencies and may be difficult to maintain and update. We have described in this paper a methodology that generates XML schemas for DICOM SR templates from the existing specification. These schemas can be used in the creation and validation of SR documents that use templates and have been encoded in XML. The method is a refinement of our earlier work on generating schemas for general SR. A substantial collection of SR templates is being developed in the DICOM community, and it is expected that, in the future, most SR documents will use templates. It is therefore necessary to extend our earlier work to handle templates. We see two main directions for future work.
Constraints: Templates themselves can have constraints on usage. Constraints are currently expressed mostly in natural language. The semantics of such constraints need to be defined more precisely before they can be adequately represented in schemas.
Templates with Parameters: Templates with parameters have been defined and will be incorporated into the next version of the DICOM standard. Such templates are useful, but, as the current version of XML schema does not allow type definitions with parameters, it is not straightforward to generate schemas for templates with parameters.
At the same time, easy-to-use ancillary tools for validating and verifying the use of templates in SR documents need to be developed.
The complexity of this process as illustrated in this paper underscores the need for a more formal representation for templates, without which more widespread use of templates will be hindered. We believe that an XML-based framework is appropriate, and our work is a good starting point for the development of a rigorous and machine-processable mechanism for templates. As an extension of this, other highly structured portions of the DICOM standard should likewise be made available in a form that can be processed and validated by software. Such efforts have already begun in the DICOM committee14 but will take time to complete. In the meantime, the only normative references are the published documents (which can be freely downloaded1). We strongly believe that the complexity of handling DICOM standards illustrated by this paper will not be reduced significantly as long as natural language is used as the primary mechanism for defining structured information and suggest to the DICOM community that a more formal approach be adopted as soon as possible.
Appendix 1
XML schema fragment for template TID 5002
- Included here are portions of the generated schema for TID 5002, illustrating the solutions to the issues discussed in the section “Issues in Defining Schema for SR Templates.”
- <?xml version=“1.0” encoding=“UTF-8”?>
- <xsd:schema targetNamespace=http://www.philips.com/pms xmlns:xsd=http://www.w3.org/2001/XMLSchema xmlns:pms=“http://www.philips.com/pms”>
- <xsd:annotation>
- <xsd:documentation>
TID 5002 OB-GYN PROCEDURE SUMMARY SECTION
</xsd:documentation>
</xsd:annotation>
<xsd:group name=“tid_5002”>
<xsd:sequence>
<xsd:group ref=“pms:group_summary_5002_1_content”/>
</xsd:sequence>
</xsd:group>
<xsd:group name=“group_summary_5002_1_content”>
<xsd:sequence>
<xsd:element name=“content_sequence_item” type=“pms:summary_5002_1_content”/>
</xsd:sequence>
</xsd:group>
<xsd:complexType name=“summary_5002_1_content”>
<xsd:sequence>
<xsd:element name=“relationship_type” type=“pms:relationship_type” fixed=“CONTAINS”/>
<xsd:element name=“document_relationship_macro” type=“pms:document_relationship_summary_5002_1”/>
<xsd:element name=“document_content_macro” type=“pms:document_content_summary_5002_1”/>
</xsd:sequence>
</xsd:complexType>
<xsd:complexType name=“document_content_summary_5002_1”>
<xsd:sequence>
<xsd:element name=“value_type” type=“pms:value_type” fixed=“CONTAINER”/>
<xsd:element name=“concept_name_code_sequence” type=“pms:summary_5002_1_concept_name”/>
<xsd:element name=“continuity_of_content” type=“pms:continuity_of_content”/>
</xsd:sequence>
</xsd:complexType>
<xsd:complexType name=“summary_5002_1_concept_name”>
</xsd:complexType>
<xsd:complexType name=“document_relationship_summary_5002_1”>
<xsd:sequence>
<xsd:element name=“observation_datetime” type=“pms:observation_datetime” minOccurs=“0”/>
<xsd:element name=“content_template_sequence” type=“pms:content_template_sequence_summary_5002_1”/>
<xsd:element name=“content_sequence” type=“pms:content_sequence_summary_5002_1”/>
</xsd:sequence>
</xsd:complexType>
<xsd:complexType name=“content_sequence_summary_5002_1”>
<xsd:sequence>
<xsd:group ref=“pms:group_date_5002_2_content” minOccurs=“0” maxOccurs=“unbounded”/>
<xsd:group ref=“pms:tid_300” minOccurs=“0” maxOccurs=“unbounded”/>
<xsd:group ref=“pms:group_comment_5002_4_content” minOccurs=“0” maxOccurs=“unbounded”/>
<xsd:group ref=“pms:tid_5003” minOccurs=“0” maxOccurs=“unbounded”/>
</xsd:sequence>
<xsd:attributeGroup ref=“pms:content_sequence_attributes”/>
</xsd:complexType>
<xsd:group name=“group_date_5002_2_content”>
<xsd:sequence>
<xsd:element name=“content_sequence_item” type=“pms:date_5002_2_content”/>
</xsd:sequence>
</xsd:group>
<xsd:complexType name=“date_5002_2_content”>
<xsd:sequence>
<xsd:element name=“relationship_type” type=“pms:relationship_type” fixed=“CONTAINS”/>
<xsd:element name=“document_content_macro” type=“pms:document_content_date_5002_2”/>
</xsd:sequence>
</xsd:complexType>
<xsd:complexType name=“document_content_date_5002_2”>
<xsd:sequence>
<xsd:element name=“value_type” type=“pms:value_type” fixed=“DATE”/>
<xsd:element name=“concept_name_code_sequence”type=“pms:date_5002_2_concept_name”/>
<xsd:element name=“date” type=“pms:date”/>
</xsd:sequence>
</xsd:complexType>
<xsd:complexType name=“date_5002_2_concept_name”>
<xsd:sequence>
<xsd:element name=“concept_name_code_sequence_item”>
<xsd:complexType>
<xsd:sequence>
<xsd:element name=“code_sequence_macro” type=“pms:date_5002_2_code”/>
</xsd:sequence>
</xsd:complexType>
</xsd:element>
</xsd:sequence>
<xsd:attributeGroup ref=“pms:concept_name_code_sequence_attributes”/>
</xsd:complexType>
<xsd:complexType name=“date_5002_2_code”>
<xsd:choice>
<xsd:group ref=“pms:cid_12003”/>
</xsd:choice>
</xsd:complexType>
<xsd:group name=“group_comment_5002_4_content”>
<xsd:sequence>
<xsd:element name=“content_sequence_item” type=“pms:comment_5002_4_content”/>
</xsd:sequence>
</xsd:group>
<xsd:complexType name=“comment_5002_4_content”>
<xsd:sequence>
<xsd:element name=“relationship_type” type=“pms:relationship_type” fixed=“CONTAINS”/>
<xsd:element name=“document_relationship_macro” type=“pms:document_relationship_comment_5002_4”/>
<xsd:element name=“document_content_macro” type=“pms:document_content_comment_5002_4”/>
</xsd:sequence>
</xsd:complexType>
<xsd:complexType name=“document_relationship_comment_5002_4”>
<xsd:sequence>
<xsd:element name=“observation_datetime” type=“pms:observation_datetime” minOccurs=“0”/>
<xsd:element name=“content_template_sequence” type=“pms:content_template_sequence_comment_5002_4”/>
<xsd:element name=“content_sequence” type=“pms:content_sequence_comment_5002_4”/>
</xsd:sequence>
</xsd:complexType>
<xsd:complexType name=“content_sequence_comment_5002_4”>
<xsd:sequence>
<xsd:group ref=“pms:tid_320” minOccurs=“0” maxOccurs=“unbounded”/>
</xsd:sequence>
<xsd:attributeGroup ref=“pms:content_sequence_attributes”/>
</xsd:complexType>
</xsd:schema>
The authors thank the anonymous reviewers for their useful comments, which improved the presentation of this paper.
Footnotes
Relationships are directional. Thus, the lower left Rel. (▶) indicates a relationship from the lower content item to the upper content item.
Due to the display limitation of XML Spy, the fixed value is not shown in the figure.
References
- 1.National Electrical Manufacturers Association. Digital Imaging and Communications in Medicine. Available at: http://medical.nema.org/. Accessed Nov 9, 2004.
- 2.DICOM 2004, Part 3. Available at: http://medical.nema.org/dicom/2004/04_03PU.PDF. Accessed Nov 9, 2004.
- 3.Clunie DA. DICOM Structured Reporting. Bangor, PA: PixelMed Publishing, 2000.
- 4.DICOM 2004, Part 16. Available at: http://medical.nema.org/dicom/2004/04_16PU.PDF. Accessed Nov 9, 2004.
- 5.Health Level Seven. Available at: http://www.hl7.org/. Accessed Nov 9, 2004.
- 6.Tirado-Ramos A, Hu J, Lee KP. IOD-based UML representation of DICOM structured reporting: a case study on transcoding DICOM to XML. J Am Med Inform Assoc. 2002;9:63–71. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Lee KP, Hu J. XML Schema Representation of DICOM Structured Reporting. J Am Med Inform Assoc. 2003;10:214–24. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.XML Schema. Available at: http://www.w3.org/XML/Schema/. Accessed Nov 9, 2004.
- 9.DICOM 2003, Supplement 26. Available at: ftp://medical.nema.org/medical/dicom/final/sup26_ft.pdf. Accessed Nov 9, 2004.
- 10.XML Spy. Available at: http://www.altova.com. Accessed Nov 9, 2004.
- 11.Majix. Available at: http://tetrasys.dhs.org/majix.html. Accessed Nov 9, 2004.
- 12.Xerces. Available at: http://xml.apache.org/xerces2-j/index.html. Accessed Nov 9, 2004.
- 13.Extensible Stylesheet Language. Available at: http://www.w3.org/Style/XSL/. Accessed Nov 9, 2004.
- 14.Loef C. Status and way forward with publishing DICOM in XML. http://medical.nema.org/dicom/minutes/committee/2003/2003-12-04/reports/dicom_in_xml_12_2003.ppt. Accessed Nov 9, 2004.
