Abstract
Given the lack of mechanisms for specifying, sharing and checking the compliance of consent permissions, we focus on building and testing novel approaches to address this gap. In our previous work, we introduced a “permission ontology” to capture in a precise, machine-interpretable form informed consent permissions in research studies. Here we explain how we built and evaluated a framework for specifying subject’s permissions and checking researcher’s resource request in compliance with those permissions. The framework is proposed as an extension of an existing policy engine based on the eXtensible Access Control Markup Language (XACML), incorporating ontology-based reasoning. The framework is evaluated in the context of the UCSD Moores Cancer Center biorepository, modeling permissions from an informed consent and a HIPAA form. The resulting permission ontology and mechanisms to check subject’s permission are implementation and institution independent, and therefore offer the potential to be reusable in other biorepositories and data warehouses.
Introduction
It is well known that biosamples and clinical data for research are currently in high demand, but they can be limited in availability. Moreover, it is argued that larger sample sets insures greater generalizability of findings and improve the validity of results, leading to research that can benefit the public at large1. Currently, the research community mainly relies on paper-based informed consent documents. Some institutions might require that those paper documents are scanned and kept in an electronic form linked to the subject’s medical records. Recently there has been an increasing interest in exploring the replacement of those paper-based forms with electronic versions 2,3. A motivation behind this movement is that electronic forms support optional multi-media resources (e.g., videos, links to educational material) that could enhance subject’s understanding of different aspects related to the informed consent process (e.g., risks of procedures agreed to be performed, consequences of participating in a research study). Another important motivation is that paper forms can contain subject’s consent permissions that affect future research (e.g., permission to re-contact the subject, or to use subject’s clinical data and biosamples for future research). If the informed consent process is electronically supported it could eventually facilitate capturing subjects’ permissions that affect future research in a machine-interpretable formalism supporting automatic reasoning. Therefore, in order to maximize the use of available research resources there is a need for developing computer-based reasoning mechanisms that can maximize researchers’ access to available clinical data and biosamples, while maintaining compliance with subject’s permissions.
There have been some attempts to address the problem of maximizing researchers’ access to research resources available in the form of clinical data and biosamples. The National Cancer Institute (NCI) provides a web-based specimen resource locator (see http://pluto3.nci.nih.gov/tissue/search1/search_cancer.cfm) that enables locating samples and clinical data from patients with or without tumors. For instance, researchers can query the availability of 500 to 999 specimens of frozen or fresh serum/plasma and diagnosis information available from patients with bone tumor diagnosis. The NCI resource locator allows researchers to know if the requested resources are available within the network of NCI participating organizations. However, each resource has an established review process that must be met before granting access. Furthermore, those requests can change between organizations, or even within the same organization. While NCI can inform researchers if the requested resources are available, it cannot check subjects’ consent permissions. This is due to the lack of mechanisms for uniformly specifying and checking permission compliance. Consequently, NCI solely provides the researcher information on how to contact the organization or organizations that hold the requested resources, which can grant the access after manually or (semi) automatically checking permission compliance.
To address the current lack of machine-interpretable mechanisms for expressing and reasoning over permission consents we have proposed an Ontological Approach for the Management of Informed Consent Permissions4 that could enable permission information from multiple institutions to be combined into a single computable data representation. The proposed ontology would provide the semantic foundation for representing and validating permissions data in a variety of data capture forms and relational databases, that could be expressed via a web-based system or embedded in point-of-care clinical and research applications. Initially, we evaluated our ontology with a resource locator prototype defined as a set of Semantic Web Rule Language (SWRL) queries 5. While the engine helped us refine the ontology and determine its level of granularity, it was neither efficient nor scalable. Both limitations arise from choosing SWRL as the query mechanism. Therefore, here we investigate the feasibility of using eXtensible Access Control Markup Language (XACML) 6 for building a new scalable framework for specifying and checking subjects’ permissions. The new framework is proposed as an extension of an existing general-purpose policy engine implemented using XACML. The extension incorporates ontology-based reasoning to support complex reasoning on medical record specifications.
In this article, we propose a resource locator that maximizes researchers’ access to resources, while providing compliance with subjects’ consents. The new resource locator consists of two modules to check:
availability of requested resources, depending on the healthcare institution databases technologies;
compliance with subject’s permission, using XACML, independently of the technologies used by the institution.
The proposed resource locator prototype was evaluated in the context of the University of California San Diego (UCSD) Moores Cancer Center (MCC) biorepository for collecting and banking biosamples for use in cancer research. Evaluation results show that the new resource locator successfully retains the powerful reasoning features of the SWRL-based resource locator that we previously proposed 4, while improving its performance.
The proposed permission ontology and the mechanism to check subject’s permission supported by our resource locator are implementation and institution independent, and therefore reusable. Our research is novel, and constitutes a step forward on addressing the practical need of developing standards for expressing, sharing, and reasoning on permission consents to maximize availability of clinical data and biosamples for research.
The article is organized as follows; in the Background section we explain on-going approaches to deal with the general problem of specifying and checking policy compliance, and consent permission compliance in particular; moreover we revisit the ontology introduced in 4 and we exemplify its use. Next, in the Method section we provide details on how we built our XACML-based resource locator. The Result section discusses our testing procedure with de-identified data from patients who signed a MCC informed consent and HIPAA form; we also compare the proposed engine with our previous SWRL-based reasoning engine. Finally, we present a discussion of the results and suggest future research directions in the Conclusion section.
Background
In the field of business policy compliance the Organization for the Advancement of Structures Information Standards (Oasis) has proposed XACML as a standard specification, to promote common terminology and interoperability between authorizations implementations by multiple vendors. Prior to XACML, every application vendor had to create its own proprietary method to specify access control policies, and these applications could not understand each other’s language. XACML has become the de facto standard for specifying access control policies. XACML is extensively used in enterprise policy modeling, because it defines: a) an XML-based declarative access control policy language and, b) a processing model describing how to evaluate authorization requests according to the rules defined in policies. In XACML a subject (e.g., an administrator) requests permission to perform an action on a particular resource (e.g., read a billing record). The user’s request is compared with applicable access policies to determine whether it can be granted (e.g., administrators can read billing information, therefore the request is granted). XACML can support role-based access control (RBAC), a security mechanism for restricting information system access to authorized users. RBAC decides whether specific users should be allowed to perform certain operations or actions on specific objects within the context of sessions. Each user is assigned a set of role(s), and each role is assigned permission(s) to perform a set of operations on a set of objects. Proprietary and open-source policy engines to implementing XACML have been proposed. For instance Sun Microsystems, Inc. (http://www.oracle.com/us/sun/index.htm) has introduced and maintains a Java-based XACML open source policy engine. A well-known drawback of XACML 2.0 policy engines is its constraint to reason over hierarchical data structures, which motivated the deployment of built-in components to incorporate ontology-based reasoning 7,8.
In the area of health care, HL7 has proposed a Security and Privacy Ontology 9 to name, define, formally describe and interrelate key security and privacy concepts within the scope of HealthCare Information Technology (HIT). The HL7 ontology is also based on RBAC and shares some of the concepts of XACML, such as users, actions (called operate), resources (represented as objects) and policies. Moreover, it refines some concepts for the HIT context, such as defining a permission policy for reading (as a type of action) medical record entries (as a kind of resource).
Recently, the Research Permissions Management System (RPMS) 3 was built and tested by the University of South Carolina with a Grand Opportunity grant funding from the National Library of Medicine to Health Sciences South Carolina. Leveraging the HL7 ontology, the RPMS 3 proposed a permission ontology that reuses the main relationships between consenters and the policies they have consented to. This ontology includes: 1) Policy – a collection of policy rules and descriptions that can be consented to or not; and 2) Policy Rule – the smallest action that can be consented to (e.g., contact for research). In RPMS, the HL7 ontology is extended with three new concepts: 3) Consenter – the patient or participant who is being asked to consent to a policy: 4) Encounter – the event that triggered the consent process, e.g., hospital visit; and 5) Consent-relates whether a consenter has granted consent or not to a policy. Using mobile tablet computing devices, in the RPMS patients could electronically consent to have clinical procedures (e.g., blood draw) performed, and those consent permissions were expressed in an ontology, and loaded into a clinical data warehouse along with the patient’s clinical data. The stored information could then be leveraged for research purposes based on clinical criteria coupled with patient permission status (e.g., a researcher could request all male patients 50 years old or older that consented to be contacted for future research). This was accomplished by expressing the clinical data and consent permissions using Informatics for Integrating Biology and the Bedside (i2b2) 10.
As in RPMS, we proposed an Ontological Approach for the Management of Informed Consent Permissions4 based on the HL7 Security and Privacy Ontology. While for the design of our ontology we did not consider the HL7 restricted information system access (users, roles and sessions) relevant, we reused from HL7 the notion of policies. Hence, in our ontology when a consent is signed a set of consent rules are created. We defined our ontology using the Web Ontology Language (OWL)11 and the Protégé 3.4 tool. Our ontology is available at the National Center for Biomedical Ontology bioportal (http://www.bioontology.org/), and the main concepts and relationships are depicted in Figure 1. Our ontology supports expressing permissions or obligations (types of consent rules) that allow or oblige subjects or organizations to perform operations over biological specimens or medical records (types of information objects) under constraints (e.g., the organization is non profit and US based, or the information is only used for cancer research).
Figure 1:
Class diagram depicting main classes and properties of the proposed Permission Ontology
In order to build the taxonomy we adopted terms from standard biomedical terminologies (e.g., SNOMED CT) and Unified Medical Language System (UMLS) semantic types. An example of informed consent permission is: “The following information obtained from the subject’s medical record may be provided to research collaborators when specimens are made available: […] diagnosis (tumor stage and prognostic histologic markers) […]”. This snippet can be modeled as the permission that a research collaborator of UCSD MCC has to perform the operations of reading certain types of medical records (subclasses Read of class Operation has attribute operatesOn, restricted to the subclasses Diagnosis of MedicalRecord), in a certain state (for instance isDeidentified attribute in class InformationObject indicates if the state is deidentified or non deidentified), since the moment the patient signed the informed consent (attributes hasStartingDate and hasEndingDate in class Permission indicates the time frame for the given permission). For more details and practical examples on the use of our ontology we refer the reader to 4. In summary, both the RPMS and our ontology are based on the HL7 Security and Privacy ontology. However, our ontology allows the specification of more granular consent permissions, including the set of information objects affected by the permission/obligation policy and the constraints that affect that policy.
Methods
The aim of our work was to deploy and evaluate a resource locator prototype (Figure 2) for supporting researchers’ access to clinical data and biosamples available in clinical data warehouses and biorepositories, while checking compliance with subjects’ consent permissions.
Figure 2:
Main components of the Resource Locator prototype. Through the graphical interface proposed in 4 the researcher can (1) search for clinical data and samples from patients with or without tumor diagnosis. The new Resource Availability Module (2) receives the request and (3,4) queries the data warehouse and biobank. Based on the (5, 6) query results, the new Resource Availability Module (7) sends to the Permission Compliance Module the set of patient and research study identifiers that satisfy the search criteria. For each of those patients the Permission Compliance Module (8) checks compliance with the corresponding research study’s consent permissions. The Permission Compliance Module (9) counts the patients who satisfy the search criteria and have given permission to share his/her data and samples for research. The number of available cases is (10) displayed through the interface.
Through an interface (Figure 3), inspired by the NCI resource locator’s interface, researchers can request clinical data and biosamples from patients with or without tumor/s diagnosis (e.g., plasma and diagnosis information from patients with breast cancer). However, researchers are required to indicate whether they belong to a profit/non profit organization, in/outside U.S., because this is a common constraint that occurs in patient’s permission consents. The resulting resource locator is built around two interacting modules.
Figure 3:
Graphical interface of the resource locator prototype.
The first module, called Resource Availability module, is institution and technology dependent. Therefore, it is not reusable. This module is responsible for mapping the researcher’s request, as captured by the interface, into a set of databases queries that can check the availability of the requested resources in the institution’s biosample repositories and data warehouses. After querying the databases, the output of this module is a set of patient and study identifiers. Those identifiers denote patients who satisfy the search criteria (e.g., have diagnosis of breast cancer, and their plasma and diagnosis information is available) and have signed research studies (e.g., signed UCSD MCC research study) that could potentially permit the sharing of data and samples for future research.
The second module, called Permission Compliance module, is institution and technology independent. Therefore, it is reusable. The latter module is based on the XACML extended engine and takes as input the output of the first module. Next, for each received patient identifier the module checks whether the available resources can be accessed as specified by the consent permissions contained in the signed research study (e.g., UCSD MCC informed consents and HIPAA privacy constraints). Those permissions are kept in a XACML permission repository. Finally, the resource locator indicates to the researcher how many cases satisfy his/her search criteria while fulfilling subjects’ permission compliance.
In the development of the resource locator prototype our permission ontology plays a key role. Using the permission ontology it is possible to specify institution independent knowledge relevant to the decision process, allowing the second module of the resource locator to be fully reusable. Moreover, through mappings between concepts in the permission ontology (used by the second module) and the processed information (search criteria introduced through the interface and information used by the first module to query the databases) it is possible to provide knowledge consistency through out the resource locator. For instance, through the interface it is possible to request Treatment Information which can be mapped into the concept PatientTreatmentHistory in our permission ontology; and PatientTreatmentHistory can be subsequently mapped into the table Cancer Treatment in the set of MCC biorepository databases.
Below we explain in detail the two modules that constitute the resource locator:
1. Module for checking resource availability
This module should be customized to an institution, and the institution’s technology and databases structure. UCSD MCC biorepository provided us with SQL databases containing information on the clinical data and biosamples shared for future research by patients who signed the informed consent entitled “Collection and banking of tissue, blood and urine for use in cancer research”. Their databases contained information on 1223 patients. For each patient MCC records contain patient’s demographics, cancer history, cancer treatment, family cancer history, and information on the banked biosamples (urine, blood and tissue).
This module is responsible for mapping researcher’s request into a set of database queries that check the availability of the requested resources. For instance, assume that a researcher wants to retrieve treatment and demographic information and frozen samples of plasma from patients with the diagnosis of breast cancer. When this request is mapped into SQL queries that access MCC databases, the following considerations should be taken into account: 1) MCC considers three variants of breast cancer (breast cancer female, breast cancer male and breast cancer carcinoma in situ); 2) treatment information corresponds in MCC to the subset of cancer treatment information; and 3) demographics information in MCC is limited to race, gender and year of birth.
The output of this module is the set of the identifiers of the patients that satisfy the researcher’s request criteria, and the identifiers of the research studies that each patient signed agreeing to contribute with his/her data and samples to the data warehouse and biobank.
2. Module for checking subject’s permission compliance
XACLM 2.0 provides an XML-based mechanism to define attributes for an entity (tumor stage is specified as an attribute of DiagnosisTumor), and hierarchical knowledge (DiagnosisAlcoholAbuse is specified as a type of Diagnosis). But it does not provide mechanisms to dynamically reason on those types of specifications. For instance we cannot infer the following:
Is there a relationship of subclass or superclass within two entities belonging to the same taxonomy (is alcohol abuse diagnosis a type of diagnosis information)?
Does a property apply to an entity (does a record on tumor diagnosis have information on tumor stage)?
If for instance a researcher is requesting permission to read diagnosis information of patients with a breast cancer diagnosis, our resource locator should allow access to: a) diagnosis alcohol abuse, as a type of diagnosis information, and b) tumor stage, because it is part of the tumor diagnosis records.
In this paper, we built an extension of the open source Sun-XACML engine. Our extension consists of building a bridge that supports the type of reasoning mentioned in a).
The bridge implementation consisted of defining two new XACML functions that enact SWRL rules to reason on our permission ontology: permit-ontology-reasoner and deny-ontology-reasoner. If an access request is received related to an instance of a resource class A, and the mentioned functions find a policy rule that permits(denies) access to resource class B, and class A and B are equivalent or A is a subclass of B, then the access should be permitted(denied). For instance if a request is received to access DiagnosisTumor information, and there is permission to access Diagnosis information, then the access should be granted because DiagnosisTumor is in our permission ontology a subclass of Diagnosis (as shown in Figure 1).
But there may be cases where the XACML engine determines that no policy is applicable. For instance, there is an access request for information on Diagnosis, but because there is a policy rule permitting access to DiagnosisTumor and there are policy rules denying access to all sensitive information diagnosis (DiagnosisAlcoholAbuse, DignosisDrugAbuse, DiagnosisMentalHealth) it is not possible to assign a permit/deny policy rule to Diagnosis. Nevertheless, our XACML-extended engine should grant access to DiagnosisTumor information because it is a subclass of Diagnosis. In order to support these types of cases we extended the XACML engine with an algorithm that, when a policy is not applicable, traverse through the taxonomy supported by our permission ontology, and determine whether there are permit/deny policy rules that apply to subclasses of the requested resource. To determine the subclasses of the requested resource the algorithm enacts SWRL rules.
We built the module for checking subject’s permission compliance based on the explained extension, and the input of this module is the output of the module that check resource availability, e.g. the set of identifiers of the patients that satisfy the researcher’s request criteria, and the set of identifiers of the research studies signed by each patient. First, for each patient identifier and research study this module creates a XACML access request, which is expressed in terms of our permission ontology. A XACML access request is defined in terms of a Subject, Resource and Action elements. The subject represents the entity making the access request (who wants access to the resource). The resource element defines the data, service, or system component that the subject wants to access. An action describes the operation that subjects wish to perform on the resource. Let us consider an example: a request from a non profit US research organization to access diagnosis information of a patient with identifier patientID who signed the MCC research study. The subject is the non profit US research organization, diagnosis information of a patient who signed MCC research study HRPP090401 is the resource, and the action is to read that information. Diagnosis information is mapped in our permission ontology as the class Diagnosis, a subclass of InformationObjects (see Figure 1). Additionally, the action is mapped into our permission ontology as Read, a subclass of the class Operation.
Second, the XACML access request is sent to the extended XACML engine, which checks the content of the XACML permission repository to determine if the permission can be granted. In the case of the MCC biorepository, the XACML permission repository contains the set of permission policies agreed by the subjects who signed the informed consent called “Collection and banking of tissue, blood and urine for use in cancer research”.
Third, the XACML extension evaluates for each request if the permission should be granted based on the request and the taxonomy of medical records specified in the permission ontology.
Fourth, the decision is sent as a response. Finally, the module counts how many patient cases could be shared with the researcher and generates as output that number.
Results
1. Evaluation of the resource locator prototype with the informed consent used by the MCC Biorepository
The evaluation required extracting from the IRB approved research plan for the study “Collection and banking of tissue, blood and urine for use in cancer research” (HRPP090401) the permissions requested to participating subjects. For instance: “The following information obtained from subject’s medical record may be provided to research collaborators when specimens are made available: age at time of blood collection, [tumor] diagnosis (tumor stage, prognostic histologic markers and site), clinical outcome […] and demographic data”. Those permissions were expressed in XACML, and were saved in the XACML Permission Repository. The process of interpreting the research plan of the clinical trial used by MCC, and extracting the consent permissions was manually performed a posteriori. To populate the XAMCL Permission Repository we wrote a program that extracted the required information from the databases provided by the MCC Biorepository and expressed that information as XAML policy rules.
XACML policy Rules are the core part of the XACML policies. Rules are constituted of a Target and Condition. Rule targets specify Subjects/Resources/Actions for matching between rules and requests. Conditions describe constraints for the specified actions to be performed on given resources by chosen subjects. A rule’s effect is either Permit or Deny.
Figure 5 shows how we specified in XACML the policy rule with effect Permit that permits to share for research diagnosis tumor information of patients who signed MCC research study. The policy rule gives permission to perform the action of reading (Read as subclass of Operation) over the resource of tumor diagnosis information (DiagnosisTumor as subclass of Diagnosis), under the condition that the access request is linked to a patient who has signed the MCC research study with identifier HRPP090401. No restriction is set on the type of subject (e.g., profit or non profit research organization) who can get access to the resource.
Figure 5:
XACML policy specifying that the Resource corresponding to patient’s diagnosis tumor information can be accessed (as denoted by the Action read), under the Condition that the patient signed the MCC research study with identifier HRPP090401.
Given that all the patients signing the MCC informed consent are agreeing to the same conditions it was not necessary to save for each patient the consent permissions, but it was enough to save them once and assign them the identifier HRPP090401.
It is worth mentioning that the references to the function permit-ontology-reasoner through out the XACML policy depicted in Figure 5 are part of the bridge that we built for incorporating ontology reasoning into the XACML engine.
XACML provides a library of rule-combining algorithms that combine the effects of all the rules in a policy to arrive at a final authorization decision (although custom algorithms can be used as well). From the library of XACML rule-combining algorithms, we chose the deny-overrides one. So, if any rule evaluated to Deny, then the final authorization decision is also to Deny.
Once the resource locator prototype was built, MCC biorepository provided us with 32 requests for resources received from researchers. Currently, in MCC the first checking is (semi)automatically done by defining SQL queries based on the researcher’s request, while the second checking is carried out manually. From those requests 12 could not be specified with the graphical interface provided in Figure 3 because they required modeling additional constraints. For instance: select only African-American patients, or patients with diagnosis of tumor stage I and II.
This limitation is based on our choice of using a similar graphical interface as the one provided by NCI to allow researchers to request the access to resources. Future versions of our prototype could provide more specific search criteria, like the ability to choose the gender and race of the patients, or types of tumor stages.
For the 20 requests that could be enacted in our resource locator, the evaluation showed that the outcome of the prototype always included the set of resources that MCC shared with the requesting researcher. For instance, a researcher requested 50 patient cases with a certain criteria, and our resource locator determined that 90 cases could have been shared. Also the 50 cases that were shared with the researcher were included in the set of the 90 cases. The interpretation of these results is that our resource locator can accurately and automatically determine resource availability and permission compliance.
2. Evaluation of the resource locator prototype with the informed consent and HIPAA forms used by the MCC Biorepository
Subjects signing MCC informed consent also sign a “Permission to use personal health information for research”, which allows the subjects to restrict the access to sensitive information contained in their medical record, like: diagnosis and treatment information related to drug and alcohol abuse, HIV/AIDS testing information, genetic testing information, and information pertaining to mental health diagnosis or treatment. This document is a consent designed by the University of California to comply with the HIPAA laws. Electronic records of these consents were not kept by MCC because no subject chose to restrict the access to their sensitive data. Nevertheless, we decided to further test our resource locator by randomly simulating, for the 1223 patients that signed MCC research study cases, restrictions to the access to some or all the sensitive information listed above. These simulated cases helped us explore in more depth the benefits that extending the XACML engine with ontological reasoning could bring to the problem of checking permission compliance. For instance, we could model cases where a patient signed the MCC research study giving permission to share cancer diagnosis information, but signed the “Permission to use personal health information for research” denying access to his/her alcohol and drug abuse diagnosis information. Therefore, when the resource locator received a request asking for diagnosis information, it could only grant access to the cancer diagnosis information of that patient (modeled in the permission ontology as a type of Diagnosis information), but could give no access to the patient’s alcohol and drug abuse diagnosis information (also modeled in the permission ontology as a type of Diagnosis information).
3. Evaluation of the scalability and performance of the proposed resource locator prototype
The first resource locator prototype that we built4 was purely ontological. We wrote a Java program that dynamically generated and enacted, based on the researcher’s resource request, SWRL queries that could inspect and reason over the content of the permission ontology. To populate the permission ontology MCC biorepository provided us with Excel spreadsheets containing 700 patient cases. In order to extract the content of the Excel sheets into our permission ontology we used the Protégé plugin called MappingMaster12. While the first prototype helped us to further test the expressiveness and granularity of the proposed ontology, it was neither scalable nor efficient. For some of the queries it took the resource locator at most one hour to provide the expected outcomes. While SWRL is highly expressive, its expressivity comes at the expense of decidability. In our context this means that when the number or the complexity of the modeled permissions or patient cases increases, we can not guarantee that the resource locator can determine if a request is in compliance with subjects’ permissions.
On the other hand, the Sun XACML engine that we used for implementing our second prototype has shown to be very efficient13,14. In Performance evaluation of XACML PDP implementations13 they analyzed the performance of different XACML engines, and in the case of the Sun XACML engine they mentioned that it took 4 seconds for the engine to evaluate 10000 random policies. When we tested the resource locator with MCC informed consent and HIPAA form our performance analysis showed that the expected outcomes were generated in the order of seconds. It took from 4 to 18 seconds for the resource locator to evaluate 1232 policies. We assume that the enactment of SWRL queries affected its performance. But, compared with the previous prototype that was purely SWRL-based, the performance dramatically improved.
Although, in our current prototype we have implemented a SWRL-based ontology bridge, our future plan is to explore the replacement of the bridge for a SPARQL-based one and analyze how it affects the performance and efficiency of the engine. Given that SPARQL query enactment is known to be more efficient than SWRL query execution, we expect to improve the overall resource locator performance.
Discussion
In our Resource Locator prototype checking the compliance with subject’s consent is achieved through an ontology-based component based on a well-developed and robust XACML policy compliance standard. The use of the standard and the ontology makes this component fully reusable. On the other hand, a drawback of our current Resource Locator prototype is that we have built our own module for checking the availability of resources. Our module converts the researcher’s resource request into a set of queries to the enacted on the institution’s databases. We are currently considering replacing this module for an i2b2 10 component that could provide similar functionality. Because the Resource Locator is modular, the existing module for checking resource availability should be easily replaceable for a new i2b2 component. Given that an i2b2 system has been used before in RPMS3 to check simple consent permission compliance (e.g., check if there is a record of the patient consenting to be recontacted for research), the possibility of extending an i2b2 component to support more complex subject’s compliance checking (e.g., reason in presence of consent permissions and denials) remains open.
It also remains as an open problem exploring the use of the Integrated Rule-Oriented Data System (iRODS)15 as a framework to check resource availability and consent permission compliance. iRODS was proposed for building the next generation data management cyber infrastructure. At the core of iRODs is a rule engine that enforces/executes data management policies for replication, distribution, pre- and post-processing and metadata extraction and assignment. iRODS is a well-developed and robust system already used showing competitive performance. For instance, the Integrated BIOMolEcular Simulations (iBIOMES) system was built using iRODS to create a virtual data warehouse to facilitate researchers in the field of biomolecular simulation the task of data handling. In iBIOMES data can be distributed among multiple servers and searched through metadata query. Provided its high performance and expressiveness iRODS framework could be a good candidate to implement data access policies based on consent permissions.
The research reported here was part of the Integrating Data for Analysis, Anonymization, and Sharing (iDASH) project 16. Also, as part of the iDASH project we are currently building a web-based tool to allow subjects from UCSD Health System understand the consequences of sharing their clinical data for research, and to chose which information from their medical records they want to share. The proposed web-tool could be further improved and tested as part of a 200 patient pilot randomized controlled trial to understand subjects’ choices on data sharing. In the context of the tool’s one year trial, there is room for a resource locator as the one we proposed here, because during the duration of the evaluation study UCSD clinical datawarehouse will honor the consent permissions chosen by the participating patients. Consequently when IRB approved clinical data requests are received by the UCSD clinical datawarehouse, a resource locator could decide which available clinical data could be shared in compliance with subject’s choices. Therefore, as future work we would like to adapt and evaluate the XACML-based resource locator proposed here in the context of the planned iDASH randomized control trial. Although, we expect to reuse the module for checking permission compliance from our current resource locator, addressing this new problem will require creating a new module for checking resource availability. The new module will be adapted to the database implementation choices and structure of the UCSD clinical datawarehouse. A more challenging evaluation scenario that we would like to address in the future is to explore the use of the resource locator in contexts where clinical data and/or biosamples could be the subject of different consent permissions; like the case of the NCI scenario described in the Introduction Section. These types of scenarios could require pulling out resources collected by different institutions that use different consent permissions.
In conclusion, based on the current lack of automatic mechanisms for sharing, integrating and checking compliance with subject’s consent, we believe that our research is novel and could have an important future practical impact that should be further explored.
Acknowledgments
The research was supported by iDASH: Integrating Data for Analysis, Anonymization, and Sharing (1U54HL108460), founded by NIH National Heart, Lung and Blood Institute. We would also like to acknowledge Neda Alipanah, Claudiu Farcas and Mona Wong for useful discussions on the tool development.
References
- 1.Hewitt RE. Biobanking: the foundation of personalized medicine. Curr Opin Oncol. 2011;23(1):112–119. doi: 10.1097/CCO.0b013e32834161b8. [DOI] [PubMed] [Google Scholar]
- 2.Boxwala AA, Barker J, Johnstone E, Gupta A. Presentation at CTSA Annual Informatics Meeting: Electronic Capture and Management of Informed Consent for Research; 2011. [Google Scholar]
- 3.Obeid J, Gerken K, Madathil KC, et al. Development of an Electronic Research Permissions Management System to Enhance Informed Consents and Capture Research Authorization Data. AMIA Summit on Clinical Research Informatics. 2013 [PMC free article] [PubMed] [Google Scholar]
- 4.Grando MA, Boxwala A, Schwab R, Alipanah N. Ontological Approach for the Management of Informed Consent Permissions. 2012 IEEE Second International Conference on Healthcare Informatics, Imaging and Systems Biology (HISB); 2012. pp. 51–60. [Google Scholar]
- 5.Horrocks I, Patel-Schneider P, Boley H, Tabet S, Grosof B, Dean M. SWRL: A Semantic Web Rule Language Combining OWL and RuleML. 2004. Available at: http://www.w3.org/Submission/SWRL/. Accessed October 5, 2012.
- 6.OASIS. XACML 2012. Available at: https://www.oasis-open.org/committees/tc_home.php?wg_abbrev=xacml.
- 7.Abou-Tair DD, Berlik S, Kelter U. Enforcing Privacy by Means of an Ontology Driven XACML Framework. Third International Symposium on Information Assurance and Security, 2007 IAS 2007; 2007. pp. 279–284. [Google Scholar]
- 8.Ferrini R, Bertino E. Supporting RBAC with XACML+OWL. Proceedings of the 14th ACM symposium on Access control models and technologies; New York, NY, USA: ACM; 2009. pp. 145–154. SACMAT ’09. [Google Scholar]
- 9.HL7 HL7 Security and Privacy Ontlogy (Ballot Reconciliation Version; In progress) 2011. Available at: http://www.hl7.org/. Accessed June 30, 2012.
- 10.Murphy SN, Weber G, Mendis M, et al. Serving the enterprise and beyond with informatics for integrating biology and the bedside (i2b2) J Am Med Inform Assoc. 2010;17(2):124–130. doi: 10.1136/jamia.2009.000893. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.W3C OWL Working Group . OWL 2 Web Ontology Language Document Overview. W3C; 2009. Available at: http://www.w3.org/TR/owl2-overview/. [Google Scholar]
- 12.O’Connor MJ, Halaschek-Wiener C, Musen MA. Mapping Master: a Flexible Approach for Mapping Spreadsheets to OWL. 9th International Semantic Web Conference (ISWC); China: Springer-Verlag; 2010. pp. 194–208. Available at: http://data.semanticweb.org/conference/iswc/2010/paper/414. Accessed May 17, 2012. [Google Scholar]
- 13.Turkmen F, Crispo B. Performance evaluation of XACML PDP implementations. Proceedings of the 2008 ACM workshop on Secure web services; New York, NY, USA: ACM; 2008. pp. 37–44. SWS ’08. [Google Scholar]
- 14.Liu AX, Chen F, Hwang J, Xie T. XEngine: A Fast and Scalable XACML Policy Evaluation Engine. Proceedings of the 2008 ACM SIGMETRICS international conference on measurement and modeling of computer systems; 2008. pp. 265–276. [Google Scholar]
- 15.iRODS: Data Grids, Digital Libraries, Persistent Archives, and Real-time Data Systems. Available at: https://www.irods.org/index.php/IRODS:Data_Grids,_Digital_Libraries,_Persistent_Archives,_and_Real-time_Data_Systems. Accessed July 9, 2013.
- 16.iDASH: integrating Data for Analysis, Anonymization, and SHaring. Available at: http://idash.ucsd.edu/. [DOI] [PMC free article] [PubMed]