Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2016 Jan 1.
Published in final edited form as: Neuroinformatics. 2015 Jul;13(3):383–386. doi: 10.1007/s12021-015-9263-8

The Three NITRCs: A Guide to Neuroimaging Neuroinformatics Resources

David N Kennedy 1, Christian Haselgrove 1, Jon Riehl 2, Nina Preuss 3, Robert Buccigrossi 3
PMCID: PMC4470758  NIHMSID: NIHMS691099  PMID: 25700675

Introduction

Information management is critical as the landscape of neuroscience related shared resources (data, software, computation, etc.) expands. Since 2006, the Neuroimaging Informatics Tools and Resources Clearinghouse (NITRC) has provided a comprehensive support infrastructure for resources in the neuroimaging domain1. Funded by the NIH Blueprint for Neuroscience Research as well as four NIH institutes 2 3, NITRC’s mission is to facilitate finding and comparing neuroimaging resources for neuroimaging analyses. Over the years the scope of these resources has expanded to support scientific domains from MR to PET, SPECT, CT, MEG/EEG, optical imaging, genetic imaging, clinical neuroinformatics and computational neuroscience. A broad set of initiatives have been developed to support these research areas.

Early in the course of the development of NITRC, it was realized that a ‘clearinghouse’ of resources was just one component in a desired infrastructure for a complete integration of neuroimaging resources. Specifically, while the original NITRC website—referred to as NITRC-Resources Repository (NITRC-R) facilitated the finding of software and data, there still existed the need for expanding the capacity for data hosting. Furthermore, once a user finds software and data, the current model of downloading each resource to one’s own local computer quickly becomes rate limiting as the magnitude of the shared datasets gets larger and the processing software gets more complex, specific, and CPU intensive. Datasets such as the ‘1000 Functional Connectomes’4, ‘Pediatric Imaging Neurocognition and Genetics (PING)’5 and the ‘Autism Brain Data Exchange (ABIDE)’6, for example, contain thousands of subjects each with structural and resting-state fMRI data for which standard processing can take days per subject. Such examples quickly outstrip the local analysis capability of many investigators and thus motivate the development of additional solutions. To address these challenges, we embarked upon the creation of the NITRC Image Registry (NITRC-IR) to facilitate a data sharing solution that was closely integrated with the resource description, support, promotion, and management functions provided by NITRC-R for its hosted projects. We also embarked upon the development of the NITRC Computational Environment (NITRC-CE), a cloud-based, dynamic, high-performance, and easy-to-use computational plaform that could be tailored to the computational needs of the NITRC community..

NITRC Resources Repository – An Update

NITRC-R continues to be the go-to site for neuroimaging analysis resources. Currently hosting 729 publically accessible projects and 11,251 registered users, user registrations, file downloads, and unique monthly visitors new to the site are increasing. Since tracking via Google Analytics was started in May 2009, NITRC-R has had 3.7 million+ page views, comprised of 878,325 visits by 399,983 unique visitors who viewed, on average, 4.2 pages per visit.

NITRC Image Repository – Featuring XNAT

As researchers pursue data sharing, they discover that the easiest way to share data is to make an archive file of the data and post it to a website. This was the initial path followed by 1000 Functional Connectomes consortium 4 7. This path has few barriers to data release -- other than preferably containing data in a format that can be understood by other investigators (in this case, the NIfTI image format8). Releasing data to a website is quick and can avoid any need to deal with thorny matching of data schema or mediation of values with existing databases or other sharing mechanisms. As a result, this ‘ease’ of data sharing is matched by a concomitant ‘difficulty’ for data use. Any user who is interested in a subset of the data must download the entire archive and sort through the data on their own to identify the specific data of interest. When the data contains thousands of subjects and is comprised of hundreds of archive files, this task quickly can daunt the end user. This ‘data alignment’ problem is exacerbated, then, when multiple different groups use archive files with variable data formats and metadata (age, gender, diagnosis, etc.) encoding. As the proliferation of this type of sharing expands, the ultimate ability of data users to integrate data will be greatly diminished.

The alternative to archive file sharing is to promote data sharing in the context of a searchable repository, where data subsets can be identified and specific data selected prior to download. This requires that all the data be associated with a set of searchable metadata to support the query mechanism. A searchable data repository can typically be achieved using an image database or data management platform. In order to construct a query system the metadata associated with the image data must conform to some sort of common data representation schema. This places requirements upon the data providers to provide the necessary data for the underlying schema, as well as requirements on the data hosting system for extensibility of the data management system, in order to be adaptable to additional metadata fields as they become included in shared data.

There were numerous data management platforms that were in use at the time NITRC sought to provide an Image Repository system in 2009. After a requirements assessment of NITRC community requirements and the available systems, the eXtensible Neuroimage Archive Toolkit (XNAT)9 10 was selected as an image database system to pair with the NITRC resource registry.

We customized an XNAT instance to seamlessly interoperate with the NITRC-R environment. This included establishing a common user-base mapping between the NITRC and XNAT applications as we desired a ‘single sign-on’ and needed to manage project-level permissions via the NITRC-R users and projects. NITRC-R projects can be associated with NITRC-IR (XNAT) ‘projects’ and these can be interlinked. In addition to within-project viewing and searching functions, users can also interact with the entire NITRC-IR portfolio of data (that they have permission to access) at a level ‘above’ the individual projects11. This greatly facilitates searches across the projects and permits ease of identification and aggregation of comparable data within multiple projects.

As of December 2014, there are 7620 imaging sessions available in the repository associated with 12 projects. These 12 NITRC-IR projects average 635 sessions per project.

NITRC Computational Environment – Featuring NeuroDebian

The NITRC-CE computational environment is currently deployed both for the Amazon Elastic Cloud Computing (EC2) environment and as a standalone virtual machine builder. The NITRC-CE system is built upon the Ubuntu 12.04 Linux release using NeuroDebian12 13. It currently supports pre-installed versions of many software packages (FreeSurfer14, FSL15, AFNI16, etc. – see NITRC-CE software listing17 for complete list). Additional pre-installed software is being added with each release. Users can add additional software to their instances as necessary.

The core functionality for the objective of providing a common and easy to deploy, fully functional neuroimaging analysis operating system is provided by a NITRC-CE install script. The purpose of the NITRC-CE install script is to convert any Ubuntu 12.04 machine (real or virtual) into a NITRC-CE instance. Once completed, the user is then able to use a web browser to connect to the machine and set up a NITRC-CE user. Please see the NITRC-CE webpage18 for full usage instructions.

The NITRC-CE install script is used to generate an Amazon Machine Instance (AMI): a fully specified computer (from operating system to installed software) that can be run on any of the available Amazon computational architectures (from single core to multi core systems). The NITRC-CE AMI can be accessed from one’s own Amazon Web Service (AWS) account or through the AWS Marketplace19 – a site where many different pre-configured EC2 solutions can be found. Extensions to other cloud-computing providers, such as Microsoft’s Windows Azure20, will also be provided in the future.

Using the NITRC-CE on the AWS EC2 requires an AWS account, as the EC2 costs are borne by the user. Amazon instance types21 from ‘micro’ (very low-cost instance option, providing a small amount of CPU resources) to ‘cc2’ (compute-optimized computational clusters with up to 32 cores) are supported. Example performance characteristics on real-world examples are being aggregated and updated at the NITRC-CE project at NITRC22. While many factors influence the time (and hence the price) needed for a specific task, a FreeSurfer run on example T1-weighted structural MRI data from the National Database for Autism Research (NDAR)23 took approximately 11 hours on a ‘m1.large’ instance. At $0.175 per hour for on-demand instances, this is approximately $2 per case. Utilizing the second core and spot pricing, this cost can be reduced further. Parallel processing via the SGE24 is supported, and multiple instances can be harnessed into even more powerful processing using through the use of StarCluster25.

Conclusion

The overall NITRC enterprise has been developed over the past six years in order to support various community needs in the area of comprehensive support of software and computing for neuroscience. Each of these services (NITRC-R, NITRC-IR and NITRC-CE) are modular and interoperable; designed to both complement other solutions and integrate with them (i.e. NIF – the Neuroscience Information Framework26 27). NITRC indexes a vast majority of the existing neuroimaging software, hosts a substantial fraction of the shared neuroimaging data and has facilitated thousands of hours of neuroimaging computation. As of December 2014, NITRC has 1870 literature citations that are indicative of this success28. Taken together, this triumvirate of coordinated resources facilitates enhanced interoperability and accessibility in support of promoting better, more reproducible, and more cost effective neuroscience research.

Figure 1.

Figure 1

Components of the NITRC system. The upper panel shows a project homepage from NITRC-R for the ‘1000 Functional Connectomes’ project. From a project homepage, the user has easy and standardized access to all materials that the developers want to make available, such as downloads, documentation, help, etc. The middle panel shows a view of the NITRC-IR displaying the results of a cross-project query for imaging data from male subjects that have resting-state imaging with a repetition time (TR) of 3 seconds acquired at 3 Tesla field strength, and 11 years of age. This result pools data from multiple projects (i.e. ADHD-200 & ABIDE in this figure). The bottom panel shows the desktop view of NITRC-CE as accessed by a VNC connection. This example shows AFNI running on the desktop of this cloud computational system.

References

RESOURCES