Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2019 Jan 24.
Published in final edited form as: Cell Syst. 2017 Nov 29;6(1):13–24. doi: 10.1016/j.cels.2017.11.001

The Library of Integrated Network-based Cellular Signatures (LINCS) NIH Program: System-level Cataloging of Human Cells Response to Perturbations

Alexandra B Keenan 1, Sherry L Jenkins 1, Kathleen M Jagodnik 1, Simon Koplev 1, Edward He 1, Denis Torre 1, Zichen Wang 1, Anders B Dohlman 1, Moshe C Silverstein 1, Alexander Lachmann 1, Maxim V Kuleshov 1, Avi Ma’ayan 1,*, Vasileios Stathias 2, Raymond Terryn 2, Daniel Cooper 2, Michele Forlin 2, Amar Koleti 2, Dusica Vidovic 2, Caty Chung 2, Stephan C Schürer 2, Jouzas Vasiliauskas 3, Marcin Pilarczyk 3, Behrouz Shamsaei 3, Mehdi Fazel 3, Yan Ren 3, Wen Niu 3, Nicholas A Clark 3, Shana White 3, Naim Mahi 3, Lixia Zhang 3, Michal Kouril 3, John F Reichard 3, Siva Sivaganesan 3, Mario Medvedovic 3, Jaroslaw Meller 3, Rick J Koch 4, Marc R Birtwistle 4, Ravi Iyengar 4, Eric A Sobie 4, Evren Azeloglu 4, Julia Kaye 5, Jeannette Osterloh 5, Kelly Haston 5, Jaslin Kalra 5, Steve M Finkbiener 5, Jonathan Li 6, Pamela Milani 6, Miriam Adam 6, Renan Escalante 6, Karen Sachs 6, Alex Lenail 6, Divya Ramamoorthy 6, Ernest Fraenkel 6, Gavin Daigle 7, Uzma Hussain 7, Alyssa Coye 7, Jeffrey Rothstein 7, Dhruv Sareen 8, Loren Ornelas 8, Maria Banuelos 8, Berhan Mandefro 8, Ritchie Ho 8, Clive N Svendsen 8, Ryan Lim 9, Jennifer Stocksdale 9, Malcolm Casale 9, Terri Thompson 9, Jenny Wu 9, Leslie M Thompson 9, Victoria Dardov 8, Vidya Venkatraman 8, Andrea Matlock 8, Jenny Van Eyk 8, Jacob D Jaffe 10, Malvina Papanastasiou 10, Aravind Subramanian 11, Todd R Golub 11,12, Sean D Erickson 13, Mohammad Fallahi-Sichani 13, Marc Hafner 13, Nathanael S Gray 13, Jia-Ren Lin 13, Caitlin E Mills 13, Jeremy L Muhlich 13, Mario Niepel 13, Caroline E Shamu 13, Elizabeth H Williams 13, David Wrobel 13, Peter K Sorger 13, Laura M Heiser 14, Joe W Gray 14, James E Korkola 14, Gordon B Mills 15, Mark LaBarge 16,17, Heidi S Feiler 14, Mark A Dane 14, Elmar Bucher 14, Michel Nederlof 14,18, Damir Sudar 14,18, Sean Gross 14, David F Kilburn 14, Rebecca Smith 14, Kaylyn Devlin 14, Ron Margolis 19, Leslie Derr 19, Albert Lee 19, Ajay Pilai 19; The LINCS Consortium
PMCID: PMC5799026  NIHMSID: NIHMS918387  PMID: 29199020

Abstract

The Library of Integrated Network-based Cellular Signatures (LINCS) is an NIH Common Fund program that catalogs how human cells globally respond to chemical, genetic and disease perturbations. Resources generated by LINCS include experimental and computational methods, visualization tools, molecular and imaging data, and signatures. By assembling an integrated picture of the range of responses of human cells exposed to many perturbations, the LINCS program aims to better understand human disease and to advance the development of new therapies. Perturbations under study include drugs, genetic perturbations, tissue micro-environments, antibodies, and disease-causing mutations. Responses to perturbations are measured by transcript profiling, mass spectrometry, cell imaging, and biochemical methods, among other assays. The LINCS program focuses on cellular physiology shared among tissues and cell types relevant to an array of diseases, including cancer, heart disease, and neurodegenerative disorders. This Perspective describes LINCS technologies, datasets, tools, and approaches to data accessibility and reusability.

Overview of the NIH LINCS Common Fund Program

The LINCS Program aims to create a network-based understanding of human biology by cataloging changes in gene and protein expression, signaling processes, cell morphology, and epigenetic states that occur when cells are exposed to a variety of perturbing agents. By generating and providing publicly available data on how human cells respond to various genetic and environmental stressors, the LINCS Program is collecting the data required for detailed understanding of cell signaling and gene regulatory pathways involved in human disease. This will aid efforts to develop therapies that restore disease-perturbed pathways and networks to their normal physiological state. Several LINCS projects are based on the premise that disrupting any one of the many components of a biological process causes related changes to molecular characteristics and functions of the cell – the observable composite of which is the cellular phenotype. Observing how and when cellular phenotypes are altered by specific stressors can provide clues about underlying mechanisms of disease while facilitating the identification of new therapeutic targets.

LINCS data are made openly available as a community resource through a series of data releases in order to enable scientists to address a broad range of basic research questions. Results are obtained in cultured and primary cells whose state has been perturbed experimentally, with the term “perturbagen” used to refer to any condition that can alter the cellular state. LINCS datasets therefore consist of assay results from cells treated with bioactive small molecules, antibodies, ligands such as growth factors and cytokines, microenvironment proteins, genetic perturbations, and comparisons of disease vs. normal primary cells from patients and healthy control subjects. Many different assays are used to measure cell responses, including measurements of mRNA and protein expression; epigenomic status; and cellular, molecular and morphological phenotypes captured by biochemical and imaging readouts (Figs 12). Assays typically are carried out on multiple cell types at multiple time points, and perturbagens are applied at multiple doses. The LINCS Program has been implemented in two phases. The pilot phase of the program was completed in 2013 and was focused on initial production of perturbation-induced molecular and cellular signatures, assay development, development of data standards, as well as tools and databases for accessing the data. Phase 2, which began in 2014 and is the focus of this Perspective, supports six LINCS Data and Signature Generation Centers (DSGCs) and one Data Coordination and Integration Center (DCIC). The data coordination center synergizes LINCS-related activities with the NIH Big Data to Knowledge (BD2K) program (Margolis et al., 2014). The focus in Phase 2 remains on the large-scale production of perturbation-induced molecular and cellular signatures, as well as computational tool development, integrative data analyses, integration of external public datasets with data generated by LINCS, metadata annotation that strictly follows the Findable, Accessible, Interoperable, and Reusable (FAIR) guidelines (Wilkinson et al., 2016a), and outreach activities to engage the next generation of biomedical data scientists and promote the LINCS consortium. A summary of the LINCS program by the numbers is provided in Figure 1. Box 1 contains web URLs to major LINCS program and project websites. To aid the reader, abbreviations used in this Perspective may be found in Box 2.

Fig. 1.

Fig. 1

An overview of the multi-institutional LINCS program.

Fig. 2. An overview of the LINCS centers, assays, tools and platforms (not all tools are listed).

Fig. 2

Acronyms: MEMA- microenvironment microarrays; CycIF- cyclic immunofluorescence; MS- mass spectrometry; RPPA- reverse phase protein array; GCP- global chromatin profiling; ATAC- assay for transposase accessible chromatin; OMERO- open microscopy environment; CLUE- CMap and LINCS unified environment; GR- growth response.

Box 1. URLs.

General program information sites

LINCS program informational site: http://www.lincsproject.org

LINCS program on the Common Fund site: https://commonfund.nih.gov/lincs

LINCS program on Twitter: https://twitter.com/lincsprogram

LINCS mobile app: http://lincsproject.org/LINCS/mobile

LINCS Data and Signature Centers

DToxS Center: http://dtoxs.org

HMS LINCS Center: http://lincs.hms.harvard.edu/

LINCS Center for Transcriptomics: https://clue.io/

LINCS PCCSE: https://panoramaweb.org/labkey/lincs.url

MEP LINCS Center: https://www.synapse.org/mep_lincs

NeuroLINCS Center: http://www.neurolincs.org/

BD2K-LINCS Data Coordination and Integration Center

http://lincs-dcic.org

Box 2. Abbreviations.

Abbreviations and Acronyms Glossary

ALS

Amyotrophic Lateral Sclerosis

API

application program interface

BD2K

Big Data to Knowledge

CCA

Consortium Coordination and Administration

CEDAR

Center for Expanded Data Annotation and Retrieval

CLUE

CMAP and LINCS Unified Environment

CMAP

Connectivity Map

CTO

Community Training and Outreach

CyCIF

Cyclic Immunofluorescence

CyTOF

Cytometry by Time of Flight

DCIC

Data Coordination and Integration Center

DSGC

Data and Signature Generation Center

DSR

Data Science Research

DToxS

Drug Toxicity Signatures

DWG

Data Working Group

ECM

Extracellular Matrix

FAIR

Findable, Accessible, Interoperable, and Reusable

FFPE

Formalin-fixed, paraffin-embedded

FI

Fluorescence Imaging

GCP

Global Chromatin Profiling

GR

Growth rate inhibition

HMS

Harvard Medical School

ICV

Integrated Connectivity Viewer

IKE

Integrated Knowledge Environment

L1000CDS2

L1000 Characteristic Direction Search Engine

LDP

LINCS Data Portal

LINCS

Library of Integrated Cellular Signatures

LPDP

LINCS Proteomics Data Portal

ME

Microenvironment

MEMA

Microenvironment Microarray

MEP

Microenvironment Perturbagen

MIBBI

Minimum Information for Biological and Biomedical Investigations

MIC

Microscopy Imaging Commons

MOOC

Massive Online Open Course

MWA

Microwestern Array

OMERO

Open Microscopy Environment

PCCSE

Proteomic Characterization Center for Signaling and Epigenetics

QC

Quality Control

RPPA

Reverse Phase Protein Array

SMA

Spinal Muscle Atrophy

While there are many published works using LINCS data, particularly the L1000 small molecule perturbation dataset (Fallahi-Sichani et al., 2017; Iwata et al., 2017; Mirza et al., 2017; Wang et al., 2016b), a review of the transcriptomic efforts of the LINCS program in the context of Connectivity Map (Musa et al., 2017), and specifications for LINCS metadata (Vempati et al., 2014), there is no comprehensive description of the LINCS program in the literature to date. Given the pre-publication availability of many of the LINCS datasets, with associated tools designed to make the data accessible to both computational and experimental researchers, LINCS resources as a whole represent a tremendous community asset for scientists from a variety of disciplines. This Perspective is targeted toward computational systems biologists from academia, pharma and biotech who could use LINCS resources to enrich their own research programs given the wide-ranging applicability of LINCS program resources. Readers of this article should come away with a better understanding of what the NIH LINCS program encompasses, the LINCS program vision, what datasets are available, and where to access them, how these datasets were generated, including the novel technologies developed as part of the LINCS program, and the tools available to aid in the analysis of these datasets.

The LINCS Data and Signature Generation Centers

The LINCS Consortium supports six data and signature generation centers (DSGCs) and the BD2K-LINCS DCIC. Below we provide a summary of the DSGCs. Each center has distinct experimental approaches and strategies, which taken together, provide multiple, complementary facets of molecular perturbations.

The Drug Toxicity Signature (DToxS) LINCS Center

The Drug Toxicity Signature (DToxS) Generation LINCS Center at the Icahn School of Medicine at Mount Sinai in New York aims to generate cellular signatures that predict adverse effects of FDA-approved drugs, particularly cardiotoxity, hepatotoxcity, and peripheral neuropathy. These signatures may also be used to predict how co-administration of drug pairs may either exacerbate or mitigate adverse effects. Drugs and drug pairs are selected as perturbagens based on adverse events reporting of commonly-used FDA-approved drugs. The initial efforts of DToxS are focused on cardiotoxicity caused by cancer therapeutics, particularly heart failure and reduced ventricular function resulting from treatment with kinase inhibitors. The center uses genomic and high-throughput proteomic measurements (with proteomics experiments being conducted at the Center for Advanced Proteomics at Rutgers New Jersey Medical School) and medium-throughput experimental measurements of perturbagen-induced changes in protein state. Multiple levels of data analysis are performed, including: (1) correlating expression changes with clinical risk scores; (2) identifying common protein structural motifs; (3) predicting the biological processes involved in toxicity through network analysis; and (4) developing mechanistic understanding by integrating data with models that describe dynamics. The different levels of analysis can help to identify signatures for both toxicity and toxicity mitigation.

DToxS unique featured assay: Microwestern Array

The microwestern array (MWA) is a scalable miniaturization of a Western blot, performing either 24, 48 or 96 traditional western blots at a time. It is employed for measuring protein levels and post-translational modifications in a medium throughput format that allows for multiple conditions to be tested in parallel (Ciaccio and Jones, 2017). Similar to the reverse phase protein array (Tibes et al., 2006), in which whole cell lysates are applied to nitrocellulose slides, probed with antibodies, and measured via a chemiluminescent reporter, the microwestern has the additional step of electrophoretic protein separation by molecular weight. Briefly, cell lysates are spotted onto an acrylamide gel, subjected to electrophoresis, transferred to nitrocellulose, blotted with antibodies in a gasket apparatus to isolate samples, and then imaged and quantified via infrared fluorescent secondary antibodies.

The Harvard Medical School (HMS) LINCS Center

The HMS LINCS Center uses multiple measurement methods to collect data and compute signatures from cells exposed to small-molecule drugs and naturally occurring ligands. The Center emphasizes integration of imaging, mass spectrometry and transcript profiling approaches and collects both single cell and population-average data. The Center also develops data analysis pipelines and informatics systems, which are particularly important in the case of microscopy data for which few such systems exist. The research products of the HMS Center comprise new measurement technologies, multi-dimensional data sets and open source software.

The HMS Center focuses on kinase inhibitors and epigenome-modifying compounds as perturbagens because they are prototypical of drugs that aim for selective targeting of members of multi-gene families. The great majority of such drugs are active against several targets, and a key goal of the HMS Center is to characterize such poly-selectivity and understand the therapeutic implications. The majority of data in the HMS Center is collected from transformed cell lines, in part because they are simple to grow in large numbers, however human primary cells (e.g. fibroblasts for example) and iPSC-derived, trans-differentiated cells (e.g. cardiomyocytes) are also studied. Signatures and tools developed by the HMS Center are designed to help others analyze drug and ligand dose-response relationships, biomarkers of drug resistance and the origins and consequences of cell-to-cell variability. Signatures and systematic perturbagen-response data are frequently combined with follow-up studies in Center publications as a means to illustrate at least one use of LINCS data.

HMS LINCS unique featured assay: Cyclic Immunofluorescence

Developed by the HMS LINCS Center, Cyclic Immunofluorescence (CyCIF) is an open source single-cell imaging method that enables 20 to 30-plex immunofluorescence imaging of cells grown in culture and 60-plex imaging of formalin-fixed, paraffin-embedded (FFPE) tissues. Multiplexing is achieved by successive cycles of four to six-color staining and imaging followed by chemical inactivation of fluorophores. CyCIF uses standard reagents and instrumentation and is simple to implement (Lin et al., 2015; Lin et al., 2016) making it well-suited to high-throughput characterization of perturbagen response. CyCIF data can be analyzed using computational algorithms developed for other multiplex singe-cell methods, CyTOF mass spectrometry for example.

The LINCS Center for Transcriptomics

The Broad Institute’s LINCS Center for Transcriptomics and the Connectivity Map (CMAP) team aims to develop a comprehensive resource of cellular state expression signatures spanning millions of genetic and small-molecule perturbations. Genetic perturbations include CRISPR knock-out, shRNA knock-down, and open reading frame (ORF) overexpression. Each genetic or small-molecule perturbation readout includes approximately 1,000 genes and uses the L1000 assay, a high-throughput, low-cost gene expression profiling platform. This data is publicly available via databases and applications with attention to features that enable meaningful user interaction with the data to foster biological discovery. The CMAP team catalogs the connections between these gene expression signatures to find drugs, diseases, pathways and targets (Lamb et al., 2006). Systematic determination of cellular effects of application of a small molecule and monitoring of the downstream consequences of perturbing a gene of interest on a massive scale allows the research community to elucidate protein function, determine small-molecule mechanism of action, and dissect biological pathways in physiological and disease states.

LINCS Center for Transcriptomics unique featured assay: L1000 Transcriptomics

Developed by the LINCS Center for Transcriptomics, the L1000 transcriptomics assay directly measures 978 landmark transcripts from crude cell lysates with a ligation-mediated amplification method coupled with Luminex-based detection on 384-well plates (Subramanian et al., 2017). Given the highly correlated structure of gene expression, the non-measured transcriptome can be computationally inferred from these landmark genes that were selected to enable faithful reconstruction using an algorithm trained on a large set of complete transcriptomes. Additionally, 80 invariant genes are measured to allow for scaling and normalization. The method has been shown to be comparable to RNA-sequencing at a fraction of the cost and has produced a compendium of over a million publicly available cellular perturbation profiles.

The LINCS Proteomic Characterization Center for Signaling and Epigenetics (PCCSE) at the Broad Institute

The Broad Institute’s LINCS Proteomic Characterization Center for Signaling and Epigenetics (PCCSE) seeks to understand how changes in cellular signaling, transcriptional, and epigenetic states influence one another via feed-forward and feed-back processes. Perturbations to these states are induced by drug treatments or genetic manipulations (via CRISPR) of cancer cell models, neuronal lineages, or primary vascular cells. The PCCSE uses mass spectrometry-based targeted proteomics to assay cellular phosphosignaling and histone modification responses elicited by these perturbations. Given the strong focus on signaling and epigenetics, the drug and genetic perturbations being selected include a number of kinase inhibitors, epigenetically active drugs, and knockdowns of core chromatin modifying genes. These experiments, coupled with matched L1000 data obtained via collaboration with the LINCS Center for Transcriptomics, are designed to test the hypothesis that early cell signaling responses to perturbation may establish new cellular states by altering epigenetic landscapes.

LINCS PCCSE unique featured assays: P100 and GCP Proteomics

The LINCS PCCSE employs two unique targeted proteomics assays: the P100 and global chromatin profiling (GCP) proteomics platforms. P100 measures the level of 96 phosphopeptides that are commonly observed and modulated in diverse cell types (Abelin et al., 2016) and serves as a “sentinel assay” for the phosphosignaling state of the cell (Soste et al., 2014). By comparing P100 profiles of uncharacterized perturbations with those induced by drugs known to inhibit distinct signaling pathways, this method can infer specific alterations in cellular signaling states. Global chromatin profiling (GCP) is a similar targeted mass spectrometry-based assay that profiles global histone post-translational modifications in bulk chromatin (Gopal et al., 2016). The probes that monitor combinations of post-translational histone modifications serve to generate epigenetic signatures of genetic or small-molecule disruption of the epigenetic machinery. Data from both assays are compatible with the Connectivity Map framework developed by the LINCS Center for Transcriptomics to interrogate and integrate large-scale data sets.

The Microenvironment Perturbagen (MEP) LINCS Center

The MEP-LINCS Center aims to generate datasets and computational strategies to illuminate how combinatorial signals from the microenvironment (ME) affect intracellular molecular networks and their resultant cellular phenotypes. This is accomplished by profiling cells after treatment with select soluble and insoluble proteins (extracellular matrix proteins, growth factors, and cytokines) identified to be variably enriched in primary tissues. The MEP-LINCS center studies both transformed cell lines as well as normal human primary cells in order to learn how cell phenotypes change in response to growth with >2500 different microenvironmental perturbagen combinations. Regulatory relationships are inferred by extracting quantitative ME response phenotype features through multi-color imaging of biomarkers associated with specific cellular phenotypes, measurement of protein expression using RPPA (Tibes et al., 2006), and measurement of gene transcription using the L1000 platform (Subramanian et al., 2017). The development of analytical pipelines and data management systems for large-scale, high-content imaging data is a key component of the MEP-LINCS center, and this effort is coordinated with the HMS-LINCS and NeuroLINCS centers—both of which have large imaging components. Use cases of MEP-LINCS data include identification of microenvironmental perturbagens that elicit specific phenotypes, understanding how microenvironmental signals modulate response to therapy, and identification of cellular networks associated with particular cellular phenotypes.

MEP LINCS unique featured assay: MEMA Platform

The MEP-LINCS Center has developed a Microenvironment Microarray (MEMA) platform to systematically interrogate ME effects on cellular phenotypes (Lin et al., 2012; Watson et al., 2014). MEMA consists of thousands of unique combinations of soluble and insoluble ME-associated proteins. Insoluble proteins are printed on a solid substrate to form pads upon which cells can be grown while soluble factors are added to the culture medium within each well of a multi-well plate, yielding thousands of unique combinations of microenvironment perturbagens (MEPs). MEMA includes factors that are secreted by multiple different cell types, including macrophages, infiltrating lymphocytes, stromal fibroblasts, and vascular endothelial cells, though the platform could be customized to assess nearly any soluble or insoluble protein. Following incubation, cells are fixed, stained for endpoints targeting morphology, metabolism, cell cycle, nuclear activity, and differentiation status, and subjected to high-content imaging.

The NeuroLINCS Center

The NeuroLINCS Center currently studies the motor neuron diseases amyotrophic lateral sclerosis (ALS) and spinal muscular atrophy (SMA) by characterizing the molecular networks within patient-derived induced pluripotent stem cells (iPSCs) and their differentiated motor neuron progeny. However, it is expanding to develop a global cortical neuron assay from iPSCs that will be of interest to groups working in other neurodegenerative diseases such as Alzheimer’s. Assays used include high-throughput imaging, RNA-seq, ATAC-seq, and SWATH mass spectrometry to inspect differences in the cellular physiological, transcriptomic, epigenetic, and proteomic landscapes, respectively, of ALS and SMA patient neurons compared to unaffected controls. Perturbations include ALS- and SMA-relevant mutations in patient-derived iPSCs as well as additional chemical perturbations. NeuroLINCS is a collaborative effort between multiple research groups at University of California, Irvine, Cedars-Sinai Medical Center, the Gladstone Institute, MIT, and Johns Hopkins University. It combines expertise in iPSC technology, disease modeling, transcriptomics, epigenomics, metabolomics, proteomics, whole genome sequencing, cell-based assays, bioinformatics, statistics, and computational biology. A unique feature of NeuroLINCS is the bioinformatics integration of the diverse high-throughput ‘omic’ data sets to provide a network based understanding of underlying pathways. In addition, the NeuroLINCS Center is collaborating with Google to integrate signatures across platforms into highly predictive models of responses to perturbagens using machine learning.

NeuroLINCS unique featured assay: Automated Imaging System to Track Live Neurons

The NeuroLINCS Center uses automated robotic microscopy to individually track live neurons to relate physiological changes in the cell over time to the fate of the cell (Arrasate and Finkbeiner, 2005; Finkbeiner et al., 2015). This high-throughput, high-content imaging technique allows for the generation of predictive dynamic models of cell fate. The models function as LINCS signatures used to understand disease pathways and to help understand the relationship between -omics signatures and cellular phenotypes.

The BD2K-LINCS Data Coordination and Integration Center

The Big Data to Knowledge (BD2K) Data Coordination and Integration Center (DCIC) for LINCS consists of four major components: Integrated Knowledge Environment (IKE), Data Science Research (DSR), Community Training and Outreach (CTO), and Consortium Coordination and Administration (CCA). The IKE is enabling federated access, intuitive querying, and integrative analysis and visualization across all LINCS resources and many additional external data types from other relevant resources. The IKE resources are built on the infrastructure, analysis tools, and data that were established in the LINCS pilot phase. For the DSR component, the DCIC is managing several internal research projects and supports several external data science research projects, addressing various data integration and intracellular molecular regulatory network challenges. The CTO efforts of the DCIC have established several educational resources including a LINCS massive online open course (MOOC) on Coursera (https://www.coursera.org/learn/bd2k-lincs), a new PhD track in Big Data Biostatistics at the University of Cincinnati, and an intensive 10-week summer research training program for graduate and undergraduate students. In addition, the DCIC is initiating and supporting diverse collaborative projects that leverage LINCS resources and disseminate LINCS data and tools. One effort central within LINCS is to standardize workflows and pipelines to ensure that they can be version controlled, sharable, and evaluated. This is achieved by publishing pipelines as Notebooks (Shen, 2014; Wang and Ma’ayan, 2016) and containerizing pipelines using platforms such as Docker (Merkel, 2014). The Center also aims to develop and deploy a next-generation computational infrastructure and novel analysis tools and methods that enable researchers to glean new insights from integrative models of biological systems while linking complex diseases/phenotypes with drugs and the pathways targeted by those drugs in different cells and tissues.

LINCS Program Websites, Portals and Databases

Sites that cover content from the entire LINCS Consortium

lincsproject.org: Informational site about the LINCS Program

The entry point to access data and information about the LINCS Program is http://www.lincsproject.org — a central hub for both the research community and general public. This website, along with the LINCS Data Portal, contains details about the assays, cell types, and perturbagens that are currently part of the library, as well as the LINCS DSGCs and DCIC, LINCS-related publications, news, events, video tutorials, workflows, and software that can be used for analyzing LINCS data. The LINCS Mobile App, available on iTunes and Google Play, is a streamlined version of the lincsproject.org website.

The LINCS Data Portal (LDP): Consolidated access to LINCS data and metadata

The LINCS Data Portal, available at http://lincsportal.ccs.miami.edu/, provides comprehensive access to LINCS data, including transcriptomics, binding, imaging, proteomics, and epigenomics datasets. Users can browse available datasets by experimental method, type of data collected, LINCS Center name, individual projects, and relevant biological processes. Users can also conduct searches, for example, by cell line, small molecule, gene, or protein of interest. All data is publicly available and no account is required for access. Datasets may be added to a checkout basket to be later consolidated for a downloadable package in a compressed file containing both the data and the relevant metadata. Each LINCS dataset has its own landing page with metadata details, analysis methods, and a data level designation. Data level designations range from 1 to 4, with raw data assigned level 1, and fully processed signatures are level 4. The data level concept was adopted from The Cancer Genome Atlas (TCGA) project (Akbani et al., 2014), with data level definitions adjusted and customized for data originating from different assays. Table 1 summarizes LINCS data and signature generation center assays and datasets available on the LINCS Data Portal as of September 1, 2017.

Table 1.

LINCS Data and Signature Generation Centers Assays and Datasets

Data and Signature Generation Center Assay Cell Types Perturbations Signatures Data Points Data Level Latest Release*
HMS LINCS Center KINOMEscan 0 147 small molecule 67,000 82,000 2,3 April 2015
KiNativ 8 29 small molecule 50 8,000 2,3 Jan 2016
ELISA Protein State 39 15 growth factor 1,200 9,000 3 Sep 2013
Bead-based Immunoassay 7 12 small molecule
10 ligand
230 66,000 3 Aug 2015
RPPA Protein State 10 5 small molecule 140 720,000 2,3,4 April 2015
FI Cell Count 54 159 small molecule 5,000 150,000 2,3,4 Dec 2015
FI Cell Viability 7 17 small molecule 700 3.4 × 107 3 May 2017
FI Apoptosis 112 167 small molecule 200 145,000 1,2,3,4 July 2015
FI Protein State 42 8 small molecule
5 ligand
2,400 128,000 2,3 Sep 2015
FI Cell Cycle State 18 9 small molecule 1,200 152,000 1,2 July 2011
FI Morphology 6 2 small molecule 36 3,000 3 Dec 2015

DToxS Center RNA-seq 4 25 small molecule
1 antibody
120 9.2 × 106 1,2,3 March 2016
Mass Spectrometry 2 15 small molecule 20 155,000 1,2,3 March 2016

LINCS Center for Transcriptomics L1000 53 25,581 small molecule
978 genetic
476,000 1.3 × 109 1,2,3,4 Dec 2015

LINCS PCCSE P100 8 105 small molecule 700 200,000 3,4 July 2016
GCP 7 90 small molecule 475 84,000 3 Aug 2016

MEP LINCS MEMA 3 2945 microenvironment 27,000 3.4 × 106 1,2,3,4 March 2016

NeuroLINCS RNA-seq 20 disease background 6 3.2 × 106 3,4 May 2017
SWATH-MS 19 disease background 6 197,000 1,2,3,4 Sep 2016
ATAC-seq 18 disease background 6 2.0 × 107 2,3,4 May 2017

FI = Fluorescence Imaging

*

as of September 1, 2017

DSGCs official center web sites

In addition to accessing LINCS data and tools through lincsproject.org and the LINCS Data Portal, each center has an official website providing additional center-specific information. Centers have made tools available on these sites to allow users to uniquely explore the data through web-based applications, command line interfaces and APIs.

Clue.io: Platform to interface with the L1000, P100, and GCP data

CMap and LINCS Unified Environment (CLUE), accessible at http://clue.io, is a computational environment designed to be executed on the cloud with pre-loaded gene expression profiles and perturbation datasets, analytical tools, and web applications. CLUE serves data generated and applications developed by the two LINCS Broad DSGCs. The aim of CLUE is to facilitate engagement with the highly dimensional L1000, P100, and GCP data by users who lack a substantial computational background. Access to this data is provided by user-friendly web-based applications, as well as command line interfaces and APIs. Free accounts are available to academic and not-for-profit users.

HMS LINCS Database: Access to data and tools produced by the HMS LINCS Center

The HMS LINCS Website and its associated database (http://lincs.hms.harvard.edu/) provide a public point of access to all data generated by the HMS LINCS Center as well as experimental protocols and computational tools developed by the Center. Users can search by cellular context, perturbagens, proteins, antibodies, and other reagents, as well as by PubMed identification number for datasets relevant to specific HMS LINCS publications. For selected datasets, specialized browsers and visualization tools are available. Programmatic access to all data is possible via an API.

MEP LINCS Synapse Site: Access to workflows, tools, and analysis from the MEP LINCS Center

The MEP LINCS Synapse site (https://www.synapse.org/mep_lincs) offers access to raw and processed imaging data, molecular profiling data, data descriptions, assay overviews, and protocols. Interactive dashboards are available for several profiled cancer cell lines, as well as interactive reports for each staining set and cell line combination. There is an also a MEP data explorer that allows for box and scatter plot visualizations for experiments involving each cell line. A free account is required to access data and tools.

The DToxS Web Portal: Access to data and tools produced by the DToxS LINCS Center

The DToxS web portal (http://dtoxs.org) provides access to experimental datasets by data type—including both transcriptomic and proteomic—and allows users to browse metadata by cell lines and drugs, which are classified as either offending drugs or toxicity-mitigating drugs. Users can also access the DToxS Center standard operating procedures for cell culture, assays, and computational analysis. Registration is required and is open to all users.

The NeuroLINCS Center site: People, Technologies, Publications, Data and Tools

The NeuroLINCS informational website (http://www.neurolincs.org/) provides an overview of the entire NeuroLINCS pipeline, which starts with the development of iPSC cell lines from patients and the propagation for analysis by the various NeuroLINCS Center laboratories. The site features the investigators, the unique technologies the center employs, publications, and links to datasets and tools. Instructions for principle investigators wishing to request access to NeuroLINCS data not publicly released through the LINCS Data Portal may be found through this site.

The LINCS Panorama Repository: Access to Targeted Proteomics Data from the PCCSE

The LINCS Panorama Repository (https://panoramaweb.org/labkey/lincs.url) allows users to access P100 and GCP profiles in matrix form and download custom subsets of profile data (i.e., by drug(s), cell type(s), etc.). Users may also access the primary mass spectrometry data in the form of Skyline documents, the standard in the field for analysis of targeted proteomics. Panorama serves as PCCSE’s cloud computing infrastructure, and complete data QC and normalization pipeline for targeted proteomics data generated by PCCSE is executed programmatically in Panorama to ensure uniform data treatment.

LINCS Tools and Workflows

The LINCS consortium has developed software tools and analysis platforms to facilitate interaction with LINCS data, including visualization and analysis of LINCS data across LINCS data types, and in the context of other data (Table 2). Below we highlight some of the LINCS tools and platforms. Each of these tools and platforms may also be accessed through lincsproject.org.

Table 2.

Selected LINCS Software Tools

Tool Summary Website
iLINCS Transcriptomics and proteomics dataset analysis tools for a biologist audience http://www.ilincs.org/
piLINCS Facilitates proteomics data access, integrated with iLINCS http://www.pilincs.org
GR Calculator, GR Metrics, GR Browser Access and visualize LINCS dose-response data, upload user data for analysis http://www.grcalculator.org/grbrowser/
LINCS Proteomics Data Portal Proteomics data access and analysis, integrated with iLINCS http://lincsproteomics.org
L1000CDS2 LINCS perturbation search engine that returns signatures that either reverse or mimic a user input signature http://amp.pharm.mssm.edu/L1000CDS2/
HMS Breast Cancer Browser Access to LINCS data relevant to breast cancer http://www.cancerbrowser.org/
CLUE Platform A collection of apps that query, visualize, analyze LINCS transcriptomics and proteomics data https://clue.io/
OMICS Integrator Identify candidate molecular pathways underlying integrated ‘omics data http://fraenkel-nsf.csbi.mit.edu/omicsintegrator/
Enrichr Gene set search engine http://amp.pharm.mssm.edu/Enrichr
Harmonizome Omics integration tool http://amp.pharm.mssm.edu/Harmonizome/
Slicr L1000 signature store http://amp.pharm.mssm.edu/Slicr
LINCS Canvas Browser L1000 visualization http://www.maayanlab.net/LINCS/LCB/
SEP-L1000 Side effect predictions for LINCS small molecules http://maayanlab.net/SEP-L1000

iLINCS: An integrated system to analyze LINCS and other data

iLINCS (http://www.ilincs.org/) is a web portal aimed at providing LINCS transcriptomic and proteomic dataset analysis tools to a biologist audience. Users are guided through four pipelines: Genes, Datasets, Signatures, and Maps. The Genes pipeline allows users to input a gene set of interest, and then to select a LINCS or non-LINCS dataset and signatures to query, analyze, and visualize. The Datasets pipeline permits users to construct a signature from a selected dataset, perform functional enrichment and pathway analysis on that signature, and search for concordant (or discordant) LINCS and non-LINCS signatures. The Signatures pipeline also allows users to upload or select a signature, identify similar signatures and analyze their common features, including the biological pathways that may underly groups of similar signatures. The Maps pipeline permits users to select a 2D map of two signature libraries, mine relationships among these signatures and analyze the features of similar signatures.

piLINCS: An integrated system to analyze LINCS proteomics data

piLINCS (http://www.pilincs.org) provides both interactive and API access to the proteomics data generated by the LINCS program, including P100 and GCP profiles that are automatically imported from Panorama. Users can filter profiles based on assay, cell type, perturbation, and dose; merge multiple profiles; download datasets in multiple formats; and send selected profiles to iLINCS for further analysis. In addition to processed protein profiles, raw data is also available with links to the associated Panorama chromatograms.

GR Metrics, GR Calculator and GR Browser: visualize and calculate dose-dependent sensitivities of cancer cell lines

The GR Calculator (http://www.grcalculator.org/grbrowser/) allows users to browse, visualize, and download LINCS dose-response datasets and to upload their own data for analysis. The GR Browser supports analysis of dose-response data using growth rate inhibition (GR) metrics, which correct for the effects of variation in growth rate in the evaluation of drug response(Hafner et al., 2016a; Hafner et al., 2017a; Hafner et al., 2017b; Niepel et al., 2017). Growth rate-corrected response metrics such as GR50, GRmax, and GRAOC are more robust and biologically informative than traditional metrics such as IC50 and Emax and are used to evaluate drug response in the LINCS MCF10A Common Project, the MEP-HMS LINCS Joint Project, the HMS LINCS Seeding Density Project, and the Broad-HMS LINCS Joint Project datasets (Hafner et al., 2016b). GR and traditional response metrics can be visualized at the GR Calculator as scatterplots and dose-response grids sorted for drug or cell types. LINCS Dose-response data can also be downloaded by users and GR tools are available as stand-alone algorithms for integration into BioConductor and other off-line pipeline applications.

LINCS Proteomics Data Portal (LPDP)

LPDP (http://lincsproteomics.org) has been developed by DCIC in coordination with the LINCS proteomics community to facilitate exploration of LINCS proteomic data and related resources. The portal facilitates finding and downloading proteomic data, provides tailored “data-centric” and “assay (or annotation)-centric” views, and integrates tailored tools, e.g., piLINCS and piNET, in order to facilitate navigation, search, and interpretation of LINCS proteomics data.

L1000CDS2: LINCS L1000 Characteristic Direction Signatures Search Engine

L1000CDS2 (http://amp.pharm.mssm.edu/L1000CDS2/) is a search engine that returns LINCS L1000 drug perturbation signatures that either mimic or reverse a user-input signature of gene symbols (Duan et al., 2016). The search is completed against a subset of the L1000 data in which the differentially expressed genes were calculated using the characteristic direction method (Clark et al., 2014). The user either enters a list of gene symbols that are up- and down-regulated, or a list of gene symbols and associated expression values. The search engine then returns the top 50 perturbation conditions that either mimic or reverse the input.

HMS Breast Cancer Browser

The HMS LINCS Breast Cancer Browser (http://www.cancerbrowser.org/) is a web interface that provides users access to LINCS data pertinent to breast cancer biology and breast cancer drug response. Users may easily filter datasets based on cell line, receptor status, molecular subtype, and mutation status. The user may also filter datasets based on perturbagen development phase, target gene, target gene class, target pathway, and target biological function. Available datasets include growth factor-induced pAKT/pERK response assays, drug dose-response profiles, and basal RTK phosphorylation profiles, as well as total protein mass spectrometry and phosphoprotein mass spectrometry datasets.

CLUE Platform: an integrated system to analyze LINCS data collected by the Broad Institute DSGCs

The Clue.io platform (https://clue.io/) includes Query app, Touchstone app, Repurposing app, Morpheus app, and Integrated Connectivity Viewer App. The Touchstone app provides access to approximately 5,000 genetic and small-molecule perturbagens that are well characterized. The Touchstone dataset can thus serve as a benchmark for exploring connectivities among perturbagens. The Integrated Connectivity Viewer app (ICV) visualizes connectivity data as an interactive heatmap so that users can intuitively explore relationships within the data. The Morpheus app provides an additional level of interaction to ICV by allowing users to manipulate and annotate a user-provided or existing dataset. The Repurposing app is a drug repurposing tool that accesses the Broad Institute’s collection of over 5,000 compounds that have approved clinical indications and known safety profiles. Users can filter these compounds by mechanism of action, target, disease area, clinical phase, and vendor. The Query app enables users to find positive and negative connections between a user gene signature and all signatures in CMAP.

Other tools

Other tools and interfaces that have been developed by the consortium or customized for LINCS data include OMICSIntegrator (http://fraenkel-nsf.csbi.mit.edu/omicsintegrator/) which can be used to identify putative molecular pathways underlying integrated ‘omics data (Tuncbag et al., 2016); Enrichr (http://amp.pharm.mssm.edu/Enrichr), which is a search engine for gene sets (Chen et al., 2013; Kuleshov et al., 2016); and Harmonizome (http://amp.pharm.mssm.edu/Harmonizome/), a database that serves integrative ‘omics data from 66 resources, including LINCS (Rouillard et al., 2016). Slicr (http://amp.pharm.mssm.edu/Slicr), is a signature store for L1000 signatures, whereas LINCS Canvas Browser (http://www.maayanlab.net/LINCS/LCB/) visualizes L1000 on canvases that cluster signatures based on their expression vector similarity (Duan et al., 2014). Additional LINCS-specific tools include SEP-L1000 (http://maayanlab.net/SEP-L1000), an interface for predictions of side effects for >20,000 LINCS-profiled small molecules (Wang et al., 2016a); Panorama, a web-based platform for storing, sharing, and analyzing proteomics data (Sharma et al., 2014) analyzed by Skyline (MacLean et al., 2010), a Windows-based client commonly used for processing data from proteomics experiments; and the Open Microscopy Environment (OMERO), an image data management platform that allows users to view, organize, analyze, and share imaging data securely with collaborators using a variety of different permission levels (Allan et al., 2012). The OMERO platform has been adopted by the LINCS consortium to manage the images that are collected at various DSGCs; and several members of the consortium are working on developing a Microscopy Imaging Commons (MIC), an open environment to store and share microscopy images.

LINCS Data Findability, Accessibility, Interoperability, Reproducibility and Reusability

Metadata and Data Standards

The LINCS Data Working Group has established metadata standards for LINCS reagents, assays, and experiments in order to ensure cross-consortium data compatibility. Annotations for perturbagens, such as small molecules, siRNAs, growth factors and other ligands, cells, and some elements of experimental metadata, are standardized across all LINCS data and signature generation centers. Harmonization of these annotations facilitates data analysis, data formatting, and data visualization strategies being used by the LINCS community. The standardized metadata is also important for the development of databases and data repositories that store and share LINCS data. Led by the DCIC and in collaboration with the Center for Expanded Data Annotation and Retrieval (CEDAR) (Musen et al., 2015), the consortium is developing automated mechanisms to capture metadata when it is produced through customized web forms. Furthermore, the LINCS DWG developed standards to annotate all LINCS assay protocols, data analysis strategies, and datasets; these standards are deposited into BioSharing (McQuilton et al., 2016), a repository for metadata standards. The DWG efforts are aligned with standards developed by other groups, such as the Investigation/Study/Assay ISA infrastructure project (Rocca-Serra et al., 2010), Minimum Information for Biological and Biomedical Investigations (MIBBI) efforts (Taylor et al., 2008), and NCBI PubChem (Bolton et al., 2008).

Data Processing and Analysis Pipelines

In an effort to make the process of generating LINCS data completely transparent, DCIC is leading an effort in validating, documenting and releasing all computational processing pipelines used by DSGS’s to process and analyze LINCS data. Two main objectives of the effort are to 1) Facilitate de-novo reconstruction of computational pipelines by completely describing all processing steps; and 2) Facilitate the re-use of computational pipelines developed and used by DSGC’s. To facilitate de-novo reconstruction, DCIC is leading an effort to provide assay-specific standards for documenting all details of data processing and analysis. To facilitate the re-use of existing pipelines, DCIC works closely with DSGCs to test and validate all computational pipelines. Whenever possible, DCIC is also creating and releasing ready-to-use Docker containers implementing individual processing and analysis pipelines.

Data Availability

In general, all LINCS data is available to all, pre-publication, with no restrictions. The only exception is the raw RNA-seq data collected from patients which will be made available for download from dbGAP to adhere to patient privacy protocols. Some web-sites developed by the data generation centers, such as clue.io, which is hosting the L1000 data and related tools, the DToXS data portal, or Synapse, which is hosting the MEP LINCS data and tools, require login and an account. However, versions of the data are made openly available by the DCIC, and through public repositories such as GEO. The LINCS consortium established a data release policy which is available at: http://lincsproject.org/LINCS/data/release-policy.

The MCF10A Common Project

As the biomedical research community is greatly concerned with the reproducibility of molecular and cell biology studies (Begley, 2013), the LINCS investigators jointly work on cross-center projects. The consortium is currently working toward identifying the molecular networks that determine how MCF10A cells receive and integrate external signals in ways that influence MCF10A physiology such as proliferation, differentiation, and motility. This is a dynamic process that can best be approached by integrating information from the unique assays that are conducted by several LINCS Data and Signature Generation Centers. It is likely that responses involve transcriptional and proteomic network reconfigurations in the first few hours, followed by epigenomic changes that are induced in hours to days. The consortium has decided to test responses of eight diverse drugs—those that are known to target different canonical pathways—and apply these in different concentrations while measuring cell viability, mRNA levels, protein levels, epigenetic markers, and cell morphology under different concentrations and time points. Understanding the extent of agreement among these assays in perturbing canonical pathways will allow the consortium to evaluate the extent to which the assays and data across centers allows for reproducible and cohesive findings.

Conclusions

The six LINCS DSGCs are producing unique datasets and resources using state-of-the-art high-throughput transcriptomics, proteomics, epigenomics, biochemical assays, and imaging data. This data is collected by the unique assays such as L1000 and RNA-seq for the transcriptomics; P100, GCP, RPPA, microwestern, and other expression proteomics techniques; and MEP, CycIF, and other methods for cell phenotypic and morphological imaging. These data are organized, analyzed, visualized, and integrated jointly by the DSGCs and the DCIC who actively develop metadata standards, tools, and portals (Fig. 2). Along these lines the LINCS consortium is centrally involved in making LINCS datasets and tools adhere to the findable, accessible, interoperable and reusable (FAIR) principles (Wilkinson et al., 2016b). The consortium is working on developing specific evaluations for LINCS datasets and tools so those are maximally FAIR. The overall aim of LINCS is to produce a long-term resource that will assist individual investigators to form novel hypotheses about the inner workings of human cells and tissues in normal physiology and in disease.

Acknowledgments

The authors of this article are partially supported by the NIH grants U54HL127624 (BD2K-LINCS, DCIC), U54HG008098 (DToxS), U54HL127366 (LINCS Center for Transcriptomics), U54HG008097 (LINCS PCCSE), U54NS091046 (NeuroLINCS), U54HL127365 (HMS LINCS) and U54HG008100 (MEP-LINCS).

Footnotes

Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

References

  1. Abelin JG, Patel J, Lu X, Feeney CM, Fagbami L, Creech AL, Hu R, Lam D, Davison D, Pino L. Reduced-representation phosphosignatures measured by quantitative targeted MS capture cellular states and enable large-scale comparison of drug-induced phenotypes. Molecular & Cellular Proteomics. 2016;15:1622–1641. doi: 10.1074/mcp.M116.058354. [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Akbani R, Ng PKS, Werner HM, Shahmoradgoli M, Zhang F, Ju Z, Liu W, Yang J-Y, Yoshihara K, Li J. A pan-cancer proteomic perspective on The Cancer Genome Atlas. Nature communications. 2014;5 doi: 10.1038/ncomms4887. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Allan C, Burel JM, Moore J, Blackburn C, Linkert M, Loynton S, MacDonald D, Moore WJ, Neves C, Patterson A. OMERO: flexible, model-driven data management for experimental biology. Nature methods. 2012;9:245–253. doi: 10.1038/nmeth.1896. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Arrasate M, Finkbeiner S. Automated microscope system for determining factors that predict neuronal fate. Proceedings of the National Academy of Sciences of the United States of America. 2005;102:3840–3845. doi: 10.1073/pnas.0409777102. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Begley CG. Reproducibility: six red flags for suspect work. Nature. 2013;497:433–434. doi: 10.1038/497433a. [DOI] [PubMed] [Google Scholar]
  6. Bolton EE, Wang Y, Thiessen PA, Bryant SH. PubChem: integrated platform of small molecules and biological activities. Annual reports in computational chemistry. 2008;4:217–241. [Google Scholar]
  7. Chen EY, Tan CM, Kou Y, Duan Q, Wang Z, Meirelles GV, Clark NR, Ma’ayan A. Enrichr: interactive and collaborative HTML5 gene list enrichment analysis tool. BMC bioinformatics. 2013;14:128. doi: 10.1186/1471-2105-14-128. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Ciaccio MF, Jones RB. Microwestern Arrays for Systems-Level Analysis of SH2 Domain-Containing Proteins. SH2 Domains: Methods and Protocols. 2017:453–473. doi: 10.1007/978-1-4939-6762-9_27. [DOI] [PubMed] [Google Scholar]
  9. Clark NR, Hu KS, Feldmann AS, Kou Y, Chen EY, Duan Q, Ma’ayan A. The characteristic direction: a geometrical approach to identify differentially expressed genes. BMC bioinformatics. 2014;15:79. doi: 10.1186/1471-2105-15-79. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Duan Q, Flynn C, Niepel M, Hafner M, Muhlich JL, Fernandez NF, Rouillard AD, Tan CM, Chen EY, Golub TR. LINCS Canvas Browser: interactive web app to query, browse and interrogate LINCS L1000 gene expression signatures. Nucleic acids research. 2014:gku476. doi: 10.1093/nar/gku476. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Duan Q, Reid SP, Clark NR, Wang Z, Fernandez NF, Rouillard AD, Readhead B, Hodos R, Tritsch S, Hafner M, et al. L1000CDS2: LINCS L1000 Characteristic Direction Signatures Search Engine. npj Systems Biology and Applications. 2016;2:16015. doi: 10.1038/npjsba.2016.15. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Fallahi-Sichani M, Becker V, Izar B, Baker GJ, Lin JR, Boswell SA, Shah P, Rotem A, Garraway LA, Sorger PK. Adaptive resistance of melanoma cells to RAF inhibition via reversible induction of a slowly dividing de-differentiated state. Molecular systems biology. 2017;13:905. doi: 10.15252/msb.20166796. [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Finkbeiner SM, Ando DM, Daub AC. Automated robotic microscopy systems. Google Patents 2015 [Google Scholar]
  14. Gopal S, Creech A, Officer A, Egri S, Davison D, Jaffe JD, Jaffe IZ. The Impact of Chemotherapeutic Agents on Signaling and Epigenetics in Vascular Endothelial Cells. Am Soc Hematology 2016 [Google Scholar]
  15. Hafner M, Niepel M, Chung M, Sorger PK. Growth rate inhibition metrics correct for confounders in measuring sensitivity to cancer drugs. Nature methods. 2016a;13:521–527. doi: 10.1038/nmeth.3853. [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Hafner M, Niepel M, Chung M, Sorger PK. Growth rate inhibition metrics correct for confounders in measuring sensitivity to cancer drugs. Nature methods. 2016b;13:521–527. doi: 10.1038/nmeth.3853. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Hafner M, Niepel M, Sorger PK. Alternative drug sensitivity metrics improve preclinical cancer pharmacogenomics. Nature biotechnology. 2017a;35:500–502. doi: 10.1038/nbt.3882. [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Hafner M, Niepel M, Subramanian K, Sorger PK. Designing Drug-Response Experiments and Quantifying their Results. Current protocols in chemical biology. 2017b;9:96–116. doi: 10.1002/cpch.19. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Iwata M, Sawada R, Iwata H, Kotera M, Yamanishi Y. Elucidating the modes of action for bioactive compounds in a cell-specific manner by large-scale chemically-induced transcriptomics. Scientific reports. 2017;7:40164. doi: 10.1038/srep40164. [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Kuleshov MV, Jones MR, Rouillard AD, Fernandez NF, Duan Q, Wang Z, Koplev S, Jenkins SL, Jagodnik KM, Lachmann A. Enrichr: a comprehensive gene set enrichment analysis web server 2016 update. Nucleic acids research. 2016:gkw377. doi: 10.1093/nar/gkw377. [DOI] [PMC free article] [PubMed]
  21. Lamb J, Crawford ED, Peck D, Modell JW, Blat IC, Wrobel MJ, Lerner J, Brunet JP, Subramanian A, Ross KN, et al. The Connectivity Map: Using Gene-Expression Signatures to Connect Small Molecules, Genes, and Disease. Science. 2006;313:1929–1935. doi: 10.1126/science.1132939. [DOI] [PubMed] [Google Scholar]
  22. Lin C-H, Lee JK, LaBarge MA. Fabrication and use of microenvironment microarrays (MEArrays) JoVE (Journal of Visualized Experiments) 2012:e4152–e4152. doi: 10.3791/4152. [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Lin J-R, Fallahi-Sichani M, Sorger PK. Highly multiplexed imaging of single cells using a high-throughput cyclic immunofluorescence method. Nature communications. 2015;6 doi: 10.1038/ncomms9390. [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. Lin JR, Fallahi-Sichani M, Chen JY, Sorger PK. Cyclic Immunofluorescence (CycIF), A Highly Multiplexed Method for Single-cell Imaging. Current Protocols in Chemical Biology. 2016:251–264. doi: 10.1002/cpch.14. [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. MacLean B, Tomazela DM, Shulman N, Chambers M, Finney GL, Frewen B, Kern R, Tabb DL, Liebler DC, MacCoss MJ. Skyline: an open source document editor for creating and analyzing targeted proteomics experiments. Bioinformatics. 2010;26:966–968. doi: 10.1093/bioinformatics/btq054. [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Margolis R, Derr L, Dunn M, Huerta M, Larkin J, Sheehan J, Guyer M, Green ED. The National Institutes of Health’s Big Data to Knowledge (BD2K) initiative: capitalizing on biomedical big data. Journal of the American Medical Informatics Association. 2014;21:957–958. doi: 10.1136/amiajnl-2014-002974. [DOI] [PMC free article] [PubMed] [Google Scholar]
  27. McQuilton P, Gonzalez-Beltran A, Rocca-Serra P, Thurston M, Lister A, Maguire E, Sansone S-A. BioSharing: curated and crowd-sourced metadata standards, databases and data policies in the life sciences. Database: the journal of biological databases and curation. 2016;2016 doi: 10.1093/database/baw075. [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Merkel D. Docker: lightweight linux containers for consistent development and deployment. Linux Journal. 2014;2014:2. [Google Scholar]
  29. Mirza N, Sills GJ, Pirmohamed M, Marson AG. Identifying new antiepileptic drugs through genomics-based drug repurposing. Human molecular genetics. 2017;26:527–537. doi: 10.1093/hmg/ddw410. [DOI] [PubMed] [Google Scholar]
  30. Musa A, Ghoraie LS, Zhang SD, Glazko G, Yli-Harja O, Dehmer M, Haibe-Kains B, Emmert-Streib F. A review of connectivity map and computational approaches in pharmacogenomics. Brief Bioinform. 2017 doi: 10.1093/bib/bbx023. [DOI] [PMC free article] [PubMed]
  31. Musen MA, Bean CA, Cheung KH, Dumontier M, Durante KA, Gevaert O, Gonzalez-Beltran A, Khatri P, Kleinstein SH, O’Connor MJ. The center for expanded data annotation and retrieval. Journal of the American Medical Informatics Association. 2015;22:1148–1152. doi: 10.1093/jamia/ocv048. [DOI] [PMC free article] [PubMed] [Google Scholar]
  32. Niepel M, Hafner M, Chung M, Sorger PK. Measuring Cancer Drug Sensitivity and Resistance in Cultured Cells. Current protocols in chemical biology. 2017;9:55–74. doi: 10.1002/cpch.21. [DOI] [PMC free article] [PubMed] [Google Scholar]
  33. Rocca-Serra P, Brandizi M, Maguire E, Sklyar N, Taylor C, Begley K, Field D, Harris S, Hide W, Hofmann O. ISA software suite: supporting standards-compliant experimental annotation and enabling curation at the community level. Bioinformatics. 2010;26:2354–2356. doi: 10.1093/bioinformatics/btq415. [DOI] [PMC free article] [PubMed] [Google Scholar]
  34. Rouillard AD, Gundersen GW, Fernandez NF, Wang Z, Monteiro CD, McDermott MG, Ma’ayan A. The harmonizome: a collection of processed datasets gathered to serve and mine knowledge about genes and proteins. Database. 2016;2016:baw100. doi: 10.1093/database/baw100. [DOI] [PMC free article] [PubMed] [Google Scholar]
  35. Sharma V, Eckels J, Taylor GK, Shulman NJ, Stergachis AB, Joyner SA, Yan P, Whiteaker JR, Halusa GN, Schilling B. Panorama: a targeted proteomics knowledge base. Journal of proteome research. 2014;13:4205–4210. doi: 10.1021/pr5006636. [DOI] [PMC free article] [PubMed] [Google Scholar]
  36. Shen H. Interactive notebooks: Sharing the code. Nature. 2014;515:151. doi: 10.1038/515151a. [DOI] [PubMed] [Google Scholar]
  37. Soste M, Hrabakova R, Wanka S, Melnik A, Boersema P, Maiolica A, Wernas T, Tognetti M, von Mering C, Picotti P. A sentinel protein assay for simultaneously quantifying cellular processes. Nat Meth. 2014;11:1045–1048. doi: 10.1038/nmeth.3101. [DOI] [PubMed] [Google Scholar]
  38. Subramanian A, Narayan R, Corsello SM, Peck DD, Natoli TE, Lu X, Gould J, Davis JF, Tubelli AA, Asiedu JK. A Next Generation Connectivity Map: L1000 Platform And The First 1,000,000 Profiles. bioRxiv. 2017:136168. doi: 10.1016/j.cell.2017.10.049. [DOI] [PMC free article] [PubMed] [Google Scholar]
  39. Taylor CF, Field D, Sansone SA, Aerts J, Apweiler R, Ashburner M, Ball CA, Binz PA, Bogue M, Booth T. Promoting coherent minimum reporting guidelines for biological and biomedical investigations: the MIBBI project. Nature biotechnology. 2008;26:889–896. doi: 10.1038/nbt.1411. [DOI] [PMC free article] [PubMed] [Google Scholar]
  40. Tibes R, Qiu Y, Lu Y, Hennessy B, Andreeff M, Mills GB, Kornblau SM. Reverse phase protein array: validation of a novel proteomic technology and utility for analysis of primary leukemia specimens and hematopoietic stem cells. Molecular cancer therapeutics. 2006;5:2512–2521. doi: 10.1158/1535-7163.MCT-06-0334. [DOI] [PubMed] [Google Scholar]
  41. Tuncbag N, Gosline SJ, Kedaigle A, Soltis AR, Gitter A, Fraenkel E. Network-Based Interpretation of Diverse High-Throughput Datasets through the Omics Integrator Software Package. PLoS Comput Biol. 2016;12:e1004879. doi: 10.1371/journal.pcbi.1004879. [DOI] [PMC free article] [PubMed] [Google Scholar]
  42. Vempati UD, Chung C, Mader C, Koleti A, Datar N, Vidovic D, Wrobel D, Erickson S, Muhlich JL, Berriz G, et al. Metadata Standard and Data Exchange Specifications to Describe, Model, and Integrate Complex and Diverse High-Throughput Screening Data from the Library of Integrated Network-based Cellular Signatures (LINCS) Journal of biomolecular screening. 2014;19:803–816. doi: 10.1177/1087057114522514. [DOI] [PMC free article] [PubMed] [Google Scholar]
  43. Wang Z, Clark NR, Ma’ayan A. Drug-induced adverse events prediction with the LINCS L1000 data. Bioinformatics. 2016a;32:2338–2345. doi: 10.1093/bioinformatics/btw168. [DOI] [PMC free article] [PubMed] [Google Scholar]
  44. Wang Z, Ma’ayan A. An open RNA-Seq data analysis pipeline tutorial with an example of reprocessing data from a recent Zika virus study. F1000Research. 2016;5 doi: 10.12688/f1000research.9110.1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  45. Wang Z, Monteiro CD, Jagodnik KM, Fernandez NF, Gundersen GW, Rouillard AD, Jenkins SL, Feldmann AS, Hu KS, McDermott MG, et al. Extraction and analysis of signatures from the Gene Expression Omnibus by the crowd. Nature communications. 2016b;7:12846. doi: 10.1038/ncomms12846. [DOI] [PMC free article] [PubMed] [Google Scholar]
  46. Watson S, Korkola J, Rantala J, Gray J. Interrogating HER2+ plasticity and lapatinib resistance with MicroEnvironment MicroArrays (AACR) 2014
  47. Wilkinson MD, Dumontier M, Aalbersberg IJ, Appleton G, Axton M, Baak A, Blomberg N, Boiten J-W, da Silva Santos LB, Bourne PE. The FAIR Guiding Principles for scientific data management and stewardship. Scientific data. 2016a;3 doi: 10.1038/sdata.2016.18. [DOI] [PMC free article] [PubMed] [Google Scholar]
  48. Wilkinson MD, Dumontier M, Aalbersberg IJ, Appleton G, Axton M, Baak A, Blomberg N, Boiten JW, da Silva Santos LB, Bourne PE. The FAIR Guiding Principles for scientific data management and stewardship. Scientific data. 2016b;3:160018. doi: 10.1038/sdata.2016.18. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

In general, all LINCS data is available to all, pre-publication, with no restrictions. The only exception is the raw RNA-seq data collected from patients which will be made available for download from dbGAP to adhere to patient privacy protocols. Some web-sites developed by the data generation centers, such as clue.io, which is hosting the L1000 data and related tools, the DToXS data portal, or Synapse, which is hosting the MEP LINCS data and tools, require login and an account. However, versions of the data are made openly available by the DCIC, and through public repositories such as GEO. The LINCS consortium established a data release policy which is available at: http://lincsproject.org/LINCS/data/release-policy.

RESOURCES