Using the Reactome Database

Karen Rothfels; Marija Milacic; Lisa Matthews; Robin Haw; Cristoffer Sevilla; Marc Gillespie; Ralf Stephan; Chuqiao Gong; Eliot Ragueneau; Bruce May; Veronica Shamovsky; Adam Wright; Joel Weiser; Deidre Beavers; Patrick Conley; Krishna Tiwari; Bijay Jassal; Johannes Griss; Andrea Senff-Ribeiro; Timothy Brunson; Robert Petryszak; Henning Hermjakob; Peter D’Eustachio; Guanming Wu; Lincoln Stein

doi:10.1002/cpz1.722

. Author manuscript; available in PMC: 2024 Jun 18.

Published in final edited form as: Curr Protoc. 2023 Apr;3(4):e722. doi: 10.1002/cpz1.722

Using the Reactome Database

Karen Rothfels ¹, Marija Milacic ¹, Lisa Matthews ², Robin Haw ¹, Cristoffer Sevilla ³, Marc Gillespie ^1,⁴, Ralf Stephan ¹, Chuqiao Gong ³, Eliot Ragueneau ³, Bruce May ¹, Veronica Shamovsky ², Adam Wright ¹, Joel Weiser ¹, Deidre Beavers ⁵, Patrick Conley ⁵, Krishna Tiwari ³, Bijay Jassal ¹, Johannes Griss ^3,⁶, Andrea Senff-Ribeiro ^1,⁷, Timothy Brunson ⁵, Robert Petryszak ^5,³, Henning Hermjakob ^3,⁸, Peter D’Eustachio ², Guanming Wu ⁵, Lincoln Stein ^1,⁹

PMCID: PMC11184634 NIHMSID: NIHMS1882651 PMID: 37053306

Abstract

Pathway databases provide descriptions of the roles of proteins, nucleic acids, lipids, carbohydrates and other molecular entities within their biological cellular contexts. Pathway-centric views of these roles may allow for the discovery of unexpected functional relationships in data such as gene expression profiles and somatic mutation catalogues from tumor cells. For this reason, there is a high demand for high-quality pathway databases and their associated tools. The Reactome project (a collaboration between the Ontario Institute for Cancer Research, New York University Langone Health, the European Bioinformatics Institute, and Oregon Health & Science University) is one such pathway database. Reactome collects detailed information on biological pathways and processes in humans from the primary literature. Reactome content is manually curated, expert-authored and peer-reviewed and spans the gamut from simple intermediate metabolism to signaling pathways and complex cellular events. This information is supplemented with likely orthologous molecular reactions in mouse, rat, zebrafish, worm, and other model organisms.

Basic Protocol 1: Browsing a Reactome pathway

Basic Protocol 2: Exploring Reactome annotations of disease and drugs

Basic Protocol 3: Finding a pathway involving a gene or protein

Alternate Protocol 1: Finding the pathways involving a gene or protein using UniProtKB (SwissProt), Ensembl, or Entrez gene identifier

Alternate Protocol 2: Using the Advanced search

Basic Protocol 4: Using the Reactome pathway analysis tool to identify statistically overrepresented pathways

Basic Protocol 5: Using the Reactome pathway analysis tool to overlay expression data onto Reactome pathway diagrams

Basic Protocol 6: Comparing inferred model organism and human pathways using the Species Comparison tool

Basic Protocol 7: Comparing tissue-specific expression using the Tissue Distribution tool

Keywords: Reactome database, biological pathway, pathway analysis, pathway visualization, interaction network

INTRODUCTION:

The availability of whole genome sequences from numerous species coupled with an explosion of techniques for querying and analyzing these reference genomes, including at a single cell level, has led to a high demand for sophisticated tools to facilitate visualization and interpretation of the resulting large data sets. Biological pathway databases are uniquely positioned to play a key role in the interpretation of such data sets. Pathway databases capture what is already known about the interplay of genes, proteins, and small molecules using a data model that is accessible to computation, and position experimental outcomes on proteins or other biological molecules in their relevant cellular context. For example, a perturbation experiment that changes the expression pattern of thousands of genes may only affect the expression patterns of a small handful of biochemical pathways. Pathway analysis has the potential to reveal unexpected connections between disparate areas of biology that are not readily apparent by simple inspection. Hence, there is a high degree of interest in the bioinformatics community in creating pathway databases. The Reactome project, covered in this unit, is one such database. It is a curated collection of well-documented human molecular reactions grouped into pathways that span the gamut from simple intermediary metabolism (e.g., sugar catabolism) to complex cellular events such as the mitotic cell cycle. Reactome annotations also document how normal biological pathways are affected during disease, and the effects of drugs on pathway activities. Reactome annotations are manually curated by PhD level scientists and peer-reviewed by experts in the field prior to being published in the database. A semi-automated procedure supplements this manually curated information by identifying likely orthologous molecular reactions in mouse, rat, zebrafish, worm, and other model organisms, extending the use of the database to support research in other species.

The protocols in this unit illustrate how to use Reactome to learn the steps of a biological pathway and how a suite of data analysis tools can assist with the interpretation of user-supplied experimental data sets. Basic Protocol 1 describes how to navigate and browse through the Reactome database. Basic Protocol 2 describes how to navigate and browse through the drug and disease annotations of Reactome. Basic Protocol 3 and Alternate Protocol 1 explain how to identify the pathways in which a molecule of interest is involved using either the common name or accession number, respectively. Alternate Protocol 2 describes how to use the Advanced Search Feature. Basic Protocol 4 details how to use Pathway Analysis to perform identifier mapping and overrepresentation analysis. Basic Protocol 5 explains how to overlay pathway diagrams with expression data. Basic Protocol 6 describes the use of the Species Comparison tool to compare model organisms and human pathways. Basic Protocol 7 describes how to compare expression in different tissues using the Tissue Distribution tool.

NOTE: This information is based on Reactome in December 2022. Some of the web pages may have changed somewhat since the unit was written.

BASIC PROTOCOL 1

Browsing a Reactome pathway

This protocol will introduce the basic navigational techniques needed to browse the Reactome website.

Materials:

Hardware

Computer capable of supporting a Web browser and an Internet connection

Software

Any modern Web browser such as Firefox, Safari and Chrome will work to display Reactome Web pages

Point the browser to the Reactome home page at https://reactome.org.

The home page (Figure 1) has several elements.

At the top left of the home page is the Reactome logo. Clicking on this from any page on the website will return the user to the home page.

The navigation bar, at the very top of the page, provides access to the top-level sections, tools, and resources of the Reactome site. “About” is a description of the project as a whole, including Reactome team and Scientific Advisory Board members, details of our open source licenses, the upcoming editorial calendar and current statistics; “Content” provides links to resources within the database, including the table of contents, database object identifiers, a detailed description of the Reactome data schema as well as information on specific features such as the Reactome Research Spotlight, the COVID-19 Disease project and the ORCID integration project; “Docs” provides access to user guides and information about the Reactome data model, icon library and computationally inferred events (described more fully in step 3, below), as well as instructions on how to link to or cite Reactome material; “Tools” provides links to key Reactome functions, including the pathway browser, tools for analyzing gene lists and gene expression data, for species comparisons, tissue distribution and for disease overlay. This tab also has links to the Reactome analysis and content services and Reactome FIViz, the Reactome functional interaction network app (Wu et al, 2014). “Community” has information on outreach and events, Reactome publications, partnerships and collaborations as well as access to training guides and tutorials; “Download” provides access to the whole database as a single bulk or individual data set download, pathway downloads in a variety of formats including BioPAX, SMBL and PDF, as well as physical entity and event mapping files.

Below the header is a simple search bar that permits flexible keyword, accession number, and database identifier queries on the Reactome database. Below the search bar are four large buttons linking to key features of the Reactome website: “Pathway Browser”, which takes the user into the curated pathway hierarchy of human biological pathways (described in step 2 below), “Analysis Tools”, which opens the analysis window allowing users to analyze gene lists and expression patterns and to conduct species comparisons and examine tissue distribution,”ReactomeFIViz”, which takes users to the documentation page for the Reactome functional interaction network app (Wu et al, 2014), and “Documentation”, which links to useful information about the website for users and developers.

Below these buttons is a section of the home page (Figure 2) that contains the “Reactome Research Spotlight,” (a feature that highlights recent publications that have incorporated Reactome data or tools into their research), as well as news items, the Twitter feed, curation statistics from the most recent release, and information about the project.

Scrolling farther down on the home page (Figure 3) reveals a “help” panel, with buttons linking to guides for users and developers, a button linking to information on citing Reactome and a “Contact Us” button, for users requiring help. Below the help panel is a panel for API and data access, including the Content and Analysis Services, the icon library and the graph database.
To begin exploring the curated Reactome pathways, click on the “Pathway Browser” button on the home page. This will load the page shown in Figure 4.
The Reactome Pathway Browser consists of four key elements:
1. The header bar, at the top of the page. This has the Reactome logo, which returns users to the home page when clicked. Next to this is a species selector, with a drop-down list of species. Selecting an organism from the species selector will refresh the pathway browser with the inferred pathway diagram from the selected model organism if it is conserved. Reactome data is human-centric. Data for other species is inferred from human pathways and pathway steps may be missing for other organisms if they are not identified by the inference process. The “Analysis” button provides access to the interactive tools associated with the pathway diagrams, described below in Protocols 4, 5, 6 and 7. Clicking the “Tour” button in the header opens a brief video tutorial on the key Reactome website functions, while selecting one of the layout buttons in the top right of the header bar allows users to personalize the website panels that are displayed to optimize viewing.
2. The pathway hierarchy panel, occupying the vertical rectangle on the far left of the screen, provides a scrolling display of all Reactome pathways in a hierarchy. The plus (“+”) symbol indicates that there are subheadings underneath the pathway headings. Clicking on a plus (“+”) symbol will expand the topic to show its subsections. The subpathways and reactions within each pathway can be hidden by clicking on the minus (“–”) symbol to the left of the pathway name. Next to the plus/minus signs is a small pathway icon in blue or black, indicating the presence or absence of an “Enhanced High Level Diagram” (EHLD, see below) associated with that pathway. A red “N” or “U” next to a pathway name indicates that the pathway is new or has been updated since the last release, respectively. A red cross next to a pathway name indicates that the pathway contains disease annotations.
3. The visualization panel, to the right of the hierarchy panel, displays an interactive pathway diagram that can be panned and zoomed in Google Map style. The visualization panel is synced with the pathway hierarchy on the left, such that selecting pathways or subpathways in the hierarchy will change what is displayed in the visualization panel. There are three primary views that can be displayed in this panel. The first view, “Pathway Overview”, displays the entire pathway hierarchy as interconnected nodes, with nodes representing pathways and edges representing relationships. If a user selects a pathway in the hierarchy or in the graphical display, the corresponding node is outlined in orange. The second view, “Enhanced High Level Diagram (EHLD)”, where available, displays a textbook style interactive illustration of a user-selected pathway (Sidiropoulos et al, 2017). The third view, “Entity Level View” (ELV), displays the reaction-level molecular details of the user-selected pathway. The ELV pathway diagrams apply the conventions of the Systems Biology Graphical Notation (SBGN) format (Le Novère et al, 2009) to distinguish the molecules and reactions by shape and cellular location, providing a dynamic framework for pathway visualization and data analysis.
  
  Users can toggle between the Pathway Overview and the ELV views by clicking the second of three blue icons beside the search bar at the top left of the visualization panel. EHLDs, where available, appear as a thumbnail in the bottom left of the visualization panel, and can be accessed by clicking the pathway name in the pathway hierarchy at the left of the visualization panel.
  
  An alternate view of the entire pathway hierarchy can be accessed by clicking the third blue icon to the right of the search bar from the Pathway Overview view. This opens a separate window and displays the Reactome pathways as a Voronoi diagram, with sizes of pathway clusters proportional to the number of events the pathway contains (Jassal et al, 2020). To move from the Voronoi diagram back to the Pathway Browser, click and hold on a pathway name within the Voronoi image.
  
  At the top left of the visualization panel is a search bar, featuring results grouping and filtering, hit highlighting and text auto-completion (Fabregat et al, 2016). The visualization panel also contains an icon (top right, first blue icon) that allows export of the pathway visualization in several different formats, a “compass” icon that provides a pathway overview key (top right, second blue icon) and an expandable panel on the right side which provides information and allows users to customize color preferences.
  
  At the bottom right of the visualization panel are navigation arrows and zoom features. Users can also zoom using the mouse wheel and can click and drag the diagram. The thumbnail image, in the lower left corner of the visualization panel, can be used to navigate quickly to a region of interest in the pathway diagram.
4. The details panel is located below the visualization panel, and its contents change in sync with user selections in the visualization panel or the pathway hierarchy. The details panel has 6 tabs, each of which contains a general description of what will be displayed in that panel once an event or entity is selected. The “Description” tab displays molecular details related to the selected event or entity, including inputs and outputs of reactions, catalysts, regulators, preceding and following events, linkouts to other databases with entity information, etc. This tab also displays event summations, literature references and editorial information. The “Molecules” tab shows downloadable details of the molecules (proteins, small molecules/chemical compounds, nucleic acid sequences) involved in the selected event. The “Structures” tab shows reaction details from Rhea (Bansal et al, 2022) or structural information from ChEBI (Hastings et al, 2016) or PDBe (Armstrong et al, 2020), as appropriate. The “Expression” tab displays gene expression information from Gene Expression Atlas for genes corresponding to the selected items. The “Analysis” tab displays the pathway-specific results generated by the Reactome analysis tools, and finally, the “Download” tab allows users to download the selected pathway in several different formats.
This protocol will illustrate features of the Reactome pathway browser by exploring the events contained within the “DNA Repair” pathway. To begin, click on the “DNA Repair” pathway title in the pathway hierarchy at right.

In response to this selection, the visualization panel zooms in on the DNA Repair node in the Pathway Overview (see Figure 5) and brings up the pathway level information in the “Description” tab of the details panel, including pathway level summation, editorial attributes (authors and reviewers), literature references, GO Biological Process where appropriate, and cellular compartment. Each of these attributes is expandable: clicking on the plus (“+”) symbol on the right reveals further details, including linkouts to PubMed, ORCID, GO or other cross-referenced resources. The Description tab of the details panel also contains a stable identifier for the event displayed, including the three letter species code (HSA for Homo sapiens is the default unless species is changed). Scrolling down in the details panel to the section labeled “View computationally predicted event in” reveals a species selector bar.. Reactome’s manual annotations are human-focused but are computationally extended to other species based on protein conservation as described under “Computatinally inferred events” in the “Docs” section of the website. Selecting a different species from the species selector bar will update the events and information displayed in the visualization and details panel for the selected species. Not that the species may also be changed using the drop-down menu to the right of the Reactome logo in the header of the pathway browser.

Other tabs in the details panel are also updated in response to the selection of “DNA Repair” pathway in the hierarchy (note that the “Structures” tab is not populated with data when a pathway-level event is selected in the hierarchy, and the “Analysis” tab is only populated once an analysis has been performed). The “Molecules” tab is updated to provide information on all the chemical compounds (55 in DNA Repair pathway), proteins (315) drugs (28), sequences (0) or other entities (0) contained in the pathway, and the total number of molecules is displayed as in a red bubble at the top of the “Molecules” header (398). Inside the “Molecules” panel, expandable sections provided detailed information on each of the entities contained in the event linked out to the appropriate reference database. This information is downloadable in full or in part from within the “Molecules” tab. The “Expression” tab displays expression data from Gene Expression Atlas for each of the genes contained within the pathway, and the “Download” tab has pathway reports and diagrams for the selected pathway available for download in a variety of formats.
Double click on the “DNA Repair” pathway title in the hierarchy, or double click on the corresponding node in the visualization panel.

Double clicking on the DNA Repair pathway title in the hierarchy opens the interactive EHLD for this pathway (users know an EHLD is available because the pathway icon between the plus sign and the pathway name in the hierarchy is blue rather than black; pathways with black icons don’t have EHLDs but may have static, non-interactive illustrations).
Click on the plus (“+”) symbol beside “DNA Repair” in the hierarchy to reveal the subpathways.

Users can navigate to any of the 7 subpathways of DNA Repair by clicking on the plus (“+”) symbol beside DNA Repair in the hierarchy or by clicking on the subpathway name in the EHLD. Clicking on a subpathway name either in the hierarchy or from the EHLD will either open another EHLD, if the selected subpathway itself contains multiple subpathways with ELV diagrams (as is the case for the Base Excision Repair subpathway) or will open an ELV pathway as is the case for the other 6 subpathways of DNA Repair.
Click on the “Nucleotide Excision Repair” pathway either in the hierarchy or from within the “DNA Repair” EHLD.

An ELV diagram containing reactions curated at the level of molecular participants appears in the visualization panel, and the details panel updates to reflect information appropriate for this pathway. Note that the total molecules displayed in the “Molecules” tab for this pathway is 119 (9 “Chemical Compounds”, 110 “Proteins”), fewer than the parent DNA Repair pathway, as expected.

ELV pathways open at a fully zoomed out level. Diagram zoom level is controlled either by a mouse or with the zoom icons in the bottom right of the visualization panel. Diagrams can be easily recentered to the fully zoomed out view by clicking on the icon to the immediate right of the search bar at the top of the visualization panel.

Depending on the diagram size, commonly occurring reaction participants such as H₂0 or ATP may not be displayed in the diagram at the fully zoomed out level. As the user zooms in, these entities are dynamically added to the pathway diagram.

Nucleotide Excision Repair has two subpathways, both laid out in the same ELV: “Global Genomic NER (GG-NER)” and “Transcription-coupled NER (TC-NER)”. These subpathway names are displayed in the zoomed-out view of the ELV and are contained by colored subpathway boundary boxes. Clicking on either of the NER subpathway names in the hierarchy highlights the events encompassed by that subpathway in the visualization panel in blue, while hovering over a pathway name highlights the corresponding events in yellow.

The pathway diagram uses the conventions of the Systems Biology Graphical Notation (SBGN) format to distinguish the molecules and reactions by shape and cellular location, to provide a dynamic framework for pathway visualization and data analysis. A key to the shapes used in the ELV diagrams is available by clicking on the compass arrow at the top right of the visualization panel; doing so reveals the key shown in Figure 6.
Continue to drill down into the hierarchy to reach reaction level events as follows: Click on the pathway title “Global Genomic Nucleotide Excision Repair”. Note that the details panel shows that this pathway contains 92 of the 119 molecules present in the NER pathway (8/9 “Chemical Compounds” and 84/110 “Proteins”). Expand this subpathway in the hierarchy by clicking on the adjacent plus sign to reveal the four subpathways (“DNA Damage Recognition in GG-NER”, “Formation of Incision Complex…”, etc.). Continue to expand the hierarchy by clicking the “DNA Damage Recognition in GG-NER” pathway in the hierarchy to reveal the 5 molecular level events contained within, noticing that at each subsequent pathway level the fraction of molecules represented is adjusted relative to the parent NER pathway.
Click on “XPC binds RAD23 and CETN2”, the first reaction in the “DNA Damage Recognition in GG-NER” pathway, as shown in Figure 7.

Clicking a reaction in the pathway hierarchy will cause the reaction name and the name of the subpathway(s) and parent pathways to be highlighted, as seen in Figure 7, above. The visualization panel will recenter on the selected reaction and the reaction will be highlighted in blue. Furthermore, the description tab of the “Details” panel will update to show particulars of the selected reaction, which will include some or all of the following, where appropriate: inputs, outputs, catalysts and positive and negative regulators, preceding event, and “inferred from” reaction.

In the case of the “XPC binds RAD23 and CETN2” reaction, there are three inputs, one output, no catalyst or regulators, no preceding event and no “inferred from” reaction. The inputs to the reaction are the proteins XPC and CETN2 and a set of RAD23 proteins. Reactome sets are groups of molecules that are known or predicted to function in the same way in a given reaction and may either be “member” sets, where each participant is known to have the indicated role, or “candidate” sets, where some participants are verified members, and others are candidates based on sequence or structural similarity. Here the RAD23 set is a “candidate” set, with RAD23B a verified member and RAD23A a candidate. The output of this reaction is the complex formed by the binding of the three inputs (XPC, the RAD23 set and CETN2).
Clicking on the “XPC binds RAD23 and CETN2” reaction simultaneously updates information in the other details panel tabs:
- the red bubble on the “Molecules” header now indicates that the reaction contains 4/119 molecules from the Nucleotide Excision Repair pathway, all of them proteins, and the “Molecules” panel is updated with information on these four proteins.
- the “Structures” header has been decorated with a red bubble indicating the fraction of the molecules in the reaction for which structural information is available (here, 4/4), and the panel provides linkouts to those structures in Protein Data Bank.
- the expression for these 4 proteins is displayed in the “Expression” tab
- the “Analysis” tab continues to be blank unless an analysis has been performed
- the “Download” tab: note that although a reaction is selected in hierarchy, the download available from this tab remains the immediate parent ELV pathway (here the “Nucleotide Excision Repair” subpathway of the top-level “DNA Repair”)
In the context of the “DNA Damage and Recognition in GG-NER” pathway, the reaction “XPC binds RAD23 and CETN2” is the first in a series of connected reactions, in which the output of one reaction is the input to the next reaction. Because there are no annotated reactions that occur prior to the binding of RAD23 and CETN2 to XPC, there is no indication of a “preceding event” in the “Description” tab of the details panel for this reaction. If the following reaction in the hierarchy (“XPC:RAD23:CETN2 and UV-DDB bind distorted dsDNA site”) is selected, the “Description” tab of the details panel is updated with reaction-specific information. In particular, a new field describing the positive regulator (INO80 complex) is added, as is the new field “Preceding Event”, which lists the “XPC binds RAD23 and CETN2[Homo sapiens]” reaction described above. Clicking on the plus (‘+’) symbol to the right of this “preceding event” reaction name expands this field to reveal the summation and literature references associated with the reaction. This allows users to put the current reaction into a more complete biological context.

The relationship between the “levels” of the pathway hierarchy on the one hand and the “Preceding event(s)” links, on the other hand, may not be immediately clear. The nested levels of the pathway hierarchy reflect levels of abstraction in the conceptual organization of pathways. As one moves deeper into the hierarchy, the contents of the pathway diagram become more and more specific and move closer to the biochemical reaction level. The “Preceding event(s)” link only appears at the reaction level.

Reactions in Reactome are human-centric. Wherever possible, the molecular events that are depicted in Reactome are supported by direct experiments that make use of human cells, tissues, protein or other entities. This evidence is cited in the literature references associated with the reaction. Often, however, knowledge of human biology is derived indirectly from work using model organisms. If the direct experimental evidence supporting a given reaction is model organism-based, an “inferring” reaction is created using the molecules from the relevant species, and a corresponding human reaction is inferred from it. Human reactions that are inferred in this way are indicated with a double arrow adjacent to the reaction title in the hierarchy, and by the presence of an “inferred from another species” field in the “Description” tab of the details panel. Expanding the field by clicking on the plus (‘+’) symbol at the right reveals the summation and references for the experiment(s) in the other species. “Inferred from” reactions may also be marked as chimeric if the experiment being cited as evidence contains molecules (proteins, nucleic acids, etc.) from multiple species.
There are no inferred reactions within the “Nucleotide Excision Repair” subpathway. To see an example in the “Base Excision Repair” pathway, expand the “Base Excision Repair” hierarchy to reveal the subpathways “Base-Excision Repair, AP Site Formation”, and continue to unfurl its child pathway “Depurination”, and its child pathway “Recognition and association of DNA glycosylase with site containing an affected purine”.

This pathway contains 10 reactions, the ninth of which is “NEIL3 recognizes and binds to spiroiminodihydatoin in telomeric DNA”, with the inferred double arrow indicator beside its event name in the hierarchy. The “Description” tab of the details panel contains an “Inferred from another species” field, listing the corresponding reaction from Mus musculus, along with its summation and literature references, as shown in Figure 8.
In addition to the reaction level information described above (summations, literature references, editorial attributes, etc.), Reactome also provides detailed information about each of the entities involved in a reaction. To explore this, return to the first reaction of the “DNA Damage Recognition in GG-NER pathway”, “XPC binds RAD23 and CETN2”. Clicking on any of the inputs or outputs of the reaction (or regulators and catalysts where applicable) updates the “Description” tab of the details panel with information and linkouts for the corresponding molecule. In the “XPC binds RAD23 and CETN2” reaction, click on the “XPC” entity in the pathway diagram.

This will update the Details panel to display information about the protein, including synonyms, cellular compartment (linked out to the GO ontology), external identifiers to resources such as UniProtKB, Ensembl, GeneCards, HMDB Protein, HPA, OpenTargets, Orphanet, PDB, PRO, Pharos and RefSeq and a selector bar to view the entity in other species. Small molecule inputs or outputs (none in this reaction) are similarly linked to appropriate reference databases and provided with synonyms and cellular compartments in the “Description” tab of the details panel. Small molecules are given species-agnostic identifiers beginning with R-ALL.

Note that in the “XPC binds RAD23 and CETN2” reaction, the XPC and CETN2 icons in the visualization panel are decorated with red circles on the upper right corner. This mark indicates the number of interacting proteins in the IntAct database for that entity. Clicking on the red icon displays the interacting proteins in a halo around the pathway entity, as shown for XPC in Figure 9; clicking on an interactor takes the user to the UniProtKB record for that protein. In cases where there are too many interactors to display in the context of the pathway diagram, only the top 18 interactors are shown. Users can customize the confidence level for these interactors with the sliding scale at the bottom of the visualization panel; this toolbar also allows users to download all pathway interactors as a CSV file.

Although the interacting protein overlay by default is set to IntAct, users can change the referring database by opening the panel on the right side of the visualization panel as shown in Figure 9 and selecting one of the other highlighted resources from the middle “Interactors” tab. These interactors are loaded on an on-demand basis through PSICQUIC. Note that at this time, interactors can only be displayed for individual pathway protein entities and not for complexes or molecule sets.

Also note that zooming in on XPC or other proteins within the visualization panel reveals within the pathway icon the UniProtKB identifier and, if available, an associated structure from PDB.
Reactome provides information about the subunits of a complex, as well as the larger ensembles of proteins that a complex participates in. In this example, from the “XPC binds RAD23 and CETN2” page, click on the “XPC:RAD23:CETN2” entity in the pathway diagram. This will update the Details panel to display information about this complex.

The “Description” tab of the details panel will now include a new field “Components”, in which each of the constituents of the complex are listed, each with an expandable window that links to further information about that entity (synonyms, compartment, reference entities, external identifiers, etc. as outlined above for XPC; in addition, clicking on the icon to the left of the component name - here, the green protein circle for XPC also reveals a window with entity-specific information).

In addition, the “Description” tab of the details panel now includes a “Produced by” and a “Consumed by” field, listing events across Reactome as a whole in which the complex is either an output or an input, respectively. These fields are expandable (plus (‘+’) symbol at right) to reveal summation and literature references for the reaction-like-event in question. Clicking on the reaction icon to the left of the “Produced by” or “Consumed by” reaction titles will highlight the reaction node and recentre the visualization panel on the corresponding reaction. Note that this may move the user to a different pathway diagram.

More detailed information is also provided about the components of sets; to see this click on the RAD23 set that is an input to the “XPC binds RAD23 and CETN2” reaction. Note, however, that sets are not associated with “Produced by” or “Consumed by” fields as complexes are.
To explore the information Reactome provides about catalysts, click on the third reaction in the “DNA Damage Recognition in GG-NER” pathway, “UV-DDB ubiquitinates XPC”. Catalysts are shown regulating the reaction node by virtue of an edge ending in a circle (see Figure 10). Catalysts may either be independent of other reaction participants, or, as in this case, may be one of the reaction inputs. Reflecting this dual role, the “XPC:RAD23:CETN2:Distorted ds DNA:UV-DDB” complex has both a reaction edge and a catalyst edge associated with it.
In addition to the fields described above for reactions without catalysts, the “Description” tab of the details panel for an enzyme-catalyzed reaction also contains the following information about the catalyst (Figure 10):
- Physical Entity: whichever molecule in the pathway diagram is associated with the catalyst activity. This may be a single protein, a set of proteins, or a complex (here the complex “XPC:RAD23:CETN2:Distorted ds DNA:UV-DDB”).
- Active Unit: in cases such as this one where a complex is the catalyst, the specific component that contributes the catalytic activity is identified. Here, the active unit is the UV-DDB subcomplex consisting of DDB1 and 2, RBX1 and CUL4.
- Molecular Function: the most appropriate term is taken from (and linked out to) the GO.
The catalyst name is a concatenation of the GO Molecular Function and the name of the Physical entity.
Reactome provides inter-pathway connections for physical entities contained within a given pathway diagram. Hovering over any entity in the visualization panel reveals an arrowhead at the right side of the entity icon. Clicking on this arrowhead reveals an interactive information panel (Contextual Information Panel, CIP) with three tabs (See Figure 11): “Molecules”, “Pathways” and “Interactors”. Similar to the descriptions above, the “Molecules” tab provides the components of the selected entity, and the “Interactors” tab provides a table listing the interacting proteins along with scores and evidence (note that display name of components or interactors can be toggled between common name and reference identifier by clicking on the small “id” button at the top right of the interactive panel; clicking on the pin icon locks the interactive panel to the pathway diagram; to close the panel, click on the ‘x’ icon).

To explore the “Pathways” tab, click on the arrowhead revealed by hovering above the XPC protein input in the first reaction of the “DNA Damage Recognition in GG-NER” subpathway, “XPC binds RAD23 and CETN2”.

The “Pathways” tab lists other Reactome pathways in which the selected entity takes part; here the only other pathway in which XPC has an annotated function in Reactome is the “SUMOylation of DNA damage response and repair proteins” pathway. Clicking on the pathway name within the interactive panel moves the user to the new pathway in which that entity participates.

This connection between pathways mediated by shared participants highlights potentially unexpected linkages between disparate areas of biology and illustrates the power of Reactome to bridge the domains.

The Reactome home page (https://reactome.org) features a header panel with drop-down menus to access web content, a search bar and four large buttons linking to key features of the website - Pathway Browser, Analysis Tools, the Reactome FIViz app, and documentation.

The Reactome home page also contains announcements (news, Twitter, research Spotlight, project information), and statistics from the most recent release.

Lower down on the home page is a help panel with buttons linking to guides for users and developers, and buttons for API and data access.

The Pathway Browser, including the event hierarchy (left panel), the details panel (bottom) and the visualization panel displaying the “Pathway Overview” view.

High-level view of the DNA Repair pathway in the Pathway Browser, showing pathway-level summation in the details panel, a zoomed-in view of the Pathway Overview and a thumbnail of the textbook-style illustration in the visualization panel.

[*Copyeditor: Two versions of Figure 6 has been provided by the authors. Please pick figure 6v1 for publication and tell the authors that figure 6v1 has been chosen for publication]

A key to the SGBN-based icons used in the molecular level events in Reactome.

Selection of a particular reaction (“XPC binds RAD23 and CETN2” in the “DNA damage and recognition in the GG-NER” pathway) updates the Pathway Browser appropriately.

The reaction “NEIL3 recognizes and binds to spiroiminodihydatoin in telomeric DNA” is inferred from a corresponding reaction in *Mus musculus*, as evidenced by the double arrowhead beside the event name in the hierarchy.

Proteins that interact with XPC as described in the IntAct database are displayed as a halo around XPC in the visualization panel.

The details of a Reactome catalyst are shown in the context of the reaction “UV-DDB ubiquitinates XPC”.

The Contextual Information Panel (CIP) displays information about a given pathway entity, including constituent molecules, other Reactome pathways where that entity occurs, and interactors.

BASIC PROTOCOL 2

Exploring Reactome annotations of disease and drugs

In addition to normal human biology, Reactome annotates abnormal or pathological events arising from genetic mutation or interaction with an infectious agent in a separate top-level pathway called “Disease”. Reactome disease pathways are designated with a red “+” symbol to the left of the pathway name and include cancer, metabolic, immune and infectious diseases, among others. Where possible, Reactome disease pathways also include the interaction of relevant therapeutic drugs.

Consistent with Reactome’s pathway-centric view, disease events (with the exception of infectious processes) are annotated as changes to normal molecular level reactions and are displayed in the context of the relevant non-disease pathway background. As a result, there is no single diagram representing a given disease (for instance, bladder cancer or diabetes) but rather individual events that are perturbed in the course of that disease are labeled with the appropriate disease tag and displayed as overlays to normal pathway events. Events with the same disease tag may therefore be distributed across multiple normal pathways and pathway diagrams. Infectious diseases represent novel events that don’t have a corresponding normal state and have their own pathway diagrams.

Reactome’s disease and drug annotations will be explored using the disease pathways “Signaling by ERBB2 in Cancer” and “SARS-CoV-2 Infection”. This module will highlight where and how the disease pathway annotations diverge from those of normal pathways; many of the key annotation features, however, are functionally equivalent and these will not be detailed here.