Table 1 |.
Troubleshooting table
Step | Problem | Possible reason | Solution |
---|---|---|---|
2 | User does not know whether Java is installed on the computer or which version is installed | The Java website (https://www.java.com) has a help page called ‘How to find Java version in Windows or Mac – Manual method’ (https://www.java.com/en/download/help/version_manual.xml) to find which version of Java is already installed | |
6B(i) | Unable to launch GSEA | Unable to associate.jar file with Java application | When launching GSEA on macOS for the first time, you may get the error ‘gsea.jnlp cannot be opened because it is from an unidentified developer’. Click on ‘Ok’. Instead of double-clicking on the gsea.jnlp icon/file, right-click and select Open. The same error ‘gsea.jnlp can’t be opened because it is from an unidentified developer’ will appear, but this time it will give you the option to Open or Cancel. Click on Open. After this initial opening, subsequent double-clicks on gsea.jnlp will launch GSEA without errors or warnings. If GSEA still fails to launch, GSEA can alternatively be launched from the command line. Go to the GSEA download site (http://www.broadinstitute.org/gsea/downloads.jsp) and download the javaGSEA JAR file (the second option on the download site). Open a command-line terminal. On macOS, the terminal can be found in Applications → Utilities → Terminal. On Windows, type cmd in the Windows program files search bar. Then navigate to the directory where the file javaGSEA.jar was downloaded, using the command cd. For example, on macOS run cd ~/Downloads if you downloaded the GSEA.jar file to your Downloads folder. Run the command java –Xmx4G –jar gsea-3.0.jar, where –Xmx specifies how much memory is given to GSEA |
6B(ii) | User needs more information about GSEA file formats (GMT, RNK, CLS, GCT) | To be able to format his/her own files for GSEA | GSEA has a useful help documentation on file formats available at https://software.broadinstitute.org/cancer/software/gsea/wiki/index.php/Data_formats |
6B(v) | GSEA seems nonresponsive | A large GMT file is being loaded | It may take 5–10 s for GSEA to load input files. The files are loaded successfully once a message appears on the screen, e.g., ‘Files loaded successfully: 2/2. There were no errors’ |
6B(xiii) | GSEA looks nonresponsive, but it is actually computing enrichments | No progress bar | GSEA has no progress bar to indicate estimated time to completion. A run can take a few minutes or hours, depending on the size of the data and the computer speed. Click on the ‘+’ in the bottom left corner of the screen to see messages such as ‘shuffleGeneSet for GeneSet 4661/4715 nperm: 1000’ (circled in red at the bottom of Fig. 3). This message indicates that GSEA is shuffling 4,715 gene sets 1,000 times each, 4,661 of which are complete. Once the permutations are complete, GSEA generates the report |
6B(xiv) | Error message ‘Java Heap space’ | GSEA was launched with insufficient memory | The error message ‘Java Heap space’ indicates that the software has run out of memory. Another version of GSEA is needed if you are running the GSEA desktop application. There are multiple options available for download from the GSEA website. You can download a webstart application that launches GSEA with 1, 2, 4 or 8 GB of RAM. Upgrade to a webstart that launches with more memory. If you are already using the webstart that launches with 8 GB, then you require a GSEA JAVA.jar file, which can be executed from the command line with increased memory (see Troubleshooting for Step 6B(i) for details) |
6B(xv) | User needs to access previous results but cannot find them | GSEA application was closed since running the analysis | If the GSEA software is closed, you can access previous results by opening the working folder and opening the ‘index.html’ file. Alternatively, you can re-launch GSEA and click on Analysis history, then History and then navigate to date of your analysis. Although all analyses, regardless of where the results files were saved, are listed under history, they are organized by the date the analyses were run. If you cannot remember when you ran a specific analysis, then you may have to manually search through a few directories to find the desired analysis |
6B(xvi) | Few or no results returned by GSEA | Possible identifier mapping issue | Check the number of gene sets that were analyzed. If the number is low (e.g., low hundreds), it could indicate gene ID mapping problems |
9A(ii) | Autoload of g:Profiler results creates many datasets with incorrect file specifications | There are too many text files within the directory specified | To simplify loading g:Profiler results into EnrichmentMap and populating the correct fields in the EnrichmentMap interface, place the g:Profiler results file and gene set file (i.e., Supplementary_Table4_gprofiler_results.txt and Supplementary_Table5_hsapiens.pathways.NAME.gmt) into a directory together by themselves |
9A(xi) | User cannot create a g:Profiler map with more than one phenotype | Although an individual g:Profiler analysis has only one phenotype, it is possible to modify a single results file to contain two analyses. This is relevant when the phenotypes are mutually exclusive. For the analysis you want to associate with the additional phenotype (which would correspond to downregulated genes in GSEA PreRanked, thus called ‘negative’) open the g:Profiler results file (preferably in a spreadsheet, so you can easily modify a single column). The fifth column specifies the phenotype. Update the column to have the value of ‘−1’ for each result in the file. Open the second analysis file. Copy all the results from the second file and paste them into the updated negative g:Profiler file. Save the file and use it as the g:Profiler enrichment results file in the EnrichmentMap interface instead of the original results files. Pathways corresponding to two phenotypes will be colored red and blue in the resulting enrichment map. One limitation with this approach is that a pathway cannot be included in both the positive and the negative sets | |
9B(vi), (vii), (ix) | A random number is appended to the GSEA directory name | Each GSEA analysis generates a random number that is appended to the names of the files and directories. The number will be different for each new analysis | |
9B(viii) | EnrichmentMap uses a GMT file that was not the original file input to GSEA | The original GMT file was moved or no longer exists in the location in which GSEA saved it | If EnrichmentMap cannot find the original GMT file used in the GSEA analysis, it will use a filtered GMT file found in the GSEA ‘edb’ results directory. EnrichmentMap will not be able to find your original GMT file if you have moved it since running GSEA analysis. Although it is a GMT file, it has been filtered to contain only genes found in the expression file. If you use this filtered file, you will obtain different pathway connectivity depending on the expression data being used. You should always use the original GMT file used for the GSEA analysis and not the filtered one in the results directory |
9B(xiii) | User cannot provide a sufficiently precise Q value | Scientific notation is not enabled | To set the threshold to a small number, select Scientific Notation and set a Q value cutoff such as 1E−04 |
10 | Few or no pathways are present in EnrichmentMap | Input dataset may not contain enough signal to find enriched pathways | A pathway enrichment analysis resulting in few or no enriched pathways may be caused by suboptimal statistical processing used to define the original gene list. Enriched pathways are unlikely to be found if the gene list ranks are too noisy and the most important genes are not at the top of the list, no genes are highly significant, or a large fraction of genes are highly significant. If the gene list has been correctly defined, analyzing further databases of pathways and gene sets or setting more liberal filters may improve results |
11 | User cannot find any pathways with search gene | Check that the gene identifier type used for the search matches the identifier type used in the analysis | Multiple genes separated by spaces can be entered into the search bar. Any pathway that contains the gene will be selected and highlighted in the network. Adding keywords with ‘AND’ into the search bar will show only pathways that contain all genes in the search query (e.g., ‘geneA AND geneB’). If the original analysis did not use gene symbols, then you will not be able to search by gene symbols. Instead, use the identifier type that the analysis was based on, for example Entrez Gene ID or Ensembl gene ID |
12 | There are very few entries in the node table, although the network contains many nodes | Some nodes in the network are already selected | If there are very few records in the node table, make sure that no nodes are selected in the network. Or click on the gear icon and change the setting from Auto to Show all |
13A(i) | Leading-edge genes are not highlighted when clicking on a pathway node | Analysis was not done with GSEA, or GSEA rank file or enrichment results were not supplied when the enrichment map was built | The leading edge can be displayed only if the rank file is provided when the network is built. The rank file supplied needs to be identical to the one used for the GSEA analysis for the leading-edge calculation to function |
13A(ii) | User does not know which sort option to choose | In the case of multiple conditions or conditions with variable expression profiles (e.g., cancer patient samples), hierarchical clustering tends to generate a more informative visualization | |
13A(vii) | Heat map column names are not colored by dataset | No CLS file was loaded or there is a mismatch between the CLS file and the phenotype definition | If the heat map columns are not colored for a GSEA analysis, make sure the phenotype names specified in the EnrichmentMap input panel match the class names specified in the class file (MesenchymalvsImmunoreactive_RNASeq_classes.cls). Also see Troubleshooting for Step 9B(xiii) |
13A(ix) | The option to save only leading-edge genes is not available | Selection includes more than one node or dataset contains no leading-edge information (i.e., was not built from GSEA results) | The leading edge is available only for GSEA analyses. The option will appear only if the enrichment map was built with GSEA results and a rank file was specified |
13C(ii) | AutoAnnotate has many tunable parameters | The default parameters are likely to work well with EnrichmentMap; however, there are many parameters within the AutoAnnotate application that can fine-tune the results. See the AutoAnnotate user manual at https://autoannotate.readthedocs.io/en/latest/ | |
13C(iii) | Labels contain uninformative words | Node names contain uninformative words that are not excluded by default or are not considered during network normalization | If particular non-informative words keep appearing in the labels generated by AutoAnnotate, try adjusting the WordCloud normalization factor. The significance of each word is calculated on the basis of the number of occurrences in the given cluster of pathways. This causes frequent words such as ‘pathway’ or ‘regulation’ to be prominent. By increasing the normalization factor, we reduce the priority of such recurrent words in cluster labels. If that doesn’t help, you can add the non-informative words to the WordCloud word exclusion list |
Labels contain ‘-’ | If a specific character other than a space is used to separate words (e.g., ‘-’ or ‘|’), it should be added as a delimiter in the WordCloud application. Launch the WordCloud application (Apps → WordCloud). In the WordCloud input panel, expand Advanced options. Click on Delimiters… Add your delimiters. Click on OK. In the AutoAnnotate input panel, click on the menu button (icon with three horizontal lines). Select Recalculate Labels… for this change to take effect | ||
13C(iv) | Labels are bigger for bigger clusters, but user wants all the labels to be the same size | Setting to scale labels to the size of the cluster is enabled | The number of nodes in a cluster determines label size by default. Thus, the cluster size may relate to pathway popularity instead of importance in the experiment. Annotation labels can all be set to the same size by unchecking the option Scale font by cluster size in the AutoAnnotate results panel |
13D(iii) | Pop-up after selecting Collapse all shows up every time user collapses the clusters | User has not specified Don’t ask me again option on pop-up | Once you click on Collapse All, a pop-up window will show the message ‘Before collapsing clusters please go to the menu Edit → Preferences → Group preferences and select ‘Enable attribute aggregation”. There is no need to adjust this parameter repeatedly. Click on Don’t ask me again and OK if you have set this parameter previously |
Collapsing the network takes a long time | The larger the network or the more clusters in a network, the longer collapsing will take | For large networks, collapsing and expanding may take time. For a quick view of the collapsed network, you can create a summary network by selecting the Create summary Network… option. There are two options for the summary network: clusters only, which creates a summary network with just the circled clusters, or clusters and unclustered nodes, which creates a summary network that also includes the singleton nodes that are not part of any cluster | |
Collapsed network contains gray nodes instead of colored nodes as they were in the precollapsed network | Attribute aggregation is not enabled | If the nodes in the resulting collapsed network are gray, then you forgot to enable attribute aggregation. Expand the clusters and, before collapsing clusters again, go to Edit → Preferences → Group preferences and select Enable attribute aggregation | |
13E(ii) | In the EnrichmentMap input panel, the bottom options Publication Ready and Set Signature Edge Width are not visible | The Node Layout Tools is open | Close the Node Layout Tools window using the × symbol located at the top right corner |
13F(x) | The created subnetwork is empty | Nodes were not selected before the creation of the subnetwork | Make sure that the nodes that will be part of the subnetwork are selected before creation of the subnetwork |
16 | Exported image contains only a small subset of the network | Only what is visible in the view is exported | In image export, only the visible part of the map will be exported. Make sure that the entire network is visible on your screen before exporting |