Abstract
Small cell lung cancer (SCLC) is an aggressive disease and challenging to treat due to its mixture of transcriptional subtypes and subtype transitions. Transcription factor (TF) networks have been the focus of studies to identify SCLC subtype regulators via systems approaches. Yet, their structures, which can provide clues on subtype drivers and transitions, are barely investigated. Here, we analyze the structure of an SCLC TF network by using graph theory concepts and identify its structurally important components responsible for complex signal processing, called hubs. We show that the hubs of the network are regulators of different SCLC subtypes by analyzing first the unbiased network structure and then integrating RNA-seq data as weights assigned to each interaction. Data-driven analysis emphasizes MYC as a hub, consistent with recent reports. Furthermore, we hypothesize that the pathways connecting functionally distinct hubs may control subtype transitions and test this hypothesis via network simulations on a candidate pathway and observe subtype transition. Overall, structural analyses of complex networks can identify their functionally important components and pathways driving the network dynamics. Such analyses can be an initial step for generating hypotheses and can guide the discovery of target pathways whose perturbation may change the network dynamics phenotypically.
Subject terms: Dynamic networks, Computer modelling, Computational science, Cancer, Regulatory networks
Introduction
Throughout their evolution, cells differentiate and specialize into different subtypes, that are often controlled by underlying molecular-level mechanisms1–3. This process is generally pictured by the famous metaphor that is a ball rolling down a hill, called the Waddington Landscape4. Analogous to a ball rolling down a hill, which may change its direction by the effect of obstacles in its way, lose its kinetic energy, slow down, and eventually reside at a stable point, cells may change their trajectories and differentiate to different subtypes due to some regulatory or evolutional triggers while they are maturing. Similarly, due to abnormalities, stochasticity, or other unknown reasons, they may diverge from their trajectories and become cancerous cells5. Moreover, cancerous cells may also evolve and differentiate into other subtypes6–8. Therefore, developing effective treatments for cancer has been a challenge due to heterogeneous cell subpopulations that appear within a tumor. Genetic or non-genetic mechanisms can drive the cancerous cell subpopulations via plasticity, drug-induced selection, or state transitions between the subtypes and have them escape the treatment or recur with a resistance to the treatment9–11, which is the case in multiple cancer types such as breast cancer12,13, melanoma14, and small cell lung cancer (SCLC)15–20.
SCLC is an extremely aggressive disease with a low survival rate21–25 (7% 5-year survival as of 202226). Although it was characterized as molecularly homogeneous due to loss of TP53 and RB1, and neuroendocrine/epithelial differentiation27,28, SCLC was shown to be heterogeneous29–37 by the identification of its mixtures of transcriptional subtypes such as neuroendocrine (NE) stem-cell-like subtype centered on the expression of the transcription factors ASCL1 and NEUROD135 and non-neuroendocrine (NON-NE) subtype centered on the expression of the transcription factor YAP136. Overall, the SCLC subtypes have been classified into four classes SCLC-A (also labeled as NE), SCLC-N (also labeled as NEv1), SCLC-Y (also labeled as NON-NE), and SCLC-P defined by the expression of the transcription factors ASCL1 (A), NEUROD1 (N), YAP1(Y), and POU2F3 (P), respectively29–37. Recently, the fifth subtype has also been proposed named SCLC-A2 (also labeled as NEv2) which is driven by ASCL1 but distinct from the SCLC-A neuroendocrine subtype38. At the early stages of the disease, the cancerous cell population contains the NE type cells, and then over time the population begins to include the NON-NE subtype that is more treatment-resistant34,39,40, indicating that subtype transition is happening. In addition to various subtypes with different levels of resistance to treatment, such transitions between the subtypes further complicate the treatment of the disease. Therefore, understanding molecular heterogeneity in SCLC is essential for developing more precise, tailored treatments to cure the pathology.
Transcription factor (TF) networks have been the focus of the studies to understand the mechanism of the disease and to identify different SCLC subtypes as they are associated with the overexpression of different transcription factors30,34,37,38,41. These networks have been mechanistically analyzed at the systems level which led to the identification of regulators and destabilizers of different subtypes30,34,38, and have contributed to our understanding of the underlying gene regulatory system. However, the structures of these networks were barely studied about a decade ago42. It has been shown in many studies that the structure of a network can be as informative as its dynamical features and their analysis may help to identify key components associated with fundamental functional behaviors43–45. Specifically, hubs (Box 1) of the networks are shown to have key functional properties46–51.
In this study, we analyze the topology of SCLC TF network (Fig. 1) provided in34,38 and has been key in the identification of different SCLC subtypes. It comprises literature-based connections that are verified from ChEA, a database of ChIP-seq-derived interactions52. Overall, the network consists of 35 TFs connected through 239 activatory and inhibitory interactions (red and green arrows in Fig. 1, respectively). Combinational ON–OFF states of the TFs in this network have been shown to drive cells toward different subtypes34. Here, one of our goals is to identify the hubs of the SCLC TF network, which are the special nodes that interconnect several key pathways and play an important role in collecting, processing, and distributing key signals throughout the signaling mechanism. We hypothesize that the hubs might be important for the overall network dynamics and perhaps may help to identify specific TFs that regulate SCLC subtypes. Furthermore, although the earlier studies elucidate regulators of different SCLC subtypes, they lack mechanisms of subtype transitions whose understanding is critical to controlling disease progression. We also hypothesize that the pathways connecting the functionally distinct hubs may have roles in the subtype transitions.
To identify the hubs of the SCLC TF network, we implement a graph theory concept called Dense Spanning Tree (DST, see Box 1), which can be found by solving an optimization problem (Methods Section Dense Spanning Trees of the unbiased SCLC TF network)53–55. We initially analyze a relatively unbiased network structure by considering the undirected and unweighted network. In other words, we only consider whether two nodes are interacting and do not consider the type and direction of interaction. Later, we integrate previously published RNA-seq data into our analysis, which is the probability of each interaction occurring34,38, assigned to each interaction as weights. To identify the hubs given the weighted network graph, we extend the DST concept into Minimum Dense Spanning Tree (MDST, see Box 1) concept for which the DST optimization problem is extended into a multi-objective optimization problem (Methods Section Integrating data into the structural analysis: Minimum Dense Spanning Tree). Interestingly, all the found hubs are either regulators or destabilizers of the previously identified SCLC subtypes as elaborated in the Results section. Next, we test a pathway connecting the two functionally distinct hubs via simulations and observe a transition from the NON-NE to NE subtype. Furthermore, running and tracking several asynchronous NON-NE to NE transition simulations suggest additional TFs other than the hubs that may have a role in this transition.
The paper is organized as follows. First, we present the results of the DST and MDST analyses of the SCLC TF network in the Results Sections Structural analysis of the unbiased SCLC TF network identifies some of the known SCLC subtype regulators and destabilizers and Data-driven structural analysis of the SCLC TF network highlights MYC as a hub in addition to those previously identified as subtype regulators and destabilizers. Then, we present the results of the asynchronous subtype transition simulations in the Results section The pathways connecting the SCLC TF network hubs may have a role in SCLC subtype transitions: NON-NE to NE transition occurs when FLI1–ASCL1–MITF pathway is active. Next, we provide the mathematical details of DST and MDST analyses as well as the details of the transition simulations in Methods Sections Dense Spanning Trees of the unbiased SCLC TF network, Integrating data into the structural analysis: minimum dense spanning trees, and SCLC TF network subtype transition simulations, respectively. In addition, we compare the dst and mdst analysis results in the supplementary information. Finally, we conclude the paper with some concluding remarks.
Box 1: Brief Definitions.
Graph is a collection of objects (points) linked together based on some pairwise relations. Figure B1–1 is an example of a graph (G) with the vertex set V = {a, b, c, d, e}. Some random weights are assigned to the edges for exemplary purposes.
Tree is an acyclic graph, i.e., a graph that do not contain any cycles (loops). Figure B1–2 is an example of a tree.
Node (Vertex) is an individual object (point) in a graph. “a” in Figure B1–1 is an example of a node in the graph.
Edge is a link connecting two nodes in a graph. The link connecting “a” and “b” in Figure B1–1 is an example of an edge.
Node Degree is the number of edges connected to the node.
For more details on basic Graph Theory definitions, please see67.
Given a graph G with a vertex set V:
Spanning Tree (ST) is a subset of G that contains all the vertices in V with minimum number of edges (N-1 edges for a graph with N nodes) connecting all the nodes54. They are not unique and known as the basis of the graph. Figure B1–2 is an example of ST. It contains all the vertices in G with minimum number of edges.
Minimum Spanning Tree (MST) is a spanning tree that minimizes the total weights assigned to the edges. Figure B1–3 is an example of MST. It is a ST and it minimizes the total edge weights.
Dense Spanning Tree (DST): is a special spanning tree that minimizes the total distances between the vertices54. Figure B1–4 is an example of DST. It does not care about the edge weights, but it minimizes the total distances between the nodes. Note that the distance between two nodes here is defined as the number of edges in the shortest path between the nodes, e.g., the distance between “a” and “e” in Figure B1–1 is two.
Minimum Dense Spanning Tree (MDST): is a special spanning tree of a weighted graph that minimizes the total distances between the vertices while minimizing the total weights assigned to the edges. Figure B1–5 is an example of MDST. It minimizes both total distances between the nodes and the total weights assigned to the edges.
Hub: is a node (component) of a graph (network) that has the number of connections above average66. Node “b” in Figure B1–4 is an example for hubs, which has higher node degree and connects multiple nodes.
Results
In our analyses, given the SCLC TF network (Fig. 1), we search for hubs of the network by finding the substructure DSTs (Box 1). The DST of a given network contains hubs that are known to be structurally important nodes interconnecting several pathways. Due to their high and strategic connectedness, they are very likely to have functional importance as well. This concept has many applications in different areas such as telecommunications networks, social networks, resource allocation, and biological networks55.
In biological networks, the DSTs of the network are substructures that preserve the shortest pathways between the nodes (TFs) and hence they preserve the maximum influence among the individual components while highlighting a few nodes as the hubs. Since the identified hubs connect several pathways, they receive so many signals from their peripherals, process them, and distribute them to multiple other nodes. Therefore, in general, they have functional importance as well46–51. Also, depending on the size of the initial network, the identified DSTs may contain multiple hubs. Due to their individual importance, the pathways connecting the hubs might also be important as they are the pathways communicating complex signaling between the hubs. In this section, we show that the hubs of the SCLC TF network are relevant to the SCLC subtypes. Additionally, we test a pathway connecting two identified hubs via network simulations. All the results are elaborated in the following subsections.
Structural analysis of the unbiased SCLC TF network identifies some of the known SCLC subtype regulators and destabilizers
We start our analysis by converting the SCLC TF network (Fig. 1) into an undirected, unweighted network (see Methods Section Dense spanning trees of the unbiased SCLC TF network). In this way, we just focus on whether interactions between two nodes exist without considering their interaction types, directionality, or weights (i.e., probabilities), which allows us to minimize bias on the network structure. Then, we searched for the DSTs of the SCLC TF network following the approach of Ref. 55. Upon solving the global optimization problem in Eq. (1) (Methods Section Dense spanning trees of the unbiased SCLC TF network), we observed 146,143 DSTs, all having the same optimum total distances between the TFs. Examples of the found DSTs are presented in Fig. 2. In one of the DSTs, FLI1 and MITF are identified as the hubs (Fig. 2a) while in the other DST, FLI1, ASCL1, and FOXA1 are identified as the hubs (Fig. 2b). Since different DSTs may highlight different TFs as the hubs, we computed the average node degrees (Box 1) of the nodes among all the found 146,143 DSTs, which is collectively presented in Fig. 3. As seen in the figure, FLI1 is a major hub with about 20 connections on average among all the found DSTs. In addition, MITF, ASCL1, NR0B1, and FOXA1 are the other hubs with relatively high average node degrees in some DSTs.
The found major and side hubs are not only structurally important but also shown to have biological importance to the identified SCLC subtypes. For instance, FLI1—the major hub in Fig. 3—is shown to be one of the regulators of the SCLC NE subtype34,56,57. Similarly, ASCL1, NR0B1, and FOXA1 are reported as one of the regulators of SCLC NE and NEv2 subtypes, and MITF is reported as one of the regulators of the SCLC NON-NE subtype34, which shows the specificity of the hubs of SCLC TF network.
Data-driven structural analysis of the SCLC TF network highlights MYC as a hub in addition to those previously identified as subtype regulators and destabilizers
Next, we repeat our hub search by integrating experimental data into the analysis. The data is the individual probabilities of each interaction between the TFs in the SCLC TF network (Fig. 1), extracted from RNA-seq data34. The probabilities are integrated into the network structure as the weights that are assigned to the associated edges. Then, to identify the hubs of the weighted SCLC TF network, we extend the DST concept into MDST (Box 1) for which we solve an extended multi-objective optimization problem (Methods Section Integrating data into the structural analysis: minimum dense spanning trees). Apart from DSTs, MDSTs allow us to highlight the hubs while preserving the maximum likelihood of the interactions.
Upon solving the optimization, we observed only 46 MDSTs which is drastically lower than the number of DSTs (146,143) found with the unbiased network structure. This means that this analysis guided by prior knowledge, i.e., the experimental data, can constrain the search space more efficiently. Once we compute the average node degrees among the found MDSTs, we observe that FLI1 still is the major hub (Fig. 4). Similarly, ASCL1 and MITF are still identified as the hubs but this time with higher average node degrees compared to the unbiased network analysis (Fig. 4). In other words, they become more major hubs, which coincides with their biological importance in SCLC as reported in the literature30,31,34,38,40,58–60. Interestingly, the data-driven structural analysis further reveals MYC as another hub (Fig. 4), which does not appear in the unbiased network analysis (Fig. 3). Recently, MYC was shown to be one of the key TFs for SCLC32,61–63, which initiates Notch signaling to reprogram neuroendocrine fate from NE to NEv1 to NEv2 to NON-NE states40. Overall, our observations support that structurally important nodes are very likely to be functionally significant as well. Therefore, such structural analyses could be an initial step in the analysis of complex intracellular networked processes because of their potential to pinpoint important network components, which would guide experimental target discovery.
The pathways connecting the SCLC TF network hubs may have a role in SCLC subtype transitions: NON-NE to NE transition occurs when FLI1 – ASCL1 – MITF pathway is active
SCLC TF network contains multiple hubs with varying average node degrees. These hubs are shown to have distinct functional features in terms of SCLC subtypes, as elaborated in the previous sections, which leads us to a question: Do the pathways connecting different hubs that are identified as regulators of different SCLC subtypes have any role in subtype transition? For instance, FLI1 and MITF are the two major hubs identified in both unbiased (Fig. 3) and data-driven structural analyses (Fig. 4). One of the pathways connecting these two hubs is through FLI1–ASCL1–MITF. FLI1 being a regulator of the SCLC NE subtype, MITF being a regulator of the NON-NE subtype, and ASCL1 being a destabilizer of the NON-NE subtype and regulator of the NE subtypes34 suggest that this pathway has a potential role in NON-NE to NE subtype transition. One can also identify such structurally important pathways by checking the interactions remaining in the found DSTs and MDSTs with high probability, as exemplified in Supplementary Information.
To test the possible role of this pathway in the NON-NE to NE subtype transition, here we simulate the SCLC TF network using a tool called BooleaBayes34 that automatically infers gene regulatory mechanisms, based on Boolean logic models, and links inputs and output states tailored to -omics datasets such as those from RNA-seq data. Upon setting the network’s initial state to NON-NE subtype based on previously identified combinational ON-OFF states of the TFs34, keeping the FLI1–ASCL1–MITF pathway active, and running asynchronous network simulation (i.e., one TF is randomly picked and updated at each iteration) using the extracted logic rules (Methods section SCLC TF network subtype transition simulations), we observe a transition from NON-NE to NE subtype (Fig. 5).
Dynamic analysis of asynchronous NON-NE to NE subtype transition simulations
Although the NON-NE to NE subtype transition was observed by keeping the FLI1–ASCL1–MITF pathway active, there are possibly other TFs and dominant pathways that contribute to the transition. Identifying those TFs and dominant pathways may reveal how the system mechanistically executes such transitions and allow us to identify potential other TFs playing a role in the transition. Therefore, as the next step, we run 700 asynchronous NON-NE to NE subtype transition simulations and keep track of all the iterations. Then, we compute the Longest Common Sequence (LCS) based distance (Methods section Distance measure between instantaneous network state and NE subtype) between the target SCLC Boolean NE state and the instantaneous network state at each iteration (Methods section SCLC TF network subtype transition simulations). As seen in Fig. 6, throughout the NON-NE to NE transition, the network state dynamically alternates between NON-NE and NE subtypes through many distance-increasing and -decreasing patterns until it finally converges to the NE state. This means that some reaction patterns drive the cells toward the NE subtype (distance-decreasing patterns in Fig. 7) whereas some other reaction patterns drive the cells toward the NON-NE subtype (distance-increasing patterns in Fig. 7).
Overall, the 700 asynchronous NON-NE to NE subtype transition simulations, in which transition occurs in the order of 105 asynchronous iterations, contain about 7 × 105 distance increasing and 5 × 105 distance decreasing patterns. To see which TF appears most in the distance-increasing and -decreasing patterns, we compute their frequencies (Fig. 8). Interestingly, four TFs that are ASCL1, FLI1, NR0B1, and CEBPD, appear more than the other TFs in the distance-decreasing patterns (Fig. 8a) whereas the same four TFs appear less than the others in the distance-increasing patterns (Fig. 8b). This means that in addition to the ASCL1 and FLI1 which are part of the pathway identified NON-NE to NE transition pathway, NR0B1 and CEBPD may have a regulatory involvement in this transition as well. Moreover, throughout all the asynchronous iterations among 700 NON-NE to NE transitions, we compute the number of iterations for each TFs, on which an update of the TF causes an increase in the distance between the network’s instantaneous state and NE subtype. As seen in Fig. 9a, in addition to ASCL1 and FLI1 which never drives the cells toward the NON-NE subtype, NR0B1 and CEBPD are the two TFs that have a lower effect on the increase in the distance between the network state and the NE subtype compared to the others, which further supports their possible regulatory involvement in NON-NE to NE subtype transition. Furthermore, we compute the probability of TFs being ON at the network state during the initiation of distance decrease patterns (Fig. 9b). With about 0.2 probability of being ON, NR0B1 seems to drive the cells toward the NE subtype by mostly being OFF whereas the activity status of CEBPD seems not very important as its probability of being ON is very close to 0.5. Additionally, Fig. 9b suggests that whenever ISL1 and FOXA2 appear in the distance-decreasing patterns which is very likely as seen in Fig. 8a, they are mostly ON with relatively high probabilities which implies that they may have a role in the NON-NE to NE transition.
Overall, the presented results suggest that structural analysis of the biological networks may guide the identification of functionally important molecules. More specifically, the concepts of DST and here extended to MDST by integrating data can identify hubs of the networks which can be potential targets in the experiments due to their involvement in complex biological processes. Focusing on the SCLC TF network that is being analyzed in this work, all the identified hubs in both unbiased and data-driven analysis show biological importance in terms of SCLC subtype regulation and destabilization as supported by the literature. Moreover, integrating data into the structural analysis highlights MYC as another hub whose importance in SCLC subtypes has recently been discovered32,61–63. This observation further supports those previously reported results. Furthermore, the ability to identify multiple hubs that have distinct functional roles in SCLC subtypes lets us scrutinize the pathways connecting the hubs. Upon asynchronously simulating the network by keeping the pathway connecting FLI1 and MITF—the two major hubs—active, we observed a transition from NON-NE to NE subtype. In addition, analysis of 700 asynchronous NON-NE to NE transition simulations suggests other TFs that may play a role in this transition. As a result, starting from a pure network structure, its analysis leads us to understand the underlying mechanism of a complex biological system, which is noteworthy.
Methods
Dense spanning trees of the unbiased SCLC TF network
Given the SCLC TF network (Fig. 1), to analyze its structure and identify the hubs (Box 1) that are potentially fundamental in terms of their roles in complex biological processes, we search for the substructures called dense spanning trees (DSTs, Box 1). Suppose G is a graph that represents the SCLC TF network, V(G) is the set of nodes that represent the TFs in the network and E(G) is the set of edges that represents the interactions between the TFs in the network. Then, the DST of G is a substructure that minimizes the total distances between the TFs and contains all the TFs in V(G) with a minimum number of interactions while highlighting some nodes with high connectedness, i.e., the hubs. In other words, the DSTs are the subnetworks of the SCLC TF network that comprises the hubs and the shortest pathways from the hubs to all other TFs preserving the maximum biological influence.
To identify the hubs of the SCLC TF network, we minimize possible bias to the network structure by removing all the edge directions, i.e., the information on which node influence the other, the edge types, i.e., the information on activating and inhibitory interactions, and not using any data on strength of the connections, i.e., the probabilities of the interactions (Supplementary Figure 1). Then, the DSTs of the network are observed by solving the following optimization55:
For the graph G with vertex set where , and edge set where ,
1 |
in which denotes the minimum spanning tree obtained from that is a subset of E(G), and is the distance between nodes and defined as the total number of edges in the shortest pathway between and . The main idea here is to find the optimal subset(s) of edges E(G) from which the constructed DST has the optimal objective value which is the total distances between the individual nodes. For more mathematical details and possible applications of this approach, we refer the reader to54,55. Upon solving the optimization problem (1) via Genetic Algorithm (GA), which is a metaheuristic optimization method that attempts to find the global optimum or at least its good approximation64, we observed 146,143 DSTs with the same objective value.
Integrating data into the structural analysis: minimum dense spanning trees
As the next step, we blend this pure structural analysis with some data that is the probability of the interactions, i.e., the strength of the connections estimated from RNA-seq data from the probabilistic Boolean rules by Wooten et al.34. They are the difference of means for a particular node when the parent node is on versus off. To elaborate, suppose FLI1 regulates ASCL1. Then, the weight for the edge between FLI1 and ASCL1 is the mean probability of ASCL1 turning on for FLI1 being on minus the probability of ASCL1 turning on when FLI1 is off across the samples, i.e., P(ASCL1 = 1 | FLI1 = 1) - P(ASCL1 = 1 | FLI1 = 0). So, if ASCL1 is always on when FLI1 is on, and always off when FLI1 is off, then the edge weight = 1. These probabilities are integrated into the network structure as the weights that are assigned to the associated edges. The source codes for computing these probability values were provided in Wooten et al.34 (see their BooleaBayes source codes on GitHub).
To identify the hubs of the weighted SCLC TF network, here we reformulate the optimization problem constructed to find DSTs in Eq. (1) as a multi-objective optimization problem given in Eq. (2) and call the resulting optimal trees as the minimum dense spanning trees (MDSTs, Box 1). MDSTs add another information layer to the found trees by preserving the maximum likelihood of the interactions in addition to the minimum total distances between the nodes while highlighting the hubs of the network. More precisely, MDSTs of the SCLC TF network are the subnetworks that preserve the most probable interactions as well as the maximum biological influence between the TFs via the shortest pathways through the hubs. Note that one can assign different weights to the interactions by different means such as the mutual information between the TFs extracted from experimental data. In this case, the MDSTs will be the substructures that preserve the highest mutual information in addition to the shortest pathways through the hubs.
To find the MDSTs of the SCLC TF network, we extend Eq. (1) as follows: Suppose for each interaction , we are given a probability , that is probability of the existence of the interaction. Then, for the graph G with vertex set where , and edge set where with associated weights :
2 |
in which weight denotes the minimum spanning tree obtained from that is a subset of E(G), and is the distance between nodes and , and results in 1 if the edge is in . Here, the first objective function is the minimization of the total sum of distances between the nodes whereas the second objective function is the minimization of the sum of weights assigned to each edge, which is the same as the maximization of the sum of probabilities of each selected interaction exists based on the definition of weights. Once we solved the multi-objective optimization problem (2) by GA, we observed 46 MDSTs all having the same objective value, which shows the effect of prior knowledge on narrowing down the search space.
SCLC TF network subtype transition simulations
To see how important the pathways connecting the hubs having distinct functional features are, we simulate the SCLC TF network using a tool called BooleaBayes34. BooleaBayes is a Boolean rule-fitting algorithm that infers local regulatory mechanisms near stable cell subtypes from gene expression data. The approach has previously been applied to the SCLC TF network (Fig. 1) to identify and rank master regulators and master destabilizers of SCLC subtypes assuming binary, i.e., ON and OFF, activity states of each transcription factor (Supplementary Figure 2). Further details of BooleaBayes and how it infers the logic rules can be found in34.
Using the Boolean rules extracted via BooleaBayes, we test the role of FLI1–ASCL1–MITF pathway, in which FLI1 and MITF are the two major hubs found by both DST and MDST approaches, in NON-NE to NE subtype transition. This is hypothesized due to FLI1 being a regulator of the SCLC NE subtype, MITF being a regulator of the NON-NE subtype, and ASCL1 being a destabilizer of the NON-NE subtype and regulator of the NE subtype34. In other words, FLI1 and MITF are two functionally distinct hubs identified by DST/MDST analyses and ASCL1 connects these hubs. Note that FLI1–ASCL1–MITF is only one of the candidate pathways connecting these two hubs. We picked this pathway based on prior knowledge from the literature. Nevertheless, if one does this analysis in the same way without any prior knowledge and try all possible candidates, FLI1–ASCL1–MITF pathway will still be identified as one of the candidate pathways that results in a subtype transition.
First, we set the initial state of the network to the NON-NE subtype using the logic TF states in Supplementary Figure 2. Then, we simulate the network using a general asynchronous update scheme with the inferred Boolean rules and keeping the FLI1–ASCL1–MITF pathway active by setting ASCL1 and FLI1 always ON. At each iteration, we randomly select a node and fetch its probability of being ON based on its parent nodes’ instantaneous state from the Boolean lookup tables generated by BooleaBayes. Then, based on the probability value, we flip a weighted coin to set the selected node’s state to ON or OFF. After updating the selected node’s state, we compare the overall network’s state to the target state. After several asynchronous update/compare iterations (usually in the order of 105), the network converged to one of the NE subtype Boolean states (Supplementary Figure 2). The stopping criteria for the simulation is either the network state is equal to the target state, or the simulation reaches to the maximum number of iterations, which we set to 106 (three times more than the typical number of iterations needed for such a transition based on our experience).
We have also tested various activity status of this pathway to see under which conditions such a transition occurs. Keeping FLI1 and ASCL1 always inactive does not result in a NON-NE to NE transition, which is intuitive because the target NE state requires them to be active and they are forced to be inactive. Similarly, keeping FLI1 active and ASCL1 inactive or vice versa does not result in a transition as well. Keeping one of them active and not forcing the other one to any state resulted in a NON-NE to NE transition in a few instances (5% of the simulations). We believe this is due to the random nature of the update scheme, which resulted in the “right” conditions for such a transition. On the other hand, Keeping FLI1 and ASCL1 always active results in this transition at every single run (100% of the simulations). Note that due to the nature of the asynchronous update scheme, the convergence of the system to the NE subtype may occur in a different number of iterations and update patterns at each run of the simulations.
Distance measure between instantaneous network state and NE subtype
To track the network state and understand its dynamic behavior throughout NON-NE to NE transition, we compute the distance between the network’s instantaneous state at each iteration and the target NE subtype. The distance metric we chose is longest common sequence (LCS) metric65 due to its sensitivity to order differences by assigning a larger distance value to the difference between the network state and target state and it can be applied to vectors with the same or different lengths. Overall, LCS-based distance is a metric that measures the difference between two sequences as a cost of required insertions and deletions operations to transform one sequence to another. Given two vectors and of length , that in our case represent the network state and the target state, respectively, the LCS-based distance is defined as follows:
3 |
where is the number of elements in that uniquely matches the elements of in the same order (not necessarily contiguous). Note that one can use other distance metrics such as Hamming distance to perform the same analysis if the vectors are equal in lengths.
Computing LCS-based distance between the instantaneous network state and NE subtype throughout the asynchronous transition simulations shows us how the network converges and diverges from the NE subtype starting from the NON-NE subtype. Furthermore, this allows us to identify some patterns causing increase and decrease between the two network states; and hence, allows us to identify other TFs that may contribute to this transition.
Discussion
Small cell lung cancer (SCLC) is an aggressive disease with its mixtures of transcriptional subtypes such as neuroendocrine (NE) and non-neuroendocrine (NON-NE), later being more treatment-resistant, regulated by the expression of different transcription factors (TFs). In addition to the heterogeneity in cancerous cell types, transitions between the subtypes make the disease even harder to treat. To date, SCLC TF networks have been broadly studied via systems approaches to reveal regulators and destabilizers of different subtypes. Yet, the studies lack mechanisms of subtype transitions, whose understanding is critical to control disease progression and perhaps develop ways for permanent cure. In this work, we hypothesize that analysis of the SCLC TF network structure (Fig. 1), which is barely investigated to our best knowledge, can provide clues on distinct subtype drivers, and further reveal pathways controlling subtype transitions. To test this hypothesis, here we use graph theory concepts called Dense Spanning Trees and its extended version called Minimum Dense Spanning Trees (DSTs and MDSTs, see Box 1 and Methods Sections Dense Spanning Trees of the unbiased SCLC TF network and Integrating data into the structural analysis: Minimum Dense Spanning Trees). DSTs and MDSTs are special subnetworks of the initial TF network that feature strategical nodes called hubs and the pathways connecting the hubs. Hubs are critical nodes due to interconnecting several key pathways and collecting, processing, and distributing key signals throughout the signaling mechanism. Moreover, the pathways connecting the hubs are also important as they are potential probes for controlling complex signaling across hubs. Therefore, given two hubs regulating different SCLC subtypes, we hypothesize that the pathways connecting these hubs could be targets to control subtype transitions.
First, with DSTs, we analyze a relatively unbiased network structure by removing all the edge directions, i.e., the information on activatory and inhibitory interactions, and not using any data on the strength of the connections (Fig. 3). Next, we integrate data into this pure structural analysis, assigned to each edge as weights that are the probability of the existence of the interactions, i.e., the strength of the connections estimated from RNA-seq data34. Then, we extend the DST into MDST (Methods Section Integrating data into the structural analysis: minimum dense spanning trees) to identify the hubs of the weighted network structure (Fig. 4). Interestingly, all the hubs such as ASCL1, FLI1, and MITF identified in both unbiased and data-driven structural analyses are either regulators or destabilizers of different SCLC subtypes as reported in the literature, which confirms our hypothesis on the importance of hubs. Additionally, the structural analysis driven by the data highlights MYC as another hub in addition to those identified in unbiased analysis (Fig. 4), which supports its importance in SCLC subtypes as shown in recent studies32,61–63. To test the roles of pathways connecting functionally distinct hubs, we asynchronously simulate the SCLC TF network using a Boolean modeling framework extracted by a tool called BooleaBayes34 (Methods section SCLC TF network subtype transition simulations). As a result of several asynchronous iterations and keeping the pathway connecting FLI1 and MITF—the two major hubs in both unbiased and data-driven analyses—active, we observe a transition from NON-NE to NE subtype (Fig. 5), confirming our hypothesis on the importance of hub-connecting pathways. Furthermore, after analyzing increasing and decreasing patterns in distance between the network state and NE subtype (Figs. 6 and 7) in 700 asynchronous NON-NE to NE transition simulations, we conclude that the TFs NR0B1 and CEBPD may also play a role in this transition in addition to FLI1 and ASCL1 (Figs. 8 and 9).
Note that, one can integrate different data into this analysis, assigned as the weights to the edges. For instance, instead of assigning probabilities of interactions extracted from experimental data, the mutual information between the pair of nodes can be used. In this case, resulting MDSTs would contain the hubs while preserving the highest mutual information and the maximum influence within the nodes. Similarly, one can assign the weights manually guided by prior knowledge to keep the preferred interactions in the resulting substructures. Also, one can apply the tools presented here for any network type such as protein–protein interaction networks (PPINs), gene regulatory networks (GRNs), cell signaling networks, and metabolic networks. In addition, they can be applied to any network structures such as directed or undirected and weighted or unweighted. Note that although preserving the directedness of interactions would integrate more information into the structural analysis, it would also require adding new constraints to the optimization problems (1) and (2), which may become harder to solve due to increased complexity, leaving room for potential improvement to the found DSTs and MDSTs for the SCLC network. Moreover, as this is a structural network analysis, the results will be sensitive to the given network structure. Here, we analyzed the SCLC TF network provided in34,38. Given different SCLC TF networks with different set of nodes and interactions, the observations might change.
There are different ways to define and identify the hubs for a given network than ours. One can define a node that has the most connections (highest node degree) or a node that has the most connections that make it central in the network as the hub (see Supplementary Information for application of different hub definitions and their results on SCLC TF network). However, we believe they are not very well suited for biological applications as they are purely structural concepts and aren’t concerned about the closeness, i.e., the influence of the nodes with each other. Moreover, such hubs are expected to occur only in scale-free networks, i.e., the networks whose degree distribution follows power law66. On the other hand, the concept of DSTs and MDSTs can identify hubs for any given network because, in DSTs and MDSTs, hubs are defined as the central nodes that minimize the total distance between every node, and such substructures can be found for any random network. Additionally, there are other ways to find DSTs of a given network such as the edge-swap heuristic algorithms presented in53,54. However, we have previously shown that optimization-based approaches outperform such edge-swap heuristic algorithms55 both in accuracy and computational complexity changing by the network size. Lastly, here, to identify the DST and MDSTs, we solve the optimization problems (1) and (2) using genetic algorithm (GA), which is a metaheuristic optimization method that attempts to find a globally optimal solution, but it does not guarantee a global solution because it does not guarantee exploration of all the search space and the solution quality and optimality depend on several parameters that need to be properly selected by the user, including population size, rate of mutation and crossover, etc.64. However, GA is well suited for problems that are discrete and combinatorial in nature by providing at least a good approximation of the global solution. Nevertheless, one can solve these optimization problems via other algorithms such as particle swarm optimization.
Overall, the presented results have shown that the hubs of the SCLC TF network identified via DSTs and MDSTs are either regulators or destabilizers of different SCLC subtypes. This implies that structural analyses of the networks can be advantageous as the initial analysis step as their results can be used as guidance to generate hypotheses to be tested in experiments. Moreover, the pathways connecting the functionally distinct hubs may have major roles in SCLC subtype transitions as shown by the simulations, which may allow the control of such transitions and help develop better treatment strategies by driving the cancerous cells toward more sensitive states. Furthermore, targeting those pathways in the experiments may lead to the identification of other dominant components in such transitions and hence help to understand the underlying mechanism of this complex signaling process. As a result, pure as well as data-driven structural analyses of the networked processes could be a plausible first step and may result in important biological observations in complex systems as well as help generate hypotheses to be tested.
Reporting summary
Further information on research design is available in the Nature Research Reporting Summary linked to this article.
Supplementary information
Acknowledgements
We would like to thank Vito Quaranta, Sarah Maddox Groves, and Lopez Lab members at Vanderbilt University for insightful conversations and critical feedback on this work. This work was supported by the following funding sources: C.F.L. was supported by the National Science Foundation (NSF) [M.C.B. 1411482] and NSF CAREER Award [M.C.B. 1942255]; and the National Institutes of Health (NIH) [U54-CA217450 and U01-CA215845].
Author contributions
M.O. developed the methods, performed the simulations and computations, and wrote the manuscript. C.F.L. conceived the ideas and concepts, developed the methods, and wrote the manuscript.
Data availability
Data sharing is not applicable to this article as no new datasets were generated during the current study.
Code availability
The source MATLAB codes are available for reproducing the results or redoing the analyses on GitHub: https://github.com/LoLab-MSM/SCLC-TF-Network-Analysis.
Competing interests
The authors declare no competing interests.
Footnotes
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
The online version contains supplementary material available at 10.1038/s41540-023-00316-2.
References
- 1.Slack J. Metaplasia and transdifferentiation: from pure biology to the clinic. Nat. Rev. Mol. Cell Biol. 2007;8:369–378. doi: 10.1038/nrm2146. [DOI] [PubMed] [Google Scholar]
- 2.MacArthur B, Ma’ayan A, Lemischka I. Systems biology of stem cell fate and cellular reprogramming. Nat. Rev. Mol. Cell Biol. 2009;10:672–681. doi: 10.1038/nrm2766. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Newman SA. Cell differentiation: what have we learned in 50 years? J. Theo. Biol. 2020;485:110031. doi: 10.1016/j.jtbi.2019.110031. [DOI] [PubMed] [Google Scholar]
- 4.Waddington, C. H. The strategy of the genes. George Allen & Unwin, London (1957).
- 5.Huang S. Genetic and non-genetic instability in tumor progression: link between the fitness landscape and the epigenetic landscape of cancer cells. Cancer Metastasis Rev. 2013;32:423–448. doi: 10.1007/s10555-013-9435-7. [DOI] [PubMed] [Google Scholar]
- 6.Kim Y, Lin Q, Glazer PM, Yun Z. Hypoxic tumor microenvironment and cancer cell differentiation. Curr. Mol. Med. 2009;9:425–434. doi: 10.2174/156652409788167113. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Jögi A, Vaapil M, Johansson M, Påhlman S. Cancer cell differentiation heterogeneity and aggressive behavior in solid tumors. Upsala J. Med. Sci. 2012;117:217–224. doi: 10.3109/03009734.2012.659294. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Saghafinia S, et al. Cancer cells retrace a stepwise differentiation program during malignant progression. Cancer Discov. 2021;11:2638–2657. doi: 10.1158/2159-8290.CD-20-1637. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Yuan S, Norgard RJ, Stanger BZ. Cellular plasticity in cancer. Cancer Discov. 2019;9:837–851. doi: 10.1158/2159-8290.CD-19-0015. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Tomasetti C, Vogelstein Variation in cancer risk among tissues can be explained by the number of stem cell divisions. Science. 2015;347:78–81. doi: 10.1126/science.1260825. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Qin S, et al. Emerging role of tumor cell plasticity in modifying therapeutic response. Sig Transduct. Target Ther. 2020;5:228. doi: 10.1038/s41392-020-00313-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Kong D, Hughes CJ, Ford HL. Cellular plasticity in breast cancer progression and therapy. Front Mol. Biosci. 2020;7:72. doi: 10.3389/fmolb.2020.00072. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Nguyen A, Yoshida M, Goodarzi H, Tavazoie SF. Highly variable cancer subpopulations that exhibit enhanced transcriptome variability and metastatic fitness. Nat. Commun. 2016;7:11246. doi: 10.1038/ncomms11246. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Rambow F, Marine JC, Goding CR. Melanoma plasticity and phenotypic diversity: therapeutic barriers and opportunities. Genes Dev. 2019;33:1295–1318. doi: 10.1101/gad.329771.119. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Calbo J, et al. A functional role for tumor cell heterogeneity in a mouse model of small cell lung cancer. Cancer Cell. 2011;19:244–256. doi: 10.1016/j.ccr.2010.12.021. [DOI] [PubMed] [Google Scholar]
- 16.George J, et al. Comprehensive genomic profiles of small cell lung cancer. Nature. 2015;524:47–53. doi: 10.1038/nature14664. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Carney DN, et al. Establishment and identification of small cell lung cancer cell lines having classic and variant features. Cancer Res. 1985;45:2913–2923. [PubMed] [Google Scholar]
- 18.Hann CL, Rudin CM. Fast, hungry and unstable: finding the Achilles’ heel of small-cell lung cancer. Trends Mol. Med. 2007;13:150–157. doi: 10.1016/j.molmed.2007.02.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Marusyk A, Almendro V, Polyak K. Intra-tumour heterogeneity: a looking glass for cancer? Nat. Rev. Cancer. 2012;12:323–334. doi: 10.1038/nrc3261. [DOI] [PubMed] [Google Scholar]
- 20.Sutherland KD, et al. Cell of origin of small cell lung cancer: inactivation of Trp53 and Rb1 in distinct cell types of adult mouse lung. Cancer Cell. 2011;19:754–764. doi: 10.1016/j.ccr.2011.04.019. [DOI] [PubMed] [Google Scholar]
- 21.Rudin CM, et al. Treatment of small-cell lung cancer: American Society of Clinical Oncology Endorsement of the American College of Chest Physicians Guideline. J. Clin. Oncol. J. Am. Soc. Clin. Oncol. 2015;33:4106–4111. doi: 10.1200/JCO.2015.63.7918. [DOI] [PubMed] [Google Scholar]
- 22.Byers LA, Rudin CM. Small cell lung cancer: where do we go from here? Cancer. 2015;121:664–672. doi: 10.1002/cncr.29098. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Sutherland, et al. Cell of origin of small cell lung cancer: inactivation of Trp53 and Rb1 in distinct cell types of adult mouse lung. Cancer Cell. 2011;19:754–764. doi: 10.1016/j.ccr.2011.04.019. [DOI] [PubMed] [Google Scholar]
- 24.Park K-S, et al. Characterization of the cell of origin for small cell lung cancer. Cell Cycle. 2014;10:2806–2815. doi: 10.4161/cc.10.16.17012. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Song H, et al. Functional characterization of pulmonary neuroendocrine cells in lung development, injury, and tumorigenesis. Proc. Natl Acad. Sci. USA. 2012;109:17531–17536. doi: 10.1073/pnas.1207238109. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.American Cancer Society. Cancer facts and figures 2022. Atlanta: American Cancer Society; 2022.
- 27.Semenova EA, Nagel R, Berns A. Origins, genetic landscape, and emerging therapies of small cell lung cancer. Gene Dev. 2015;29:1447–1462. doi: 10.1101/gad.263145.115. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Gazdar AF, Bunn PA, Minna JD. Small-cell lung cancer: what we know, what we need to know and the path forward. Nat. Rev. Cancer. 2017;17:725. doi: 10.1038/nrc.2017.87. [DOI] [PubMed] [Google Scholar]
- 29.Gazdar AF, Carney DN, Nau MM, Minna JD. Characterization of variant subclasses of cell lines derived from small cell lung cancer having distinctive biochemical, morphological, and growth properties. Cancer Res. 1985;45:2924–2930. [PubMed] [Google Scholar]
- 30.Udyavar AR, et al. Novel hybrid phenotype revealed in small cell lung cancer by a transcription factor network model that can explain tumor heterogeneity. Cancer Res. 2017;77:1063–1074. doi: 10.1158/0008-5472.CAN-16-1467. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Rudin CM, et al. Molecular subtypes of small cell lung cancer: a synthesis of human and mouse model data. Nat. Rev. Cancer. 2019;19:289–297. doi: 10.1038/s41568-019-0133-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Mollaoglu G, et al. MYC drives progression of small cell lung cancer to a variant neuroendocrine subtype with vulnerability to aurora kinase inhibition. Cancer Cell. 2017;31:270–285. doi: 10.1016/j.ccell.2016.12.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Horie M, Saito A, Ohshima M, Suzuki HI, Nagase T. YAP and TAZ modulate cell phenotype in a subset of small cell lung cancer. Cancer Sci. 2016;107:1755–1766. doi: 10.1111/cas.13078. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Wooten, D. J. et al. Systems-level network modeling of small cell lung cancer subtypes identifies master regulators and destabilizers. PLoS Comput. Biol.15, (2019). [DOI] [PMC free article] [PubMed]
- 35.Borromeo MD, et al. ASCL1 and NEUROD1 reveal heterogeneity in pulmonary neuroendocrine tumors and regulate distinct genetic programs. Cell Rep. 2016;16:1259–1272. doi: 10.1016/j.celrep.2016.06.081. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Huang YH, et al. POU2F3 is a master regulator of a tuft cell-like variant of small cell lung cancer. Genes Dev. 2018;32:915–928. doi: 10.1101/gad.314815.118. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Gay CM, et al. Patterns of transcription factor programs and immune pathway activation define four major subtypes of SCLC with distinct therapeutic vulnerabilities. Cancer Cell. 2021;39:346–360. doi: 10.1016/j.ccell.2020.12.014. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Groves SM, et al. Archetype tasks link intratumoral heterogeneity to plasticity and cancer hallmarks in small cell lung cancer. Cell Syst. 2022;13:690–710. doi: 10.1016/j.cels.2022.07.006. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Lim JS, et al. Intratumoural heterogeneity generated by Notch signalling promotes small-cell lung cancer. Nature. 2017;545:360–364. doi: 10.1038/nature22323. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Ireland AS, et al. MYC drives temporal evolution of small cell lung cancer subtypes by reprogramming neuroendocrine fate. Cancer Cell. 2020;38:60–78. doi: 10.1016/j.ccell.2020.05.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Viktorsson K, Lewensohn R, Zhivotovsky B. Systems biology approaches to develop innovative strategies for lung cancer therapy. Cell Death Dis. 2014;5:e1260. doi: 10.1038/cddis.2014.28. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Zhang W, et al. Network analysis in lung cancer. Thorac. Cancer. 2014;5:556–564. doi: 10.1111/1759-7714.12134. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Santolini M, Barabasi A-L. Predicting perturbation patterns from the topology of biological networks. Proc. Natl Acad. Sci. 2018;115:E6375–E6383. doi: 10.1073/pnas.1720589115. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Klein C, et al. Structural and dynamical analysis of biological networks. Brief. Fun Gen. 2012;11:420–433. doi: 10.1093/bfgp/els030. [DOI] [PubMed] [Google Scholar]
- 45.Doncheva N, et al. Topological analysis and interactive visualization of biological networks and protein structures. Nat. Protoc. 2012;7:670–685. doi: 10.1038/nprot.2012.004. [DOI] [PubMed] [Google Scholar]
- 46.He X, Zhang J. Why do hubs tend to be essential in protein networks? PLoS Genet. 2006;2:826–834. doi: 10.1371/journal.pgen.0020088. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Helsen J, Frickel J, Jelier R, Verstrepen KJ. Network hubs affect evolvability. PLoS Biol. 2019;17:e3000111. doi: 10.1371/journal.pbio.3000111. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Liu Y, et al. Identification of hub genes and key pathways associated with bipolar disorder based on weighted gene co-expression network analysis. Front. Physiol. 2019;10:1081. doi: 10.3389/fphys.2019.01081. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Di Silvestre D, et al. Network topological analysis for the identification of novel hubs in plant nutrition. Front. Plant Sci. 2021;10:629013. doi: 10.3389/fpls.2021.629013. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Dietz K-J, Jacquot J-P, Harris G. Hubs and bottlenecks in plant molecular signalling networks. N. Phytologist. 2010;188:919–938. doi: 10.1111/j.1469-8137.2010.03502.x. [DOI] [PubMed] [Google Scholar]
- 51.Sulaimanov, N, et al. Inferring gene expression networks with hubs using a degree weighted Lasso approach. Bioinformatics. 2019;35:987–994. doi: 10.1093/bioinformatics/bty716. [DOI] [PubMed] [Google Scholar]
- 52.Lachmann A, et al. ChEA: transcription factor regulation inferred from integrating genome-wide ChIP-X experiments. Bioinformatics. 2010;26:2438–2444. doi: 10.1093/bioinformatics/btq466. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Silva R, et al. An edge-swap heuristic for generating spanning trees with minimum number of branch vertices. Optim. Lett. 2014;8:1225–1243. doi: 10.1007/s11590-013-0665-y. [DOI] [Google Scholar]
- 54.Ozen M, Wang H, Wang K, Yalman D. An edge-swap heuristic for finding dense spanning trees. Theory Appl. Graphs. 2016;3:1–10. [Google Scholar]
- 55.Ozen M, Lesaja G, Wang H. Globally optimal dense and sparse spanning trees, and their applications. Stat. Optim. Inf. Comput. 2020;8:328–345. doi: 10.19139/soic-2310-5070-855. [DOI] [Google Scholar]
- 56.Li L, et al. Friend leukemia virus integration 1 promotes tumorigenesis of small cell lung cancer cells by activating the miR-17-92 pathway. Oncotarget. 2017;8:41975–41987. doi: 10.18632/oncotarget.16715. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57.Li L, et al. FLI1 exonic circular RNAs as a novel oncogenic driver to promote tumor metastasis in small cell lung cancer. Clin. Cancer Res. 2019;25:1302–1317. doi: 10.1158/1078-0432.CCR-18-1447. [DOI] [PubMed] [Google Scholar]
- 58.Augustyn A, et al. ASCL1 is a lineage oncogene providing therapeutic targets for high-grade neuroendocrine lung cancers. Proc. Natl Acad. Sci. USA. 2014;111:14788–14793. doi: 10.1073/pnas.1410419111. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 59.Baine MK, et al. SCLC subtypes defined by ASCL1, NEUROD1, POU2F3, and YAP1: a comprehensive immunohistochemical and histopathologic characterization. J. Thorac. Oncol. 2020;15:1823–1835. doi: 10.1016/j.jtho.2020.09.009. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60.Olsen RR, et al. ASCL1 represses a SOX9+ neural crest stem-like state in small cell lung cancer. Genes Dev. 2021;35:847–869. doi: 10.1101/gad.348295.121. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 61.Chalishazar MD, et al. MYC-driven small-cell lung cancer is metabolically distinct and vulnerable to arginine depletion. Clin. Cancer Res. 2019;25:5107–5121. doi: 10.1158/1078-0432.CCR-18-4140. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 62.Patel, A. S. et al. Prototypical oncogene family Myc defines unappreciated distinct lineage states of small cell lung cancer. Sci. Adv.7 (2021). [DOI] [PMC free article] [PubMed]
- 63.Chen J, et al. Lineage-restricted neoplasia driven by Myc defaults to small cell lung cancer when combined with loss of p53 and Rb in the airway epithelium. Oncogene. 2022;41:138–145. doi: 10.1038/s41388-021-02070-3. [DOI] [PubMed] [Google Scholar]
- 64.Mitchell, M. An introduction to genetic algorithms. (MIT Press, Cambridge, MA, 1996).
- 65.Bergroth L, Hakonen H, Raita T. A survey of longest common subsequence algorithms. Proc. 7th Int. Symp. String Process. Inf. Retr. SPIRE. 2000;2000:39–48. [Google Scholar]
- 66.Barabasi, A-L. Network Science, Cambridge University Press, United Kingdom (2016).
- 67.Balakrishnan, V. K. Graph Theory (1st ed.). McGraw-Hill (1997).
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
Data sharing is not applicable to this article as no new datasets were generated during the current study.
The source MATLAB codes are available for reproducing the results or redoing the analyses on GitHub: https://github.com/LoLab-MSM/SCLC-TF-Network-Analysis.