Skip to main content
PLOS One logoLink to PLOS One
. 2024 Nov 19;19(11):e0314022. doi: 10.1371/journal.pone.0314022

Computational investigation of missense somatic mutations in cancer and potential links to pH-dependence and proteostasis

Shalaw Sallah 1, Jim Warwicker 1,*
Editor: Rajesh Kumar Pathak2
PMCID: PMC11575792  PMID: 39561123

Abstract

Metabolic changes during tumour development lead to acidification of the extracellular environment and a smaller increase of intracellular pH. Searches for somatic missense mutations that could reveal adaptation to altered pH have focussed on arginine to histidine changes, part of a general arginine depletion that originates from DNA mutational mechanisms. Analysis of mutations to histidine, potentially a simple route to the introduction of pH-sensing, shows no clear biophysical separation overall of subsets that are more and less frequently mutated in cancer genomes. Within the more frequently mutated subset, individual sites predicted to mediate pH-dependence upon mutation include NDST1 (a Golgi-resident heparan sulphate modifying enzyme), the HLA-C chain of MHCI complex, and the water channel AQP-7. Arginine depletion is a general feature that persists in the more frequently mutated subset, and is complemented by over-representation of mutations to lysine. Arginine to lysine balance is a known factor in determining protein solubility, with higher lysine content being more favourable. Proteins with greater change in arginine to lysine balance are enriched for cell periphery location, where proteostasis is likely to be challenged in tumour cells. Somatic missense mutations in a cancer genome number only in the 10s typically, although can be much higher. Whether the altered arginine to lysine balance is of sufficient scale to play a role in tumour development is unknown.

1. Introduction

The tumour microenvironment is affected by metabolic changes in cancer cells. Both hypoxia and acidosis have been characterised, with hypoxia a therapeutic target [1, 2]. An increasing interest in acidosis associated with cancer metabolism has the perspective that pH changes are not only the result of altered metabolism, but also regulate tumour progression, and therefore offer therapeutic opportunities [1]. In response to a shift towards glycolytic metabolism and other acid producing processes, acid extruding transporters are up-regulated at multiple levels of expression control, with additional modulation from the pH-sensitivity of glycolytic enzymes and transporters [3]. A role for somatic mutations altering net acid extrusion from cells and oncogenic processes has been discussed in the context of acid-resistant cells being associated with aggressive phenotypes [4]. A survey of tumour somatic mutations in acid-base transporters argued that interpretation would be improved with additional data, for example transporter flux measurement and environmental pH reporters [5]. Mechanisms of pH-sensing in cancer have been placed into the context of a wider role in cellular physiology and pathophysiology [6].

Cancer cells tend to have a higher intracellular pH (pHi) than normal cells (pH ~ 7.2), with a reversed pH gradient across the cancer cell membrane, and an extracellular pH (pHe) less than normal (pH ~ 7.4) [1]. A review of measured pH values shows decreases of 0.3–0.7 in pHe, dependent upon tissue, with the minimum pHe of 6.4 for lung tissue [7]. Changes are smaller for pHi, typically only 0.1 [7]. It is appreciated that within these average values, there are localised regions of acidification in the tumour microenvironment [8].

The relevance of ionisation at the histidine sidechain, and its role in protein structure and function, to pH changes in cancer has been recognised [9]. Databases such as the Catalogue of Somatic Mutations in Cancer (COSMIC) collect and annotate somatic mutations from cancer genome sequencing projects [10]. Using these data, enrichments for mutation from arginine and (to a lesser degree) for mutation from glutamic acid to lysine, have been seen [1113]. This observation has been termed arginine depletion in human cancers, and related to C > T transitions in nucleotide mutational signatures, with subsequent selection for function at the amino acid (AA) level suggested for some sites [14]. Mutational signatures in cancers have been widely studied since the advent of gene and subsequently genome sequencing methods, and in general terms result from the superposition of specific cancer mutational mechanisms with weaker and constant endogenous mutation processes [15]. An example of a specific cancer mutational signature is the prevalence of C > T and CC > TT mutations at dipyrimidines in skin cancer, which correlates with the effect of UV light [16]. Bioinformatics analysis of cancer genomes has revealed > 100 mutational signatures, which are also being coupled to genome topographical properties in the ENCODE database [17] (for example nucleosome occupancy of base-pairs) [18]. All cancer types include some C > T mutations (at CpG dinucleotides), arising from the normal endogenous process of 5-methylcytosine deamination [15, 16], contributing to Arg depletion at the amino acid level [14].

Proposed functional significance of arginine mutation includes a potential role for cysteine in neutralising reactive oxygen species [13, 19], and pH-sensing for the introduction of histidine [20], with variation of AA mutation frequencies between cancer types noted [11, 14]. Arginine to histidine mutations that mediate altered pH-dependence of activity, and could be related to fitness advantage in cancer cells include EGFR-R776H and p53-R273H [20], and IDH1-R132H where production of the oncometabolite D-2-hydroxyglutarate is rendered pH-dependent if mutant and wild type IDH1 form a heterodimer [21]. Mutation away from histidine can also modulate pH-dependent function, as seen with β-catenin-H36R [22].

Computational analysis of missense somatic mutation, from the COSMIC database, is facilitated by the availability of AlphaFold2 models for protomers [23], for which coverage of the entire proteome comes at the expense of excluding homo and heter-oligomeric structures, other than in specific cases. Methods for prediction of pKas and protein pH-dependence such as PROPKA [24] and pkcalc [25] are sufficiently fast to allow computation for large datasets [26]. Constant pH molecular dynamics techniques explicitly link conformational change to protonation events [27], but are not currently of sufficient speed for application to the human proteome. Both PROPKA3 [28] and pkcalc have been benchmarked against experimental data [25, 29].

This study looks at missense somatic mutations in COSMIC, recapitulating the prominence of mutation from arginine, followed by a focus on solvent accessible surface area (SASA) calculation and pKa predictions. Mutations involving histidine are the focus, given the proximity of its normal sidechain pKa to pHi and pHe, noting that other AAs can be involved in pH-sensing [30]. Mutations with higher occurrence in COSMIC (≥ 10) were assumed to be more likely driver mutations, and they show no systematic deviation of predicted pKa from presumed passenger (1 occurrence) mutations. Specific sites that are of potential interest for pH-sensing are described. Arginine depletion, and a smaller lysine supplementation effect, are discussed in the context of protein solubility, for which arginine/lysine balance is known to contribute. Consideration is given to the numbers of mutations in tumour cells, and the location of the most mutated proteins.

2. Materials and methods

Somatic mutations in cancer were obtained from version 97 of the COSMIC database [10], and filtered for missense mutations, to analyse single AA changes. Ensembl transcript identifiers in the COSMIC data were mapped to UniProt [31] identifiers. Redundancy due to multiple entries for the same mutation in a unique combination of COSMIC tumour and sample identifiers was removed. The resulting list was ranked by the number of COSMIC instances (occurrences) for each mutation, where these instances are comparable to the numbers displayed in the Gene View page of the COSMIC database web site. Conversion of identifiers allowed direct matching with AlphaFold2 models for protomers in the human proteome [23], and used the Retrieve/ID mapping facility at UniProt. Subsets of mutations were made according to ranges of instances recorded in COSMIC. The total number of mutations recorded is 3,289,443, the majority of which (2,625,973) have a single instance. The subset of mutations with ≥ 10 instances (instGE10), used in this study, numbers 9,919. When reduced to being unique by wild type AA, the number of sites is 2,769,929, being 24.3% of the AlphaFold2 structurally annotated human proteome.

Mutation matrices were made for combinations of wild type and target AAs, and presented as heat maps. For sets of COSMIC missense mutations, over all cancer types, the heat map is calculated as percentages of the (20x19) combinations within each set. Distinction by cancer type was introduced using Human Protein Atlas [32] classifications, cross referenced to those in the COSMIC database. The Mutation Assessor (MA) tool was used to retrieve a set of Functional Impact Score data that reflects amino acid conservation [33]. Enrichment of protein subsets in Gene Ontology (GO) terms engaged the Princeton GO Finder tool [34]. Subcellular location of proteins was added from UniProt. Three tools that have been constructed to assess benign versus deleterious mutation effects on protein function were used to analyse a subset of mutations to histidine: PolyPhen-2 [35], SIFT [36], and AlphaMissense [37].

In order to estimate the difference between observed instances of mutation from wild type AAs in COSMIC, and that expected without adaptation, a simple model was constructed. Wild type AA sites with a single mutation in COSMIC were assumed to not be drivers, and thus to represent a background probability for mutation, when combined with the total number of each AA type in proteins that have mutations in COSMIC. This background probability was then propagated successively through increasing mutation instances, differencing between the current number of instances and the next, to give a predicted number of mutations (assuming no growth advantage) for each instance number and wild type AA. Instances were gathered into ranges to examine the results as the instances increase. These data were processed as both absolute numbers, and as the percentage distributions over AA types, within each range of instances. Finally, the predicted background distributions were differenced with the observed distributions.

Structure based calculations were made with locally installed code for SASA (sacalc), and for pKa (pkcalc, PROPKA) [24, 29]. For some cases, molecular graphics representation of predicted pKas are shown using the protein-sol server [25]. Models for single site mutation in an AlphaFold2 protomer were made with SwissModel [38]. In order to make a distinction between relatively buried and accessible AAs, a threshold of 20 Å2 was used.

3. Results and discussion

3.1 Mutation from arginine is a dominant feature in somatic mutation matrices

A set of 9,919 mutations from the February 2023 (version 97) COSMIC database, with ≥ 10 instances (instGE10) was selected to represent a set with sufficient numbers for analysis of the 20 x 19 mutation matrix, but also enriched for driver mutations. The most obvious features in the distribution of instGE10 mutations, other than being sparse due to the limited codon changes associated with single nucleotide mutation, are the preponderance of mutations from Arg (to Cys, Gln, His, Trp in particular), from Ala to Thr and Val, and also from Glu to Lys (Fig 1A). When compared (by differencing) with the set of all mutations that occur just once in the February 2023 COSMIC database (presumed passenger, instEQ1), most prominent are from Arg to Cys/Gln mutations (Fig 1B). Mutations from Arg to His for instGE10 are enriched relative to instEQ1, as are Glu to Lys, but not as much as from Arg to Cys/Gln mutations. As reported by others [1113] mutation from Arg is a major feature of in the landscape of somatic mutations in cancer, but it is also apparent that rather than a specific emphasis on mutation to His in a set that is enriched in driver mutations, target amino acids of Cys and Gln feature more prominently [11, 13]. Mutation to His is a potential route to gain of function pH-sensing at close to neutral pH [39]. Here the high incidence of mutations from Arg, due to underlying DNA mutational signatures [40] are, if anything, depleted for those to His in the instGE10 set.

Fig 1. Mutation matrices across cancer types in COSMIC.

Fig 1

(A) The percentage distribution of all 20 x 19 missense AA mutations in COSMIC, only for mutations with instances ≥ 10. Wild type AA is listed at the left, and target AA across the top. Summation is to 100% over the entire heat map, and the white to blue colour scale is indicated to the right. (B) Percentage distribution for the instances = 1 subset is subtracted from that for instances ≥ 10, therefore with sum to 0% over the heat map, and with colour scale (red-white-blue) shown to the right.

When somatic mutations are separated for 15 cancer types, using cancer type names from the Human Protein Atlas [32], the prominence of Arg in the instGE10 set remains a common feature despite the variation in DNA mutational signatures (Fig 2A). Skin cancer has the most feature-rich set of mutated AAs (Fig 2A), related to the mutational mechanisms resulting from exposure ultra violet light [41], with Glu, Gly, Pro, Ser adding to Arg as heavily mutated. The prevalence of mutations from these 5 AAs, in the instGE10 set for skin cancer (Fig 2B), shows similarity with that across all cancers (Fig 1A), with Arg mutation to Cys/Gln and to a lesser degree, His, as well as Glu to Lys mutation. While the prevalence of mutations from Pro and Ser are a specific feature of skin cancer in the instGE10 set (Fig 2A), enriched destination AAs of Leu (from Pro and Ser), and Ser (from Pro) are also common to the overall cancer landscape for the instGE10 set (Fig 1A). Skin cancer has well-characterised mutational signatures [42] that underpin AA mutations. However the detail of AA mutation profiles shifts for the instGE10 set, compared with instEQ1, likely reflecting the importance of adaptation to tumour conditions and growth (Fig 2C), and similar to the more general results over all cancers (Fig 1B). Of particular note are the persistence of mutations from Arg (to Cys, Gln, His), as the number of COSMIC instances increases. Driver missense mutations impact proteins through loss-of-function (LoF) or gain-of-function (GoF) effects, including mutations from Arg (e.g. the metabolic switch R132H in IDH1, [21]). To what extent the prominence of mutations from Arg incorporates Arg-specific functional effects remains an open question, and the reason that potential pH-sensing for Arg to His mutations has been raised [20].

Fig 2. Mutation distributions by cancer type.

Fig 2

(A) Percentage distributions by cancer type, for instGE10 data, with the 20 wild type AAs listed across the top, and each row (cancer type) summing to 100%. Percentage scale (white to blue) is shown to the right. (B) Skin cancer, instGE10 data with the 5 most prominent wild type AAs in panel (A) listed on the left. Distributions of mutation to target AAs (listed across the top) are shown for instGE10 data, percentage scale (white to blue) shown to the right. (C) The distributions of panel (B) are differenced with instEQ1 data, with scale of percentages (red-white-blue) on the right.

3.2 Predicted ΔpKa and SASA values at Arg and His sites, compared between low and higher instance mutations

Using AlphaFold2 models [23], predictions of ΔpKa values for Arg and His were made with pkcalc [25], together with SASA calculations. Missense mutations in COSMIC for the most prevalent 4 target AAs for each of Arg (Cys, Gln, His, Trp) and His (Tyr, Pro, Gln, Arg) were included, with the data further divided into sets of mutations at just a single instance recorded, and those at ≥ 10 instances. Overall, sites of mutation from Arg have moderately positive predicted ΔpKas, indicative of structural stabilisation, with very little difference between target AAs, or between 1 and ≥ 10 instance subsets (Fig 3A). For SASA, there is also little difference between target AAs, but now a systematic reduction in average SASA for ≥ 10 instance subsets versus 1 instance subsets with the same target AA (Fig 3C). Notably, there is a fraction of all Arg mutation sites that have very low SASA. These do not map to very low predicted ΔpKas (which would result for uncompensated dehydration), rather they are cases of Arg involved in specific structural stabilisation, often through hydrogen bonding to main chain carbonyl groups. With regard to His target AA as compared with Cys, Gln, Trp, there is no clear difference of SASA or predicted ΔpKa values, and thus no support for a general separation of Arg to His from other target AA mutations, at least for these properties.

Fig 3. Structure-based SASA and ΔpKa calculations for mutations from Arg and His.

Fig 3

The box and whisker plots show quartiles, median (central line), and mean (cross). (A) Distributions of predicted Arg ΔpKas (pkcalc) are shown for instEQ1 and instGE10 mutation data, further divided into the 4 most common target AAs for Arg mutation. Limiting thresholds of +/- 3 are applied for ΔpKa values. (B) Equivalent format ΔpKa data to panel (A) are shown for mutation from His, to its most common target AAs. (C) The ΔpKa data for Arg mutations in panel (A) are replaced with SASA, for the same subsets. (D) His mutation SASA is shown, using the same datasets as for His ΔpKas in panel (B).

Focussing on mutations from His, the most evident features for predicted ΔpKas are low ranges and overall a small negative value, across all displayed target AAs, and in either the 1 instance or ≥ 10 instance mutation sets (Fig 3B). This reflects the majority of cases in which (at physiological cytoplasmic pH) His will be neutral and not involved in charge networks. Nevertheless, as for Arg, there is a fraction of His mutation sites with very low SASA (Fig 3D), which are likely to be contributing to the overall slightly negative ΔpKas. Such sites will be buried with little influence of a predicted negative ΔpKa on structural stability, so long as ambient pH is greater than normal His sidechain pKa (6.3 in pkcalc).

The overall picture is that no clear predicted differences of ΔpKa or SASA properties are evident in respect of mutations from Arg to His relative to other target AAs, or His to what may be considered a possible buried sterically similar AA (Tyr), relative to other target AAs. It is possible, as indicated in the low SASA fractions (Fig 3C and 3D), that certain subsets may underpin pH-dependence, requiring consideration of a variety of factors, including AA burial and ambient pH of subcellular location. For this study the focus is on His as a potential mediator of adaptation to the altered pH micro-environment of tumours [20], but other AAs can also mediate pH-dependence at neutral or mild acidic pH, in particular Asp and Glu [30]. In order to assess whether differences between lower and higher instance COSMIC missense mutation sites are apparent for other amino acids with ionisable sidechains, the same calculations made for Arg and His in Fig 3 were applied to all Asp, Glu, and Lys mutations sites in COSMIC, using AlphaFold2 models (S1 Fig). Predicted ΔpKas for Asp and Glu tend to be negative, and are largely positive for Lys (S1 Fig panel A), in both cases indicating overall stabilisation of the ionised state in the network of charge interactions. Solvent accessibilities are overall larger for Lys than for Asp and Glu, as expected (S1 Fig panel B). In a repeat of the overall result for Arg and His (Fig 3), there are no clear predicted differences of ΔpKa or SASA properties for the COSMIC instance of 1 category compared with that of ≥10 instances. The low SASA tail for Asp, Glu, and Lys residues, as for Arg and His, could be indicative of sites that are relevant for pH-dependence.

3.3 Filtering for somatic mutations to His that could mediate pH-dependence

Using AlphaFold2 structures to calculate physical properties, this study omits environments (in particular burial from solvent) that are only realised in homomeric protein or heteromeric (protein or other partner) interactions. Histidine is a prime AA for mediating pH-dependence, especially when buried and with ambient pH at or lower than the normal sidechain pKa. It is therefore used to filter for the potential introduction of pH-sensing through mutation. The procedure is not intended to be exhaustive, rather, given the lack of general findings across classes of mutations, it is an attempt to identify specific examples. An additional feature is whether GoF mutations, as anticipated for generating a pH-sensing His, can be distinguished from LoF mutations. The extent to which a mutation site has single or multiple target AAs in COSMIC is denoted as percentage specificity, with 100% indicating that all mutations are to a single target AA. To restrict analysis to a set of mutations most likely to be drivers, the top 1,000 COSMIC mutations, in terms of numbers of instances in COSMIC (down to 39), were filtered for His target AA mutations (57 for any SASA value in the protomer model). Literature analysis revealed 11 of these as known to be GoF, with 18 LoF mutations. Percentage specificity for the GoF mutations is on average higher than that for the LoF mutations (Fig 4, Mann-Whitney p = 0.0193), suggesting that this feature could contribute to a filter that distinguishes GoF from LoF mutations [43].

Fig 4. High instance GoF and LoF mutations compared for His target AA.

Fig 4

Percentage specificity (see text) for mutations to His within the top 1,000 of all mutations, ranked by instances, in COSMIC.

The set of mutations to His in COSMIC were studied for potential GoF pH-sensing, with filtering for burial (SASA ≤ 20 Å2 in the protomer model), giving 55 sites down to ≥ 10 instances. Following the analysis of GoF versus LoF mutations, a filter of percentage specificity of mutation at a site (to His) of ≥ 50% was applied, with reduction to 41 sites. Subcellular location was used to further restrict the data, concentrating on cases where a low environmental pH would yield a destabilising influence of the buried His, when loss of hydration is unmatched by protein charge interactions. The resulting 9 cases (Table 1) include some where an alternate explanation to potential pH-dependence is already known or seems likely, and those where a role for introduced pH-dependence is hypothesised. Those 9 are supplemented by a tenth (potassium channel) case, where the site is accessible in the protomer model, but substantially buried in the functional tetramer. Not included in the 10 cases are the two most recurring mutations to His in COSMIC, both well-characterised, IDH1-R132H and p53-R175H. The IDH1 mutation leads to a metabolic switch that is proposed to promote tumour growth [44], also with the suggestion that the mutant enzyme activity could sense environmental pH through a pH-dependent heterodimerisation of IDH1-R132H and IDH1-WT enzymes [21]. R175H of p53 has been classified, within the large set of important p53 mutations, as reducing DNA binding affinity through destabilisation of structure around a zinc binding site [45].

Table 1. Mutations to histidine: 10 examples with COSMIC instances ≥ 10.

mutn inst protein UniProt ID mutn spec pka-pkcalc pka-PROPKA location SASA PP2 SIFT AM Fig
L702H 85 ANDR P10275 100 2.32 3.77 cyto-nucl 1.9 1 0 1
Q638H 11 PCDHB16 Q9NRJ7 100 < 1 5.05 TM-EC 1.7 0 0.89 0.05
D268H 14 PCDHGB4 Q9UN71 88 10.57 7.07 TM-EC 16.2 1 0 0.95 5A
R191H 19 TINAG Q9UJW2 54 2.77 3.33 secr-ECM 8.4 0.84 0.09 0.27
R350H 16 TMEM168 Q9H0V1 80 4.04 5.28 TM-nucl 19 1 0 0.94
Q793H 11 NDST1 P52848 100 < 1 4.57 Golgi 1 0.99 0.01 0.98 5B
R149H 16 MGAT4C Q9UBM8 67 < 1 4.95 Golgi 13.7 0.97 0.16 0.24
Y195H 17 HLAC P10321 94 < 1 3.94 TM-ER-EC 1.6 0.16 0.09 0.22 5C
Y115H 41 AQP7 O14520 100 < 1 4.73 TM-EC 6.6 1 0 0.68 5D
Q192H 32 KCNJ12 Q14500 100 5.59 5.96 TM-cyto 129.2 0 0.01 0.59

mutn spec is mutation specificity; location includes cyto/cytoplasm, TM/trans-membrane, EC/extra-cellular, secr/secreted, ECM/extra-cellular matrix, nucl/nucleus, ER/endoplasmic reticulum; SASA is calculated for the unmutated amino acid (Å2); PP2 is PolyPhen-2 prediction of mutation effect from 0 (benign) to 1 (damaging); For SIFT predictions, values < 0.05 are deleterious and > 0.05 are tolerated; AlphaMissense (AM) predictions of mutation effects are either benign (< 0.5) or pathogenic (> 0.5). For PP2, SIFT, and AM methods, results for mutations predicted to be deleterious are shown in bold and underlined.

A feature of COSMIC and related databases is their expansion with available cancer genome sequence data. In order to assess the potential influence of such expansion on the current analysis, a comparison was made between the data available on the COSMIC web site in October 2024 and that used generally in the current study (from February 2023). Of the 10 mutations listed in Table 1, for 4 the number of recorded instances remains the same, for 2 mutations instances increase by 1, and increases for the other 4 mutations are 3, 5, 8, and 9 instances. There are 6 mutations in the February 2023 dataset that would pass the burial filter applied for Table 1, but lie just 1 instance below the applied threshold of 10. Of these, 5 remain at 9 instances and 1 increases to 16 instances in October 2024. This comparison of data snapshots 20 months apart shows that whilst numbers will necessarily change over time, there is also a high degree of consistency between the two datasets.

3.4 Prominent mutations to His in COSMIC, at buried sites

3.4.1 L702H of androgen receptor

The 10 mutations listed in Table 1 are contained within proteins that are either located in mild acidic pH environments (e.g. Golgi lumen), neutral pH environments (e.g. cytoplasm), or on the extracellular side of the plasma membrane (neutral or mildly acidic, depending on the tumour environment). Mutations were modelled within the AlphaFold2 protomer using SwissModel [38], and pKa predictions made with pkcalc and PROPKA. Predictions of benign or deleterious to protein function are mostly consistent between PolyPhen-2, SIFT, and AlphaMissense methods for these mutations, indicated by highlighting of deleterious predictions in Table 1. For 5 mutations within the group of 10, all methods predict deleterious, for 3 mutations all methods predict benign, and the remaining 2 mutations have mixed predictions. It is likely that the specific property of introduced pH-dependence is not fully accounted for in the standard mutation effect prediction methods. For example, one of the 4 mutations that is represented in Fig 5 (HLA-C Y195H) is consistently predicted as benign, but it is not clear how these methods take into account factors such as differential pH of subcellular/extracellular location, altered extracellular pH in cancer, or coupling between the mutation site and ligand binding.

Fig 5. Molecular environments of selected mutations to His.

Fig 5

(A) PCDHGB4-D268H: left is a protein-sol image of predicted pKas around the wild type calcium binding site (in the absence of calcium ion), with colour scale by destabilisation (red) through ΔpKa close to zero (white) and stabilisation (red). Right is the equivalent plot for the D268H mutant, with a switch from destabilising at 268 to stabilising. (B) NDST1-Q793H: left is the sulfotransferase domain from a structure of the bifunctional Golgi enzyme (8ccy) [50], with ADP (cyan) marking the active site. Q793 is indicated with a rectangle, and that region expanded on the right, with a protein-sol pKa prediction showing destabilised Q793H and neighbouring H627. (C) HLAC-Y195H: mutation sites (instances ≥ 17 in COSMIC, green) are displayed in the peptide binding domain of structure 6pag [51], with bound peptide shown in purple. Mutations other than Y195H are S48A, L171W, T187M/T187P, A97T, L180R, A176V, L119I, all adjacent to the peptide binding groove, and R155S, D114A in this domain but not adjacent to the peptide binding groove. (D) AQP7-Y115H: left shows wild type Y115 in the AlphaFold2 protomer model, and right the modelled Y115H mutation. In both panels the neighbouring residue H92 is shown.

The first example, L702H of androgen receptor (ANDR), is part of the neutral pH subset. This mutation is documented to alter the affinity for some natural ligands, and also for drugs that are targeted at the oncogene ANDR to down-regulate its transcriptional activity in tumour growth [46]. Mutations at relatively buried sites and adjacent to ligand binding pockets can affect processes that impact on tumorigenesis in both enzymes and non-enzymes.

3.4.2 Protocadherins

Two of the 10 sites (Table 1) are in protocadherins, a large group of proteins involved in cell-cell adhesion [47]. In PCDHGB4, D268 is part of a cluster of 3 Asp that contribute to a calcium binding site, referencing a mouse PCDHGB4 structure (6e6b), with mouse D238 equivalent to human D268 [48]. Using the pka tool on the protein-sol server [25], a web implementation of pkcalc, pka predictions were made for PCDHGB4-WT and PCDHGB4-D268H, in both cases in the absence of calcium (Fig 5A). Interestingly, destabilisation for parts of the Asp cluster, anticipated in the absence of calcium ion (due to repulsion between the adjacent carboxylate groups), is replaced with stabilising predicted pKa changes for the D268H mutant (with calculated pKa greater than the normal His sidechain pKa value of 6.3). The histidine is predicted to be positively-charged at neutral pH (pkcalc), networking favourably with the remaining Asp sidechains in the calcium binding site. Whether this predicted electrostatic stabilisation, or perhaps an introduction of His-mediated pH-dependence, plays a role in tumour growth is unknown. The other protocadherin mutation in Table 1 is PCDHB16-Q638H. Again a mouse structure is available (5szq), with mouse Q612 equivalent to human Q638 [49]. In this case Q638 is not part of, or adjacent to, a calcium binding cluster of AAs. Since the site is buried and the predicted ΔpKa is negative, incorporation of His could lead to pH-dependence if the EC pH is depressed close to the normal His sidechain pKa (6.3).

3.4.3 R191H in TINAG

TINAG is a secreted protein that is a constituent of basement membranes. It has been implicated in cancer, with expression levels correlated with survival of patients with kidney renal clear cell carcinoma [52]. R191H is the most common TINAG missense mutation in COSMIC. Whereas the buried R191 is able to salt-bridge with D236, and as a result is a predicted stabilising influence on structure, R191H in the model is not able to make the same salt-bridge, losing some stability. Further destabilisation would occur should the environmental pH fall to around the normal His sidechain pKa. No specific information on the effect of TINAG-R191H is available.

3.4.4 R350H in TMEM168

A multi-pass TM protein in the nuclear membrane, TMEM168 has a reported association with glioblastoma multiforme (GBM) [53], with no indication of mechanism for R350H, its most numerous mutation in COSMIC. This mutation lies adjacent to a modelled TM segment. As for TINAG-R191H, it is predicted to mutate from a stabilising network of Arg interactions (in this case with backbone carbonyls), to a largely buried His. In this case though the loss of stabilising Arg is unlikely to be supplemented by His-mediated pH-dependence, since the environment pH is well above the free His sidechain pKa.

3.4.5 Q793H in NDST1

The single pass TM protein NDST1 is a Golgi luminal bifunctional enzyme involved in heparan sulphate modification. Q793 lies in a flanking domain adjacent to the active site carrying the sulfotransferase activity (8ccy) [50], buried and making hydrogen-bonds with two backbone carbonyl groups. A modelled Q793H is unable to make these interactions, with a predicted acidic pKa (Fig 5B). Destabilisation relative to wild type is predicted to occur both through the loss of interaction to neighbouring main-chain and from a reduced stability around the introduced His at the acidic Golgi pH of 6.0 to 6.7 [54]. Of interest is a neighbouring and buried His (H627), predicted to have a negative ΔpKa in wild type AlphaFold2 model (pkcalc and PROPKA) and mutated enzyme (pkcalc), as well as in the cryo-EM structure (8ccy, pkcalc), despite the adjacent Asp (D629). The Q793H mutation could be adding to existing pH-dependence mediated by H627, or the interaction of H627 with D629 could be more stabilising than predicted, so that H627 would not be a source of structural instability at Golgi pH. Whatever the case, any His effect will depend on the precise value of Golgi pH relative to the free His sidechain pKa (6.3) and could, in principle, act as a protein stability sensor of Golgi pH. It has been reported that a microRNA upregulated in GBM tissues targets and negatively regulates NDST1 [55], consistent with the hypothesis that complementary protein structure destabilisation could play a similar role.

3.4.6 R149H in MGAT4C

MGAT4C is also a single pass TM protein, here with glycosyltransferase activity and R149 located in the Golgi lumen. R149 and D150 interact in the AlphaFold2 model. R149H (with low predicted pKa, Table 1) could act as a pH sensor if Golgi pH is comparable to the free histidine sidechain pKa. If the mutation is detrimental to enzyme activity, this would be consistent with reported increased GBM tumour growth when MGAT4C is under-expressed [56].

3.4.7 Y195H in HLA-C

HLA-C Y195 is buried and located adjacent to the peptide binding groove (with reference to a reported complex structure, 6pag [51]). While His is sterically similar to Tyr, without specific protein charge solvation it has a low predicted pKa (Table 1), and could be a source of structural instability where environmental pH falls to around 6.3. This is perhaps attainable in the acidic extracellular environment of tumours, but is also possible in the exocytotic pathway [57] that transports peptide-loaded HLA-I complexes from the ER to the cell surface. Additionally, of the 9 mutations in COSMIC with equal or greater number of instances to Y195H, that are located in the peptide binding domain of HLA-C, 7 are adjacent to the peptide binding groove and (as suggested for Y195H) could therefore affect binding affinity (Fig 5C). This mechanism could contribute to the inhibition of immune surveillance that is a feature of tumorigenesis [58].

3.4.8 Y115H in AQP7

Aquaporin-7 (AQP-7) assembles in the cell membrane as a homotetramer. Each protomer contains a pore for water and glycerol. Structural data reveal that Y115 is adjacent to the pore, as is a neighbouring AA, H92 (6qzi, [59]). This pair of AAs is conserved in AQP-10 (Y103 and H80, [60]), where H80 is proposed to mediate pH-dependence with protonation at pH 5. Predicted pKas of H80 in the AlphaFold2 protomer model of AQP-10 are 3.4 (pkcalc) and 3.9 (PROPKA), noting again that His pH-dependence depends on pH falling below the normal His sidechain pKa (6.3), and thereby favouring a less buried environment. An argument is made that since H92 of AQP-7 replicates H80 of AQP-10, and that AQP-7 does not exhibit pH-dependence of pore activity (between pH 6 and 8), other conformational effects are involved [59]. Indeed, predicted pKas for AQP-7 H92 in the wild type AlphaFold2 monomer are still low at 3.3 (pkcalc) and 5.09 (PROPKA), so that predicted pH-dependent instability around normal His sidechain pKa is insufficient to alter conformation, given the lack of observed pH-dependent function. A factor that could change the conformational balance is introduction of a further buried His (Y115H mutation). It is possible that Y115H introduces a pH-dependent pore function to AQP-7, similar to that of AQP-10 (Fig 5D).

3.4.9 Q192H in KCNJ12

In a further example, Q192H of KCNJ12, the Kir2.2 inwardly rectifying potassium channel, is accessible within the AlphaFold2 protomer model (a difference to the other 9 sites in Table 1), with both pkcalc and PROPKA predicting a pKa slightly depressed to mild acidic pH. Since this AA is located on the cytoplasmic side of the membrane, it is unlikely to encounter an environmental pH much below neutral. However, it also lies next to the inositol 4,5-bisphosphate group of channel regulator PIP2, which contributes to control of channel opening [61]. Additionally the Q192 site is now much less accessible due to the adjacent PIP2, and a neighbouring protomer in the channel tetramer. Interaction of the Q192H sidechain with neighbouring phosphate groups could elevate its pKa, and introduce a pH-dependent element to channel opening at a pH close to neutral. Notably, other sources of electrostatic modulation on the cytoplasmic face of Kir2.2 activity have been discovered [62].

3.5 Context on filtered His mutations

In a search for sites where somatic mutations may underpin tumour adaptation to pH, several filters have been applied (introduction of buried His in AlphaFold protomer models, instances in COSMIC of at least 10, consideration of the environmental pH). These are restrictive, noting for example the role of Asp and Glu in pH-dependent processes [30], yielding a relatively small number of sites. There are systems with histidine mutations, not revealed in the current work, where a pH-dependence of molecular function has been established and potential association to cancer suggested. These include IDH1-R132H ([63], already discussed; β-catenin H36P and other target AAs [22], (an accessible site involved in protein-protein interactions); and several systems where the summed instances in COSMIC is just one, RasGRP1-H212 [64], FAK1-H58 [65], a 4 His cluster in NHE1 [66]. These characterised systems can be used to assess the effectiveness of predictions of pH-dependence, although they are not in Table 1. The indicated sites, (Arg in IDH1, His in RasGRP1 and FAK1), are buried (< 20 Å2 SASA) with large predicted pKa changes, and would therefore be flagged as of interest in respect of pH-dependence. The loop carrying the 4 His cluster of NHE1 is exposed to solvent in the AlphaFold model of NHE1, but largely buried at the interface with obligate binding partner CHP1 [67], demonstrating one way in which the current computational pipeline can fail, where pH-dependence is only manifested when burial occurs at an interface between molecules. The H36P β-catenin mutation falls into a similar category, although with the added complexity that H36 is in an intrinsically disordered region, in the absence of a binding partner. The advent of AlphaFold models protomers is in itself a very significant step forward for the type of analysis reported here, allowing complete human proteome coverage. Greater coverage of the human biomolecular interactome, through experimental and modelling [68] methods, will yield further insight.

Referring to the experimental characterisations of pH-dependence, it appears that for some of these the mechanisms of pH-dependence are not sufficiently susceptible to modulation by somatic mutation or sufficiently coupled to tumour growth, to appear at high incidence in the COSMIC database. Of note is that transcriptional and other changes that alter protein levels are common in tumorigenesis. For example, it may be more simple to enhance sodium / proton antiporting activity of NHE1, and response to acidosis [69], through transporter levels than through modulation of intrinsic activity.

3.6 Net mutation from arginine and to lysine is evident in COSMIC

The percentage of Arg and Lys sites that are in COSMIC both correlate with the overall percentage of sites mutated in that protein. Whereas the percentage of Arg sites mutated exceeds the average number, for the majority of proteins, the inverse is generally the case for mutation from Lys (Fig 6). This effect is evident for datasets covering all COSMIC mutations (Fig 6A), and including only those sites with at least 10 instances of a mutation (Fig 6B). Emphasis of mutations from Arg over those from Lys is complemented by an inverse effect for mutations to Arg or Lys, including the prevalent Glu to Lys mutation (not shown). It is apparent that whatever the DNA mutation mechanisms behind amino acid changes, they result in an overall shift from Arg to Lys.

Fig 6. Extent of missense mutation from Arg and Lys in COSMIC compared with overall mutation.

Fig 6

Underlying data are the overall percentages of locations in each gene that have mutations recorded in COSMIC, and also those percentages when considering just wild type Arg and Lys sites. The overall percentage data are binned (horizontal axis) and displayed with the Arg or Lys data on the vertical axis. (A) Analysis for all missense mutations in COSMIC. (B) Analysis for sites with ≥ 10 instances in COSMIC.

3.7 Lysine to arginine balance is a feature in studies of protein hydrophobicity and solubility

Given that the incidence of Arg to His mutations does not appear to be systematically linked to modulation of pH-dependence, it is reasonable to ask what other consequences there may be for mutation from Arg. Although this is widespread in different cancer types, and results from specific mutational mechanisms to generate DNA changes, there remains a question of how advantage may accrue for tumour growth, as seen for skin cancer on the background of a particular mutational signature (Fig 2C). One suggestion has been that mutation to cysteine could counter production of reactive oxygen species during tumour growth [13]. Another possibility, discussed here, is that a rebalancing of Arg and Lys could be beneficial for maintaining cellular proteostasis in the rapid growth of tumours.

Several studies have pointed to the differentiated properties of Lys and Arg in scales of hydrophobicity and solubility. Locations of AAs in protein structures have been used to derive a stickiness scale, in which Lys is the least sticky AA with Arg lying close to the middle of the 20 AAs [70]. Combining this scale with abundance data, it was found that more abundant proteins have less sticky surfaces. Divergence between results from a large-scale study of AA contributions to protein stability, and evolutionary AA usage, was used to construct a scale representing the influence of non-stability factors on AA conservation [71]. It was suggested that solubility is a major contributor, and Lys (substantial divergence) is well separated from Arg (little divergence). Comparing Lys and Arg content with solubility data revealed that Lys is enriched relative to Arg in many of the more soluble proteins [72]. This observation contributes to the protein-sol solubility prediction tool [73]. A scale based on structural flexibility has also been used to make a model for protein solubility, with Lys again separated from Arg, grouping with Asp and Glu at the more soluble end of the spectrum [74].

The observation that Lys/Arg balance is a factor in determining protein solubility, and that it is modulated in the dataset of cancer somatic mutations, is of interest when considering protein homeostasis (proteostasis) in cancer. Links have been made between the role of proteostasis in ageing and in cancer [75], and the pursuit of cancer therapies that target proteostatic factors has been reviewed [76]. Examples of the latter include targeting of HSP90 [77], and HSF1 [78]. Further, it has been shown that a major part of the Caenorhabditis elegans proteome is close to intrinsic solubility limits [79]. If this fine solubility balance extends to tumour cells in humans, where altered protein expression is common, then it might be expected that somatic mutations that modulate some aspect of solubility would occur.

3.8 Arginine stands out in variation with COSMIC instances, and conservation

In order to catalogue the over-representation of arginine somatic mutation (summed over target amino acids), both the absolute numbers (Fig 7A), and distributions (Fig 7B), for wild type AAs are calculated, for various ranges of instances in COSMIC. Several amino acids are close in terms of highest numbers of single instance sites (Fig 7A). Notably, all the AAs with one instance numbers comparable or larger than Arg (Ala, Glu, Gly, Leu, Pro, Ser) have higher AA composition values in the human proteome. That the proportion of mutations at arginine (Fig 7B) increases above one instance, demonstrates the higher propensity for mutation of Arg. This proportion, for Arg, falls for the category of ≥ 10 instances, but remains larger than for other AAs. Single instance numbers are used, along with AA composition, to estimate an overall background probability of mutation for each amino acid (see Materials and methods). Calculated probabilities are then applied to estimate the expected number of mutations for each wild type AA, at different numbers of instances, in the non-driver case. Finally, the modelled distributions are subtracted from the observed distributions (Fig 7B) to yield the differences (Fig 7C). Many AAs show moderate increases in their share of the overall distribution (in actual versus modelled), as the instances range rises to ≥ 10, at the expense of a large drop for Arg. Despite this fall for Arg, relative to a model without adaptation, it remains the largest contributor to mutation sites for instances ≥ 10 (Fig 7B). Presumably the relative fall in Arg mutation occurs as adaptation plays a more important role for sites with higher instances. While this effect is conveniently displayed in terms of the percentage distributions across AAs, it is instructive to note the absolute numbers. For the ≥ 10 instance subset, there are 3,008 Arg sites in a total of 12,121 in COSMIC, compared with 560 Arg from a total of 611 (background model), noting that these calculations are made with instances summed over all target AAs at a site. Despite the simple model employed, differentiation of these numbers (actual to model), likely demonstrates the impact of adaptation as the number of instances increases. Notably, mutation from Arg is dominant in all subsets of instances.

Fig 7. The distribution of missense mutation sites in COSMIC.

Fig 7

(A) Numbers of missense mutation sites for the 20 AAs are shown for different ranges of instances recorded in COSMIC. (B) Percentage distributions of missense mutations for the 20 AAs. (C) Percentage distributions, modelled from the frequency of single instance mutations (see Materials and methods), are subtracted from those observed in COSMIC, to estimate the effect of adaptation at higher instances.

Another property for which Arg behaves differently as the number of instances increases is the Functional Impact Score (FIS), available as pre-calculated values for the human proteome from the Mutation Assessor tool [33]. The basis for FIS and MA scores is multiple sequence alignment, so that it is primarily recording evolutionary conservation. The method performs comparably with others in predicting non-synonymous single nucleotide variant pathogenicity [80, 81]. For most AAs, the MA score and therefore conservation is substantially lower for mutations with higher number of instances (Fig 8). Arg shows the most difference to the overall trend, with relatively small changes in conservation, and no reduction for the ≥ 10 instances subset. The general shift to lower conservation in the highest instances subset (Fig 8) may reflect a balance shifted towards GoF mutations. Similarly the much smaller change for Arg could indicate a lesser role for GoF mutations. Examples of Arg LoF mutation include protein—DNA interaction sites, for example in p53 [45], and RNA splicing, for example SF3B1 [82]. Even so, conservation for Arg overall changes little across the instances subsets, rather than moving towards greater conservation at higher instances, as would be expected should LoF mutations be dominating in tumour adaptation.

Fig 8. Analysis of AA conservation with increasing COSMIC instances.

Fig 8

(A) Change in FIS (representing AA conservation), with the AA values averaged over COSMIC instance populations of missense mutations subtracted from the equivalent averages for AAs at sites that do not appear in COSMIC.

3.9 Proteins with mutations from Arg and to Lys, at higher COSMIC instances, are enriched for cell periphery location

To examine the hypothesis that subsets of mutations from Arg, and to Lys, introduced broadly across cancer types, could be combatting proteostasis challenges in cancer cells, gene ontology was examined for sets of proteins with mutations at different numbers of instances in COSMIC (summed over all target mutations at a site when considering mutation from an AA). Using the Princeton GO Finder tool [34] for the from Arg and to Lys mutations, consistently the cell periphery category in GO component classification is the most significant (for all instances of to Lys mutations, and instances 8, 9, ≥10 of from Arg mutations in Fig 9), or close to the most significant (instances 4, 5, 6, 7 of from Arg mutations in Fig 9), category returned. The percentage difference between each set tested and the background human proteome set was calculated for cell periphery (Fig 9). While there is a general increase of mutations for proteins at the cell periphery as number of instances increases (except for the ≥ 10 subsets), both from Arg (Fig 9A) and to Lys (Fig 9B) sets are further enriched. Viewed in the context of membrane proteins as key cancer hallmarks [83, 84], and the connection between cell surface protein aggregation, membrane protein homeostasis, and the endocytotic pathway [85, 86], it is reasonable to ask whether a subset of somatic mutations could be linked to proteostasis at the cell membrane.

Fig 9. Cell periphery enrichment of proteins with mutations from Arg and to Lys.

Fig 9

Percentage enrichments for the GO category of cell periphery are plotted for overall mutation populations and from Arg (A), and to Lys (B) mutations. Results are shown for different numbers of mutation instances in COSMIC.

The number of mutations in a tumour varies considerably between cancer types, and within a cancer type [87]. For the missense somatic mutations of the current study, retrieved from the February 2023 COSMIC database, numbers of mutations reported for each unique tumour and sample identifier pair were calculated for a subset of cancer types. At the higher end, in 6,021 sequenced samples, stomach cancers have an average of 62 missense mutations per sample, with 684 samples ≥ 100 mutations, and 58 ≥ 1,000. For 24,756 skin cancer samples, the average is 34 missense mutations, with 1,119 ≥ 100 and 241 ≥ 1,000. At the lower end, 22,774 breast cancer samples have an average of 12, with 444 ≥ 100 and 14 ≥ 1,000. Percentages of all mutations that are either from Arg or to Lys, are (24, 25, 25), and for mutations with ≥ 10 instances are (32, 27, 34), for the stomach, skin, and breast cancer, respectively. Given the extent to which proteins are expressed at their solubility limit [79], mutation numbers in some tumour samples may be sufficient to influence proteostasis.

4. Conclusions

Mutation from Arg is a dominant feature of missense somatic mutations in cancer (Fig 1A). Arginine depletion [14] overall is related to underlying mutational signatures in DNA, but the balance of AA mutations varies for higher instances compared with lower instances (Fig 1B). The inventory of missense somatic mutations with greater occurrence in COSMIC is therefore the result of both underlying mutational signature and adaptation for tumour growth, with mutation from Arg prominent throughout (Fig 7). One of the targets of Arg missense mutation is His, which has led to the suggestion that Arg to His mutations may be generating pH-sensing functions that aid adaptation to altered pH in tumour growth [20]. This may be relevant for a subset of Arg to His mutations, but higher occurrence Arg mutations are enriched for mutation to Cys and Gln, relative to His (Fig 1). Investigation more generally for mutations involving His showed no clear signal overall for higher instance sites, compared with lower instance sites, that indicate pH-sensing (Fig 3). Adding further filters, including burial and lower environmental pH, revealed a number of sites at which pH-sensing may play a role in adaptation for tumour growth (Fig 5). Other avenues to study include looking beyond AlphaFold protomers, at AAs other than His [30], modelling the coupling between protonation and metal ion binding, and accounting for the pH-dependence of enzyme activity. Computational studies will supplement new experimental tools for measuring the relationships between cancer, pH, transcription and proteome stability [88].

In respect of arginine depletion, missense somatic mutations from Lys are under-represented, opposite to Arg, contributing to a rebalancing of Arg and Lys in cancer genomes (Fig 6). This result is intriguing in the context of Lys and Arg being well separated in a number of hydrophobicity and solubility scales, where Lys is favoured for solubility. It is unknown what level of missense somatic mutations would be necessary to influence proteostasis. Average numbers of missense mutations in cancer genomes are typically in the 10s, but with significant numbers of genomes in the 100s, and some in the 1,000s. Gene ontology analysis suggests that if Arg / Lys rebalancing is a significant factor in proteostasis, then the cell periphery is a candidate location (Fig 9). A recent study has used antibody-mediated cross-linking of cell surface proteins to reveal aggregation-dependent endocytosis and lysosomal degradation as a proteostasis response to altered membrane protein behaviour [85]. An experimental test of any contribution from Arg / Lys rebalancing in membrane proteins could be investigated in a similar way, but substituting altered Arg / Lys content, and perhaps altered membrane protein expression, for antibody-mediated cross-linking. Another possibility is to study the wild type and arginine depleted aqueous phase domains (from the membrane proteins) in solution, or at a surface [89]. A potentially related area is the decline of proteostasis mechanisms in ageing [90]. Since Arg depletion arises, at least in part, from underlying C > T DNA mutations [14], it may also contribute more generally to modulation of proteostasis, although any role of somatic mutations in ageing is not yet clear [91].

Finally, altered expression of proteins in cancer is an important factor, whether in considering proteostasis or in the balance between expression and somatic mutation for mediating adaptation to shifted pHi and pHe in tumours. It may be more effective to tune the level of a protein than to modulate its function through mutation, or introduce new (pH-sensing) functions. This is particularly the case with expression levels of acid-base transporters [5], and also modulation of degradation [92].

Supporting information

S1 Fig. Structure-based SASA and ΔpKa calculations for mutated Asp, Glu, and Lys.

The box and whisker plots show quartiles, median (central line), and mean (cross). (A) Distributions of predicted Asp (D), Glu (E), and Lys (K) ΔpKas (pkcalc) are shown for instEQ1 and instGE10 mutation data. Limiting thresholds of +/- 3 are applied for ΔpKa values. (B) The ΔpKa data for mutations in panel (A) are replaced with SASA, for the same subsets.

(PDF)

pone.0314022.s001.pdf (105.7KB, pdf)

Acknowledgments

The authors thank Sifan Zhang for valuable discussions, and staff at the University of Manchester Computational Shared Facility for facilitating storage and processing of data.

Data Availability

All relevant data are within the manuscript.

Funding Statement

This work was supported by UK Biotechnology and Biological Sciences Research Council grant BB/V0065921/1 to JW. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

References

  • 1.Corbet C, Feron O. Tumour acidosis: from the passenger to the driver’s seat. Nat Rev Cancer. 2017;17(10):577–93. Epub 2017/09/16. doi: 10.1038/nrc.2017.77 . [DOI] [PubMed] [Google Scholar]
  • 2.Sennino B, McDonald DM. Controlling escape from angiogenesis inhibitors. Nat Rev Cancer. 2012;12(10):699–709. Epub 2012/09/25. doi: 10.1038/nrc3366 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Andersen AP, Moreira JM, Pedersen SF. Interactions of ion transporters and channels with cancer cell metabolism and the tumour microenvironment. Philosophical transactions of the Royal Society of London. 2014;369(1638):20130098. Epub 2014/02/05. doi: 10.1098/rstb.2013.0098 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Swietach P, Boedtkjer E, Pedersen SF. How protons pave the way to aggressive cancers. Nat Rev Cancer. 2023;23(12):825–41. Epub 2023/10/27. doi: 10.1038/s41568-023-00628-9 . [DOI] [PubMed] [Google Scholar]
  • 5.White B, Swietach P. What can we learn about acid-base transporters in cancer from studying somatic mutations in their genes? Pflugers Arch. 2024;476(4):673–88. Epub 2023/11/24. doi: 10.1007/s00424-023-02876-y . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Hajjar S, Zhou X. pH sensing at the intersection of tissue homeostasis and inflammation. Trends Immunol. 2023;44(10):807–25. Epub 2023/09/16. doi: 10.1016/j.it.2023.08.008 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Hao G, Xu ZP, Li L. Manipulating extracellular tumour pH: an effective target for cancer therapy. RSC Adv. 2018;8(39):22182–92. Epub 2018/06/19. doi: 10.1039/c8ra02095g . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Rohani N, Hao L, Alexis MS, Joughin BA, Krismer K, Moufarrej MN, et al. Acidification of Tumor at Stromal Boundaries Drives Transcriptome Alterations Associated with Aggressive Phenotypes. Cancer Res. 2019;79(8):1952–66. Epub 2019/02/14. doi: 10.1158/0008-5472.CAN-18-1604 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Schonichen A, Webb BA, Jacobson MP, Barber DL. Considering protonation as a posttranslational modification regulating protein structure and function. Annu Rev Biophys. 2013;42:289–314. Epub 2013/03/05. doi: 10.1146/annurev-biophys-050511-102349 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Sondka Z, Dhir NB, Carvalho-Silva D, Jupe S, Madhumita, McLaren K, et al. COSMIC: a curated database of somatic variants and clinical data for cancer. Nucleic Acids Res. 2024;52(D1):D1210–D7. Epub 2024/01/06. doi: 10.1093/nar/gkad986 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Anoosha P, Sakthivel R, Michael Gromiha M. Exploring preferred amino acid mutations in cancer genes: Applications to identify potential drug targets. Biochim Biophys Acta. 2016;1862(2):155–65. Epub 2015/11/20. doi: 10.1016/j.bbadis.2015.11.006 . [DOI] [PubMed] [Google Scholar]
  • 12.Szpiech ZA, Strauli NB, White KA, Ruiz DG, Jacobson MP, Barber DL, et al. Prominent features of the amino acid mutation landscape in cancer. PLoS One. 2017;12(8):e0183273. Epub 2017/08/25. doi: 10.1371/journal.pone.0183273 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Tsuber V, Kadamov Y, Brautigam L, Berglund UW, Helleday T. Mutations in Cancer Cause Gain of Cysteine, Histidine, and Tryptophan at the Expense of a Net Loss of Arginine on the Proteome Level. Biomolecules. 2017;7(3). Epub 2017/07/04. doi: 10.3390/biom7030049 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Nelakurti DD, Rossetti T, Husbands AY, Petreaca RC. Arginine Depletion in Human Cancers. Cancers (Basel). 2021;13(24). Epub 2021/12/25. doi: 10.3390/cancers13246274 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Alexandrov LB, Stratton MR. Mutational signatures: the patterns of somatic mutations hidden in cancer genomes. Curr Opin Genet Dev. 2014;24(100):52–60. doi: 10.1016/j.gde.2013.11.014 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Greenblatt MS, Bennett WP, Hollstein M, Harris CC. Mutations in the p53 tumor suppressor gene: clues to cancer etiology and molecular pathogenesis. Cancer Res. 1994;54(18):4855–78. . [PubMed] [Google Scholar]
  • 17.Consortium EP, Snyder MP, Gingeras TR, Moore JE, Weng Z, Gerstein MB, et al. Perspectives on ENCODE. Nature. 2020;583(7818):693–8. doi: 10.1038/s41586-020-2449-8 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Otlu B, Diaz-Gay M, Vermes I, Bergstrom EN, Zhivagui M, Barnes M, et al. Topography of mutational signatures in human cancer. Cell Rep. 2023;42(8):112930. doi: 10.1016/j.celrep.2023.112930 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Auboeuf D. Physicochemical Foundations of Life that Direct Evolution: Chance and Natural Selection are not Evolutionary Driving Forces. Life (Basel). 2020;10(2). Epub 2020/01/25. doi: 10.3390/life10020007 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.White KA, Ruiz DG, Szpiech ZA, Strauli NB, Hernandez RD, Jacobson MP, et al. Cancer-associated arginine-to-histidine mutations confer a gain in pH sensing to mutant proteins. Sci Signal. 2017;10(495). Epub 2017/09/07. doi: 10.1126/scisignal.aam9931 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Sesanto R, Kuehn JF, Barber DL, White KA. Low pH Facilitates Heterodimerization of Mutant Isocitrate Dehydrogenase IDH1-R132H and Promotes Production of 2-Hydroxyglutarate. Biochemistry. 2021;60(25):1983–94. Epub 2021/06/19. doi: 10.1021/acs.biochem.1c00059 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.White KA, Grillo-Hill BK, Esquivel M, Peralta J, Bui VN, Chire I, et al. beta-Catenin is a pH sensor with decreased stability at higher intracellular pH. The Journal of cell biology. 2018;217(11):3965–76. Epub 2018/10/14. doi: 10.1083/jcb.201712041 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Jumper J, Evans R, Pritzel A, Green T, Figurnov M, Ronneberger O, et al. Highly accurate protein structure prediction with AlphaFold. Nature. 2021;596(7873):583–9. Epub 2021/07/16. doi: 10.1038/s41586-021-03819-2 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Olsson MH, Sondergaard CR, Rostkowski M, Jensen JH. PROPKA3: Consistent Treatment of Internal and Surface Residues in Empirical pKa Predictions. J Chem Theory Comput. 2011;7(2):525–37. Epub 2011/02/08. doi: 10.1021/ct100578z . [DOI] [PubMed] [Google Scholar]
  • 25.Hebditch M, Warwicker J. protein-sol pKa: prediction of electrostatic frustration, with application to coronaviruses. Bioinformatics. 2020. Epub 2020/07/20. doi: 10.1093/bioinformatics/btaa646 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Fowler NJ, Blanford CF, de Visser SP, Warwicker J. Features of reactive cysteines discovered through computation: from kinase inhibition to enrichment around protein degrons. Sci Rep. 2017;7(1):16338. doi: 10.1038/s41598-017-15997-z . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Martins de Oliveira V, Liu R, Shen J. Constant pH molecular dynamics simulations: Current status and recent applications. Curr Opin Struct Biol. 2022;77:102498. Epub 2022/11/22. doi: 10.1016/j.sbi.2022.102498 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Wei W, Hogues H, Sulea T. Comparative Performance of High-Throughput Methods for Protein pK(a) Predictions. J Chem Inf Model. 2023;63(16):5169–81. Epub 2023/08/07. doi: 10.1021/acs.jcim.3c00165 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Warwicker J. Improved pKa calculations through flexibility based sampling of a water-dominated interaction scheme. Protein Sci. 2004;13(10):2793–805. Epub 2004/09/25. doi: 10.1110/ps.04785604 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Warwicker J. The Physical Basis for pH Sensitivity in Biomolecular Structure and Function, With Application to the Spike Protein of SARS-CoV-2. Frontiers in molecular biosciences. 2022;9:834011. Epub 2022/03/08. doi: 10.3389/fmolb.2022.834011 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.UniProt C. UniProt: a worldwide hub of protein knowledge. Nucleic Acids Res. 2019;47(D1):D506–D15. Epub 2018/11/06. doi: 10.1093/nar/gky1049 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Uhlen M, Fagerberg L, Hallstrom BM, Lindskog C, Oksvold P, Mardinoglu A, et al. Proteomics. Tissue-based map of the human proteome. Science (New York, NY. 2015;347(6220):1260419. Epub 2015/01/24. doi: 10.1126/science.1260419 . [DOI] [PubMed] [Google Scholar]
  • 33.Reva B, Antipin Y, Sander C. Predicting the functional impact of protein mutations: application to cancer genomics. Nucleic Acids Res. 2011;39(17):e118. Epub 2011/07/06. doi: 10.1093/nar/gkr407 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Boyle EI, Weng S, Gollub J, Jin H, Botstein D, Cherry JM, et al. GO::TermFinder—open source software for accessing Gene Ontology information and finding significantly enriched Gene Ontology terms associated with a list of genes. Bioinformatics. 2004;20(18):3710–5. doi: 10.1093/bioinformatics/bth456 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Adzhubei IA, Schmidt S, Peshkin L, Ramensky VE, Gerasimova A, Bork P, et al. A method and server for predicting damaging missense mutations. Nat Methods. 2010;7(4):248–9. Epub 2010/04/01. doi: 10.1038/nmeth0410-248 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Ng PC, Henikoff S. Predicting deleterious amino acid substitutions. Genome Res. 2001;11(5):863–74. Epub 2001/05/05. doi: 10.1101/gr.176601 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Cheng J, Novati G, Pan J, Bycroft C, Zemgulyte A, Applebaum T, et al. Accurate proteome-wide missense variant effect prediction with AlphaMissense. Science (New York, NY. 2023;381(6664):eadg7492. Epub 2023/09/21. doi: 10.1126/science.adg7492 . [DOI] [PubMed] [Google Scholar]
  • 38.Guex N, Peitsch MC. SWISS-MODEL and the Swiss-PdbViewer: an environment for comparative protein modeling. Electrophoresis. 1997;18(15):2714–23. Epub 1998/03/21. doi: 10.1002/elps.1150181505 . [DOI] [PubMed] [Google Scholar]
  • 39.Goldbach N, Benna I, Wicky BIM, Croft JT, Carter L, Bera AK, et al. De novo design of monomeric helical bundles for pH-controlled membrane lysis. Protein Sci. 2023;32(11):e4769. Epub 2023/08/27. doi: 10.1002/pro.4769 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Steele CD, Pillay N, Alexandrov LB. An overview of mutational and copy number signatures in human cancer. J Pathol. 2022;257(4):454–65. Epub 2022/04/15. doi: 10.1002/path.5912 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Brash DE. UV signature mutations. Photochem Photobiol. 2015;91(1):15–26. Epub 2014/10/30. doi: 10.1111/php.12377 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Wangari-Talbot J, Chen S. Genetics of melanoma. Front Genet. 2012;3:330. Epub 2013/02/02. doi: 10.3389/fgene.2012.00330 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Stein D, Kars ME, Wu Y, Bayrak CS, Stenson PD, Cooper DN, et al. Genome-wide prediction of pathogenic gain- and loss-of-function variants from ensemble learning of a diverse feature set. Genome Med. 2023;15(1):103. Epub 2023/12/01. doi: 10.1186/s13073-023-01261-9 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Cadoux-Hudson T, Schofield CJ, McCullagh JSO. Isocitrate dehydrogenase gene variants in cancer and their clinical significance. Biochem Soc Trans. 2021;49(6):2561–72. Epub 2021/12/03. doi: 10.1042/BST20210277 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Joerger AC, Fersht AR. Structural biology of the tumor suppressor p53 and cancer-associated mutants. Adv Cancer Res. 2007;97:1–23. Epub 2007/04/11. doi: 10.1016/S0065-230X(06)97001-8 . [DOI] [PubMed] [Google Scholar]
  • 46.Lallous N, Volik SV, Awrey S, Leblanc E, Tse R, Murillo J, et al. Functional analysis of androgen receptor mutations that confer anti-androgen resistance identified in circulating cell-free DNA from prostate cancer patients. Genome biology. 2016;17:10. Epub 2016/01/28. doi: 10.1186/s13059-015-0864-1 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.McLeod CM, Garrett AM. Mouse models for the study of clustered protocadherins. Curr Top Dev Biol. 2022;148:115–37. Epub 2022/04/25. doi: 10.1016/bs.ctdb.2021.12.006 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Brasch J, Goodman KM, Noble AJ, Rapp M, Mannepalli S, Bahna F, et al. Visualization of clustered protocadherin neuronal self-recognition complexes. Nature. 2019;569(7755):280–3. Epub 2019/04/12. doi: 10.1038/s41586-019-1089-3 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Goodman KM, Rubinstein R, Thu CA, Mannepalli S, Bahna F, Ahlsen G, et al. gamma-Protocadherin structural diversity and functional implications. Elife. 2016;5. Epub 2016/10/27. doi: 10.7554/eLife.20930 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Mycroft-West CJ, Abdelkarim S, Duyvesteyn HME, Gandhi NS, Skidmore MA, Owens RJ, et al. Structural and mechanistic characterization of bifunctional heparan sulfate N-deacetylase-N-sulfotransferase 1. Nat Commun. 2024;15(1):1326. Epub 2024/02/14. doi: 10.1038/s41467-024-45419-4 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.Moradi S, Stankovic S, O’Connor GM, Pymm P, MacLachlan BJ, Faoro C, et al. Structural plasticity of KIR2DL2 and KIR2DL3 enables altered docking geometries atop HLA-C. Nat Commun. 2021;12(1):2173. Epub 2021/04/14. doi: 10.1038/s41467-021-22359-x . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.You Y, Ren Y, Liu J, Qu J. Promising Epigenetic Biomarkers Associated With Cancer-Associated-Fibroblasts for Progression of Kidney Renal Clear Cell Carcinoma. Front Genet. 2021;12:736156. Epub 2021/10/12. doi: 10.3389/fgene.2021.736156 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53.Xu J, Su Z, Ding Q, Shen L, Nie X, Pan X, et al. Inhibition of Proliferation by Knockdown of Transmembrane (TMEM) 168 in Glioblastoma Cells via Suppression of Wnt/beta-Catenin Pathway. Oncol Res. 2019;27(7):819–26. Epub 2019/04/04. doi: 10.3727/096504018X15478559215014 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54.Kellokumpu S. Golgi pH, Ion and Redox Homeostasis: How Much Do They Really Matter? Front Cell Dev Biol. 2019;7:93. Epub 2019/07/03. doi: 10.3389/fcell.2019.00093 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55.Xue J, Yang M, Hua LH, Wang ZP. MiRNA-191 functions as an oncogene in primary glioblastoma by directly targeting NDST1. Eur Rev Med Pharmacol Sci. 2019;23(14):6242–9. Epub 2019/08/01. doi: 10.26355/eurrev_201907_18443 . [DOI] [PubMed] [Google Scholar]
  • 56.Lu CH, Wei ST, Liu JJ, Chang YJ, Lin YF, Yu CS, et al. Recognition of a Novel Gene Signature for Human Glioblastoma. Int J Mol Sci. 2022;23(8). Epub 2022/04/24. doi: 10.3390/ijms23084157 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 57.Hernandez A, Serrano-Bueno G, Perez-Castineira JR, Serrano A. Intracellular proton pumps as targets in chemotherapy: V-ATPases and cancer. Curr Pharm Des. 2012;18(10):1383–94. Epub 2012/03/01. doi: 10.2174/138161212799504821 . [DOI] [PubMed] [Google Scholar]
  • 58.Wang J, Liu T, Huang T, Shang M, Wang X. The mechanisms on evasion of anti-tumor immune responses in gastric cancer. Front Oncol. 2022;12:943806. Epub 2022/11/29. doi: 10.3389/fonc.2022.943806 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 59.de Mare SW, Venskutonyte R, Eltschkner S, de Groot BL, Lindkvist-Petersson K. Structural Basis for Glycerol Efflux and Selectivity of Human Aquaporin 7. Structure. 2020;28(2):215–22 e3. Epub 2019/12/14. doi: 10.1016/j.str.2019.11.011 . [DOI] [PubMed] [Google Scholar]
  • 60.Gotfryd K, Mosca AF, Missel JW, Truelsen SF, Wang K, Spulber M, et al. Human adipose glycerol flux is regulated by a pH gate in AQP10. Nat Commun. 2018;9(1):4749. Epub 2018/11/14. doi: 10.1038/s41467-018-07176-z . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 61.Hansen SB, Tao X, MacKinnon R. Structural basis of PIP2 activation of the classical inward rectifier K+ channel Kir2.2. Nature. 2011;477(7365):495–8. Epub 2011/08/30. nature10370 [pii] doi: 10.1038/nature10370 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 62.Maksaev G, Brundl-Jirout M, Stary-Weinzinger A, Zangerl-Plessl EM, Lee SJ, Nichols CG. Subunit gating resulting from individual protonation events in Kir2 channels. Nat Commun. 2023;14(1):4538. Epub 2023/07/29. doi: 10.1038/s41467-023-40058-7 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 63.Luna LA, Lesecq Z, White KA, Hoang A, Scott DA, Zagnitko O, et al. An acidic residue buried in the dimer interface of isocitrate dehydrogenase 1 (IDH1) helps regulate catalysis and pH sensitivity. Biochem J. 2020;477(16):2999–3018. Epub 2020/07/31. doi: 10.1042/BCJ20200311 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 64.Vercoulen Y, Kondo Y, Iwig JS, Janssen AB, White KA, Amini M, et al. A Histidine pH sensor regulates activation of the Ras-specific guanine nucleotide exchange factor RasGRP1. Elife. 2017;6. Epub 2017/09/28. doi: 10.7554/eLife.29002 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 65.Choi CH, Webb BA, Chimenti MS, Jacobson MP, Barber DL. pH sensing by FAK-His58 regulates focal adhesion remodeling. The Journal of cell biology. 2013;202(6):849–59. Epub 2013/09/18. doi: 10.1083/jcb.201302131 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 66.Webb BA, White KA, Grillo-Hill BK, Schonichen A, Choi C, Barber DL. A Histidine Cluster in the Cytoplasmic Domain of the Na-H Exchanger NHE1 Confers pH-sensitive Phospholipid Binding and Regulates Transporter Activity. J Biol Chem. 2016;291(46):24096–104. Epub 2016/09/22. doi: 10.1074/jbc.M116.736215 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 67.Dong Y, Gao Y, Ilie A, Kim D, Boucher A, Li B, et al. Structure and mechanism of the human NHE1-CHP1 complex. Nat Commun. 2021;12(1):3474. doi: 10.1038/s41467-021-23496-z . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 68.Abramson J, Adler J, Dunger J, Evans R, Green T, Pritzel A, et al. Accurate structure prediction of biomolecular interactions with AlphaFold 3. Nature. 2024;630(8016):493–500. doi: 10.1038/s41586-024-07487-w . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 69.Yang X, Wang D, Dong W, Song Z, Dou K. Over-expression of Na+/H+ exchanger 1 and its clinicopathologic significance in hepatocellular carcinoma. Med Oncol. 2010;27(4):1109–13. Epub 2009/10/31. doi: 10.1007/s12032-009-9343-4 . [DOI] [PubMed] [Google Scholar]
  • 70.Levy ED, De S, Teichmann SA. Cellular crowding imposes global constraints on the chemistry and evolution of proteomes. Proc Natl Acad Sci U S A. 2012;109(50):20461–6. Epub 2012/11/28. doi: 10.1073/pnas.1209312109 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 71.Tsuboyama K, Dauparas J, Chen J, Laine E, Mohseni Behbahani Y, Weinstein JJ, et al. Mega-scale experimental analysis of protein folding stability in biology and design. Nature. 2023;620(7973):434–44. Epub 2023/07/20. doi: 10.1038/s41586-023-06328-6 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 72.Warwicker J, Charonis S, Curtis RA. Lysine and arginine content of proteins: computational analysis suggests a new tool for solubility design. Mol Pharm. 2014;11(1):294–303. doi: 10.1021/mp4004749 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 73.Hebditch M, Carballo-Amador MA, Charonis S, Curtis R, Warwicker J. Protein-Sol: a web tool for predicting protein solubility from sequence. Bioinformatics. 2017;33(19):3098–100. doi: 10.1093/bioinformatics/btx345 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 74.Bhandari BK, Gardner PP, Lim CS. Solubility-Weighted Index: fast and accurate prediction of protein solubility. Bioinformatics. 2020;36(18):4691–8. Epub 2020/06/20. doi: 10.1093/bioinformatics/btaa578 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 75.Chen XQ, Shen T, Fang SJ, Sun XM, Li GY, Li YF. Protein homeostasis in aging and cancer. Front Cell Dev Biol. 2023;11:1143532. Epub 2023/03/07. doi: 10.3389/fcell.2023.1143532 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 76.Carvalho AS, Rodriguez MS, Matthiesen R. Review and Literature Mining on Proteostasis Factors and Cancer. Methods in molecular biology (Clifton, NJ. 2016;1449:71–84. Epub 2016/09/11. doi: 10.1007/978-1-4939-3756-1_2 . [DOI] [PubMed] [Google Scholar]
  • 77.Ren X, Li T, Zhang W, Yang X. Targeting Heat-Shock Protein 90 in Cancer: An Update on Combination Therapy. Cells. 2022;11(16). Epub 2022/08/27. doi: 10.3390/cells11162556 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 78.Dai C, Sampson SB. HSF1: Guardian of Proteostasis in Cancer. Trends Cell Biol. 2016;26(1):17–28. Epub 2015/11/26. doi: 10.1016/j.tcb.2015.10.011 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 79.Vecchi G, Sormanni P, Mannini B, Vandelli A, Tartaglia GG, Dobson CM, et al. Proteome-wide observation of the phenomenon of life on the edge of solubility. Proc Natl Acad Sci U S A. 2020;117(2):1015–20. Epub 2020/01/02. doi: 10.1073/pnas.1910444117 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 80.Hassan MS, Shaalan AA, Dessouky MI, Abdelnaiem AE, ElHefnawi M. Evaluation of computational techniques for predicting non-synonymous single nucleotide variants pathogenicity. Genomics. 2019;111(4):869–82. Epub 2018/05/31. doi: 10.1016/j.ygeno.2018.05.013 . [DOI] [PubMed] [Google Scholar]
  • 81.Montenegro LR, Lerario AM, Nishi MY, Jorge AAL, Mendonca BB. Performance of mutation pathogenicity prediction tools on missense variants associated with 46,XY differences of sex development. Clinics (Sao Paulo). 2021;76:e2052. Epub 2021/01/28. doi: 10.6061/clinics/2021/e2052 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 82.Canbezdi C, Tarin M, Houy A, Bellanger D, Popova T, Stern MH, et al. Functional and conformational impact of cancer-associated SF3B1 mutations depends on the position and the charge of amino acid substitution. Comput Struct Biotechnol J. 2021;19:1361–70. Epub 2021/03/30. doi: 10.1016/j.csbj.2021.02.012 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 83.Kampen KR. Membrane proteins: the key players of a cancer cell. J Membr Biol. 2011;242(2):69–74. Epub 2011/07/07. doi: 10.1007/s00232-011-9381-7 . [DOI] [PubMed] [Google Scholar]
  • 84.Lin CY, Lee CH, Chuang YH, Lee JY, Chiu YY, Wu Lee YH, et al. Membrane protein-regulated networks across human cancers. Nat Commun. 2019;10(1):3131. Epub 2019/07/18. doi: 10.1038/s41467-019-10920-8 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 85.Paul D, Stern O, Vallis Y, Dhillon J, Buchanan A, McMahon H. Cell surface protein aggregation triggers endocytosis to maintain plasma membrane proteostasis. Nat Commun. 2023;14(1):947. Epub 2023/03/01. doi: 10.1038/s41467-023-36496-y . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 86.Tagliatti E, Cortese K. Imaging Endocytosis Dynamics in Health and Disease. Membranes (Basel). 2022;12(4). Epub 2022/04/22. doi: 10.3390/membranes12040393 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 87.Martincorena I, Campbell PJ. Somatic mutation in cancer and normal cells. Science (New York, NY. 2015;349(6255):1483–9. Epub 2015/09/26. doi: 10.1126/science.aab4082 . [DOI] [PubMed] [Google Scholar]
  • 88.Czowski BJ, Romero-Moreno R, Trull KJ, White KA. Cancer and pH Dynamics: Transcriptional Regulation, Proteostasis, and the Need for New Molecular Tools. Cancers (Basel). 2020;12(10). Epub 2020/10/01. doi: 10.3390/cancers12102760 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 89.Kopp MRG, Grigolato F, Zurcher D, Das TK, Chou D, Wuchner K, et al. Surface-Induced Protein Aggregation and Particle Formation in Biologics: Current Understanding of Mechanisms, Detection and Mitigation Strategies. Journal of pharmaceutical sciences. 2023;112(2):377–85. Epub 2022/10/13. doi: 10.1016/j.xphs.2022.10.009 . [DOI] [PubMed] [Google Scholar]
  • 90.Hipp MS, Kasturi P, Hartl FU. The proteostasis network and its decline in ageing. Nat Rev Mol Cell Biol. 2019;20(7):421–35. doi: 10.1038/s41580-019-0101-y . [DOI] [PubMed] [Google Scholar]
  • 91.Chatsirisupachai K, de Magalhaes JP. Somatic mutations in human ageing: New insights from DNA sequencing and inherited mutations. Ageing Res Rev. 2024;96:102268. doi: 10.1016/j.arr.2024.102268 . [DOI] [PubMed] [Google Scholar]
  • 92.Michl J, Monterisi S, White B, Blaszczak W, Hulikova A, Abdullayeva G, et al. Acid-adapted cancer cells alkalinize their cytoplasm by degrading the acid-loading membrane transporter anion exchanger 2, SLC4A2. Cell Rep. 2023;42(6):112601. Epub 2023/06/04. doi: 10.1016/j.celrep.2023.112601 . [DOI] [PubMed] [Google Scholar]

Decision Letter 0

Rajesh Kumar Pathak

27 Aug 2024

PONE-D-24-29679Computational investigation of missense somatic mutations in cancer and potential links to pH-dependence and proteostasisPLOS ONE

Dear Dr. Warwicker,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

Please submit your revised manuscript by Oct 11 2024 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

  • A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.

  • A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.

  • An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: https://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols.

We look forward to receiving your revised manuscript.

Kind regards,

Rajesh Kumar Pathak, Ph.D.

Academic Editor

PLOS ONE

Journal Requirements:

When submitting your revision, we need you to address these additional requirements.

1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at 

https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and 

https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf

2. Please note that PLOS ONE has specific guidelines on code sharing for submissions in which author-generated code underpins the findings in the manuscript. In these cases, we expect all author-generated code to be made available without restrictions upon publication of the work. 

Please review our guidelines at https://journals.plos.org/plosone/s/materials-and-software-sharing#loc-sharing-code and ensure that your code is shared in a way that follows best practice and facilitates reproducibility and reuse.

3. Thank you for stating the following financial disclosure: 

"This work was supported by UK Biotechnology and Biological Sciences Research Council grant BB/V0065921/1 to JW."

Please state what role the funders took in the study. If the funders had no role, please state: "The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript." 

If this statement is not correct you must amend it as needed. 

Please include this amended Role of Funder statement in your cover letter; we will change the online submission form on your behalf.

4. Please note that funding information should not appear in the Acknowledgments section or other areas of your manuscript. We will only publish funding information present in the Funding Statement section of the online submission form. Please remove any funding-related text from the manuscript. 

Additional Editor Comments:

The reviewers have identified several key areas where the manuscript can be significantly strengthened, including broadening the analysis, enhancing the background information, and providing a more thorough discussion of the limitations of the computational methods used. Addressing these points will help the study make a meaningful contribution to the field.

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Yes

Reviewer #2: Partly

**********

2. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: Yes

Reviewer #2: Yes

**********

3. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #2: Yes

**********

4. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

Reviewer #2: Yes

**********

5. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: Review Report:

The paper explores the role of somatic mutations, with a particular focus on arginine depletion and pH-sensing mutations, in cancer cell adaptation to acidic environments and their potential impact on tumor progression. The study employs advanced computational tools, including AlphaFold2 and pKa prediction methods, to analyze the structural and functional implications of these somatic mutations.

The paper highlights potential therapeutic opportunities by targeting pH-sensing mechanisms in cancer cells.

Major Comments:

1. The paper primarily focuses on histidine and arginine mutations, neglecting other potentially relevant mutations that could provide a more comprehensive understanding of cancer cell adaptation. Expanding the analysis to include other amino acid mutations that may play a role in pH-dependent processes would offer a more holistic understanding of the mechanisms involved.

2. The introduction and background sections do not provide enough context on the broader landscape of somatic mutations in cancer, limiting the reader's understanding of the study's significance. Enhancing these sections to include a more detailed discussion of the general landscape of somatic mutations in cancer will help readers better appreciate the study's relevance and significance.

3. The paper places too much emphasis on computational data without adequately discussing the limitations and potential inaccuracies of these methods. A more balanced discussion is needed, critically evaluating the limitations of the computational methods used and discussing potential inaccuracies and their impact on the study's conclusions.

Minor Comments:

1. The introduction should be expanded to provide more context on the broader landscape of somatic mutations in cancer, which is crucial for understanding the study's significance.

2. The results are presented clearly, but the discussion does not adequately address the computational methods' limitations. Additionally, the focus on histidine and arginine mutations is too narrow.

The paper provides valuable insights into the role of arginine depletion and pH-sensing mutations in cancer. However, major revisions are required to address the lack of experimental validation, narrow focus, and insufficient contextual background. With these revisions, the study could make a substantial contribution to the field.

Reviewer #2: Comments to the Authors:

The study titled "Computational Investigation of Missense Somatic Mutations in Cancer and Potential Links to pH-Dependence and Proteostasis" by Shalaw Sallah and Jim Warwicker provides a thorough analysis of somatic missense mutations, specifically focusing on the arginine-to-histidine substitution, which may contribute to pH-sensing functions. Within the frequently mutated subset, the authors identified mutations in NDST1, the HLA-C chain of the MHC I complex, and the water channel AQP-7 as potential mediators of pH-dependence in cancer cells. Furthermore, they emphasized that rebalancing the arginine-to-lysine ratio is crucial for maintaining proteostasis in peripheral cellular locations and controlling tumor development. However, I believe there are areas where further improvements could be made.

Major revisions:

1. In the methods section, the authors noted their use of COSMIC database version 97. However, the latest version, 100, has been released, including several new missense mutations. Did the authors consider analyzing these recently added mutations in the current study? If not, it would be beneficial to include this analysis.

2. In Table 1, ten mutations to histidine are listed along with mutation specificity and pKa values. However, the parameters used to analyze the stability of the protein structures are not clearly defined. Please provide the exact SASA (Solvent Accessible Surface Area) values for each mutation. Additionally, include the SIFT, PolyPhen, and CADD (Combined Annotation Dependent Depletion) scores for each mutation to enhance clarity.

3. In Section 3.7, the authors described how the lysine-to-arginine balance could be beneficial for maintaining cellular proteostasis in tumor cells. Have the authors analyzed Lysine/Arginine (Lys/Arg) mutations in previously published studies on centenarians or healthy aging? Does this Lys/Arg balance contribute to healthy aging, potentially acting oppositely to the mechanism observed in cancer cells?

Minor Revisions:

4. The authors discussed the significance of the arginine-to-histidine mutation in cancer cells, highlighting its role in gaining pH-sensing function. They also referenced previously published studies explaining the mechanism behind this mutation. However, are there any other significant functions gained by this arginine-to-histidine mutation specifically in cancer cells beyond pH-sensing? If so, please elaborate on these in the introduction.

5. In Section 3.4.2, the authors discussed the stabilization of the protein structure. Were any specific parameters used to measure this stabilization in the predictive analysis? Please add the SASA values or other relevant measures in brackets next to the sentences where stabilization is explained.

6. In Section 3.9, the authors explained that genes with subsets of mutations from arginine to lysine were analyzed using Gene Ontology (GO) pathways, and these mutations were found to be enriched in the "cell periphery" GO component category. However, the methodology for this analysis is unclear. The authors should provide more details on how the "cell periphery" pathway was identified. Was the most significant pathway selected?

7. In the conclusion, on line 623, the authors cited a paper suggesting that Arg/Lys rebalancing could be tested with wild-type and arginine-depleted membrane proteins in a cell-based assay. Please provide more experimental results supporting the role of the Arg/Lys rebalancing mechanism in maintaining proteostasis in cancer cells, along with relevant citations.

**********

6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: Yes: Tamil Iniyan Gunasekaran, Columbia University, United States

**********

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.

Attachment

Submitted filename: PONE-D-24-29679_review_report.pdf

pone.0314022.s002.pdf (31.7KB, pdf)
Attachment

Submitted filename: Reviewer comments.docx

pone.0314022.s003.docx (14.7KB, docx)
PLoS One. 2024 Nov 19;19(11):e0314022. doi: 10.1371/journal.pone.0314022.r002

Author response to Decision Letter 0


11 Oct 2024

PONE-D-24-29679

Computational investigation of missense somatic mutations in cancer and potential links to pH-dependence and proteostasis

PLOS ONE

Dear PLOS ONE,

We record our responses to the Reviewer’s comments. We believe that we have addressed all the comments, we hope satisfactorily. We thanks the Reviewer’s for their time and constructive comments.

Jim Warwicker and Shalaw Sallah, Manchester, 11 Oct 2024

Reviewer #1: Review Report:

The paper explores the role of somatic mutations, with a particular focus on arginine depletion and pH-sensing mutations, in cancer cell adaptation to acidic environments and their potential impact on tumor progression. The study employs advanced computational tools, including AlphaFold2 and pKa prediction methods, to analyze the structural and functional implications of these somatic mutations.

The paper highlights potential therapeutic opportunities by targeting pH-sensing mechanisms in cancer cells.

Major Comments:

1. The paper primarily focuses on histidine and arginine mutations, neglecting other potentially relevant mutations that could provide a more comprehensive understanding of cancer cell adaptation. Expanding the analysis to include other amino acid mutations that may play a role in pH-dependent processes would offer a more holistic understanding of the mechanisms involved.

Response: After Fig 3, that presents the Arg and His calculations, results of a new set of calculations are given in S1 Fig. These capture Asp, Glu, and Lys sites of missense mutation in COSMIC and are again sampled at low (1) and higher (≥10) instances. As seen for Arg and Lys sites, there is very little difference in the distribution of calculated �pKas and SASA values between the lower and instance subsets. This analysis is discussed in new text at the end of Section 3.2, with the conclusion that (again) there are no clear systematic differences in the electrostatic environments of the different COSMIC instance sets, and thus no obvious link to alteration in pH-dependent properties. It is noted that smaller subsets may still be relevant in regard of adaptation to pH changes.

2. The introduction and background sections do not provide enough context on the broader landscape of somatic mutations in cancer, limiting the reader's understanding of the study's significance. Enhancing these sections to include a more detailed discussion of the general landscape of somatic mutations in cancer will help readers better appreciate the study's relevance and significance.

Response: Text and references have been added to the Introduction, with a very short history of the study of mutational signatures and their links with mechanisms, including recent coupling to measured genome topological properties. The common underlying endogenous process of methylcytosine deamination, contributing to Arg depletion at the amino acid level is noted, and referenced.

3. The paper places too much emphasis on computational data without adequately discussing the limitations and potential inaccuracies of these methods. A more balanced discussion is needed, critically evaluating the limitations of the computational methods used and discussing potential inaccuracies and their impact on the study's conclusions.

Response: Asking for more discussion of the limitations involved with our computational methodology is reasonable. With regard to the calculations of solvent accessible surface area (SASA) and pKas, these are allowing us to assess possible sites of pH-dependence. We already note in the Introduction that both of the pKa calculation methods (PROPKA3 and pkcalc) have reports of benchmarking against experimental in the literature. To give context beyond this we add text to section 3.5 that discusses 5 systems where biophysical analysis demonstrates a pH-dependence, which in turn may be related to cancer. These 5 mutations are omitted from our Table 1 list of potentially pH-dependent groups because they either do not have 10 instances in COSMIC, or they are already well-characterised. They are though used to test our computations. Three of the 5 are buried with large predicted pKa changes in the AlphaFold protomer models. Both of the other two are involved in functional protein-protein interactions that change environment of the mutation sites, and are likely to mediate the pH-dependence. This demonstrates both the efficacy of our method for identifying pH-dependent sites, but also its limitations where these sites develop at interfaces. The advent of AlphaFold models for whole proteome modelling is itself revolutionary for the field, and it can be anticipated (as referenced in the new text) that proteome interactome analysis with vastly increased coverage (through experimental structures and modelling) will also be a great step forward, although beyond the scope of the current study.

Minor Comments:

1. The introduction should be expanded to provide more context on the broader landscape of somatic mutations in cancer, which is crucial for understanding the study's significance.

Response: See response to major point 2 from this Reviewer, outlining the additional Introduction text and references added.

2. The results are presented clearly, but the discussion does not adequately address the computational methods' limitations. Additionally, the focus on histidine and arginine mutations is too narrow.

Response: In respect of the computational methods’ limitations, please see the response to major point 3 from this Reviewer.

With regard to the focus on histidine and arginine, see response to major point 1 from this Reviewer. We have added Asp, Glu, Lys analysis, and a supplemental Figure to mirror that for Arg and His (Fig 3). Very similar results are obtained for Asp, Glu, Lys to those for Arg, His, specifically that we see no general electrostatic feature difference between sites at higher instances (≥10) in the COSMIC database, compared with those at a single instance, and therefore no indication of a general signal for pH-adaptation.

The paper provides valuable insights into the role of arginine depletion and pH-sensing mutations in cancer. However, major revisions are required to address the lack of experimental validation, narrow focus, and insufficient contextual background. With these revisions, the study could make a substantial contribution to the field.

.

Reviewer #2: Comments to the Authors:

The study titled "Computational Investigation of Missense Somatic Mutations in Cancer and Potential Links to pH-Dependence and Proteostasis" by Shalaw Sallah and Jim Warwicker provides a thorough analysis of somatic missense mutations, specifically focusing on the arginine-to-histidine substitution, which may contribute to pH-sensing functions. Within the frequently mutated subset, the authors identified mutations in NDST1, the HLA-C chain of the MHC I complex, and the water channel AQP-7 as potential mediators of pH-dependence in cancer cells. Furthermore, they emphasized that rebalancing the arginine-to-lysine ratio is crucial for maintaining proteostasis in peripheral cellular locations and controlling tumor development. However, I believe there are areas where further improvements could be made.

Major revisions:

1. In the methods section, the authors noted their use of COSMIC database version 97. However, the latest version, 100, has been released, including several new missense mutations. Did the authors consider analyzing these recently added mutations in the current study? If not, it would be beneficial to include this analysis.

Response: This is a reasonable observation, and applies to bioinformatics analysis in many areas that rely on developing genomics data resources. For a specific response to the Reviewer’s comments we have taken the data used in Table 1, and analysed how these data change for the current (October 2024) COSMIC set, as compared with that for our analysis throughout the work (February 2023 COSMIC dataset). We find, as expected, that there are increases in number of recorded instances for some, but not all, of the 10 mutations listed in Table 1. For 6 mutations at buried sites and with 9 instances in the earlier COSMIC dataset, just one is now at 10 instances (the threshold for Table 1), the remaining 5 staying at 9. It is concluded that analysis will depend in detail on the underlying dataset, but that there are not substantial differences between analyses on datasets compiled 20 months apart. The new text is at the end of section 3.3.

2. In Table 1, ten mutations to histidine are listed along with mutation specificity and pKa values. However, the parameters used to analyze the stability of the protein structures are not clearly defined. Please provide the exact SASA (Solvent Accessible Surface Area) values for each mutation. Additionally, include the SIFT, PolyPhen, and CADD (Combined Annotation Dependent Depletion) scores for each mutation to enhance clarity.

Response: We have added SASA values to Table 1 (for the wild type residue), along with PolyPhen-2 and SIFT predictions of mutation effect on protein function. Rather than CADD we have also added predictions using the recent AlphaMissense method for predicting mutation effects, which are targeted at amino acid changes, and take into account sequence as well as the AlphaFold coverage in modelling structure. References for these are now included in the Methods section, with information on the thresholds used in each prediction scheme included in the Table 1 footnotes. Further additional text in sections 3.3 and 3.4 integrates discussion of the added data in Table 1.

3. In Section 3.7, the authors described how the lysine-to-arginine balance could be beneficial for maintaining cellular proteostasis in tumor cells. Have the authors analyzed Lysine/Arginine (Lys/Arg) mutations in previously published studies on centenarians or healthy aging? Does this Lys/Arg balance contribute to healthy aging, potentially acting oppositely to the mechanism observed in cancer cells?

Response: This is an interesting suggestion that we have researched. We did not find clear-cut data on the role of Arg depletion, or more generally that of somatic mutations, in ageing. We have though included a note towards the end of the Conclusions section that briefly discusses this potentially interesting link.

Minor Revisions:

4. The authors discussed the significance of the arginine-to-histidine mutation in cancer cells, highlighting its role in gaining pH-sensing function. They also referenced previously published studies explaining the mechanism behind this mutation. However, are there any other significant functions gained by this arginine-to-histidine mutation specifically in cancer cells beyond pH-sensing? If so, please elaborate on these in the introduction.

Response: We have researched the literature in this area and can find no other directly appropriate material. It is noted that we have expanded discussion generally of mutational signatures in the Introduction, in response to Reviewer 1 comments. In addition, just after that expanded discussion, we have added a further reference with respect to a potential role for Arg to Cys mutations in alleviating stress from reactive oxygen species.

5. In Section 3.4.2, the authors discussed the stabilization of the protein structure. Were any specific parameters used to measure this stabilization in the predictive analysis? Please add the SASA values or other relevant measures in brackets next to the sentences where stabilization is explained.

Response: We have added text in this section to clarify that we are discussing the electrostatic stabilisation that is predicted to occur for the D268H mutation of PCDHGB4 (in the absence of calcium binding at the site), and that this stabilisation is assessed from the predicted pKa for D268H. Calculated SASA changes little between wild type D268 (16.2 Å2) and mutated 268H (17.6 Å2).

6. In Section 3.9, the authors explained that genes with subsets of mutations from arginine to lysine were analyzed using Gene Ontology (GO) pathways, and these mutations were found to be enriched in the "cell periphery" GO component category. However, the methodology for this analysis is unclear. The authors should provide more details on how the "cell periphery" pathway was identified. Was the most significant pathway selected?

Response: We have added text to section 4.9 specifying that the GO component classification was used in the Princeton GO Finder tool, to identify the most significant categories. Then cell periphery was returned as the most significant (10 of 14 from Arg and to Lys categories in Fig 9), or close to the most significant (the remaining 4), as now stated in section 4.9.

7. In the conclusion, on line 623, the authors cited a paper suggesting that Arg/Lys rebalancing could be tested with wild-type and arginine-depleted membrane proteins in a cell-based assay. Please provide more experimental results supporting the role of the Arg/Lys rebalancing mechanism in maintaining proteostasis in cancer cells, along with relevant citations.

Response: We have provided some more text on the study the Reviewer mentions, in the Conclusions section. That study looked at proteostasis at the cell surface in response to membrane protein cross-linking with antibodies. We suggest that similar methodology could be employed to look at proteostasis effects when Arg / Lys balance is altered for cell surface proteins. We take the point that we have no clear-cut experimental validation of our hypothesised link between Arg / Lys balance and the behaviour of cell surface proteins in cancer. What we do have is a background experimental literature (referenced in the manuscript) on the different properties of Arg and Lys in respect of protein solubility, alongside our bioinformatics observations for cancer mutations. We hope that this report will stimulate experimental work in the area.

We thank both Reviewers for their helpful comments, and hope that our revisions are a suitable response to their suggestions.

Attachment

Submitted filename: ph-response-to-reviewers.docx

pone.0314022.s004.docx (20.9KB, docx)

Decision Letter 1

Rajesh Kumar Pathak

5 Nov 2024

Computational investigation of missense somatic mutations in cancer and potential links to pH-dependence and proteostasis

PONE-D-24-29679R1

Dear Dr. Warwicker,

We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements.

Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication.

An invoice will be generated when your article is formally accepted. Please note, if your institution has a publishing partnership with PLOS and your article meets the relevant criteria, all or part of your publication costs will be covered. Please make sure your user information is up-to-date by logging into Editorial Manager at Editorial Manager® and clicking the ‘Update My Information' link at the top of the page. If you have any questions relating to publication charges, please contact our Author Billing department directly at authorbilling@plos.org.

If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org.

Kind regards,

Rajesh Kumar Pathak, Ph.D.

Academic Editor

PLOS ONE

Additional Editor Comments (optional):

The manuscript can be accepted for publication.

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation.

Reviewer #1: All comments have been addressed

Reviewer #2: All comments have been addressed

**********

2. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Yes

Reviewer #2: Yes

**********

3. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: Yes

Reviewer #2: Yes

**********

4. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #2: Yes

**********

5. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

Reviewer #2: Yes

**********

6. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: The authors have addressed the major concerns in a comprehensive manner, expanding their analysis to include additional amino acid mutations such as Asp, Glu, and Lys. This additional analysis is well integrated into the manuscript, offering a more complete understanding of pH-dependent processes in cancer. They have also provided new background information on mutational signatures, including references to recent studies, enhancing the contextual understanding of somatic mutations in cancer. Furthermore, the limitations of the computational methods used in the study are now better discussed, and the addition of experimental validations or links to known biophysical systems offers a more balanced perspective. The revised manuscript now provides a clearer and more robust contribution to understanding the role of pH-dependence and somatic mutations in cancer, and I recommend the manuscript for acceptance.

Reviewer #2: Dear Authors,

I hope this message finds you well.

I have reviewed the revised version of your manuscript titled “Computational investigation of missense somatic mutations in cancer and potential links to pH-dependence and proteostasis”. I am pleased to confirm that you have successfully addressed all the revisions and comments I provided in my last review. The changes have significantly improved the quality and clarity of the manuscript.

I have no further comments or suggestions. Thank you for your efforts in thoroughly revising the manuscript.

**********

7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

**********

Acceptance letter

Rajesh Kumar Pathak

8 Nov 2024

PONE-D-24-29679R1

PLOS ONE

Dear Dr. Warwicker,

I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now being handed over to our production team.

At this stage, our production department will prepare your paper for publication. This includes ensuring the following:

* All references, tables, and figures are properly cited

* All relevant supporting information is included in the manuscript submission,

* There are no issues that prevent the paper from being properly typeset

If revisions are needed, the production department will contact you directly to resolve them. If no revisions are needed, you will receive an email when the publication date has been set. At this time, we do not offer pre-publication proofs to authors during production of the accepted work. Please keep in mind that we are working through a large volume of accepted articles, so please give us a few weeks to review your paper and let you know the next and final steps.

Lastly, if your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org.

If we can help with anything else, please email us at customercare@plos.org.

Thank you for submitting your work to PLOS ONE and supporting open access.

Kind regards,

PLOS ONE Editorial Office Staff

on behalf of

Dr. Rajesh Kumar Pathak

Academic Editor

PLOS ONE

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Supplementary Materials

    S1 Fig. Structure-based SASA and ΔpKa calculations for mutated Asp, Glu, and Lys.

    The box and whisker plots show quartiles, median (central line), and mean (cross). (A) Distributions of predicted Asp (D), Glu (E), and Lys (K) ΔpKas (pkcalc) are shown for instEQ1 and instGE10 mutation data. Limiting thresholds of +/- 3 are applied for ΔpKa values. (B) The ΔpKa data for mutations in panel (A) are replaced with SASA, for the same subsets.

    (PDF)

    pone.0314022.s001.pdf (105.7KB, pdf)
    Attachment

    Submitted filename: PONE-D-24-29679_review_report.pdf

    pone.0314022.s002.pdf (31.7KB, pdf)
    Attachment

    Submitted filename: Reviewer comments.docx

    pone.0314022.s003.docx (14.7KB, docx)
    Attachment

    Submitted filename: ph-response-to-reviewers.docx

    pone.0314022.s004.docx (20.9KB, docx)

    Data Availability Statement

    All relevant data are within the manuscript.


    Articles from PLOS ONE are provided here courtesy of PLOS

    RESOURCES