Abstract
The crystal structure of phycocyanin (pr-PC) isolated from Phormidium rubidum (P. rubidum) is described at a resolution of 1.17 Å. Electron density maps derived from crystallographic data showed many clear differences in amino acid sequences when compared with the previously obtained gene-derived sequences. The differences were found in 57 positions (30 in α-subunit and 27 in β-subunit of pr-PC), in which all residues except one (β145Arg) are not interacting with the three phycocyanobilin chromophores. Highly purified pr-PC was then sequenced by mass spectrometry (MS) using LC-MS/MS. The MS data were analyzed using two independent proteomic search engines. As a result of this analysis, complete agreement between the polypeptide sequences and the electron density maps was obtained. We attribute the difference to multiple genes in the bacterium encoding the phycocyanin apoproteins and that the gene sequencing sequenced the wrong ones. We are not implying that protein sequencing by mass spectrometry is more accurate than that of gene sequencing. The final 1.17 Å structure of pr-PC allows the chromophore interactions with the protein to be described with high accuracy.
Keywords: Phycocyanin, Phycocyanobilin, Photosynthesis, Light harvesting, Cyanobacteria, Atomic resolution crystal structure
INTRODUCTION
Phycobiliproteins are light harvesting proteins that absorb light energy in the spectral range between 500–660 nm. They are either bound to the outer surface of thylakoid membranes forming a protein complex called a phycobilisome in cyanobacteria and red algae or located in the thylakoid lumen in cryptophyte algae (Watanabe and Ikeuchi 2013; Spear-Bernstein and Miller 1989). Phycocyanin (PC) is a red-light (600–640 nm) absorbing phycobiliprotein, which consists of two polypeptides, an α-subunit (~162 residues) and a β-subunit (~172 residues), each of which contain covalently bound linear tetrapyrrole phycocyanobilin (PCB) chromophores (Scheer and Zhao 2008; Singh et al. 2015). Of a total of ~334 residues in PC, almost 30% are strictly conserved among cyanobacterial species, and the remaining 70% can be divided into two types, partially conserved and non-conserved residues (Supplementary Material I). Structural studies of PCs have shown that the strictly conserved residues are crucial for their structure and function; for example, the conserved residues α84Cys, β82Cys, and β153Cys provide covalent binding sites for the chromophores (Adir et al. 2002; Contreras-Martel et al. 2007; David et al. 2011). Among the partially conserved residues, many contribute to the tight binding sites for these pigment molecules and control their conformations and, thus, their spectroscopic properties. Species-to-species variations in these partially conserved regions result in minor changes of the spectral properties of the chromophores. To understand the roles of varying PC sequences on their resultant structures and spectroscopic properties, high resolution structures are required.
In our previous study, the structure of PC isolated from Phormidium rubidum (termed pr-PC), was described at 2.7 Å resolution (PDB ID: 4YJJ, Gupta et al. 2016). During refinement of this model, several ambiguities between gene-derived PC sequences and the experimental electron density (ED) maps were noticed, but they were left unaddressed owing to low resolution of the data and acceptance that the sequences are correct. In the present paper, the crystal structure of pr-PC is revisited by using superior quality crystals, which produced diffraction up to resolution of 1.17 Å. New data clearly show that previously observed ambiguities are not artifacts of the crystallographic analysis. We, therefore, sequenced the genes of this PC again, but we found no changes, making it clear that sequences observed in the crystal structure are different in approximately 50 positions. Amino acids in most of these positions could be assigned based on largely improved electron density maps, whereas the remaining uncertainties were resolved by mass spectrometry sequencing of the PC proteins used for crystallization. We describe here how these combined efforts have allowed us both to correct the sequence information and to describe the overall structure of pr-PC at an atomic resolution of 1.17 Å.
MATERIALS AND METHODS
Isolation and purification of pr-PC
The pr-PC was purified from Phormidium sp. A09DM as described in Gupta et al. (2016).
Gene-based sequencing of pr-PC
The amino acid sequences of the α- and β-subunits of pr-PC were originally deduced from the cpcA and cpcB gene sequences as described earlier (Gupta et al., 2016).
Crystallographic analysis
Crystallization and data collection
Crystallization trials of the pure pr-PC (10 mg mL−1 in 20 mM Tris-HCl pH 8.0) were set up using a range of commercially available pre-formulated screens, PACT premier, JCSG+, Morpheus, MIDAS and Structure, procured from Molecular Dimensions. These trials were set up by using a sitting drop vapor diffusion method in 96-well, 2-drop MRC crystallization plates (Molecular Dimensions) by a crystallization robot, Cartesian Honeybee 8+1 (Genomic Solutions Ltd). Initially, crystals were obtained under several screening conditions. Suitable sized crystals were tested on our in-house X-ray system (Rigaku MicroMax-007 generator and mar345dtb detector). The best crystals were obtained from Morpheus screen condition E4 [0.12 M ethylene glycol, 0.1 M buffer system 1 (pH 6.5) and 50% v/v Precipitant Mix 4] (Details in Supplementary Material II). This condition was further optimized by using a larger volume 24-well plates (Cryschem, NBS Biologicals), which yielded superior quality crystals of 300 × 100 × 100 microns in 20 days diffracting in-house to 1.6 Å. The best PC crystals were flash cooled in nitrogen gas at 100 K and stored in liquid nitrogen before shipment to the synchrotron. The Morpheus crystallization condition E4 contained sufficient amounts of cryoprotectants like ethylene glycols, MPD and PEG-1K, so no extra cryoprotection was needed. The final data collection was carried out at 100 K at beamline I03 of Diamond Light Source (DLS) near Oxford, UK, using the detector Pilatus 6M-F (Dectris AG).
Data processing and building of 3-D structure
Two independent 1.17 Å data sets collected from a single crystal were scaled and averaged (data collection statistics are in Table 1). The data were indexed in the P63 space group, integrated, scaled, and evaluated by using programs XDS (Kabsch 2010), POINTLESS (Evans 2006), AIMLESS (Evans and Murshudov 2013), Xia2 (Winter 2010) and the CCP4 suite (Winn et al. 2011). Initial phases were obtained by using molecular replacement (MR) program PHASER (McCoy et al. 2007) with a single αβ-monomer from the low-resolution PC structure, PDB ID: 4YJJ (Gupta et al. 2016) as a search model. The MR maps showed clearly that in several positions, the protein sequences in the model were incorrect. As described below, resequencing by mass spectrometry of the pr-PC protein used in crystallization, the sequences of the α- and β-subunits were corrected and became consistent with the features of 1.17 Å ED (Electron Density) maps. The new model, with the correct amino acid sequences, was then refined by using REFMAC5 (Murshudov et al. 2011) and manual modeling in COOT (Emsley et al. 2010) until reasonable stereochemistry and R-factors were achieved. Final cycles of refinement were performed by using anisotropic atomic B factors. The structure factors and refined coordinates were submitted to the protein data bank (PDB) with ID 6XWK. From the Arpeggio web-server (Jubb et al. 2017; http://structure.bioc.cam.ac.uk/arpeggio), non-classical interactions like low-energy CH-π interactions (weaker than H-bonds, ΔG = −0.17 to −2.01 kCal mol−1) (Nishio 2004; Chakrabarti and Bhattacharyya 2007) and cation-π (ΔG range equals to H-bonds and salt-bridges) (Gallivan and Dougherty 1999) interactions present in the pr-PC crystal structure were analyzed. The figures were prepared using PyMOL (Version 2.0, Schrödinger, LLC) and CCP4MG (McNicholas et al. 2011).
Table 1:
Data collection, processing and refinement statistics for the PC complex of Phormidium rubidum
| Protein Data Bank accession code | 6XWK |
|---|---|
| Space group | P63 |
| Unit cell dimensions a, b, c (Å) | 106.32, 106.32, 58.67 |
| Unit cell angles α, β, γ (°) | 90.00, 90.00, 120.00 |
| Data collected at beamline | I03 at Diamond Light Source, UK |
| Detector used | Pilatus 6M-F |
| Wavelength (Å) | 0.84920 |
| Resolution range (outer shell) (Å) | 46.08–1.17 (1.20–1.17)a |
| Unique reflections | 127070 |
| Redundancy | 14.0 (8.0)a |
| Completeness (%) | 100.0 (99.97)a |
| Rmergeb | 0.062 (1.28)a |
| Rpimc | 0.024 (0.69)a |
| Mean I/σ | 21.3 (1.5)a |
| Half-set correlation coefficientd CC1/2 | 1.0 (0.52)a |
| Refinement Rwork/Rfree factorse (%) | 11.88/14.50 |
| Ramachandran plot features (%) Favored/Allowed/Outliers |
99.0/1.0/0.0 |
| Rms dev. bond lengths/angles (Å/°) | 0.012/1.747 |
| Coordinate errorf | 0.029/0.027 |
| No. of non-H atoms used in refinement | 3363 (3137)g |
| No. of water molecules | 463 |
| Mean atomic/Wilson plot B factors (Å2) | 15.3/13.2 |
Values in parentheses are for the highest-resolution outer shell
The experimental unmerged data are divided into two parts, each containing a random half of the measurements of each unique reflection. The correlation coefficient CC1/2 is then calculated between the average intensities of each subset (Karplus and Diederichs, 2012)
Rwork and ; Rwork was calculated for all data except for 5% that used for the Rfree calculations
Estimated standard uncertainty; first value calculated using the method of Cruickshank (Cruickshank 1999), second one based on maximum likelihood as implemented in REFMAC (Murshudov et al. 2011)
Real number of atoms in protein excluding the atoms of alternative confirmations
Protein sequencing by LC-MS/MS
Protein Sample Digestion and LC-MS/MS
Pure PC was precipitated by adding 4X volume of acetone, the resulting protein pellets were dissolved in 8 M urea (30 μL), and the disulfide bonds were reduced using TCEP, tris(2-carboxyethyl) phosphine (C4706, Sigma, St. Louis, MO) at 5 mM total concentration for 30 min. The free thiols were then alkylated with freshly prepared iodoacetamide (I1149, Sigma), added to give 10 mM final concentration, for 30 min with shaking in the dark. Trypsin/Lys-C Mix (V5071, Promega, Madison, WI) was added to give a final concentration of 0.02 μg/μL, and enzymatic digestion took place at 37 °C with shaking overnight. Finally, the digestion was quenched by adding 5 uL of 1% aqueous formic acid to afford a total concentration of 0.1%.
Aliquots (5 μL, ~100 pmoles) of the peptide samples were separated on line by using a Dionex Ulimate 3000 RSLCnano pump and autosampler (Thermo Fisher Scientific, Waltham, MA, USA) and a custom-packed column containing ProntoSIL C18AQ, 3 μm particle size, 120 Å pore size (Bischoff, Stuttgart, Germany), in a 75 μm × 15 cm capillary. The mobile phase consisted of A: 0.1% formic acid in water, and B: 0.1% formic acid in 80% acetonitrile/20% water (Thermo Fisher Scientific, Waltham, MA, USA). At a flow rate of 500 nL/min, the gradient was held for 5 min at 2% B and slowly ramped to 17% B over the next 30 min, increasing to 47% B over the next 30 min and then finally increasing to 90% B over 30 min and held at 90% for 10 min. The column was then allowed to re-equilibrate for 60 min with a flow of 2% B in preparation for the next injection.
The separated peptides were analyzed on-line by using a Q-Exactive Plus mass spectrometer (Thermo Fisher Scientific, Waltham, MA, USA) operated in standard data-dependent acquisition mode controlled by Xcalibur version 4.0.27.19. Precursor-ion activation was set with an isolation width of m/z 1.0 and with two collision energies toggled between 25 and 30%. The mass resolving power was 70 K for precursor ions and 17.5 K for the product ions (MS2).
LC-MS/MS data analysis
The raw data were analyzed using PEAKS Studio X (version 10.0, Bioinformatics Solution Inc., Waterloo, ON, Canada, www.bioinfor.com) and Protein Metrics Byonic and Byologic (Protein Metrics Inc., Cupertino, CA, www.proteinmetrics.com). The data were searched against a custom database that included decoys, and positive IDs were manually verified. PEAKS was used in the de novo mode followed by DB, PTM, and SPIDER modes. Search parameters included a precursor-ion mass tolerance of 10.0 ppm and a fragment-ion mass tolerance of 0.02 Da. Variable modifications included all built-in modifications as well as phycobiliprotein specific PTMs, and the pigments selected were phycocyanobilin and phycoerythrin. The maximum allowed modifications per peptide were 3; and the maximum missed cleavages were 2; false discovery rate, 0.1%. SPIDER (function) was used to identify unknown spectra by considering homology searches, sequence errors, and residue substitutions to yield a more confident identification.
Byonic searches employed the same database but used a precursor ion mass tolerance of 20 ppm and a fragment ion mass tolerance of 60 ppm with a maximum of two missed cleavages. Wildcard searches of ± 200 Da were employed to look for modifications in addition to regular PTM analysis. Protein false discovery rate threshold was determined by the score of the highest ranked decoy protein identified. All the search results were combined in Byologic for validation. Some predicted peptide fragments derived from the originally proposed sequence or the raw X-ray crystallography data were not identified with this strategy. For these, a peptide sequence library was constructed by using Python, and this new database containing variable amino acids at conflicting sites was used for a new search. This strategy allowed all the remaining residues to be identified. The mass spectrometry proteomics data have been deposited to the ProteomeXchange Consortium via the PRIDE partner repository with the dataset identifier PXD017229 and 10.6019/PXD017229.
RESULTS AND DISCUSSION
Crystallographic analysis
The best crystal of pr-PC diffracted X-rays to give a structure with a resolution of 1.17 Å. (The crystal and diffraction data plus the structure refinement statistics are shown in Table 1). The final refinement of the pr-PC structure was carried out by using the anisotropic atomic B factors and reached final values of Rwork/Rfree 11.88/14.50%. The crystal asymmetric unit is composed of one α-subunit with one covalently bound PCB chromophore (the PDB ligand code CYC), and one β-subunit with two covalently bound PCB chromophores. It was apparent at the MR stage that the gene-derived protein sequences described in 4YJJ-PC (Gupta et al. 2016) did not fit well in a significant number of places of the 1.17 Å electron density (ED) maps. Residues in some of these positions could be replaced, with high confidence, by new residues, which satisfied the ED maps very well. Subsequent refinement of the structure confirmed these choices (Figure 1 illustrates examples of this in both subunits). The high quality of the ED maps in the present case clearly shows a mismatch between the previously gene-derived sequences and these ED maps. In other poorly fitting positions, it was not possible to select the correct replacement amino acid residues unambiguously solely on the basis of ED maps. To obtain fully consistent amino acid sequences for α- and β-subunits that would better reflect the high-resolution ED maps, information from a mass spectrometry (MS) approach was used. For example, a positive difference peak near β72Asn indicated that this residue is methylated (as was also noticed in other PC structures (Klotz et al. 1986)), and thus the structure incorporated γ-N-methyl asparagine (MeN).
Fig. 1:

Representative cases of discrepancies between the electron density (ED) maps and the gene-derived polypeptide sequences in positions α5 (A), α146 (B), α158 (C), α35 (D), α42 (E), α76 (F), β19 (G), β38 (H), β118 (I) and β139 (J). Fitting of the newly assigned residues (gold color) in the ED-maps is shown as compared to previously gene-derived assignments (green color).
Structural data and mass spectrometry aided sequencing
Figure 2 compares the previously gene-derived sequences of the α- and β-subunits with those obtained by MS. Interestingly, the correct sequences required a combination of the raw MS data with the information from ED maps, that provide potential candidate amino acid residues. Several rounds of integration and feedback between these two sources of information were required to obtain a fully consistent picture. In total, the combination of data from the MS and ED maps corrected residues in 57 positions (30 in α- and 27 in β-subunits). All of the MS-derived sequences can be accessed from the ProteomeXchange Consortium website (ebi.ac.uk/pride/archive) with the dataset identifier PXD017229 and 10.6019/PXD017229. Interested readers can also see all the raw MS data at this website.
Fig. 2:

Alignment of α-subunit (A) and β-subunit (B) sequences of pr-PC, derived from the gene sequencing (first line) and assigned based on ED maps and/or the MS analysis (second line). Yellow shade: residues assigned based on both ED maps and MS; magenta shade: residues assigned based on only ED-maps; green shade: residues assigned based solely on MS results.
Out of the original 32 mismatches in the α-subunit, 23 are clearly confirmed based on combined inference from the ED maps and the MS analysis (Figure 2). For example, the α5Leu does not fit in relevant ED maps (Figure 1A). The ED clearly suggested that the correct residue in this position must have four side chain atoms with a 3rd atom larger in size than the others; this residue was tentatively assigned to be Met. The presence of Met at this position was unequivocally confirmed by MS analysis (Supplementary Material III, Fig. IIIA). First of all, the precursor ion of the peptide (MKTPMTEAVAAADSQGR) containing Met5 was identified to high accuracy with −1.3 ppm mass error. Second, the product-ion coverage is 59% for y and 88% for b. Specifically, the difference in mass of product ions y13 and y12, and of b5 and b4 are 131.0405 and 131.0404, respectively, consistent with the presence of a Met (monoisotopic residue mass of Met is 131.0405) at position 5 (the second M) of the peptide.
At position α146, the Pro assigned by previous gene sequencing had no appropriate electron density, clashed with the main chain carbonyl group, and was corrected to Ala, based on the ED maps (Figure 1B) and the MS analysis (Supplementary Material III, Fig. IIIB). Again, in Fig. IIIB the mass difference between product ions b9 and b8 strongly indicates Ala at 146, consistent with the ED maps.
Likewise, the assigned Ile at position α158 was found to be incorrect owing to a lack of sufficient electron density for the fourth side chain carbon (CD) atom and was corrected to be Val (Figure 1C). This new assignment was also supported by the MS results (Supplementary Material III, Fig. IIIB). Note, the mass difference between product ion y5 that contains Val and y4 that doesn’t, is consistent with a Val.
Similarly, the Gly at site α35 is not enough to fill the electron density suggesting presence of residue having a five-atom side chain (Figure 1D). Based on the electron density shape, it could be assigned as either Glu or Gln. This choice was finally resolved as Gln by the MS analysis to give accurate values of the intact peptide, Supplementary Material III, Fig. IIIC).
In 9 of these 32 problematic positions, the correct sequence assignment could not be clarified completely through the MS analysis (Figure 2). Considering clear features of the ED maps in these positions and the surrounding environment in 3-dimensional space, 7 (out of 9 positions) positions were assigned solely from the ED maps. For example, residue at α42 may be Lys or Gln, as both are consistent with the MS data. However, since the ED features at and surrounding position α42 clearly suggests that this residue should have five side chain atoms with the end atoms available to make H-bonds with the neighboring residue β21Asn and a water molecule 260, α42 was assigned to be Gln (Figure 1E).
Similarly, the ED at position of residue α76 suggested that it should have a hydrophilic, two-atom side chain with an end atom available for a H-bonding with the main chain carbonyl of α69 Met and water molecule 135 (Figure 1F). This residue was, therefore, also assigned as Ser in consistent with the ED map. At the two remaining positions, α46 and α70, the ED maps are not conclusive and, thus, these two residues were assigned as Glu and Gln, respectively, solely based on the MS data (Figure 2).
In the β-subunit, 26 out of 27 changes were clearly confirmed by both the ED maps and the MS analysis. For example, Figures 1G–1I clearly show that electron density at positions β19, β38 and β118 are better fit by Val, Ile and Val instead of Leu, Met and Ile, respectively. These changes were also confirmed by the MS data. At the remaining position β139 (Leu or Ile), the ED features were utilized to make the assignment because the MS analysis cannot distinguish between isomeric Leu and Ile. This final position was, therefore, assigned to be Ile based on the clear features of the ED map (Figure 1J). All of these changes in α- and β-subunits also agree with the electron density maps of previous low resolution structure of pr-PC (PDB ID: 4YJJ; Gupta et al 2016).
The atomic resolution structure of Phormidium PC
The overall pattern of polypeptide folding of pr-PC is similar to those of other known PCs. The Cα carbons of pr-PC superimpose, for example, with those of PCs from Thermosynechococcus elongatus (PDB ID: 3L0F) and Acaryochloris marina (PDB ID: 5OOK, Bar-Zvi et al. 2018) with RMSDs of 1.22 and 0.81 Å, respectively. The α- and β-subunits interact through their N-terminal helices to form a stable αβ-heterodimer (often called in the literature as a ‘monomer’) with a buried interaction surface area of ~6660 Å2 and a gain of −71.0 kcal M−1 of solvation free energy, as calculated from PISA server (Krissinel and Henrick 2007) (Figure 3A). Several strong H-bonds between α- and β-subunit residues, α42Gln - β21Asn (2.83 Å), α42Gln - β21Asn (2.83 Å), α3Thr - β3Asp (2.74 Å), α1Met - β1Met (2.70 Å), α17Arg - β95Tyr (2.65 Å), α13Asp - β91Arg (2.95 Å), α13Asp - β108Arg (2.93 Å), α97Tyr - β17Ala (2.60 Å) and α93Arg - β13Asp (2.83 Å) are involved in formation of this stable heterodimer. Three such α/β-heterodimers interact in a head-to-tail manner to form a trimeric ring (αβ)3, that has a total buried interaction surface area of ~24870 Å2 and a gain of −249.6 kcal M−1 of solvation free energy (Figure 3B). Unlike some other known PC crystal structures, pr-PC does not assemble into the [(αβ)3]2 (face to face sandwich-like arrangements of two (αβ)3 trimers) hexamers in the crystal. Each α-subunit contains one PCB chromophore, α PCB 84 (molecule termed CYC A201 in the PDB coordinates file) and each β-subunit contains two of them, β PCB 82 and β PCB 153 (CYC B201 and CYC B202 in the PDB coordinates file, accordingly), attached to their α84Cys, β82Cys and β153Cys residues, respectively.
Fig. 3:

Cartoon representation of the αβ-heterodimer (A) and the (αβ)3-trimer (B) of the pr-PC with the associated chromophores. Cyan and green colors represent α- and β-subunits, respectively. (C) Fit of the α PCB 84 chromophore of pr-PC in the 2Fo-Fc electron density map (drawn at a 2.5 σ contour level) illustrating the quality of this map and the scheme of naming of the four pyrrole rings.
Chromophore-protein interactions
The nomenclature of four pyrrole rings of the phycocyanobilin (PCB) molecule used here follows as per Peng et al. (2014) (i.e., the pyrrole ring attached to Cys residue is named A-ring and the subsequent rings are named B, C and D (Figure 3C)). The structural details of each of the three chromophore binding pockets are shown in Figure 4 and in Table 2.
Fig. 4:


Details of the binding pockets of chromophores α PCB 84 (A), β PCB 82 (B) and β PCB 153 (C) in the pr-PC protein matrix. Yellow-dashed lines indicate H-bonds. (D) A view of the three chromophores, α PCB 84 (bright green), β PCB 82 (pale green) and β PCB 153 (tan), superimposed on each other by matching the planes of approximately coplanar B and C rings.
Table 2:
Details of chromophores interactions with surrounding apoprotein in P. rubidum phycocyanin structure
| Chromophore | Type of Interaction | CYC atom | Bonded with (Atom, Residue, Chain) | Distance* (Å) |
|---|---|---|---|---|
| α PCB 84 (molecule CYC A201 in the PDB coordinate file) | Covalent | Ring A, CAC | SG, Cys84, A | 1.80 |
| H-bond | Ring A, NC | O, Asn73, A | 2.89 | |
| Ring A, OC | N, Ala75, A | 3.06 | ||
| Ring B, ND | OD2, Asp87, A | 2.95 | ||
| Ring B, O1D | O, Phe122, A (water mediated) | 2.61, 2.70 | ||
| Ring B, O2D | O2C, a PCB 84, A (water mediated) | 2.84, 2.69 | ||
| Ring C, NA | OD2, Asp87, A | 2.78 | ||
| Ring C, NA | NH2, Arg86, A | 3.06 | ||
| Ring C, O2A | NZ, Lys83, A | 2.75 | ||
| Ring C, O1A | NH1, Arg86, A | 2.80 | ||
| Ring C, O1A | NH2, Arg86, A | 3.00 | ||
| Ring C, O1C | NB, Ring D, α PCB 84, A (water mediated) | 2.72, 2.93 | ||
| Ring D, OB | N, Thr75, B$ | 3.00 | ||
| Ring D, OB | O, Ala73, B$ (water mediated) | 2.74, 2.75 | ||
| CH-π | Ring B, Center | CB, Lys83, A | 3.56, 3.52 | |
| Ring B, Center | CD2, Leu124, A | 3.66 | ||
| Ring C, CAA | Phenyl ring, Phe122, A | 3.37 | ||
| Cation-π | Ring C, Center | NH2, Arg86, A | 3.23 | |
| β PCB 82 (molecule CYC B201 in the PDB coordinate file) | Covalent | Ring A, CAC | SG, Cys82, B | 1.80 |
| H-bond | Ring A, NC | OD1, MeN72, B | 2.88 | |
| Ring B, ND | OD2, Asp85, B | 2.86 | ||
| Ring B, O1D | O, Leu120, B (water mediated) | 2.76, 2.70 | ||
| Ring B, O2D | NH1, Arg77, B | 2.63 | ||
| Ring B, O2D | O2A, P PCB 82, B (water mediated) | 2.76, 2.41 | ||
| Ring C, NA | OD1, Asp85, B | 3.06 | ||
| Ring C, NA | OD2, Asp85, B | 2.78 | ||
| Ring C, O1A | NB, P PCB 82, B (water mediated) | 2.97, 2.78 | ||
| Ring C, O1A | NH1, Arg84, B | 2.86 | ||
| Ring C, O1A | NH2, Arg84, B | 2.32 | ||
| Ring C, O1A | NB, Ring D, β PCB 82, B (water mediated) | 2.90, 2.78 | ||
| Ring C, O2A | NH1, Arg77, B (water mediated) | 3.12, 3.15 | ||
| CH-π | Ring B, Center | CB, Ala81, B | 3.79, 3.56 | |
| Ring B, Center | CG2, Val122, B | 3.72 | ||
| β PCB 153 (molecule CYC B202 in the PDB coordinate file) | Covalent | Ring A, CAC | SG, Cys153, B | 1.80 |
| H-bond | Ring A, OC | NH1, Arg145, B | 2.74 | |
| Ring A, OC | N, Gly151, B | 3.12 | ||
| Ring A, NC | O, Thr149, B | 2.63 | ||
| Ring B, ND | OD2, Asp39, B | 2.80 | ||
| Ring B, O1D | ND2, Asn35 | 3.26 | ||
| Ring B, O2D | O, Ser32, B (water mediated) | 2.35, 2.77 | ||
| Ring B, O2D | N, Asn35, B (water mediated) | 2.35, 2.91 | ||
| Ring C, NA | OD2, Asp39, B | 2.67 | ||
| Ring C, O1A | OG1, Thr149, B | 2.60 | ||
| Ring C, O2A | N, Thr149, B (water mediated) | 2.74, 2.94 | ||
| Ring C, O2A | NB, Ring D, β PCB 153, B (water mediated) | 2.64, 2.85 | ||
| CH-π | Ring B, Center | CG, Lys36, B | 3.94, 3.66 | |
| Ring C, Center | CB, Asn35, B | 3.54, 3.52 |
- for the covalent interaction the length of the covalent CAA(chromophore)-SG(Cys) bond;
- for the H-bond interactions the distance between the H-bond donor and acceptor atoms, two values if it is water mediated interaction;
- for the CH-π interaction the distance between the CH-group C-atom and the aromatic ring center, and, if shorter, the distance between the CH-group C-atom and the closest C-atom of the ring;
- for the cation-π interaction the distance between the cation and the aromatic ring center.
Symmetry related chain B for β-subunit within the (αβ)3-trimer
The chromophore α PCB 84 is positioned at the interface of two αβ-heterodimers in the pr-PC trimer (Figure 3B) (the binding pocket of α PCB 84 is shown in Fig. 4A). The position of the A-ring is fixed through its C-S covalent bond to the α84Cys residue and the H-bonds with the main chain peptide bond O/N atoms of α73Asn/α75Ala, respectively. The aromatic rings B and C are held in an almost co-planar arrangement as found also in other known PC structures (Sonani et al. 2019). This coplanarity is achieved by three sets of interactions; (1) by two H-bonds between the conserved α87Asp side chain and the N atoms of rings B and C; (2) by multiple H-bonds between the B- and C-ring propionic acids and the side chains of β57Arg, α83Lys and α86Arg residues, and (3) by CH-π and cation-π interactions of the B and C-rings/their propionic acid chains with α83Lys, α86Arg, α122Phe and α124Leu as detailed in Table 2 (Jubb et al. 2017). Although these CH-π and cation-π interactions are non-classical and low-energy interactions, they must play an important role in structure and function of this protein and others (Burley and Petsko 1986; Chakrabarti and Bhattacharyya 2007). It is of note that residues involved in the protein-chromophore CH-π and cation-π interactions are strictly conserved. The plane of aromatic ring D deviates from the B-C plane by a dihedral angle of 28.1°. The orientation of ring D is controlled by three H-bonds, of which two are water-mediated, one with the C-ring propionic acid group, and a second with the β73Ala backbone carbonyl group. The third one is a direct H-bond with the peptide bond N atom of β75Thr.
The chromophore β PCB 82 is located towards the interior cavity of the pr-PC trimer. Its overall conformation is almost identical to that of α PCB 84 (Figures 4B and 4D). The A-ring of β PCB 82 is fixed by a C-S covalent bond with the β82Cys and the H-bond with methylated β72Asn. Rings B and C are also co-planar in β PCB 82; their arrangement is fixed by three sets of interactions similar to those described above for α PCB 84 (for details see Table 2). Ring D in β PCB 82 is solvent-exposed and does not interact with any amino acid residues. Its deviation from the B-C plane is 40.1°. As the inner cavity of the PC trimer is a likely binding site for linker proteins, the orientation of the β PCB 82 D-ring can be further modified upon binding of such linker proteins.
Chromophore β PCB 153 is situated on the periphery of the pr-PC-trimer and adopts a different conformation as compared to other two chromophores in pr-PC (Figures 4C and 4D). Its A-ring is fixed by a C-S covalent bond with β153Cys and H-bonds with residues β145Arg, β149Thr and β151Gly (details in Table 2). Residue β145Arg is the only one among all residues interacting directly with the chromophores, which is also one of 57 residues found different between the pr-PC gene-derived and crystal structure sequences. As can be seen in the sequence alignment presented in the Supplementary Material Fig. IB, an arginine is rarely found in position 145 of PC β-subunit. There are only four known cases, with only a single determined crystal structure, PDB ID 4H0M (Marx and Adir 2013) beside the pr-PC presented here. The β145Arg in pr-PC contributes to fixing the chromophore β PCB 153 to the protein, but the Arg does not seem to be critical as there are already many other interactions involved in this process (see Table 2). This residue in the PC structure 4H0M does not make the similar H-bond contact with the β PCB 153 chromophore (Marx and Adir 2013).
Rings B and C of β PCB 153 are held co-planar by H-bonds between their N atoms and the β39Asp side chain, and by a direct or water-mediated H-bond interactions between their propionic groups and residues β32Ser, β35Asn and β149Thr. Moreover, residues β36Lys and β35Asn are involved in the CH-π interactions with the rings B and C, respectively (Jubb et al., 2017). The orientation of ring D in β PCB 153 is also distinct as it is fixed above the B-C plane (see Figures 4C and 4D) by the water-mediated H-bonding with the C-ring propionic group. Effectively ring D deviates by 42.0° from the B-C plane.
The propionic side chains in β PCB 153 are oriented differently as compared to those in the first two chromophores (i.e., the B-ring side chain protrudes down and C-ring side chain protrudes up relative to the B-C plane). These conformational differences between the dispositions of the propionic acid side chains of the different chromophores with respect to the conjugated plane of rings B and C are shown in Figure 4D. Figure 4D also shows that the arrangement of rings A and D relative to the B-C plane differs between the three PCBs. In chromophores α PCB 84 and β PCB 82 rings A and D are on opposite sides of the B-C plane (anti-periplanar conformation) whereas in chromophore β PCB 153, they are on the same side of this plane (syn-periplanar conformation). It is worth pointing out that a similar difference in conformations between β PCB 153 and the other two chromophores were found in eight structurally characterized structures of PCs that contain a phenylalanine (Phe) residue in position 28 of their α-subunits (PDB IDs: 1CPC, 1HA7, 4F0T, 4H0M, 5OOK, 5TOU, 6HRN and 6JPR). The binding pocket of β-subunit chromophore β PCB 153 is located near the border between α- and β-subunits, and the side chain of α-residue 28 is a part of this pocket. The bulky side chain of Phe in this position does not allow the D-ring of β PCB 153 to form an anti-periplanar conformation relative the B-C plane. Because the Phe is present in this position in majority of PC cases (see the Supplementary Material Fig. IA), the syn-periplanar conformation for the β PCB 153 should, therefore, be considered as a typical conformation. In three structurally characterized PCs with the smaller side chain residues like aspartate or asparagine in position 28 of their α-subunit (PDB IDs: 2VML, 3L0F and 3O18), the conformation of β PCB 153 is anti-periplanar (i.e., the same as for chromophores α PCB 84 and β PCB 82).
The actual geometry of each chromophore is also influenced by several hydrophobic interactions, which are shown in Supplementary Material Fig. IV. With the exception of residue β145Arg, all other residues directly interacting with the three chromophores are fully conserved and are not among the 57 amino acids that were found different between the gene-derived and crystal structure sequences.
There is a clear hierarchy of interactions of the PCB chromophores with their surrounding apo-proteins. The first level of interaction is the covalent attachment of ring A with the relevant Cys residue. This fixes the chromophore to the protein. A specific orientation of ring A is achieved by one or two hydrogen bonds. The next level of interaction sets the specific coplanar arrangement of rings B and C, and of their propionic side chains. This level involves the stronger non-covalent interactions (H-bonds) and some weaker interactions (CH-π and cation-π interactions) (Jubb et al. 2017). The final level of interaction controls the orientation of ring D relative to the B-C ring plane and dictates whether the overall conformation of the chromophore is anti-periplanar or syn-periplanar. At this level, a common factor in all three chromophores is the water-mediated H-bond interactions between the C-ring propionic acid group and the N atom of the D-ring. When the C-ring propionic group is ‘down’, the ring D is ‘down’ and vice-versa. Thus, the propionic acid position is strongly correlated with the corresponding orientation of the ring D. The exact angle of the D-ring deviation from the B-C plane appears to be controlled by other hydrogen bonds. In the case of chromophore α PCB 84 where that angle is 28.1°, there are two H-bond contacts between ring D and the surrounding protein (see details in Table 2). In the case of the other two chromophores, however, where that angle is ~40°, ring D only makes additional H-bond contacts with water molecules.
Final remarks
We have shown above that a combination of high-resolution X-ray crystallographic structural information and mass spectrometry measurements from a MS-based proteomics analysis can resolve previously found ambiguities between the gene-derived protein sequences and the sequences found in the crystal structure. It is unlikely that these sequence ambiguities are because of the presence of mixed cultures as the purity of our culture was re-confirmed by both morphological and 16S rRNA gene homology-based molecular assessments, as described previously (Parmar et al., 2011). We rather suspect that the ambiguities reflect the presence of multiple isoforms of cpcA and cpcB genes in the genome of Phormidium rubidum sp. A09DM, as was also identified recently in cyanobacterium Acaryochloris marina (Bar-Zvi et al. 2018). Efforts to sequence the genome of cyanobacterium Phormidium rubidum sp. A09DM are now underway to test this hypothesis. We are also in the process of exploiting our improved structure by asking our collaborators to try to use our atomic resolution PC structure to calculate the absorption properties of each individual PCB chromophore by using state of the art quantum mechanical methods.
Data availability
The co-ordinates associated with the present study are available at the Protein Data Bank with PDB ID 6XWK. The MS proteomics data is available at ProteomeXchange Consortium with the dataset identifier PXD017229 and 10.6019/PXD017229.
Supplementary Material
Acknowledgements
DM acknowledges the University Grants Commission (UGC) for UGC-BSR faculty fellowship. RJC, AWR, REB, MLG, HL were supported by the Photosynthetic Antenna Research Center (PARC), an Energy Frontier Research Center funded by the Department of Energy, Office of Science, Office of Basic Energy Sciences, under award number DE-SC0001035 (to REB). Mass spectrometry was supported by NIH P41GM103422. We thank PMI (Protein Metrics) for providing the PMI software suite for mass spectrometry sequencing analysis. H.L. owes thanks to Carl Hennicke for helping with the Python script. We thank the Diamond Light Source for access to beamlines I04 (MX11651-41) and I03 (MX11651-47) that contributed to the results presented here.
Abbreviations:
- PC
Phycocyanin
- pr-PC
Phycocyanin from Phormidium rubidum sp. A09DM
- PCB
Phycocyanobilin
- ED
Electron Density
- MS
Mass Spectrometry
- MPD
(4S)-2-methyl-2,4-pentanediol
- PEG-1K
Polyethylene glycol 1000
- MeN
Methylated Asn, γ-N-methylasparagine
- DLS
Diamond Light Source
Footnotes
Conflict of Interest
The authors declare that they have no conflict of interest.
REFERENCES
- Adir N, Vainer R, Lerner N (2002) Refined structure of C-phycocyanin from the cyanobacterium Synechococcus vulcanus at 1.6 Å: insights into the role of solvent molecules in thermal stability and co-factor structure. BBA-Bioenergetics 1556(2–3):168–174. 10.1016/S0005-2728(02)00359-6 [DOI] [PubMed] [Google Scholar]
- Bar-Zvi S, Lahav A, Harris D, Niedzwiedzki DM, Blankenship RE, Adir N (2018) Structural heterogeneity leads to functional homogeneity in A. marina phycocyanin. BBA-Bioenergetics 1859(7): 544–553. 10.1016/j.bbabio.2018.04.007 [DOI] [PubMed] [Google Scholar]
- Burley SK, Petsko GA (1986) Amino- aromatic interactions in proteins. FEBS lett 203(2):39–143. 10.1016/0014-5793(86)80730-X [DOI] [PubMed] [Google Scholar]
- Chakrabarti P, Bhattacharyya R (2007) Geometry of nonbonded interactions involving planar groups in proteins. Prog Biophys Mol Bio 95(1–3):83–137. 10.1016/j.pbiomolbio.2007.03.016 [DOI] [PubMed] [Google Scholar]
- Contreras-Martel C, Matamala A, Bruna C, Poo-Caamañoc G, Almonacida D, Figueroaa M, Martínez-Oyanedela J, Bunster M (2007) The structure at 2 Å resolution of phycocyanin from Gracilaria chilensis and the energy transfer network in a PC–PC complex. Biophys Chem 125(2–3):388–396. 10.1016/j.bpc.2006.09.014 [DOI] [PubMed] [Google Scholar]
- Cruickshank DWJ (1999) Remarks about protein structure precision. Acta Cryst D 55(3):583–601. 10.1107/S0907444998012645 [DOI] [PubMed] [Google Scholar]
- David L, Marx A, Adir N (2011) High-resolution crystal structures of trimeric and rod phycocyanin. J Mol Biol 405(1):201–213. 10.1016/j.jmb.2010.10.036 [DOI] [PubMed] [Google Scholar]
- Emsley P, Lohkamp B, Scott WG, Cowtan K (2010) Features and development of Coot. Acta Cryst D 66(4):486–501. 10.1107/S0907444910007493 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Evans P (2006) Scaling and assessment of data quality. Acta Cryst D 62(1):72–82. 10.1107/S0907444905036693 [DOI] [PubMed] [Google Scholar]
- Evans PR, Murshudov GN (2013) How good are my data and what is the resolution? Acta Cryst D 69(7):1204–1214. 10.1107/S0907444913000061 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gallivan JP, Dougherty DA (1999) Cation-π interactions in structural biology. Proc Natl Acad Sci US 96(17):9459–9464. 10.1073/pnas.96.17.9459 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gupta GD, Sonani RR, Sharma M, Patel K, Rastogi RP, Madamwar D, Kumar V (2016) Crystal structure analysis of phycocyanin from chromatically adapted Phormidium rubidum A09DM. RSC Adv 6(81):77898–77907. 10.1039/C6RA12493C [DOI] [Google Scholar]
- Jubb HC, Higueruelo AP, Ochoa-Montaño B, Pitt WR, Ascher DB, Blundell TL (2017) Arpeggio: a web server for calculating and visualising interatomic interactions in protein structures. J Mol Biol 429(3):365–371. 10.1016/j.jmb.2016.12.004 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kabsch W (2010) Xds. Acta Cryst D 66(2):25–132. 10.1107/S0907444909047337 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Karplus PA, Diederichs K (2012) Linking crystallographic model and data quality. Science 336(6084):1030–1033. 10.1126/science.1218231 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Klotz AV, Leary JA, Glazer AN (1986) Post-translational methylation of asparaginyl residues. Identification of beta-71 gamma-N-methylasparagine in allophycocyanin. J Biol Chem 261(34):15891–15894. [PubMed] [Google Scholar]
- Krissinel E, Henrick K (2007) Inference of macromolecular assemblies from crystalline state. J Mol Biol 372(3):774–797. 10.1016/j.jmb.2007.05.022 [DOI] [PubMed] [Google Scholar]
- Marx A, Adir N (2013) Allophycocyanin and phycocyanin crystal structures reveal facets of phycobilisome assembly. BBA-Bioenergetics 1827(3):311–318. 10.1016/j.bbabio.2012.11.006 [DOI] [PubMed] [Google Scholar]
- McCoy AJ, Grosse-Kunstleve RW, Adams PD, Winn MD, Storoni LC, Read RJ (2007) Phaser crystallographic software. J Appl Cryst 40(4):658–674. 10.1107/S0021889807021206 [DOI] [PMC free article] [PubMed] [Google Scholar]
- McNicholas S, Potterton E, Wilson KS, Noble MEM (2011) Presenting your structures: the CCP4mg molecular-graphics software. Acta Cryst D 67(4):386–394. 10.1107/S0907444911007281 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Murshudov GN, Skubák P, Lebedev AA, Pannu NS, Steiner RA, Nicholls RA, Winn MD, Long F, Vagin AA (2011) REFMAC5 for the refinement of macromolecular crystal structures. Acta Cryst D 67(4):355–367. 10.1107/S0907444911001314 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Nishio M (2004) CH/π hydrogen bonds in crystals. Cryst Eng Comm 6(27):130–158. 10.1039/b313104a [DOI] [Google Scholar]
- Parmar A, Singh NK, Kaushal A, Sonawala S, Madamwar D (2011) Purification, characterization and comparison of phycoerythrins from three different marine cyanobacterial cultures. Bioresour Technol 102(2):1795–1802. 10.1016/j.biortech.2010.09.025 [DOI] [PubMed] [Google Scholar]
- Peng PP, Dong LL, Sun YF, Zeng XL, Ding WL, Scheer H, Yang X, Zhao KH (2014) The structure of allophycocyanin B from Synechocystis PCC 6803 reveals the structural basis for the extreme redshift of the terminal emitter in phycobilisomes. Acta Cryst D 70(10):2558–69. 10.1107/S1399004714015776 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Scheer H, Zhao KH (2008) Biliprotein maturation: the chromophore attachment. Mol Microbiol 68(2):263–276. 10.1111/j.1365-2958.2008.06160.x [DOI] [PMC free article] [PubMed] [Google Scholar]
- Singh NK, Sonani RR, Rastogi RP, Madamwar D (2015) The phycobilisomes: an early requisite for efficient photosynthesis in cyanobacteria. EXCLI J 14: 268–289. 10.17179/excli2014-723 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sonani RR, Rastogi RP, Patel SN, Chaubey MG, Singh NK, Gupta GD, Kumar V, Madamwar D (2019) Phylogenetic and crystallographic analysis of Nostoc phycocyanin having blue-shifted spectral properties. Sci Rep 9:9863 10.1038/s41598-019-46288-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Spear-Bernstein L, Miller KR (1989) Unique location of the phycobiliprotein light- harvesting pigment in the cryptophyceae. J Phycol 25(3): 412–419. 10.1111/j.1529-8817.1989.tb00245.x [DOI] [Google Scholar]
- Watanabe M, Ikeuchi M (2013) Phycobilisome: architecture of a light-harvesting supercomplex. Photosynth Res 116(2–3): 265–276. 10.1007/s11120-013-9905-3 [DOI] [PubMed] [Google Scholar]
- Weiss MS (2001) Global indicators of X-ray data quality. J Appl Crystallogr 34:130–135. 10.1107/S0021889800018227 [DOI] [Google Scholar]
- Winn MD, Ballard CC, Cowtan KD, Dodson EJ, Emsley P, Evans PR, Keegan RM, Krissinel EB, Leslie AGW, McCoy A, McNicholas SJ, Murshudov GN, Pannu NS, Potterton EA, Powell HR, Read RJ, Vagin A, Wilson KS (2011) Overview of the CCP4 suite and current developments. Acta Cryst D 67(4):235–42. 10.1107/S0907444910045749 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Winter G (2010) xia2: an expert system for macromolecular crystallography data reduction. J Appl Cryst 43(1):186–190. 10.1107/S0021889809045701 [DOI] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
The co-ordinates associated with the present study are available at the Protein Data Bank with PDB ID 6XWK. The MS proteomics data is available at ProteomeXchange Consortium with the dataset identifier PXD017229 and 10.6019/PXD017229.
