Deep Convolutional Neural Networks for Detecting Secondary Structures in Protein Density Maps from Cryo-Electron Microscopy

Rongjian Li; Dong Si; Tao Zeng; Shuiwang Ji; Jing He

doi:10.1109/BIBM.2016.7822490

. Author manuscript; available in PMC: 2018 May 14.

Published in final edited form as: Proceedings (IEEE Int Conf Bioinformatics Biomed). 2017 Jan 19;2016:41–46. doi: 10.1109/BIBM.2016.7822490

Deep Convolutional Neural Networks for Detecting Secondary Structures in Protein Density Maps from Cryo-Electron Microscopy

Rongjian Li ¹, Dong Si ², Tao Zeng ³, Shuiwang Ji ^3,^†, Jing He ^1,^†

PMCID: PMC5952046 NIHMSID: NIHMS874389 PMID: 29770260

Abstract

The detection of secondary structure of proteins using three dimensional (3D) cryo-electron microscopy (cryo-EM) images is still a challenging task when the spatial resolution of cryo-EM images is at medium level (5–10Å ). Prior researches focused on the usage of local features that may not capture the global information of image objects. In this study, we propose to use deep learning methods to extract high representative global features and then automatically detect secondary structures of proteins. In particular, we build a convolutional neural network (CNN) classifier that predicts the probability of label for every individual voxel in 3D cryo-EM image with respect to the secondary structure elements of proteins such as α-helix, β-sheet and background. To effectively incorporate the 3D spatial information in protein structures, we propose to perform 3D convolutions in the convolutional layers of CNNs. We show that the proposed CNN classifier can outperform existing SVM method on identifying the secondary structure elements of proteins from 3D cryo-EM medium resolution images.

I. Introduction and related work

Proteins perform most of the work of living cells with unique and stable three-dimensional (3D) structures. Cryo-electron microscopy (cryo-EM) is an experimental technique with increasing popularity to study the structures of protein complexes. Through cryo-EM, a number of large molecular complexes, such as ribosome and viruses, have been resolved to near atomic resolutions [21]. For cryo-EM images at lower resolutions such as 5–10Å (referred as medium resolution in the paper), detailed molecular features are not resolved. It is a challenging problem to derive atomic structures from such density images. Two types of approaches have been proposed. Fitting relies on a suitable atomic structure [9], [26], [32] and de novo modeling relies on the match of secondary structures between those in the density image and those in the protein sequence [1]–[4], [8], [20], [24].

The major difficulty in detecting secondary structures from images of medium resolution is that the spatial shape patterns of secondary structure elements (SSEs) at medium resolution are hard to distinguish from their closely located neighbours. The most common secondary structure elements (SSEs) are α-helices, β-sheets, and turns/loops. An example showing their shapes is given in Figure 1. In general, the long α-helices and large β-sheets can be detected fairly accurately. However, short α-helices appear to be similar to turns/loops in density images at medium resolution. A β-sheet with two strands can be confused with an α-helix. The spacing between two neighboring β-strands is about 5Å, and therefore they are not resolved at medium resolution. We previously proposed different approaches to predict the location of β-strands using StrandTwister and StrandRoller [6], [7], [28]. More accurate detection methods are needed for accurately and automatically identifying SSEs from cryo-EM images at medium resolution.

Fig. 1 — Illustration of different SSEs in an example protein including α-helix, β-sheet and turns/loops. The backbone of a protein structure is shown as a ribbon, and the surface view of the corresponding density image is superimposed.

Most prior methods for detecting SSEs at medium resolution are based on image-processing techniques [5], [12], [16], [17], [25], [27], [29], [34]. These methods search for cylinder-like regions for α-helices and plane-like regions for β-sheets. In general, these existing methods have two common drawbacks. The first one is that users often need to carefully select parameters for the method to work. The second drawback is that they do not fully explore the existing data to assist in detecting SSEs of new samples. Recently, learning based methods with few user interventions are attracting more research attentions in detecting protein SSEs. The studies in [23] used a nested K nearest neighbors classifiers for improving the α-helices detection. The authors in [29] developed a machine learning framework based on support vector machine (SVM) method to automatically identify α-helices and β-sheets in one density image using other existing volumetric images. However, the sample features they used are local features which are not representative enough for the essential characteristics of protein structures.

Convolutional neural networks (CNNs) are a type of fully trainable models that learn a hierarchy of features through nonlinear mappings between multiple stacked layers. CNNs have been widely used in a number of image related applications and achieved state-of-the-art performance on tasks including large-scale image and video recognition [15], [18], [35], [36], digit recognition [11], and object recognition tasks [19]. Recently, many attempts have been made to extend these models to the field of image segmentation, leading to improved performance [14], [31], [37]. One appealing property of CNNs is that the learned features through trainable parameters can capture highly nonlinear relationship between inputs and out-puts. Therefore, it is natural to employ CNNs for obtaining high representative features from cryo-EM images to improve the performance of detecting protein SSEs.

In this study, we show CNNs for detecting protein secondary structures in cryo-EM images. Specifically, we built a voxel CNN classifier that predicts the probabilities of every individual voxel in a given cryo-EM image with respect to different kinds of SSEs. To effectively incorporate the 3D spatial information in protein structures, we propose to perform 3D convolutions in the convolutional layers of CNNs so that discriminative features along three spatial dimensions are all captured. The proposed CNN classifier accepts voxel cryo-EM densities as input and learns highly discriminative features automatically for producing intermediate label prediction. These intermediate label predictions are then integrated with post-processing steps for detecting the final secondary structures. In addition, the conventional approaches used patch-based predictions to obtain the outputs of CNNs on test images, which is very time-consuming for large images. To reduce the computational complexity, we apply a stack of deconvolution layers in the CNN architecture to produce a dense pixel-wise prediction very efficiently. We compare the performance of our approach with that of commonly used learning based methods on a number of challenging 3D simulated density images. Results show that the proposed model significantly outperforms prior methods on detecting secondary structures of proteins from volumetric images.

II. The proposed deep model

One major challenge of using CNNs for cryo-EM segmentation is the large image diversity in the database. The complicated structures of proteins require designed networks be able to learn features from multiple scales. For example, the recognition of α-helices needs large filters, since an α-helix usually extends long in 3D space. On the other hand, detecting β-sheet structures needs small filters to capture local information in a short and flat neighborhood. To overcome the above-mentioned difficulty, we propose a novel 3D CNN with inception learning and residual learning (Figure 2). An inception network [30] usually utilizes multiple convolutional layers with different kernel sizes and max pooling layers to form different paths between two hidden layers. Such design allows us to increase the number of trainable parameters at each stage significantly without sharply increasing the computational complexity at later stages. The residual learning [13] is designed to simulate the desired nonlinear mapping between the input and output of some stage by adding a shortcut identity mapping connection. A residual network can achieve more accurate results in a very deep model without increasing computation costs.

Fig. 2 — Illustration of the inception (left) and residual (right) learning.

Another challenge of using CNNs on the cryo-EM segmentation is that the sizes of volumetric images are various and some images are very large. This requires predictions on test images to be more efficient. In the traditional patch-based prediction mechanism, for generating image segmentations using CNNs, we firstly extract patches from the images and then use those patches as inputs to the trained network. The output of each patch is a single label of the center pixel of that patch. Such patch-based prediction results in a huge amount of redundant computations. It is thus desirable to design a fast prediction algorithm that can segment the whole image directly without generating patches. To be specific, we applied deconvolutional layers to offset the size reduction caused by convolution and max-pooling operations. One significant advantage of using deconvolution operations is that the output feature map could have same size as the input image if the deconvolution kernel and stride sizes are carefully selected. In this paper, we used multiple deconvolution operations at different intermediate layers to generate feature maps with same size. The deconvolved feature maps were then summed to form a multi-scale representation of the model input. Through such design, our network is able to generate an end-to-end mapping between inputs and outputs. This leads to dense voxel-wise prediction over images without any computational redundancy.

A. Dilated convolution

Many image related applications such as semantic segmentation problems required the developed model could keep the local pixel-level accuracy such as precise detection of edges, and also utilize the knowledge from the wider global context. To this end, researchers have developed various techniques in deep learning field for acquiring the multi-scale representation of the input. Besides the inception learning and residual learning introduced in the above two sections, the convolution with a dilated filter has also been studied and shown excellent performance in many computer vision applications. The convolution with a dilated filter is an extension of the original convolution. Its significant property is that the dilated convolutions support exponentially expanding receptive fields without losing resolution or coverage. Therefore, a neural network with it could capture information from different scales without increasing the number of parameters too much. In particular, the formula of the original convolution over the 1-D input signal f with the kernel k is defined as follows,

{(k * f)}_{t} = \sum_{τ = - \infty}^{\infty} k_{r} f_{t - τ}

where t is the variable of f. Instead, the convolution with a dilated filter factor l between f and k is defined as:

{(k *_{l} f)}_{t} = \sum_{τ = - \infty}^{\infty} k_{r} f_{t - t}

In the dilated convolution, the kernel only touches the signal at every l-th entry. This formula applies to a 1-D signal, but it can be straightforwardly extended to higher dimensional convolutions.

Recently, dilated convolutions have been employed for semantic segmentation on natural images. The authors in [22] analyzed filter dilation and performed preliminary experiments for comparisons with other developed tricks. The authors of [10] used dilated convolutions to simplify the architecture of [22]. A new convolutional network architecture that systematically used dilated convolutions is proposed in [33] for multi-scale\context aggregation. In [33], the spatial pooling layers were replaced with convolutions with increased filter dilated sizes. In this paper, we propose to integrate the dilated convolution with inception and residual learning to build an efficient neural network for identifying protein SSEs from cryo-EM images. To be specific, we used the dilated convolution at the inception Module A and inception Module B of Figure 3 respectively. Before the residual learning was implemented, we used the dilation convolution with filter size 3 for obtaining a larger size of receptive fields.

Fig. 3 — Detailed architecture of the 3D convolutional neuron network with dense prediction. For each module in the architecture, convolutional layers are denoted by filter sizes and the numbers in parentheses denote the numbers of feature maps used. Except for those layers with a stride size of 2, which are indicated by “s=2”, the stride sizes in other layers are all 1. The filter sizes in the third dimension are all 1, and thus are omitted in the figure. The orange arrows indicate the shortcuts in residual learning. The red blocks indicate dilated convolution layers with kernel size 3 × 3 and dilation factor size 2 which is indicated by “L=2”.

B. The detailed architecture

We provide the detailed configuration of our proposed deep network in Figure 3, which generates the final probability map about the α-helix and β-sheet voxels. The whole network contains 6 modules, and each of them used multiple paths for realizing inception learning. There are 3 modules (Inception A, B, C) mainly for building the nonlinear relationship between input and output, and an additional 3 modules mainly for reducing sizes of feature maps (Reduction A, B, C). In order to extract more multi-scale information from the input images, we adopted dilated convolutions in 3 ‘Inception’ modules. To be specific, we introduced one extra dilated convolution layer with kernel size of 3 and dilation factor size of 2 on top of those concatenation layers before the residual learning was applied.

III. Experimental evaluation

A. Experiments setup

In this work, we select 25 simulated cryo-EM images for training and testing. In particular, we generate the training and test simulated images to 8Å using the program command “pdb2mrc” of “EMAN” with a sampling size of 1Å/pixel. We choose 15 of these 25 subjects for training the proposed model. Then the models are evaluated subjects to test the SSE detection performance.

To present a quantitative estimation about the size of the identified helices and β-sheets, we estimate the number of Cα atoms that are close to the identified helix voxels and β-sheet voxels. The detailed selections of parameters for closeness are similar to those given in [29].

We further use specificity and sensitivity to evaluate the accuracy of our model. The sensitivity records the ratio of correctly detected (true positive) voxels over all detected (true) voxels by computational methods. The specificity calculates the percentage of true negative voxels over all un-detected (negative) voxels by computational methods. We also use “F1” score to measure the segmentation accuracy on detecting helix and sheet Cα atoms. The F1 score is defined as

F 1 = \frac{2 (A \cap B)}{∣ A ∣ + ∣ B ∣}

where |A| denotes the number of α-helix (or β-sheet) voxels in the segmentation A by CNN. |B| is the number of α-helix (or β-sheet) voxels in the ground truth segmentation B, and |A ∩ B| is the number of shared α-helix (or β-sheet) voxels by A and B. The F 1 score lies in [0,1], and a higher value indicates a better detection accuracy.

During the training phase, we used small patches of size 32 × 32 × 7 extracted from training images as inputs and outputs, which is based on comprehensive considerations over both computational cost and training image sizes. We trimmed off the background margins of the training images according to the ground truth for saving computational resources. In order to improve model performance, we used data augmentation to enlarge the training data set. The data augmentation includes transformations of the original images with rotation and flipping along different dimensions. During the test phase, we employed the whole test images as inputs of the proposed model. The carefully selected kernel sizes in convolutions and deconvolutions can ensure the outputs have the same size of inputs, which significantly increased the prediction speed for obtaining the output probability values over SSEs.

B. Performance on simulated cryo-EM density images

In order to demonstrate the effectiveness of the proposed method, we firstly report the specificity and sensitivity based on detected helix and sheet Cα atoms. In this study, we consider a Cα atom as an identified helix Cα if it is in the neighbourhood of an identified helix voxel with radius of 2.5Å. Similarly, an identified sheet Cα should be within the neighbourhood with radius of 3Å of an identified sheet voxel. The detailed numbers of identified Cα atoms for all test cryo-EM density images are given in Table I. We can observe that the average of sensitivity and specificity of helix identification can reach 71.52% and 97.86%, respectively, in Table I. The average sensitivity and specificity for β-sheet identification is 76.04% and 91.87%, respectively. The high specificity shows the ability our CNN method for correctly detecting the SSEs.

TABLE I.

Accuracy of identified Cα atoms from the simulated images. Total: total number of Cα atoms for proteins; The ‘tp’,’m’ and ‘fp’: ‘true positive’, ‘missed’ and ‘false positive’ atoms respectively; ‘Hlx’ and ‘Sht’: helix and sheet respectively; Spe, Sen: specificity and sensitivity respectively.

ID	1ajw	1ajz	1al7	1cv1	1dai	1eny	1wab	2aw0	2itg	3lck	Average
Total	145	282	350	162	219	268	212	72	160	270
Hlx	5	124	159	123	84	126	96	22	66	107
tp.HlX	5	120	155	119	82	124	96	22	66	98
m.HlX	0	4	4	4	2	2	0	0	0	9
fp.Hlx	26	23	35	21	24	60	57	11	31	25
Sht	63	37	46	14	47	66	24	25	21	30
tp.Sht	60	32	41	12	45	46	24	25	21	29
m.Sht	3	5	5	2	2	20	0	0	0	1
fp Sht	61	18	14	18	16	18	22	31	49	24
Spe.Hlx (%)	81.43	85.44	81.68	46.15	82.22	57.75	50.86	78	67.02	84.66	71.52
Sen.Hlx (%)	100	96.77	97.48	96.75	97.62	98.41	100	100	100	91.59	97.86
Spe.Sht (%)	25.61	92.65	95.39	87.84	90.70	91.09	88.30	34.04	64.75	90	76.04
Sen.Sht (%)	95.24	86.49	89.13	85.71	95.74	69.70	100	100	100	96.67	91.87

Open in a new tab

In order to provide a comprehensive and quantitative evaluation of the proposed method on detecting protein SSEs, we also report the identification performance on all 10 test cryo-EM images. The performance of our proposed method outperformed the existing method for both α-helix and β-sheet detection. Specifically, CNN could achieve F1 score as 78.66% for α-helix and 67.5% for β-sheet on average over 10 test subjects, yielding an overall value 73.08%. In contrast, SVM method achieves F1 score as 60.04% for α-helix and 41.19% for β-sheet, yielding an overall value 50.62%. Moreover, the authors in [29] proposed a post-processing step for improving the detection performance. In this work, we also reported the results by SVM and CNN after the post-processing step in [29] respectively. The quantitative evaluation results are listed in Table III. We found that SVM detects more accurately than CNN after post-processing. The main reason is that the post-processing step proposed in [29] was specially designed for SVM to remove those sparse false positive voxels, and current post-processing method may not be effective for the data generated by the proposed CNN model. In general, we can see that the performance of CNN is still comparable with SVM after a post-processing step specially designed for SVM is used. Designing an efficient and customized post-processing step for deep learning is one of our future research directions. In addition to quantitatively demonstrate the advantage of the proposed CNN method, we visually examined the identification results of α-helix and β-sheet without post-processing for two test samples in Figure 4. The ground truth three-dimensional structural morphologies are shown with purple ribbons. It can be seen that the CNN method generates less false positive voxels than SVM method. This explains the high specificity of CNN method.

TABLE III.

Comparison of accuracy (using F1 score) between deep learning model (DL) and support vector machine method (SVM) for α-helix and β-sheet respectively after post-processing step.

ID	F1 Helix		F1 Sheet		F1 average
ID	DL	SVM	DL	SVM	DL	SVM
1ajw	50.00	21.74	67.80	83.22	58.90	52.48
1ajz	92.43	93.28	70.89	81.32	81.66	87.30
1al7	89.87	92.72	76.40	88.24	83.14	90.48
1cv1	92.24	88.60	51.43	63.41	71.84	76.01
1dai	91.23	93.33	77.78	79.25	84.51	86.29
1eny	83.33	86.76	63.55	73.38	73.44	80.07
1wab	83.49	91.01	72.73	64.86	78.11	77.94
2aw0	89.80	93.62	68.57	72.73	79.19	83.18
2itg	81.16	80.00	66.67	62.69	73.92	71.35
3lck	88.69	94.23	68.49	68.97	78.59	81.60
Average	84.22	83.53	68.43	73.81	76.33	78.67

Open in a new tab

Fig. 4 — Comparisons with different identification methods for two test cryo-EM images. The first row is for the protein’1wab’, and the second row is for the protein ‘2itg’. The first column shows the results by proposed CNN method. The second column shows the results by SVM method. Note that these results for both two methods are generated without any post-processing steps.

IV. Conclusion

Identification of secondary structure of proteins is challenging because of their structural similarities in 3D space. We demonstrate the use of 3D CNNs to segment cryo-EM images. The comparison between CNN and SVM shows significant advantage of CNN in accuracy before post-processing.

In the protein cryo-EM images, we used only a small number of simulated density images with medium resolution of 8Å. Our initial results applying CNN in experimentally derived cryo-EM data (data not shown) shows similar conclusion. Meanwhile, we used one single model for segmenting the cryo-EM image voxels. We will develop ensemble novel CNN models with more efficient architectures to obtain a better segmentation accuracy.

TABLE II.

Comparison of accuracy (using F1 score) between deep learning model (DL) and support vector machine method (SVM) for α-helix and β-sheet respectively without any post-processing step.

ID	F1 Helix		F1 Sheet		F1 average
ID	DL	SVM	DL	SVM	DL	SVM
1ajw	27.78	8.26	65.22	72.41	46.50	40.34
1ajz	89.89	66.67	73.56	34.26	81.73	50.47
1al7	88.83	66.67	81.19	34.85	85.01	50.76
1cv1	90.49	86.83	54.55	21.05	72.52	53.94
1dai	86.32	58.87	83.33	49.47	84.83	54.17
1eny	80.00	66.84	70.77	50.38	75.39	58.61
1wab	77.11	66.67	68.57	30.38	72.84	48.53
2aw0	80.00	47.83	61.73	56.18	70.87	52.01
2itg	80.98	67.35	46.15	27.63	63.57	47.49
3lck	85.22	64.42	69.88	35.29	77.55	49.86
Average	78.66	60.04	67.50	41.19	73.08	50.62

Open in a new tab

Acknowledgments

This work was supported by National Science Foundation grant DBI-1350258, NIH-R0-GM062968, and DBI-1356621. Part of this work was done while Rongjian Li was a visiting student at Washington State University

References

1.Al Nasr K, Chen L, Si D, Ranjan D, Zubair M, He J. Building the initial chain of the proteins through de novo modeling of the cryo-electron microscopy volume data at the medium resolutions. Proceedings of the ACM Conference on Bioinformatics, Computational Biology and Biomedicine; 2012. pp. 490–497. [Google Scholar]
2.Al Nasr K, Ranjan D, Zubair M, Chen L, He J. Solving the secondary structure matching problem in cryo-em de novo modeling using a constrained-shortest path graph algorithm. IEEE/ACM Transactions on Computational Biology and Bioinformatics. 2014;11(2):419–430. doi: 10.1109/TCBB.2014.2302803. [DOI] [PubMed] [Google Scholar]
3.Al Nasr K, Ranjan D, Zubair M, He J. Ranking valid topologies of the secondary structure elements using a constraint graph. Journal of Bioinformatics and Computational Biology. 2011;9(03):415–430. doi: 10.1142/s0219720011005604. [DOI] [PubMed] [Google Scholar]
4.Baker ML, Abeysinghe SS, Schuh S, Coleman RA, Abrams A, Marsh MP, Hryc CF, Ruths T, Chiu W, Ju T. Modeling protein structure at near atomic resolutions with gorgon. Journal of structural biology. 2011;174(2):360–373. doi: 10.1016/j.jsb.2011.01.015. [DOI] [PMC free article] [PubMed] [Google Scholar]
5.Baker ML, Ju T, Chiu W. Identification of secondary structure elements in intermediate-resolution density maps. Structure. 2007;15(1):7–19. doi: 10.1016/j.str.2006.11.008. [DOI] [PMC free article] [PubMed] [Google Scholar]
6.Biswas A, Ranjan D, Zubair M, He J. A dynamic programming algorithm for finding the optimal placement of a secondary structure topology in cryo-em data. Journal of Computational Biology. 2015;22(9):837–843. doi: 10.1089/cmb.2015.0120. [DOI] [PubMed] [Google Scholar]
7.Biswas A, Ranjan D, Zubair M, Zeil S, Al Nasr K, He J. An effective computational method incorporating multiple secondary structure predictions in topology determination for cryo-em images. IEEE/ACM Transactions on Computational Biology and Bioinformatics. 2016 doi: 10.1109/TCBB.2016.2543721. page in press. [DOI] [PMC free article] [PubMed] [Google Scholar]
8.Biswas A, Si D, Al Nasr K, Ranjan D, Zubair M, He J. Improved efficiency in cryo-em secondary structure topology determination from inaccurate data. Journal of bioinformatics and computational biology. 2012;10(03):1242006. doi: 10.1142/S0219720012420061. [DOI] [PubMed] [Google Scholar]
9.Chan KY, Trabuco LG, Schreiner E, Schulten K. Cryo-electron microscopy modeling by the molecular dynamics flexible fitting method. Biopolymers. 2012;97(9):678–686. doi: 10.1002/bip.22042. [DOI] [PMC free article] [PubMed] [Google Scholar]
10.Chen L-C, Papandreou G, Kokkinos I, Murphy K, Yuille AL. Semantic image segmentation with deep convolutional nets and fully connected crfs. 2014 doi: 10.1109/TPAMI.2017.2699184. arXiv preprint arXiv:1412.7062. [DOI] [PubMed] [Google Scholar]
11.Ciresan DC, Giusti A, Gambardella LM, Schmidhuber J. Mitosis detection in breast cancer histology images with deep neural networks. MICCAI. 2013;2:411–418. doi: 10.1007/978-3-642-40763-5_51. [DOI] [PubMed] [Google Scholar]
12.Dal Palu A, He J, Pontelli E, Lu Y. Identification of a-helices from low resolution protein density maps’. Computational Systems Bioinformatics Conference. 2006:89–98. [PubMed] [Google Scholar]
13.He K, Zhang X, Ren S, Sun J. Deep residual learning for image recognition. 2015 arXiv preprint arXiv:1512.03385. [Google Scholar]
14.Jain V, Seung S. Natural image denoising with convolutional networks. In: Koller D, Schuurmans D, Bengio Y, Bottou L, editors. Advances in Neural Information Processing Systems. Vol. 21. 2009. pp. 769–776. [Google Scholar]
15.Ji S, Xu W, Yang M, Yu K. 3D convolutional neural networks for human action recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2013;35(1):221–231. doi: 10.1109/TPAMI.2012.59. [DOI] [PubMed] [Google Scholar]
16.Jiang W, Baker ML, Ludtke SJ, Chiu W. Bridging the information gap: computational tools for intermediate resolution structure interpretation. Journal of molecular biology. 2001;308(5):1033–1044. doi: 10.1006/jmbi.2001.4633. [DOI] [PubMed] [Google Scholar]
17.Kong Y, Ma J. A structural-informatics approach for mining β-sheets: locating sheets in intermediate-resolution density maps. Journal of molecular biology. 2003;332(2):399–413. doi: 10.1016/s0022-2836(03)00859-3. [DOI] [PubMed] [Google Scholar]
18.Krizhevsky A, Sutskever I, Hinton G. ImageNet classification with deep convolutional neural networks. Advances in Neural Information Processing Systems. 2012;25:1106–1114. [Google Scholar]
19.LeCun Y, Huang FJ, Bottou L. Learning methods for generic object recognition with invariance to pose and lighting. Computer Vision and Pattern Recognition, 2004. CVPR 2004. Proceedings of the 2004 IEEE Computer Society Conference on; IEEE; 2004. pp. II–97. [Google Scholar]
20.Lindert S, Alexander N, Wötzel N, Karakaş M, Stewart PL, Meiler J. Em-fold: de novo atomic-detail protein structure determination from medium-resolution density maps. Structure. 2012;20(3):464–478. doi: 10.1016/j.str.2012.01.023. [DOI] [PMC free article] [PubMed] [Google Scholar]
21.Liu Z, Guo F, Wang F, Li TC, Jiang W. 2.9 Å resolution cryo-em 3d reconstruction of close-packed virus particles. Structure. 2016;24(2):319–328. doi: 10.1016/j.str.2015.12.006. [DOI] [PMC free article] [PubMed] [Google Scholar]
22.Long J, Shelhamer E, Darrell T. Fully convolutional networks for semantic segmentation. The IEEE Conference on Computer Vision and Pattern Recognition (CVPR); June 2015; [DOI] [PubMed] [Google Scholar]
23.Ma L, Reisert M, Burkhardt H. Rennsh: a novel alpha-helix identification approach for intermediate resolution electron density maps. Computational Biology and Bioinformatics, IEEE/ACM Transactions on. 2012;9(1):228–239. doi: 10.1109/TCBB.2011.52. [DOI] [PubMed] [Google Scholar]
24.Nasr KA, Chen L, Ranjan D, Zubair M, Si D, He J. A constrained k-shortest path algorithm to rank the topologies of the protein secondary structure elements detected in cryoem volume maps. Proceedings of the International Conference on Bioinformatics, Computational Biology and Biomedical Informatics; ACM; 2013. p. 749. [Google Scholar]
25.Rusu M, Wriggers W. Evolutionary bidirectional expansion for the tracing of alpha helices in cryo-electron microscopy reconstructions. Journal of structural biology. 2012;177(2):410–419. doi: 10.1016/j.jsb.2011.11.029. [DOI] [PMC free article] [PubMed] [Google Scholar]
26.Schröder GF, Brunger AT, Levitt M. Combining efficient conformational sampling with a deformable elastic network model facilitates structure refinement at low resolution. Structure. 2007;15(12):1630–1641. doi: 10.1016/j.str.2007.09.021. [DOI] [PMC free article] [PubMed] [Google Scholar]
27.Si D, He J. Beta-sheet detection and representation from medium resolution cryo-em density maps. Proceedings of the International Conference on Bioinformatics, Computational Biology and Biomedical Informatics; ACM; 2013. p. 764. [Google Scholar]
28.Si D, He J. Tracing beta strands using strandtwister from cryoem density maps at medium resolutions. Structure. 2014;22(11):1665–1676. doi: 10.1016/j.str.2014.08.017. [DOI] [PubMed] [Google Scholar]
29.Si D, Ji S, Nasr KA, He J. A machine learning approach for the identification of protein secondary structure elements from electron cryo-microscopy density maps. Biopolymers. 2012;97(9):698–708. doi: 10.1002/bip.22063. [DOI] [PubMed] [Google Scholar]
30.Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A. Going deeper with convolutions. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition; 2015. pp. 1–9. [Google Scholar]
31.Turaga SC, Murray JF, Jain V, Roth F, Helmstaedter M, Briggman K, Denk W, Seung HS. Convolutional networks can learn to generate affinity graphs for image segmentation. Neural Computation. 2010;22(2):511–538. doi: 10.1162/neco.2009.10-08-881. [DOI] [PubMed] [Google Scholar]
32.Wriggers W, Birmanns S. Using situs for flexible and rigid-body fitting of multiresolution single-molecule data. Journal of structural biology. 2001;133(2):193–202. doi: 10.1006/jsbi.2000.4350. [DOI] [PubMed] [Google Scholar]
33.Yu F, Koltun V. Multi-scale context aggregation by dilated convolutions. 2015 arXiv preprint arXiv:1511.07122. [Google Scholar]
34.Yu Z, Bajaj C. Computational approaches for automatic structural analysis of large biomolecular complexes. Computational Biology and Bioinformatics, IEEE/ACM Transactions on. 2008;5(4):568–582. doi: 10.1109/TCBB.2007.70226. [DOI] [PubMed] [Google Scholar]
35.Zeiler MD, Fergus R. Computer vision–ECCV 2014. Springer; 2014. Visualizing and understanding convolutional networks; pp. 818–833. [Google Scholar]
36.Zeng T, Li R, Mukkamala R, Ye J, Ji S. Deep convolutional neural networks for annotating gene expression patterns in the mouse brain. BMC bioinformatics. 2015;16(1):1. doi: 10.1186/s12859-015-0553-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
37.Zhang W, Li R, Deng H, Wang L, Lin W, Ji S, Shen D. Deep convolutional neural networks for multi-modality isointense infant brain image segmentation. NeuroImage. 2015;108:214–224. doi: 10.1016/j.neuroimage.2014.12.061. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R1] 1.Al Nasr K, Chen L, Si D, Ranjan D, Zubair M, He J. Building the initial chain of the proteins through de novo modeling of the cryo-electron microscopy volume data at the medium resolutions. Proceedings of the ACM Conference on Bioinformatics, Computational Biology and Biomedicine; 2012. pp. 490–497. [Google Scholar]

[R2] 2.Al Nasr K, Ranjan D, Zubair M, Chen L, He J. Solving the secondary structure matching problem in cryo-em de novo modeling using a constrained-shortest path graph algorithm. IEEE/ACM Transactions on Computational Biology and Bioinformatics. 2014;11(2):419–430. doi: 10.1109/TCBB.2014.2302803. [DOI] [PubMed] [Google Scholar]

[R3] 3.Al Nasr K, Ranjan D, Zubair M, He J. Ranking valid topologies of the secondary structure elements using a constraint graph. Journal of Bioinformatics and Computational Biology. 2011;9(03):415–430. doi: 10.1142/s0219720011005604. [DOI] [PubMed] [Google Scholar]

[R4] 4.Baker ML, Abeysinghe SS, Schuh S, Coleman RA, Abrams A, Marsh MP, Hryc CF, Ruths T, Chiu W, Ju T. Modeling protein structure at near atomic resolutions with gorgon. Journal of structural biology. 2011;174(2):360–373. doi: 10.1016/j.jsb.2011.01.015. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R5] 5.Baker ML, Ju T, Chiu W. Identification of secondary structure elements in intermediate-resolution density maps. Structure. 2007;15(1):7–19. doi: 10.1016/j.str.2006.11.008. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R6] 6.Biswas A, Ranjan D, Zubair M, He J. A dynamic programming algorithm for finding the optimal placement of a secondary structure topology in cryo-em data. Journal of Computational Biology. 2015;22(9):837–843. doi: 10.1089/cmb.2015.0120. [DOI] [PubMed] [Google Scholar]

[R7] 7.Biswas A, Ranjan D, Zubair M, Zeil S, Al Nasr K, He J. An effective computational method incorporating multiple secondary structure predictions in topology determination for cryo-em images. IEEE/ACM Transactions on Computational Biology and Bioinformatics. 2016 doi: 10.1109/TCBB.2016.2543721. page in press. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R8] 8.Biswas A, Si D, Al Nasr K, Ranjan D, Zubair M, He J. Improved efficiency in cryo-em secondary structure topology determination from inaccurate data. Journal of bioinformatics and computational biology. 2012;10(03):1242006. doi: 10.1142/S0219720012420061. [DOI] [PubMed] [Google Scholar]

[R9] 9.Chan KY, Trabuco LG, Schreiner E, Schulten K. Cryo-electron microscopy modeling by the molecular dynamics flexible fitting method. Biopolymers. 2012;97(9):678–686. doi: 10.1002/bip.22042. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R10] 10.Chen L-C, Papandreou G, Kokkinos I, Murphy K, Yuille AL. Semantic image segmentation with deep convolutional nets and fully connected crfs. 2014 doi: 10.1109/TPAMI.2017.2699184. arXiv preprint arXiv:1412.7062. [DOI] [PubMed] [Google Scholar]

[R11] 11.Ciresan DC, Giusti A, Gambardella LM, Schmidhuber J. Mitosis detection in breast cancer histology images with deep neural networks. MICCAI. 2013;2:411–418. doi: 10.1007/978-3-642-40763-5_51. [DOI] [PubMed] [Google Scholar]

[R12] 12.Dal Palu A, He J, Pontelli E, Lu Y. Identification of a-helices from low resolution protein density maps’. Computational Systems Bioinformatics Conference. 2006:89–98. [PubMed] [Google Scholar]

[R13] 13.He K, Zhang X, Ren S, Sun J. Deep residual learning for image recognition. 2015 arXiv preprint arXiv:1512.03385. [Google Scholar]

[R14] 14.Jain V, Seung S. Natural image denoising with convolutional networks. In: Koller D, Schuurmans D, Bengio Y, Bottou L, editors. Advances in Neural Information Processing Systems. Vol. 21. 2009. pp. 769–776. [Google Scholar]

[R15] 15.Ji S, Xu W, Yang M, Yu K. 3D convolutional neural networks for human action recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2013;35(1):221–231. doi: 10.1109/TPAMI.2012.59. [DOI] [PubMed] [Google Scholar]

[R16] 16.Jiang W, Baker ML, Ludtke SJ, Chiu W. Bridging the information gap: computational tools for intermediate resolution structure interpretation. Journal of molecular biology. 2001;308(5):1033–1044. doi: 10.1006/jmbi.2001.4633. [DOI] [PubMed] [Google Scholar]

[R17] 17.Kong Y, Ma J. A structural-informatics approach for mining β-sheets: locating sheets in intermediate-resolution density maps. Journal of molecular biology. 2003;332(2):399–413. doi: 10.1016/s0022-2836(03)00859-3. [DOI] [PubMed] [Google Scholar]

[R18] 18.Krizhevsky A, Sutskever I, Hinton G. ImageNet classification with deep convolutional neural networks. Advances in Neural Information Processing Systems. 2012;25:1106–1114. [Google Scholar]

[R19] 19.LeCun Y, Huang FJ, Bottou L. Learning methods for generic object recognition with invariance to pose and lighting. Computer Vision and Pattern Recognition, 2004. CVPR 2004. Proceedings of the 2004 IEEE Computer Society Conference on; IEEE; 2004. pp. II–97. [Google Scholar]

[R20] 20.Lindert S, Alexander N, Wötzel N, Karakaş M, Stewart PL, Meiler J. Em-fold: de novo atomic-detail protein structure determination from medium-resolution density maps. Structure. 2012;20(3):464–478. doi: 10.1016/j.str.2012.01.023. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R21] 21.Liu Z, Guo F, Wang F, Li TC, Jiang W. 2.9 Å resolution cryo-em 3d reconstruction of close-packed virus particles. Structure. 2016;24(2):319–328. doi: 10.1016/j.str.2015.12.006. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R22] 22.Long J, Shelhamer E, Darrell T. Fully convolutional networks for semantic segmentation. The IEEE Conference on Computer Vision and Pattern Recognition (CVPR); June 2015; [DOI] [PubMed] [Google Scholar]

[R23] 23.Ma L, Reisert M, Burkhardt H. Rennsh: a novel alpha-helix identification approach for intermediate resolution electron density maps. Computational Biology and Bioinformatics, IEEE/ACM Transactions on. 2012;9(1):228–239. doi: 10.1109/TCBB.2011.52. [DOI] [PubMed] [Google Scholar]

[R24] 24.Nasr KA, Chen L, Ranjan D, Zubair M, Si D, He J. A constrained k-shortest path algorithm to rank the topologies of the protein secondary structure elements detected in cryoem volume maps. Proceedings of the International Conference on Bioinformatics, Computational Biology and Biomedical Informatics; ACM; 2013. p. 749. [Google Scholar]

[R25] 25.Rusu M, Wriggers W. Evolutionary bidirectional expansion for the tracing of alpha helices in cryo-electron microscopy reconstructions. Journal of structural biology. 2012;177(2):410–419. doi: 10.1016/j.jsb.2011.11.029. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R26] 26.Schröder GF, Brunger AT, Levitt M. Combining efficient conformational sampling with a deformable elastic network model facilitates structure refinement at low resolution. Structure. 2007;15(12):1630–1641. doi: 10.1016/j.str.2007.09.021. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R27] 27.Si D, He J. Beta-sheet detection and representation from medium resolution cryo-em density maps. Proceedings of the International Conference on Bioinformatics, Computational Biology and Biomedical Informatics; ACM; 2013. p. 764. [Google Scholar]

[R28] 28.Si D, He J. Tracing beta strands using strandtwister from cryoem density maps at medium resolutions. Structure. 2014;22(11):1665–1676. doi: 10.1016/j.str.2014.08.017. [DOI] [PubMed] [Google Scholar]

[R29] 29.Si D, Ji S, Nasr KA, He J. A machine learning approach for the identification of protein secondary structure elements from electron cryo-microscopy density maps. Biopolymers. 2012;97(9):698–708. doi: 10.1002/bip.22063. [DOI] [PubMed] [Google Scholar]

[R30] 30.Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A. Going deeper with convolutions. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition; 2015. pp. 1–9. [Google Scholar]

[R31] 31.Turaga SC, Murray JF, Jain V, Roth F, Helmstaedter M, Briggman K, Denk W, Seung HS. Convolutional networks can learn to generate affinity graphs for image segmentation. Neural Computation. 2010;22(2):511–538. doi: 10.1162/neco.2009.10-08-881. [DOI] [PubMed] [Google Scholar]

[R32] 32.Wriggers W, Birmanns S. Using situs for flexible and rigid-body fitting of multiresolution single-molecule data. Journal of structural biology. 2001;133(2):193–202. doi: 10.1006/jsbi.2000.4350. [DOI] [PubMed] [Google Scholar]

[R33] 33.Yu F, Koltun V. Multi-scale context aggregation by dilated convolutions. 2015 arXiv preprint arXiv:1511.07122. [Google Scholar]

[R34] 34.Yu Z, Bajaj C. Computational approaches for automatic structural analysis of large biomolecular complexes. Computational Biology and Bioinformatics, IEEE/ACM Transactions on. 2008;5(4):568–582. doi: 10.1109/TCBB.2007.70226. [DOI] [PubMed] [Google Scholar]

[R35] 35.Zeiler MD, Fergus R. Computer vision–ECCV 2014. Springer; 2014. Visualizing and understanding convolutional networks; pp. 818–833. [Google Scholar]

[R36] 36.Zeng T, Li R, Mukkamala R, Ye J, Ji S. Deep convolutional neural networks for annotating gene expression patterns in the mouse brain. BMC bioinformatics. 2015;16(1):1. doi: 10.1186/s12859-015-0553-9. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R37] 37.Zhang W, Li R, Deng H, Wang L, Lin W, Ji S, Shen D. Deep convolutional neural networks for multi-modality isointense infant brain image segmentation. NeuroImage. 2015;108:214–224. doi: 10.1016/j.neuroimage.2014.12.061. [DOI] [PMC free article] [PubMed] [Google Scholar]

PERMALINK

Deep Convolutional Neural Networks for Detecting Secondary Structures in Protein Density Maps from Cryo-Electron Microscopy

Rongjian Li

Dong Si

Tao Zeng

Shuiwang Ji

Jing He

Abstract

I. Introduction and related work

Fig. 1.

II. The proposed deep model

Fig. 2.

A. Dilated convolution

Fig. 3.

B. The detailed architecture

III. Experimental evaluation

A. Experiments setup

B. Performance on simulated cryo-EM density images

TABLE I.

TABLE III.

Fig. 4.

IV. Conclusion

TABLE II.

Acknowledgments

References

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

Deep Convolutional Neural Networks for Detecting Secondary Structures in Protein Density Maps from Cryo-Electron Microscopy

Rongjian Li

Dong Si

Tao Zeng

Shuiwang Ji

Jing He

Abstract

I. Introduction and related work

Fig. 1.

II. The proposed deep model

Fig. 2.

A. Dilated convolution

Fig. 3.

B. The detailed architecture

III. Experimental evaluation

A. Experiments setup

B. Performance on simulated cryo-EM density images

TABLE I.

TABLE III.

Fig. 4.

IV. Conclusion

TABLE II.

Acknowledgments

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases