Overall scheme of DeepEC. DeepEC consists of 3 independent CNNs to classify whether an input protein sequence is an enzyme or not, using CNN-1, and to predict third- and fourth-level EC numbers using CNN-2 and CNN-3, respectively. Homology analysis is also implemented using DIAMOND (33) for EC numbers that cannot be classified by the CNNs. The 3 CNNs share the same embedding, convolution, and 1-max pooling layers, but have different fully connected layers to perform the 3 different tasks mentioned here. DeepEC generates final EC numbers as output only if the 3 CNNs generate consistent results: Binary classification of a protein sequence as an enzyme by CNN-1 and 2 EC numbers with consistent class (first number), subclass (second number), and sub-subclass (third number) generated from CNN-2 and CNN-3. Because the CNN-2 and CNN-3 are multilabel classification models, multiple EC numbers can be predicted for a given protein sequence. If a given protein sequence is classified as an enzyme by CNN-1, but is not assigned with specific EC numbers by CNN-2 and CNN-3, the homology analysis is subsequently conducted. See SI Appendix, Materials and Methods for details on the operation of CNNs, as well as implementation of homology analysis within DeepEC. ReLU stands for rectified linear unit.