Figure - PMC

Skip to main content

An official website of the United States government

Here's how you know

Here's how you know

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.

View full-text article in PMC

. 2021 Apr 7;118(15):e2019053118. doi: 10.1073/pnas.2019053118

Search in PMC
Search in PubMed
View in NLM Catalog
Add to search

Copyright © 2021 the Author(s). Published by PNAS.

This open access article is distributed under Creative Commons Attribution-NonCommercial-NoDerivatives License 4.0 (CC BY-NC-ND).

PMC Copyright notice

Fig. 2. — (A–E) Comparison of the (A) sequence length (in amino acids, a.a.), (B) hydrophobicity, (C) Shannon entropy, the fraction of sequence that is part of (D) the low-complexity regions (LCRs) and (E) the intrinsically disordered regions (IDRs) for the three training datasets and the Swiss-Prot. Comparative analysis highlighted that the average construct in the ${L L P S}^{+}$ dataset (cyan) was longer than in the ${L L P S}^{-}$ (orange) and the ${P D B}^{*}$ (magenta) datasets and less hydrophobic and had a higher LCR fraction than sequences in the ${L L P S}^{-}$ , the ${P D B}^{*}$ , or the Swiss-Prot (gray) datasets. It also had a lower Shannon entropy and a higher IDR fraction than sequences in the ${P D B}^{*}$ or the Swiss-Prot datasets. The boxes bound data between the upper and the lower quartile, and the center lines indicates the mean value. The ends of the whiskers correspond to values that exceed the boundaries of the interquartile range by 1.5 times its size or to the most extreme value. Significance was tested with a Mann–Whitney test, **** denotes a P value below $10^{- 4}$ , and ns denotes no significance at $P \leq 0.01$ . Full distributions are shown in SI Appendix, Fig. S1. The dashed line in C corresponds to the case when all amino acids are present at equal frequencies.