function |
(33,39,57,58) |
•entropic chains |
IDRs carrying out functions
that benefit directly from their conformational disorder, e.g., flexible
linkers and spacers |
MAP2 projection domain,
titin PEVK domain, RPA70, MDA5 |
•display sites |
flexibility of IDRs facilitates
exposure of motifs and easy access for proteins that introduce and
read PTMs |
p53, histone
tails, p27,
CREB kinase-inducible domain |
•chaperones |
their binding properties
(many different partners, rapid association/disassociation, and folding
upon binding) make IDPs suitable for chaperone functions |
hnRNP A1, GroEL, α-crystallin,
Hsp33 |
•effectors |
folding upon binding mechanics
allow effectors to modify the activity of their partner proteins |
p21, p27, calpastatin, WASP
GTPase-binding domain |
•assemblers |
assembling IDRs have large
binding interfaces that scaffold multiple binding partners and promote
the formation of higher-order protein complexes |
ribosomal proteins L5, L7,
L12, L20, Tcf 3/4, CREB transactivator domain, Axin |
•scavengers |
disordered scavengers
store and neutralize small ligands |
chromogranin
A, Pro-rich glycoproteins, caseins and other SCPPs |
functional features |
linear motifs47,125
|
•structural modification |
sites of conformational
alteration of a peptide backbone |
peptidylprolyl cis–trans
isomerase Pin1 sites |
|
•proteolytic cleavage |
sites of post-translational
processing events or proteolytic cleavage scission sites |
Caspase-3/-7, separase,
taspase1 scission sites |
|
•PTM removal/addition |
specific binding sequences
that recruit enzymes catalyzing PTM moiety addition or removal |
cyclin-dependent kinase
phosphorylation site, SUMOylation site, N-glycosylation site |
|
•complex promoting |
motifs that mediate protein–protein
interactions important for complex formation; often associated with
signal transduction |
proline-rich SH3-binding
motif, cyclin box, pY SH2-binding motif, PDZ-binding motif, TRAF-binding
motifs in MAVS |
|
•docking |
motifs that increase the
specificity and efficiency of modification events by providing an
additional binding surface |
KEN box degron, MAPK docking
sites |
|
•targeting or trafficking |
signal sites that localize
proteins within particular subcellular organelles or act to traffic
proteins |
nuclear
localization signal,
clathrin box motif, endocytosis adaptor trafficking motifs |
molecular recognition
features
(MoRFs)121
|
•alpha |
disordered motifs that form
α-helices upon target binding |
p53 ∼ Mdm2, p53 ∼
RPA70, p53 ∼ S100B(ββ), RNase E ∼ enolase,
inhibitor IA3 ∼ proteinase A |
|
•beta |
disordered motifs that form
β-strands upon target binding |
RNase E ∼ polynucleotide
phosphorylase, Grim ∼ DIAP1, pVIc ∼ adenovirus 2 proteinase |
|
•iota |
disordered motifs that form
irregular secondary structure upon target binding |
p53 ∼ Cdk2-cyclin
A, amphiphysin ∼ α-adaptin C |
|
•complex |
disordered motifs that contain
combinations of different types of secondary structure upon target
binding |
amyloid β
A4 ∼
X11, WASP ∼ Cdc42 |
intrinsically disordered
domains (IDDs)158,159
|
|
some protein domains identified
using sequence-based approaches are fully or largely disordered |
WH2, RPEL, BH3, KID domains |
co-occurrence
of protein domains with disordered regions161,162
|
|
particular disordered
regions frequently co-occur in the same sequence with specific protein
domains |
|
structure |
structural continuum37
|
|
proteins function within
a continuum of differently disordered conformations, extending from
fully structured to completely disordered, with everything in between
and no strict boundaries between the states |
|
protein quartet32,34,166
|
•intrinsic
coil |
flexible regions
of extended
conformation with hardly any secondary structure; high net charge
differentiates these from disordered globules |
ribosomal proteins L22,
L27, 30S, S19, prothymosin α |
|
•pre-molten globule |
disordered protein regions
with residual secondary structure, often poised for folding upon binding
events; lower net charge makes them more compact than coils |
Max, ribosomal proteins
S12, S18, L23, L32, calsequestrin |
|
•molten globule |
globally collapsed conformation
with regions of fluctuating secondary structure |
nuclear coactivator binding
domain of CREB binding protein |
|
•folded |
structured proteins
with a defined three-dimensional structure |
most enzymes,
transmembrane domains, hemoglobin, actin |
sequence |
sequence–structural
ensemble relationships166,204
|
•polar tracts |
sequence stretches enriched
in polar amino acids often form globules that are generally devoid
of significant secondary structure preferences |
Asn- and Gly-rich sequences,
Gln-rich linkers in transcription factors and RNA-binding proteins |
|
•polyelectrolytes |
amino acid compositions
biased toward charged residues of one type; strong polyelectrolytes
(high net charge) form expanded coils |
Arg-rich protamines, Glu/Asp-rich
prothymosin α |
|
•polyampholytes |
sequences with roughly equal
numbers of positive and negative charges; conformations of polyampholytes
are governed by the linear distribution of oppositely charged residues,
with segregation of opposite charges leading to globules, while well-mixed
charged sequences adopt random-coil or globular conformations, depending
on the total charge |
RNA chaperones, splicing
factors, titin PEVK domain, yeast prion Sup35 |
prediction flavors205
|
•V |
predicted
best by the VL-2V
predictor, for which the hydrophobic amino acids are the most influential
attributes |
E. coli ribosomal proteins |
|
•C |
VL-2C is the best predictor
for flavor C, which has more histidine, methionine, and alanine residues
than the other flavors |
poly- and oligosaccharide
binding domains |
|
•S |
flavor
with less histidine
than the others, best predicted by predictor VL-2S, which has a measure
of sequence complexity as the most important attribute |
proteins that facilitate
binding and interaction |
disorder–sequence
complexity206
|
|
IDPs from different functional
classes show distinct disorder–sequence complexity distributions |
proteins with disordered
linkers between structured domains populate compact and disordered
DC regions |
overall degree of disorder35,51,68,161,208,209
|
•fraction |
categorization of proteins
based on the fraction of residues predicted to be disordered |
0–10/10–30/30–100%
disorder |
|
•overall
score |
overall disorder
scores
for the whole protein |
minimum average disorder
score depending on the predictor |
|
•continuous stretches |
presence or absence of continuous
stretches of disordered residues |
typically >30 residues |
length of disordered regions211
|
•>500 residues |
proteins that contain disordered
regions of different lengths are enriched for different types of functions |
transcription |
|
•300–500 residues |
|
kinase and phosphatase functions |
|
•<50 residues |
|
(metal) ion binding, ion
channels, GTPase regulatory activity |
position of disordered regions211
|
•N-terminal |
proteins that contain disordered
regions at different locations in the sequence are enriched for different
types of functions |
DNA-binding, ion channel |
|
•internal |
|
transcription
regulator,
DNA-binding |
|
•C-terminal |
|
transcription repressor/activator,
ion channel |
tandem repeats217,218
|
•Q/N |
glutamine- and asparagine-rich
proteins regions are both important for normal cellular function and
prone to cause harmful aggregation |
huntingtin, Sup35p, Ure2p,
Ccr4, Pop2 |
|
•S/R |
tandem repeats composed
of arginine and serine residues are phosphorylated and disordered,
and play a role in spliceosome assembly |
ASF/SF2, SRp75, SRSF1 |
|
•K/A/P |
tandem repeats composed
of lysine, alanine, and proline function in binding nucleosome linker
DNA |
histone H1 |
|
•F/G |
disordered domains with
phenylalanine-glycine repeats influence NPC gating behavior |
nucleoporins |
|
•P/T/S |
extensively glycosylated
regions rich in proline, threonine, and serine residues are involved
in mucus formation |
mucins |
|
|
•others |
|
|
protein interactions |
fuzzy complexes by topology242
|
•polymorphic |
a form of static disorder,
with alternative bound conformations serving distinct functions by
having different effects on the binding partner |
β-catenin ∼
Tcf4, NLS ∼ importin-α, actin ∼ WH2 domain |
|
•clamp |
complex formation through
folding upon binding of two disordered protein segments, connected
by a linker that remains disordered |
Ste5 ∼ Fus3, myosin
VI ∼ actin filament, Oct-1 ∼ DNA |
|
•flanking |
complex formation through
folding upon binding of a central disordered protein segment, flanked
by two regions that remain disordered |
SF1 splicing factor ∼
U2AF, proline-rich peptides ∼ SH3 domains, p27Kip1 ∼ cyclin-Cdk2 |
|
•random |
disordered
regions that
remain highly dynamic even in the bound state |
elastin self-assembly, Sic1
∼ Cdc4 |
fuzzy complexes by mechanism176,251
|
•conformational selection |
the fuzzy region facilitates
the formation of the binding-competent form by shifting the conformational
equilibrium |
Max ∼
DNA, MeCP2
∼ DNA |
|
•flexibility
modulation |
the fuzzy
region modulates
the flexibility of the binding interface and changes binding entropy |
Ets-1 ∼ DNA, SSB
∼ DNA |
|
•competitive
binding |
the fuzzy
region serves
as an intramolecular competitive partner for the binding surface. |
HMGB1 ∼ DNA, RNase1
∼ RNase inhibitor |
|
•tethering |
the fuzzy region increases
the local concentration of a weak-affinity binding domain near the
target, or anchors it via transient interactions |
RPA ∼ DNA, UPF1 ∼
UPF2, PC4 ∼ VP16 |
binding plasticity257
|
•static |
mono-/polyvalent complexes,
chameleons, penetrators, huggers |
for examples, see Figure 12
|
|
•coiled-coil based |
intertwined strings, long
cylindrical containers, connectors, armature, tweezers and forceps,
grabbers, tentacles, pullers, stackers |
|
|
•dynamic |
cloud contacts
and protein interaction ensembles |
|
evolution |
sequence conservation54
|
•flexible |
regions that require the
property of disorder for functionality regardless of the exact sequence |
signaling and regulatory
proteins (Sky1, Bur1) |
|
•constrained |
regions of conserved disorder
that also have highly conserved amino acid sequences |
ribosomal proteins (Rpl5),
protein chaperones (Hsp90) |
|
•nonconserved |
no conservation of the disorder,
nor of the underlying sequence; no clear functional hallmarks |
yeast Ty1 retrotransposon
domains A and B |
conservation of amino acid
composition260
|
•HR |
IDRs with high residue conservation |
transcription regulation
and DNA binding |
|
•LRHT |
IDRs
with low residue conservation
but high conservation of the amino acid composition of the region |
ATPase and nuclease activities |
|
•LRLT |
IDRs with neither conservation
of sequence nor conservation of amino acid composition |
(metal) ion binding proteins |
lineage and species
specificity159
|
•prokaryotes |
species from different kingdoms
of life seem to use disorder for different types of functions |
longer lasting interactions
involved in complex formation |
|
•eukaryotes and viruses |
|
transient interactions in
signaling and regulation |
evolutionary history and
mechanism of repeat expansion61
|
•Type I |
repeats that showed no function
diversification after expansion |
titin PEVK domain, salivary
proline-rich proteins |
|
•Type II |
repeats that acquired diverse
functions through mutation or differential location within the sequence |
RNA polymerase II (CTD) |
|
•Type
III |
repeats
that
gained new functions as a consequence of their expansion |
prion protein
octarepeats |
regulation |
expression patterns208
|
•constitutive |
IDPs encoded by constitutively
highly expressed transcripts are almost entirely disordered and often
ribosomal proteins |
ribosomal L proteins |
|
•high |
IDP-encoding
transcripts
showing high expression levels in most tissues and little tissue specificity |
protease inhibitors, splicing
factors, complex assemblers |
|
•medium |
these
IDP-encoding transcripts
are expressed at medium levels, with some tissue-specificity |
DNA binding, transcription
regulation |
|
•tissue-specific |
IDP-encoding transcripts
with highly tissue-specific expression |
cell organization regulators,
complex disassemblers |
|
•low or transient |
IDP-encoding transcripts
that are present in undetectable amounts; more than one-half of analyzed
IDPs |
variety of functions |
alternative splicing304,305,309,312,313
|
|
regulation and evolutionary
patterns of inclusion and exclusion of IDR-encoding exons can provide
insights into whether the encoded IDR functions in protein regulation
and interactions |
a
tissue-specific region
with a phosphosite in the TJP1 protein in mouse, a mammalian-specific
region in the PTB1 splicing regulator |
degradation kinetics315,316,318,320,321
|
•degradation accelerators |
IDRs that can influence
and accelerate proteasomal degradation of the protein containing it |
|
|
•others |
IDRs that have no influence
on protein half-life or increase it, e.g., because of sequence compositions
that impede proteasome processivity |
low complexity sequences
such as glycine-alanine repeats and polyglutamine repeats |
post-translational
processing and secretion337,340
|
|
secreted proteins
are depleted for IDPs, but structural disorder is important in, e.g.,
prohormones, the extracellular matrix, and biomineralization |
pre-pro-opiomelanocortin,
elastic fiber proteins, SIBLINGs, mucins |
biophysical
properties |
solubility209
|
|
the sequence features
of
IDPs are generally associated with aqueous solubility, although some
IDPs are thermostable, while others are not; this is likely modulated
by sequence–structural ensemble relationships, such as the
degree of compaction |
4E-BP1, calpastatin, CREB,
p21, p27, Sp1, stathmin, WASP |
phase transition137,353
|
|
certain IDRs (such as those
that contain specific low-complexity regions or interaction motifs)
can undergo phase transitions like the formation of protein-based
droplets or hydrogels |
multivalent SH3-binding
motifs in phase separation, granule-like assemblies of RNA-binding
proteins containing low-complexity IDRs, mucins |
biomineralization117,341
|
|
structural disorder is common
in proteins with roles in biomineralization, such as the formation
of bone and teeth |
caseins, osteopontin, bone
sialoprotein 2, dentin sialophosphoprotein |