The Eukaryotic Linear Motif resource for
Functional Sites in Proteins
Functional site class:
KLHDC2 C-terminal GG degrons
Functional site description:
Aberrant proteins can be deleterious to cells and are cleared by the ubiquitin-proteasome system. A group of C-end degrons that are recognized by specific cullin-RING ubiquitin E3 ligases (CRLs) has recently been found to be shared by some of these abnormal polypeptides. Among the different classes of C-end degradation signals, the diglycine degron is distinguished from others by the simplicity of its consensus sequence that lacks any side chains. The crystal structures of KLHDC2 in complex with three diglycine C-end degrons reveal a surprisingly condensed degron-E3 interface, which is cemented by a network of inter-molecular polar interactions.

Upon docking to the E3 enzyme, the degron adopts a compact conformation, which allows its carboxyl terminus and backbone carbonyl groups to make interactions with a cluster of conserved E3 side chains. This binding mode licenses the overall promiscuous diglycine degron with a minimum of five amino acids to achieve high affinity KLHDC2 binding (Rusnac,2018).
ELM Description:
Gly is one of the amino acids characteristically depleted at the terminal position (Hasenjager,2023). This is explained by the existence of C-terminus recognizing ubiquitin ligases. While providing fast turnover to a limited set of cognate substrates, these E3 enzymes can also efficiently recognize “out of ordinary” polypeptide sequences, resulting from aberrant translational readthrough, early termination, proteolytic cleavage, or simple mis-localization (Hasenjager,2023; Yeh,2021).

The terminal Gly-dontaining degrons are bound by a family of closely related, cullin 2-dependent E3 recognition subunits in vertebrates (Cul2KLHDC1 Cul2KLHDC2, Cul2KLHDC3, and Cul2KLHDC10 ligases), with Kelch repeat domains. (Koren,2018; Lin,2018). In all instances, the central depression formed at the junction of six β-blades is responsible for binding the last two C-terminal amino acids (positions -1 and -2). KLHDC2 is the most well-understood among these E3 ligases (Rusnac,2018). In the human KLHDC2 protein, Arg236 and Arg241 coordinate the carboxy terminus, most commonly Gly. Notably, the geometry of this pocket dictates that the amino acid at the −2 position must be glycine, while glycine and alanine are both tolerated at the −1 position. While KLHDC2 can also bind rare degrons terminating in Ala (GA degrons), it likely has further sequence preferences upstream of the carboxy terminus as shown by the known crystal structures. (6DO3; Rusnac,2018; 8EBL; Scott,2023) This is due to the tight space at the centre of the Kelch domain and the subterminal short α-helix (positions -6 to -3, if the carboxy terminus is position -1), only possible if the -5 position is a small amino acid (Ala, Gly, Pro, Cys, Ser), -4 has a moderately sized side chain and none of -3 or -4 positions are filled by Pro. However, not all known substrates satisfy these structural conditions, likely due to alternative geometries not yet explored by X-ray crystallography (Timms,2023).
Pattern: ...G[GA]$
Pattern Probability: 0.0000196
Present in taxon: Metazoa
Interaction Domain:
Kelch-type beta propeller (IPR015915) This entry represents the 6-bladed Kelch beta-propeller, which consists of six 4-stranded beta-sheet motifs or six Kelch repeats (Stochiometry: 1 : 1)
o See 2 Instances for DEG_Cend_KLHDC2_1
o Abstract
Protein quality control is a vital cellular process that ensures the proper folding, assembly, and function of proteins. It involves various surveillance mechanisms that detect misfolded, mislocalized or damaged proteins and either facilitate their refolding or target them for degradation via protein degradation pathways such as the ubiquitin-proteasome system, thereby maintaining cellular homeostasis (Yeh,2021).

C-terminal degrons (C-degrons, also known as destabilizing C-terminal ends, DesCEnds) are short amino acid sequences located at the C-terminus of proteins that play a crucial role in regulating protein stability and degradation. These degrons are recognized by specific E3 ubiquitin ligases, such as the Cul2KLHDC2 E3 ligase, which target the protein for ubiquitination and subsequent proteasomal degradation. C-terminal degrons can be present in full length proteins internally, in which case they must be activated by proteolytic cleavage. On the other hand, they can be natively present at the C-termini of other proteins or even introduced by premature translation termination.

KLHDC2 is the adaptor subunit for a cullin2-based ubiquitin ligase complex targeting the so-called C-terminal di-glycine motifs. These sequences can arise by proteolytic cleavage (notably, by USP proteases), improper translation of selenoproteins with a non-overridden stop codon or occur in some cellular substrates natively. Proteins terminating in Gly-Gly are rare in the proteome: such sites are more frequently generated by either proteolysis (e.g., by ubiquitin-specific proteases) or other processes. Selenoproteins, in particular are prone to early termination during synthesis if selenocysteinyl-tRNAs overriding the stop codon are not present at a suitable concentration. Thus, KLHDC2 also serves as part of a quality control mechanism eliminating truncated Gly-rich selenoproteins during periods of selenium restriction. The substrate specificity of KLHDC2 overlaps with KLHDC3 (preferentially recognizing Arg/Lys/Gln-Gly C-terminal degrons) and KLHDC10 (principally recognizing Trp/Ala/Pro-Gly C-terminal degrons) to a certain extent (Koren,2018; Timms,2023), and some selenoproteins might be substrates of multiple ligases.

Thus, the known di-glycine termini arise from 3 distinct cases (1) cleavage, that is only seen so far among ubiquitin-specific protease (USP) cleavage sites, capable of generating terminal L.GG sequences, where L is required for USP recognition (see e.g. structure 7ZH3); (2) insufficient selenocysteine incorporation, in the context of the GGU sequence (U = selenocysteine) or (3) natively C-terminal sequences (GG$) that are rare.
o 4 selected references:

o 9 GO-Terms:

o 2 Instances for DEG_Cend_KLHDC2_1
(click table headers for sorting; Notes column: =Number of Switches, =Number of Interactions)
Acc., Gene-, NameStartEndSubsequenceLogic#Ev.OrganismNotes
184 188 GGGACSWRPGRRGPSSGGUG TP 8 Homo sapiens (Human)
P29323 EPHB2
1051 1055 QGIFFKEDSHKESNDCSCGG U 1 Homo sapiens (Human)
Please cite: The Eukaryotic Linear Motif resource: 2022 release. (PMID:34718738)

ELM data can be downloaded & distributed for non-commercial use according to the ELM Software License Agreement