Collagen
Collagen is the most abundant protein in the human body, constituting ~30% of total protein mass and providing the structural scaffold for virtually every tissue. Its defining feature — the triple helix — three α-chains coiling around each other, each requiring every third residue to be glycine and stabilised by 4-hydroxyproline formed by vitamin C-dependent prolyl hydroxylase. More than 28 distinct types serve specific mechanical and signalling roles: type I dominates bone, skin, and tendon; type IV scaffolds all basement membranes; type VII anchors dermal-epidermal junctions. The same architecture that creates extraordinary tensile strength in healthy tissue becomes pathological excess in fibrosis — where TGF-β-driven myofibroblast activation deposits type I/III collagen in liver (cirrhosis), lung (IPF), and kidney (CKD).
Overview
The collagen superfamily contains more than 28 distinct types, encoded by at least 44 genes, ranging from large fibril-forming collagens (types I–III, V, XI) that provide tensile strength in bone, skin, and tendon, to the sheet-forming type IV that scaffolds all basement membranes, to transmembrane collagens (type XVII) that anchor cells. About 90% of body collagen is type I.
Collagen pathology is correspondingly broad: scurvy (vitamin C deficiency → prolyl hydroxylase failure → unstable helix), osteogenesis imperfecta (COL1A1/A2 Gly substitution — dominant-negative defect), Ehlers-Danlos syndrome (multiple types; COL5A1/A2, COL3A1, PLOD1), Alport syndrome (COL4A3-A5 mutations → GBM failure), and organ fibrosis (TGF-β → myofibroblast activation → excess type I/III deposition).
Structure
Triple-helix architecture
The fundamental structural unit of all collagens is the triple helix: three α-chains, each in a left-handed polyproline II (PPII) helix conformation, coil around each other into a right-handed superhelix (~300 nm long, 1.5 nm diameter for fibrillar collagens). The Gly-X-Y repeat is the structural imperative — every third residue must be glycine, the only residue small enough to occupy the sterically crowded central axis of the triple helix. X is frequently proline; Y is frequently 4-hydroxyproline (Hyp), which forms interchain hydrogen bonds via water bridges that critically stabilise the helix. Loss of Hyp (scurvy) destabilises the triple helix → connective tissue fragility.
| Type | Genes | Distribution | Primary role |
|---|---|---|---|
| I | COL1A1/A2 | Bone, skin, tendon, cornea, dentin | Tensile strength; ~90% of body collagen |
| II | COL2A1 | Cartilage, vitreous | Compressive force resistance |
| III | COL3A1 | Fetal skin, blood vessels, GI tract | Often co-deposited with type I; vascular compliance |
| IV | COL4A1–A6 | All basement membranes (GBM, tubular BM, skin BMZ) | Sheet network; filtration scaffold |
| V | COL5A1/A2 | Cornea, interstitial tissues | Regulates type I fibril diameter (template function) |
| VII | COL7A1 | Skin dermal-epidermal junction | Anchoring fibrils for epidermis to dermis |
| XVII | COL17A1 | Skin hemidesmosomes | Transmembrane; BPAG2; epidermal adhesion |
Mechanism — Biosynthesis Pathway
RIBOSOME: Pre-pro-α-chain synthesis; signal peptide → ER
ER PROCESSING:
1. Signal peptide cleavage → pro-α-chain
2. Prolyl 4-hydroxylase (P4H):
Pro(Y) + O₂ + ascorbate + Fe²⁺ + α-KG → 4-Hydroxyproline (Hyp)
[~100 Hyp per α-chain; VITAMIN C cofactor → SCURVY if deficient]
3. Prolyl 3-hydroxylase → 3-Hyp at Pro986 (1 per chain)
4. Lysyl hydroxylase (PLOD1/2/3):
Lys → Hydroxylysine (Hyl) → O-glycosylated (Gal-Hyl; Glc-Gal-Hyl)
5. C-propeptide disulfide bonding → nucleates α-chain registration
6. Triple helix propagates N-terminally (zipper mechanism)
GOLGI: Glycosylation; packaging into secretory vesicles
EXTRACELLULAR:
ADAMTS2/14 (N-proteinase) + BMP1/tolloid (C-proteinase)
→ cleave N- and C-propeptides → tropocollagen (300 nm × 1.5 nm)
→ self-assembles into D-staggered fibrils (67 nm D-period)
CROSS-LINKING (extracellular):
Lysyl oxidase (LOX; Cu²⁺-dependent):
Lys/Hyl → allysine → spontaneous condensation
→ pyridinoline + deoxypyridinoline cross-links (trivalent)
→ fibril tensile strength and resistance to proteolysis
Collagen degradation
Triple helices resist most proteases. Specific cleavage requires: Collagenases (MMP-1, MMP-8, MMP-13) — cleave the triple helix at a specific Gly-Ile/Leu site ~75% from the N-terminus → ¾ and ¼ fragments that denature at 37°C; Gelatinases (MMP-2, MMP-9) — degrade denatured collagen fragments; MT-MMPs (MMP-14) — pericellular collagen degradation in invasion; Cathepsins (B, K, L) — intracellular lysosomal degradation (relevant in osteoclast bone resorption). MMP activity is regulated by TIMPs; imbalance toward proteolysis drives ECM degradation (joint destruction, tumour invasion); imbalance toward synthesis drives fibrosis.
Physiological Roles
Bone: Type I collagen fibrils in bone are mineralized with hydroxyapatite, creating a composite material that resists both tension (collagen) and compression (mineral). The collagen scaffold determines bone quality independent of bone mineral density (BMD) — explaining why bone density measurements alone underpredict fracture risk.
Vascular wall: Types I and III collagen form the arterial media and adventitia, determining arterial stiffness and providing the mechanical foundation upon which smooth muscle cells generate tone. Degradation by MMPs in atherosclerotic plaques reduces tensile strength, predisposing to plaque rupture and acute coronary events.
Glomerular basement membrane: Type IV collagen (COL4A3-A5 isoforms) forms a covalently crosslinked network in the GBM contributing to charge and size selectivity of glomerular filtration. The GBM type IV network is the only collagen network that forms true covalent crosslinks in vivo (NC1 sulfilimine bond formed by peroxidasin).
Cardiac ECM: Types I and III collagen form the myocardial collagen network (perimysium, endomysium) that couples cardiomyocyte shortening to chamber ejection and prevents over-distension. Excess type I deposition (fibrosis) in heart failure increases myocardial stiffness → diastolic dysfunction → HFpEF phenotype.
Pathology
| Disease | Gene/mechanism | Clinical features |
|---|---|---|
| Scurvy | Vitamin C deficiency → P4H failure → unstable triple helix → non-secreted procollagen degraded intracellularly | Perifollicular haemorrhages, bleeding gums, corkscrew hairs, wound dehiscence, costochondral "rosary"; reverses rapidly with vitamin C |
| Osteogenesis imperfecta (OI) | COL1A1/A2 Gly substitution (any Gly→X in Gly-X-Y triplet) → dominant-negative disruption of triple helix; all 3 chains affected even if only 1 is mutant | Brittle bones (fracture at minimal trauma), blue sclerae, dentinogenesis imperfecta, hearing loss; 8 types (I–VIII) by severity; type I = mildest (heterozygous nonsense → haploinsufficiency); type II = lethal perinatal (Gly→large residue) |
| Ehlers-Danlos syndrome (EDS) | Classical: COL5A1/A2 → ↓type V → dysregulated fibril diameter; Vascular (EDS type IV): COL3A1 → type III deficiency; Kyphoscoliotic: PLOD1 (lysyl hydroxylase deficiency) | Skin hyperextensibility, joint hypermobility; Vascular EDS: arterial/bowel/uterine rupture risk — life-threatening; kyphoscoliotic: progressive scoliosis, ocular fragility |
| Alport syndrome | COL4A3/A4 (AR) or COL4A5 (X-linked) mutations → abnormal type IV collagen network in GBM → structural failure → GBM thinning then splitting | Haematuria (earliest sign, childhood), proteinuria, progressive CKD (ESRD by age 20–30 in X-linked males), sensorineural deafness (30–70%), anterior lenticonus (25%); ACE inhibitors delay ESRD; sparsentan approved 2024 |
| Pulmonary fibrosis (IPF) | Repetitive alveolar epithelial injury → dysregulated repair → TGF-β → myofibroblast activation → excess type I/III collagen deposited in interstitium → "honeycombing" | Progressive dyspnoea, ↓DLCO, usual interstitial pneumonia (UIP) pattern on CT; median survival 3 years from diagnosis; antifibrotics pirfenidone and nintedanib slow progression |
| Liver fibrosis / cirrhosis | Hepatic stellate cell (HSC) activation (alcohol, viral hepatitis, NAFLD, cholestasis) → myofibroblasts → excess type I/III collagen → bridging fibrosis → portal hypertension | Portal hypertension, ascites, varices, hepatic encephalopathy, ↑HCC risk; METAVIR F0-F4 staging; liver stiffness by FibroScan correlates with collagen content |
| Keloid | Unregulated type I/III collagen production from dermal fibroblasts post-injury, beyond wound margins (vs. hypertrophic scar which stays within margins) | Raised, firm, itchy/painful scar growing beyond original wound; predilection for chest, ear lobes, jaw; treatments: intralesional steroids, silicone sheets, laser, radiotherapy |
The scurvy connection: Prolyl 4-hydroxylase requires Fe²⁺, O₂, α-ketoglutarate, and ascorbate (vitamin C) as cofactors. Vitamin C is consumed stoichiometrically during each hydroxylation (regenerated by NADH under normal conditions; rate-limiting when deficient). Without hydroxylation, collagen triple helices cannot form at 37°C — collagen is retained in the ER and degraded. Supplementation reverses scurvy within days as new collagen is synthesised and secreted.
Pharmacology / Clinical Relevance
Antifibrotic therapies: Pirfenidone (pleiotropic; attenuates TGF-β signalling, ↓myofibroblast proliferation, ↓collagen synthesis — IPF); nintedanib (triple kinase inhibitor: PDGFR, VEGFR, FGFR — IPF, systemic sclerosis ILD, progressive pulmonary fibrosis); losartan/ACE inhibitors (↓TGF-β via RAS blockade — used in Alport syndrome, some hepatic fibrosis regimens).
Collagen as therapeutic target in cardiovascular disease: Post-MI cardiac fibrosis is driven by type I/III collagen deposition by activated cardiac fibroblasts. Mineralocorticoid receptor antagonists (spironolactone, eplerenone) reduce myocardial collagen content and improve diastolic function in heart failure clinical trials. Serum PICP (procollagen I C-terminal peptide) and PIIINP (N-terminal propeptide of type III procollagen) are biomarkers of active fibrosis turnover used in research.
Collagen-based biomaterials: Purified type I collagen and gelatin (hydrolysed collagen) are used in wound dressings, haemostatic agents (Gelfoam, Surgiflo), tissue-engineering scaffolds, and injectables for soft-tissue augmentation (though largely replaced by hyaluronic acid fillers in cosmetic medicine).
Connections
References
- Berg JM, Tymoczko JL, Stryer L. Biochemistry. 9th ed. W.H. Freeman; 2019. Chapter 27 (protein folding and structure).
- Alberts B, Johnson A, Lewis J, et al. Molecular Biology of the Cell. 7th ed. W.W. Norton; 2022.
- Ricard-Blum S. The collagen family. Cold Spring Harb Perspect Biol. 2011;3(1):a004978. doi:10.1101/cshperspect.a004978
- Byers PH, Pyott SM. Recessively inherited forms of osteogenesis imperfecta. Annu Rev Genet. 2012;46:475-97. doi:10.1146/annurev-genet-110711-155608
- Raghu G, et al. Idiopathic pulmonary fibrosis. Nat Rev Dis Primers. 2023;9(1):46. doi:10.1038/s41572-023-00457-2