Atlas One · Human · Molecular

Collagen

The most abundant protein in the human body — ~30% of total protein mass — a family of >28 types unified by the Gly-X-Y triple-helix that scaffolds bone, skin, cartilage, tendons, blood vessel walls, and every basement membrane.

The triple helix, elucidated by Ramachandran and Kartha in 1955, achieves tensile strength exceeding ~1 GPa for type I collagen fibrils — surpassing many steel alloys at comparable cross-section. Mutations in collagen genes or their biosynthetic enzymes cause osteogenesis imperfecta, Ehlers-Danlos syndrome, Alport syndrome, and scurvy; dysregulated deposition drives fibrosis of liver, lung, and kidney.

>28 typesCollagen family
~30%Total body protein
44+ genesCOL gene family
300 nmFibril length (type I)
Atlas One · Molecular · Structural Protein / Extracellular Matrix

Collagen

Class: Structural protein  ·  Extracellular matrix scaffold  |  Key genes: COL1A1/A2, COL3A1, COL4A1-A6, COL5A1/A2  |  Produced by: Fibroblasts, osteoblasts, chondrocytes, myofibroblasts, endothelial cells

Collagen is the most abundant protein in the human body, constituting ~30% of total protein mass and providing the structural scaffold for virtually every tissue. Its defining feature — the triple helix — three α-chains coiling around each other, each requiring every third residue to be glycine and stabilised by 4-hydroxyproline formed by vitamin C-dependent prolyl hydroxylase. More than 28 distinct types serve specific mechanical and signalling roles: type I dominates bone, skin, and tendon; type IV scaffolds all basement membranes; type VII anchors dermal-epidermal junctions. The same architecture that creates extraordinary tensile strength in healthy tissue becomes pathological excess in fibrosis — where TGF-β-driven myofibroblast activation deposits type I/III collagen in liver (cirrhosis), lung (IPF), and kidney (CKD).

type I collagen type IV collagen procollagen tropocollagen collagen fibril

Overview

The collagen superfamily contains more than 28 distinct types, encoded by at least 44 genes, ranging from large fibril-forming collagens (types I–III, V, XI) that provide tensile strength in bone, skin, and tendon, to the sheet-forming type IV that scaffolds all basement membranes, to transmembrane collagens (type XVII) that anchor cells. About 90% of body collagen is type I.

Collagen pathology is correspondingly broad: scurvy (vitamin C deficiency → prolyl hydroxylase failure → unstable helix), osteogenesis imperfecta (COL1A1/A2 Gly substitution — dominant-negative defect), Ehlers-Danlos syndrome (multiple types; COL5A1/A2, COL3A1, PLOD1), Alport syndrome (COL4A3-A5 mutations → GBM failure), and organ fibrosis (TGF-β → myofibroblast activation → excess type I/III deposition).

Structure

Triple-helix architecture

The fundamental structural unit of all collagens is the triple helix: three α-chains, each in a left-handed polyproline II (PPII) helix conformation, coil around each other into a right-handed superhelix (~300 nm long, 1.5 nm diameter for fibrillar collagens). The Gly-X-Y repeat is the structural imperative — every third residue must be glycine, the only residue small enough to occupy the sterically crowded central axis of the triple helix. X is frequently proline; Y is frequently 4-hydroxyproline (Hyp), which forms interchain hydrogen bonds via water bridges that critically stabilise the helix. Loss of Hyp (scurvy) destabilises the triple helix → connective tissue fragility.

TypeGenesDistributionPrimary role
ICOL1A1/A2Bone, skin, tendon, cornea, dentinTensile strength; ~90% of body collagen
IICOL2A1Cartilage, vitreousCompressive force resistance
IIICOL3A1Fetal skin, blood vessels, GI tractOften co-deposited with type I; vascular compliance
IVCOL4A1–A6All basement membranes (GBM, tubular BM, skin BMZ)Sheet network; filtration scaffold
VCOL5A1/A2Cornea, interstitial tissuesRegulates type I fibril diameter (template function)
VIICOL7A1Skin dermal-epidermal junctionAnchoring fibrils for epidermis to dermis
XVIICOL17A1Skin hemidesmosomesTransmembrane; BPAG2; epidermal adhesion

Mechanism — Biosynthesis Pathway

  RIBOSOME: Pre-pro-α-chain synthesis; signal peptide → ER

  ER PROCESSING:
    1. Signal peptide cleavage → pro-α-chain
    2. Prolyl 4-hydroxylase (P4H):
       Pro(Y) + O₂ + ascorbate + Fe²⁺ + α-KG → 4-Hydroxyproline (Hyp)
       [~100 Hyp per α-chain; VITAMIN C cofactor → SCURVY if deficient]
    3. Prolyl 3-hydroxylase → 3-Hyp at Pro986 (1 per chain)
    4. Lysyl hydroxylase (PLOD1/2/3):
       Lys → Hydroxylysine (Hyl) → O-glycosylated (Gal-Hyl; Glc-Gal-Hyl)
    5. C-propeptide disulfide bonding → nucleates α-chain registration
    6. Triple helix propagates N-terminally (zipper mechanism)

  GOLGI: Glycosylation; packaging into secretory vesicles

  EXTRACELLULAR:
    ADAMTS2/14 (N-proteinase) + BMP1/tolloid (C-proteinase)
    → cleave N- and C-propeptides → tropocollagen (300 nm × 1.5 nm)
    → self-assembles into D-staggered fibrils (67 nm D-period)

  CROSS-LINKING (extracellular):
    Lysyl oxidase (LOX; Cu²⁺-dependent):
    Lys/Hyl → allysine → spontaneous condensation
    → pyridinoline + deoxypyridinoline cross-links (trivalent)
    → fibril tensile strength and resistance to proteolysis

Collagen degradation

Triple helices resist most proteases. Specific cleavage requires: Collagenases (MMP-1, MMP-8, MMP-13) — cleave the triple helix at a specific Gly-Ile/Leu site ~75% from the N-terminus → ¾ and ¼ fragments that denature at 37°C; Gelatinases (MMP-2, MMP-9) — degrade denatured collagen fragments; MT-MMPs (MMP-14) — pericellular collagen degradation in invasion; Cathepsins (B, K, L) — intracellular lysosomal degradation (relevant in osteoclast bone resorption). MMP activity is regulated by TIMPs; imbalance toward proteolysis drives ECM degradation (joint destruction, tumour invasion); imbalance toward synthesis drives fibrosis.

Physiological Roles

Bone: Type I collagen fibrils in bone are mineralized with hydroxyapatite, creating a composite material that resists both tension (collagen) and compression (mineral). The collagen scaffold determines bone quality independent of bone mineral density (BMD) — explaining why bone density measurements alone underpredict fracture risk.

Vascular wall: Types I and III collagen form the arterial media and adventitia, determining arterial stiffness and providing the mechanical foundation upon which smooth muscle cells generate tone. Degradation by MMPs in atherosclerotic plaques reduces tensile strength, predisposing to plaque rupture and acute coronary events.

Glomerular basement membrane: Type IV collagen (COL4A3-A5 isoforms) forms a covalently crosslinked network in the GBM contributing to charge and size selectivity of glomerular filtration. The GBM type IV network is the only collagen network that forms true covalent crosslinks in vivo (NC1 sulfilimine bond formed by peroxidasin).

Cardiac ECM: Types I and III collagen form the myocardial collagen network (perimysium, endomysium) that couples cardiomyocyte shortening to chamber ejection and prevents over-distension. Excess type I deposition (fibrosis) in heart failure increases myocardial stiffness → diastolic dysfunction → HFpEF phenotype.

Pathology

DiseaseGene/mechanismClinical features
ScurvyVitamin C deficiency → P4H failure → unstable triple helix → non-secreted procollagen degraded intracellularlyPerifollicular haemorrhages, bleeding gums, corkscrew hairs, wound dehiscence, costochondral "rosary"; reverses rapidly with vitamin C
Osteogenesis imperfecta (OI)COL1A1/A2 Gly substitution (any Gly→X in Gly-X-Y triplet) → dominant-negative disruption of triple helix; all 3 chains affected even if only 1 is mutantBrittle bones (fracture at minimal trauma), blue sclerae, dentinogenesis imperfecta, hearing loss; 8 types (I–VIII) by severity; type I = mildest (heterozygous nonsense → haploinsufficiency); type II = lethal perinatal (Gly→large residue)
Ehlers-Danlos syndrome (EDS)Classical: COL5A1/A2 → ↓type V → dysregulated fibril diameter; Vascular (EDS type IV): COL3A1 → type III deficiency; Kyphoscoliotic: PLOD1 (lysyl hydroxylase deficiency)Skin hyperextensibility, joint hypermobility; Vascular EDS: arterial/bowel/uterine rupture risk — life-threatening; kyphoscoliotic: progressive scoliosis, ocular fragility
Alport syndromeCOL4A3/A4 (AR) or COL4A5 (X-linked) mutations → abnormal type IV collagen network in GBM → structural failure → GBM thinning then splittingHaematuria (earliest sign, childhood), proteinuria, progressive CKD (ESRD by age 20–30 in X-linked males), sensorineural deafness (30–70%), anterior lenticonus (25%); ACE inhibitors delay ESRD; sparsentan approved 2024
Pulmonary fibrosis (IPF)Repetitive alveolar epithelial injury → dysregulated repair → TGF-β → myofibroblast activation → excess type I/III collagen deposited in interstitium → "honeycombing"Progressive dyspnoea, ↓DLCO, usual interstitial pneumonia (UIP) pattern on CT; median survival 3 years from diagnosis; antifibrotics pirfenidone and nintedanib slow progression
Liver fibrosis / cirrhosisHepatic stellate cell (HSC) activation (alcohol, viral hepatitis, NAFLD, cholestasis) → myofibroblasts → excess type I/III collagen → bridging fibrosis → portal hypertensionPortal hypertension, ascites, varices, hepatic encephalopathy, ↑HCC risk; METAVIR F0-F4 staging; liver stiffness by FibroScan correlates with collagen content
KeloidUnregulated type I/III collagen production from dermal fibroblasts post-injury, beyond wound margins (vs. hypertrophic scar which stays within margins)Raised, firm, itchy/painful scar growing beyond original wound; predilection for chest, ear lobes, jaw; treatments: intralesional steroids, silicone sheets, laser, radiotherapy

The scurvy connection: Prolyl 4-hydroxylase requires Fe²⁺, O₂, α-ketoglutarate, and ascorbate (vitamin C) as cofactors. Vitamin C is consumed stoichiometrically during each hydroxylation (regenerated by NADH under normal conditions; rate-limiting when deficient). Without hydroxylation, collagen triple helices cannot form at 37°C — collagen is retained in the ER and degraded. Supplementation reverses scurvy within days as new collagen is synthesised and secreted.

Pharmacology / Clinical Relevance

Antifibrotic therapies: Pirfenidone (pleiotropic; attenuates TGF-β signalling, ↓myofibroblast proliferation, ↓collagen synthesis — IPF); nintedanib (triple kinase inhibitor: PDGFR, VEGFR, FGFR — IPF, systemic sclerosis ILD, progressive pulmonary fibrosis); losartan/ACE inhibitors (↓TGF-β via RAS blockade — used in Alport syndrome, some hepatic fibrosis regimens).

Collagen as therapeutic target in cardiovascular disease: Post-MI cardiac fibrosis is driven by type I/III collagen deposition by activated cardiac fibroblasts. Mineralocorticoid receptor antagonists (spironolactone, eplerenone) reduce myocardial collagen content and improve diastolic function in heart failure clinical trials. Serum PICP (procollagen I C-terminal peptide) and PIIINP (N-terminal propeptide of type III procollagen) are biomarkers of active fibrosis turnover used in research.

Collagen-based biomaterials: Purified type I collagen and gelatin (hydrolysed collagen) are used in wound dressings, haemostatic agents (Gelfoam, Surgiflo), tissue-engineering scaffolds, and injectables for soft-tissue augmentation (though largely replaced by hyaluronic acid fillers in cosmetic medicine).

References

  • Berg JM, Tymoczko JL, Stryer L. Biochemistry. 9th ed. W.H. Freeman; 2019. Chapter 27 (protein folding and structure).
  • Alberts B, Johnson A, Lewis J, et al. Molecular Biology of the Cell. 7th ed. W.W. Norton; 2022.
  • Ricard-Blum S. The collagen family. Cold Spring Harb Perspect Biol. 2011;3(1):a004978. doi:10.1101/cshperspect.a004978
  • Byers PH, Pyott SM. Recessively inherited forms of osteogenesis imperfecta. Annu Rev Genet. 2012;46:475-97. doi:10.1146/annurev-genet-110711-155608
  • Raghu G, et al. Idiopathic pulmonary fibrosis. Nat Rev Dis Primers. 2023;9(1):46. doi:10.1038/s41572-023-00457-2