2025-11-26T21:40:18.167165

Hormonal Regulation of Breast Cancer Incidence Dynamics: A Mathematical Analysis Explaining the Clemmesen's Hook

Mirzaei, Yang
Clemmesen's hook refers to a commonly observed slowdown and rebound in breast cancer incidence around the age at menopause. It suggests a shift in the underlying carcinogenic dynamics, but the mechanistic basis remains poorly understood. Building on our previously developed Extended Multistage Clonal Expansion Tumor (MSCE-T) model, we perform a theoretical analysis to determine the conditions under which Clemmesen's hook would occur. Our results show that Clemmesen's hook can be quantitatively explained by time-specific changes in the proliferative and apoptotic balance of early-stage mutated cell populations, corresponding to the decline in progesterone levels and progesterone-driven proliferation due to reduced menstrual cycles preceding menopause, and changing dominant carcinogenic impact from alternative growth pathways post-menopause (e.g., adipose-derived growth signals). In contrast, variation in last-stage clonal dynamics cannot effectively reproduce the observed non-monotonic incidence pattern. Analytical results further demonstrate that midlife incidence dynamics corresponding to the hook are governed primarily by intrinsic proliferative processes rather than detection effects. Overall, this study provides a mechanistic and mathematical explanation for Clemmesen's hook and establishes a quantitative framework linking hormonal transitions during menopause to age-specific breast cancer incidence curve.
academic

Hormonal Regulation of Breast Cancer Incidence Dynamics: A Mathematical Analysis Explaining the Clemmesen's Hook

Basic Information

  • Paper ID: 2511.19964
  • Title: Hormonal Regulation of Breast Cancer Incidence Dynamics: A Mathematical Analysis Explaining the Clemmesen's Hook
  • Authors: Navid Mohammad Mirzaei, Wan Yang (Columbia University)
  • Classification: q-bio.PE (Quantitative Biology - Populations and Evolution)
  • Publication Date: November 26, 2025
  • Paper Link: https://arxiv.org/abs/2511.19964v1

Abstract

This study presents a mathematical modeling analysis of the "Clemmesen's hook" phenomenon in breast cancer incidence—a characteristic deceleration and rebound in incidence rates observed in the 45-55 age group around menopause. Based on an extended multistage clonal expansion tumor model (MSCE-T), the research employs rigorous theoretical analysis and numerical experiments to demonstrate that this phenomenon can be quantitatively explained by time-specific changes in the proliferation-apoptosis balance of early-stage mutant cell populations. These changes correspond to the decline in progesterone levels around menopause and the subsequent transition to alternative growth pathways (such as adipose tissue-derived growth signals) in the postmenopausal period. The study further demonstrates that late-stage clonal dynamics changes and detection effects cannot effectively reproduce this non-monotonic incidence pattern.

Research Background and Motivation

1. Core Problem

Breast cancer is the most common cancer and leading cause of death among women globally. Its age-specific incidence rate curve exhibits a distinctive "Clemmesen's hook" phenomenon during the menopausal transition (approximately ages 45-55): the rate of incidence growth first decelerates and then rebounds. This phenomenon is prevalent across different populations and cohorts, suggesting the existence of differential driving factors around menopause.

2. Problem Significance

Understanding the biological mechanism of Clemmesen's hook is crucial because:

  • It reveals how hormonal changes around menopause affect breast cancer risk
  • It can guide breast cancer prevention and intervention strategies
  • It helps clarify the causal role of hormones in cancer development

3. Limitations of Existing Approaches

  • Biological Evidence: Recent research points to progesterone (rather than estrogen) as the primary carcinogenic factor, but lacks quantitative explanation
  • Traditional MSCE Models: Cannot capture non-monotonic trends without introducing time-dependent parameters or detection dynamics
  • Unclear Mechanisms: The mechanistic basis of Clemmesen's hook remains unclear, lacking a mathematical framework connecting hormonal transitions to incidence curves

4. Research Motivation

Based on new epidemiological evidence (the number of ovulatory cycles rather than cumulative estrogen exposure is the primary determinant), the authors hypothesize that progesterone level changes drive Clemmesen's hook and verify this through mathematical modeling.

Core Contributions

  1. Theoretical Analysis Framework: Rigorous mathematical analysis of the MSCE-T model, deriving sufficient conditions for generating Clemmesen's hook
  2. Mechanism Identification: Demonstrates that time-specific changes in the proliferation/apoptosis ratio (α₁/β₁) of early-stage mutant cells (first stage) can explain the phenomenon, while late-stage dynamics and detection effects cannot produce similar effects
  3. Parameter Insensitivity Theorem: Through Volterra integral representation and Gâteaux derivative analysis, proves that changes in tumor detection threshold (Mt) and malignant cell proliferation rate (α₃) have weak effects on mid-age incidence
  4. Biological Verification: Connects mathematical results to progesterone-driven proliferation mechanisms, providing quantitative evidence supporting the progesterone hypothesis
  5. Numerical Verification: Uses three birth cohorts from the SEER database to validate the model, confirming that modest changes in α₁ and β₁ can reproduce Clemmesen's hook

Methodology Details

Task Definition

Determine under what parameter conditions the MSCE-T model can produce non-monotonic age-incidence curves similar to Clemmesen's hook:

  • Input: Model parameters (proliferation rates αᵢ, apoptosis rates βᵢ, mutation rates μᵢ) and their time dependencies
  • Output: Cancer risk function h(t) and its derivative h'(t)
  • Constraints: Parameters must be within biologically reasonable ranges

Model Architecture

1. MSCE-T Model Foundation

Considering a three-stage breast cancer development process limited by rate-limiting driver gene mutations, the model is described by the following ODE system:

ẋ₁ = μ₀N₀x₁(x₃ - 1)
ẋ₂ = -μ₀N₀x₄
ẋ₃ = β₁ - (α₁ + β₁ + μ₁)x₃ + μ₁x₃x₅ + α₁x₃²
ẋ₄ = -(α₁ + β₁ + μ₁)x₄ + μ₁x₄x₅ + μ₁x₃x₆ + 2α₁x₃x₄
ẋ₅ = β₂ - (α₂ + β₂ + μ₂)x₅ + μ₂f(t)x₅ + α₂x₅²
ẋ₆ = -(α₂ + β₂ + μ₂)x₆ + μ₂f'(t)x₅ + μ₂f(t)x₆ + 2α₂x₆x₅

Where:

  • x₁, x₃, x₅ are survival probabilities at each stage
  • h(t) = x₂(t) is the cancer risk function
  • x₄ = ẋ₃, x₆ = ẋ₅
  • f(t) integrates detection effects

2. Detection Function

f(t) = 1 - (1 - e^(-α₃t))^(Mt-1)

Where α₃ is the net proliferation rate of tumor cells and Mt is the number of malignant cells required for detection at time t.

3. Key Parameters

Define key composite parameters:

c(t) = α₁(2x₃(t) - 1) - β₁ + μ₁(x₅(t) - 1)
q(t) = -(α₂ + β₂ + μ₂) + μ₂f(t) + 2α₂x₅(t)

Technical Innovations

1. Volterra Integral Representation (Section 4)

By solving x₆ and x₄ using integrating factors:

x₄(t) = ∫₀ᵗ K(t,r)f'(r)dr

Where the kernel function is:

K(t,r) = μ₁μ₂x₅(r)∫ᵣᵗ x₃(s)exp(∫ᵣˢ(c+q)du)ds

This expresses h'(t) = -μ₀N₀x₄(t) as a Volterra transform of f'(r), laying the foundation for subsequent sensitivity analysis.

2. Detection Effect Insensitivity Theorem (Section 5)

Theorem 5.3 (Mt Scaling Insensitivity): For the mid-age window ta, tb, if Mt is scaled to cMt (c is a constant), then:

sup_{t∈[ta,tb]} |hc(t) - h(t)| ≤ |c-1|C(ε₁ + ε₂ + 1/pmin)

Where ε₁, ε₂ ≪ 1 in mid-age, proving that detection threshold changes have weak effects.

Theorem 5.4 (α₃ Change Insensitivity):

sup_{t∈[ta,tb]} |h̃(t) - h(t)| ≤ C(ε₁ + ε₂)(||α̃₃ - α₃||∞ + ||α̃'₃ - α'₃||∞)

Even with order-of-magnitude changes in α₃, the impact remains weak when ε₁, ε₂ are small.

3. Decline and Rebound Conditions (Section 6)

Proposition 6.1 (Necessary Condition for Decline): To achieve c(t) < 0 (corresponding to incidence deceleration), we need:

μ₁(t) ≥ (β₁ - α₁(2x₃(t) - 1))/(1 - x₅(t))

However, this requires extremely large increases in μ₁, which is biologically unreasonable.

Proposition 6.5 (Achieving Decline via α₁): By reducing α₁ to α₁ - Δα in ta, tb, then:

cnew(t) - cbase(t) = -Δα(2x₃(t) - 1)

When x₃(t) > 1/2, modest reduction in α₁ can make c(t) < 0.

Proposition 6.7 (Achieving Rebound via β₁): By reducing β₁ to β₁ - Δβ after tb, there exists δ > 0 such that:

cnew(t) > 0, ∀t ∈ (tb, tb + δ]

Key insight: The α₁/β₁ ratio needs to decrease during mid-age (corresponding to progesterone decline), then increase after menopause (corresponding to adipose-derived growth signals).

4. Proof that c(t) Dominates q(t) (Section 7)

Theorem 7.5: When ∫W+ |f'(r)|dr ≪ 1 (small f changes in mid-age):

sup_{t∈W+} |δh'q(t)|/|δh'c(t)| = o(1)

Proof strategy:

  • For perturbations δq, responses are limited by f' integral (Proposition 7.2)
  • For perturbations δc, there exists a "homogeneous x₄ path" not limited by f' (Lemma 7.3)
  • Therefore, c(t) changes dominate mid-age dynamics

Experimental Setup

Datasets

  1. NORDCAN Database: Breast cancer incidence data from Denmark, Sweden, Finland, and Norway (5-year age groups)
  2. SEER Database: U.S. breast cancer incidence data (1-year age intervals)
  3. Selected Cohorts: Birth cohorts 1935-1939, 1940-1944, 1945-1949 (screening rate <29%, reducing detection bias)

Evaluation Metrics

  • Goodness of fit between model risk function h(t) and observed incidence rates
  • Biological reasonableness of parameter change magnitudes
  • Ability to reproduce Clemmesen's hook morphological characteristics (decline and rebound)

Comparison Methods

  • Comparative effects of changing different parameters (μ₀, μ₁, μ₂, α₁, α₂, β₁, β₂)
  • Verification of which parameter changes can produce hooks within reasonable ranges

Implementation Details

  1. Three-Stage Fitting:
    • Stage I (0 to ta): Estimate all parameters
    • Stage II (ta to tb): Fix other parameters, vary only one parameter θᵢ
    • Stage III (tb to age 60): Fix other parameters, vary only θⱼ
  2. Parameter Search Range: 0, 2θᵢ (allowing parameters to decrease to 0 or increase to 2-fold)
  3. Optimization Algorithm: Matlab's hybrid genetic algorithm (global optimization)
  4. Window Selection:
    • 1935-1939 cohort: 49, 54 years
    • 1940-1944 cohort: 46, 51 years
    • 1945-1949 cohort: 48, 54 years

Experimental Results

Main Results

1. Data Observations (Figure 1)

All five countries (Denmark, Sweden, Finland, Norway, USA) across all birth cohorts display pronounced Clemmesen's hooks:

  • Incidence rate slope suddenly decreases then increases within the 45-55 age window
  • Phenomenon is universal and consistent

2. Parameter Fitting Results (Figure 2)

Successfully reproduce Clemmesen's hooks for three cohorts through modest variations in α₁ and β₁:

CohortWindow ta, tbFit Quality
1935-193949, 54Good match of decline and rebound
1940-194446, 51Good match of decline and rebound
1945-194948, 54Good match of decline and rebound

Curve characteristics:

  • Orange section: Pre-hook growth period (baseline α₁ and β₁ values)
  • Blue section: Decline period (reduced α₁/β₁ ratio)
  • Green section: Rebound period (restored α₁/β₁ ratio)

3. Parameter Comparison Analysis

Numerical experiments confirm:

  • α₁ and β₁: Modest variations (within reasonable biological ranges) suffice to produce hooks
  • μ₁: Requires extreme variations (exceeding reasonable ranges) to produce effects
  • α₂, β₂, μ₂ (involving q(t)): Cannot produce hooks within reasonable ranges
  • μ₀: Effect is linear, inconsistent with the hook's nonlinear characteristics

Ablation Studies

Through theoretical analysis (Sections 6-7) and numerical experiments, verify component contributions:

  1. Detection Dynamics (f(t)):
    • Theorems 5.3 and 5.4 prove weak effects of Mt and α₃ changes
    • Mid-age ε₁, ε₂ ≈ O(e⁻⁴⁵) ≪ 1
    • Conclusion: Detection effects are not the primary cause of the hook
  2. Late-Stage Clonal Dynamics (q(t)):
    • Theorem 7.5 proves q perturbation responses are limited by ∫|f'|
    • Numerical experiments show α₂, β₂, μ₂ changes cannot produce hooks
    • Conclusion: Late-stage dynamics are insufficient to explain the hook
  3. Early-Stage Clonal Dynamics (c(t)):
    • Propositions 6.5 and 6.7 provide sufficient conditions
    • Numerical experiments confirm modest α₁ and β₁ variations are effective
    • Conclusion: Early proliferation/apoptosis balance is key

Experimental Findings

  1. Necessary Number of Mutation Stages: At least three-stage model required to reproduce observed age-incidence curves (consistent with Tomasetti et al.'s findings)
  2. Parameter Change Timing:
    • Mid-age (45-55): α₁/β₁ needs to decrease
    • Postmenopause: α₁/β₁ needs to recover or increase
    • Time window highly coincides with menopausal transition period
  3. Biological Interpretation:
    • α₁/β₁ decrease corresponds to declining progesterone levels and reduced menstrual cycles
    • α₁/β₁ recovery corresponds to dominance of adipose tissue-derived growth signals
  4. Robustness: Phenomenon is consistent across different countries and cohorts, indicating universality of underlying mechanisms

1. Epidemiological Studies

  • Clemmesen (1965): Foundational work first describing the hook phenomenon
  • Anderson et al. (2010): Similar patterns observed in male breast cancer
  • Gleason et al. (2012): Stratified analysis by ER/PR status
  • Collaborative Group (2012): Large-scale meta-analysis confirming association between menstrual cycle number and risk

2. Hormone and Breast Cancer Mechanisms

  • Coelingh Bennink et al. (2023): Proposes progesterone (not estrogen) as primary carcinogenic factor
  • Kim & Munster (2025): Review of estrogen's permissive role
  • An et al. (2022): Molecular mechanisms of progesterone promoting breast cancer via GPR126
  • Antoine et al. (2016): Relationship between hormone replacement therapy and incidence

3. Mathematical Modeling

  • Armitage & Doll (1954): Foundational multistage carcinogenesis theory
  • Moolgavkar et al. (1979, 1981): MSCE model framework
  • Brouwer et al. (2018): MSCE extension with time-dependent parameters
  • Meza et al. (2008): Application to lung cancer analysis
  • Mirzaei & Yang (2025): MSCE-T model (foundation of this work)

Advantages of This Work

  1. Rigorous Mathematical Analysis: First to provide sufficient and necessary conditions for generating the hook
  2. Mechanism Identification: Clearly distinguishes detection effects from true biological changes
  3. Quantitative Framework: Establishes quantitative connection between hormonal transitions and incidence curves
  4. Multi-Data Validation: Cross-country and cross-cohort verification

Conclusions and Discussion

Main Conclusions

  1. Mechanism of Clemmesen's Hook: Driven by time-specific changes in the proliferation/apoptosis balance of early-stage mutant cells (first stage) around menopause
  2. Biological Correspondence:
    • Decline period: Progesterone level decrease → α₁/β₁ decrease → weakened promotion
    • Rebound period: Adipose-derived growth signals → α₁/β₁ recovery → new promotion pathway
  3. Exclusion of Alternative Mechanisms:
    • Detection effects have weak impact (Theorems 5.3, 5.4)
    • Late-stage clonal dynamics cannot produce hooks (Theorem 7.5)
    • Mutation rate changes require unreasonable magnitudes (Proposition 6.1)
  4. Minimum Number of Stages: At least three mutation stages required to reproduce observed curves

Limitations

  1. Model Simplifications:
    • Parameters treated as deterministic and piecewise continuous, ignoring stochasticity in hormones or genetics
    • Microenvironment effects not considered (oxidative stress, immune activity, matrix stiffness)
    • Tissue-level heterogeneity not incorporated
  2. Data Limitations:
    • Selection of low-screening cohorts (<29%) to reduce detection bias, but residual confounding may persist
    • Age grouping (5-year) may obscure fine-scale dynamics
  3. Parameter Estimation:
    • Piecewise fitting strategy may introduce boundary effects
    • Biological interpretation of parameters requires further experimental validation
  4. Generalizability:
    • Model targets female breast cancer; applicability to male or other hormone-related cancers unknown
    • Different subtypes (ER+/ER-) may have different dynamics

Future Directions

  1. Model Extensions:
    • Integrate individual differences in hormone trajectories
    • Incorporate microenvironment and immune factors
    • Consider genetic susceptibility
  2. Experimental Validation:
    • In vitro experiments validating progesterone effects on proliferation/apoptosis of different-stage cells
    • Measurement of relevant biomarkers before and after menopause
  3. Clinical Applications:
    • Optimize screening strategies based on model
    • Evaluate preventive effects of hormonal interventions
    • Personalized risk prediction
  4. Cross-Cancer Analysis:
    • Apply to other hormone-related cancers (prostate, ovarian)
    • Explore common hormonal regulatory mechanisms

In-Depth Evaluation

Strengths

  1. Methodological Rigor (★★★★★):
    • Complete mathematical derivations (positivity, invariance, well-posedness)
    • Rigorous sensitivity analysis (Gâteaux derivatives, Volterra representation)
    • Clear characterization of sufficient and necessary conditions
  2. Theoretical Contributions (★★★★★):
    • First quantitative explanation of Clemmesen's hook
    • Establishes mathematical bridge between hormones and cancer incidence
    • Proves counterintuitive insensitivity of detection effects
  3. Biological Insights (★★★★☆):
    • Quantitative evidence supporting progesterone hypothesis
    • Identifies critical action stage (early vs. late)
    • Consistent with latest epidemiological evidence
  4. Experimental Validation (★★★★☆):
    • Multi-country, multi-cohort data
    • High consistency between theoretical predictions and numerical experiments
    • Parameter changes within biologically reasonable ranges
  5. Writing Clarity (★★★★☆):
    • Clear structure and logical flow
    • Detailed mathematical derivations
    • Clear biological motivation

Weaknesses

  1. Model Assumptions (★★★☆☆):
    • Piecewise continuous parameter assumption may be overly idealized
    • Individual heterogeneity and stochasticity ignored
    • Microenvironment factors not incorporated
  2. Experimental Design (★★★☆☆):
    • Piecewise fitting strategy may introduce bias
    • Lack of independent validation set
    • No comparison with alternative modeling approaches
  3. Biological Verification (★★☆☆☆):
    • Lacks direct experimental evidence (cell experiments, animal models)
    • Biological interpretation of parameters relies on indirect inference
    • Does not distinguish ER+ and ER- subtypes
  4. Clinical Operationalizability (★★★☆☆):
    • Unclear pathway from model to clinical application
    • Personalized prediction requires further work
    • Limited evaluation of intervention strategies

Impact

  1. Academic Impact (★★★★★):
    • Fills long-standing knowledge gap
    • Provides new paradigm for hormone-cancer research
    • May inspire subsequent modeling and experimental studies
  2. Practical Value (★★★★☆):
    • Provides theoretical basis for optimizing screening strategies
    • Supports hormonal intervention prevention strategies
    • Aids risk stratification
  3. Reproducibility (★★★★☆):
    • Complete mathematical derivations verifiable
    • Data publicly accessible
    • Code partially open (reference 17)
    • Full reproduction requires substantial mathematical and programming expertise

Applicable Scenarios

  1. Direct Applications:
    • Breast cancer epidemiology research
    • Menopausal-related cancer risk assessment
    • Risk-benefit analysis of hormone replacement therapy
  2. Extensible Applications:
    • Other hormone-related cancers (prostate, ovarian, endometrial)
    • Other diseases exhibiting age-related non-monotonic patterns
    • Life course epidemiology studies
  3. Methodological Reference:
    • Multistage disease process modeling
    • Separation of detection effects from true risk
    • Application of Volterra integral methods in biology

Key References

  1. Clemmesen J (1965): Foundational work first describing the hook phenomenon
  2. Coelingh Bennink HJT et al. (2023): Latest review of progesterone carcinogenesis hypothesis
  3. Moolgavkar SH & Knudson AG (1981): Classical paper on MSCE model
  4. Tomasetti C et al. (2015): Estimates of number of driver gene mutations
  5. Mirzaei NM & Yang W (2025): Prior work on MSCE-T model

Overall Assessment: This is a high-quality interdisciplinary research paper combining rigorous mathematical analysis with an important biomedical problem. The theoretical derivations are solid, biological insights are profound, and it provides a new quantitative framework for understanding age patterns in breast cancer incidence. While there remains room for improvement in experimental validation and clinical translation, its academic value and potential impact are significant. Particularly noteworthy is how the paper uses mathematical methods to definitively exclude detection effects and late-stage dynamics, providing strong quantitative support for the progesterone hypothesis.