2025-11-16T06:16:12.477685

Approximation theory for 1-Lipschitz ResNets

Murari, Furuya, SchÃ¶nlieb

1-Lipschitz neural networks are fundamental for generative modelling, inverse problems, and robust classifiers. In this paper, we focus on 1-Lipschitz residual networks (ResNets) based on explicit Euler steps of negative gradient flows and study their approximation capabilities. Leveraging the Restricted Stone-Weierstrass Theorem, we first show that these 1-Lipschitz ResNets are dense in the set of scalar 1-Lipschitz functions on any compact domain when width and depth are allowed to grow. We also show that these networks can exactly represent scalar piecewise affine 1-Lipschitz functions. We then prove a stronger statement: by inserting norm-constrained linear maps between the residual blocks, the same density holds when the hidden width is fixed. Because every layer obeys simple norm constraints, the resulting models can be trained with off-the-shelf optimisers. This paper provides the first universal approximation guarantees for 1-Lipschitz ResNets, laying a rigorous foundation for their practical use.

academic

1-Lipschitz ResNetsの近似理論

基本情報

論文ID: 2505.12003
タイトル: Approximation theory for 1-Lipschitz ResNets
著者: Davide Murari（ケンブリッジ大学）、Takashi Furuya（同志社大学、理研AIP）、Carola-Bibiane Schönlieb（ケンブリッジ大学）
分類: cs.LG cs.NA math.NA
発表会議: 第39回ニューラル情報処理システム会議（NeurIPS 2025）
論文リンク: https://arxiv.org/abs/2505.12003v2

要約

本論文は、負勾配流の明示的オイラーステップに基づく1-Lipschitz残差ネットワーク（ResNets）の近似能力を研究している。制限付きStone-Weierstrass定理を利用して、幅と深さの増加を許容する場合、これらの1-Lipschitz ResNetsが任意のコンパクト領域上のスカラー1-Lipschitz関数の集合で稠密であることを証明した。さらに、これらのネットワークがスカラー区分的アフィン1-Lipschitz関数を正確に表現できることを示した。残差ブロック間にノルム制約付き線形写像を挿入することで、隠れ層の幅が固定されている場合でも同じ稠密性を保つことができるというより強い結論も証明した。各層が単純なノルム制約に従うため、得られたモデルは既存の最適化器で訓練可能である。

研究背景と動機

問題の重要性

1-Lipschitz神経ネットワークは複数の重要な分野で基礎的な役割を果たしている：

生成モデリング：Wasserstein GANの判別器は1-Lipschitz制約を満たす必要があり、Kantorovich-Rubinstein双対性を通じて1-Wasserstein距離の有効な推定値を提供する
逆問題：Plug-and-Play アルゴリズムにおいて、1-Lipschitz制約は反復スキームの収束性を保証する
ロバスト分類器：ネットワークのLipschitz定数を制御することで、敵対的攻撃に対するロバスト性を向上させることができる

既存手法の限界

表現能力の低下：ネットワークのLipschitz定数を制約することは通常、その表現能力を低下させ、性能の著しい低下をもたらす
理論的空白：制約付きネットワークの近似特性に関する理解が不足しており、異なる制約戦略は大きく異なる表現能力をもたらす可能性がある
実装の困難さ：既存の1-Lipschitz ResNetsは厳密な理論的保証を欠いている

研究の動機

本論文は、1-Lipschitz ResNetsの理論的分析の空白を埋め、このクラスのネットワークの近似能力を理解するための厳密な数学的基礎を提供し、実際の応用に理論的支援を与えることを目指している。

主要な貢献

初の汎用近似定理：1-Lipschitz ResNetsに対する初の汎用近似保証を提供し、負勾配流に基づくResNetsがスカラー1-Lipschitz関数の集合で稠密であることを証明した
固定幅での近似結果：ノルム制約付き線形写像を導入することで、ネットワーク幅が固定されている場合でも汎用近似特性を保つことができることを証明した
構成的証明方法：制限付きStone-Weierstrass定理に基づく方法と区分的アフィン関数の構成的方法の2つの証明戦略を提供した
実用的なアーキテクチャ設計：明確な制約条件を持つネットワークアーキテクチャを提案し、標準的な最適化器で訓練可能である

方法の詳細

問題の定義

コンパクト集合 $X \subset \mathbb{R}^d$ 上の1-Lipschitz関数空間を研究する： $C_1(X,\mathbb{R}) = \{g : X \to \mathbb{R} \mid \|g(y) - g(x)\|_2 \leq \|y - x\|_2, \forall x,y \in X\}$

目標は、 $C_1(X,\mathbb{R})$ で稠密となるニューラルネットワークの集合を構築することである。

中核的な構成要素

1-Lipschitz残差層

負勾配流の明示的オイラーステップに基づく： $\Phi_{\theta_\ell}(x) = x - \tau_\ell W_\ell^T \sigma(W_\ell x + b_\ell)$

ここで $\sigma = \text{ReLU}$ 、制約条件： $0 \leq \tau_\ell \leq 2/\|W_\ell\|_2^2$ 、 $\|W_\ell\|_2 \leq 1$

ネットワークアーキテクチャの定義

無制限の幅と深さを持つネットワークの集合： $\mathcal{G}_{d,\sigma}(X,\mathbb{R}) = C_1(X,\mathbb{R}) \cap \{v^T \circ \Phi_{\theta_L} \circ \cdots \circ \Phi_{\theta_1} \circ Q : X \to \mathbb{R}\}$

固定幅のネットワークの集合： $\tilde{\mathcal{G}}_{d,\sigma,h}(X,\mathbb{R}) = \{v^T \circ \Phi_{\theta_L} \circ A_{L-1} \circ \cdots \circ A_1 \circ \Phi_{\theta_1} \circ Q : X \to \mathbb{R}\}$

ここで $A_i$ はノルム制約付きアフィン写像である。

技術的な革新点

1. 二重証明戦略

Stone-Weierstrass方法：ネットワークの集合が点を分離する格であり、制限付きStone-Weierstrass定理の条件を満たすことを検証する
構成的方法：ネットワークがすべての区分的アフィン1-Lipschitz関数を正確に表現できることを証明する

2. 固定幅での革新的設計

特殊な残差層構造を導入することで： $\tilde{\mathcal{E}}_{h,\sigma} = \left\{\Phi_\theta : \mathbb{R}^{h+3} \to \mathbb{R}^{h+3} \mid \Phi_\theta(x) = \begin{bmatrix} \max\{x_1, x_2\} \\ \min\{x_1, x_2\} \\ x_3 \\ \tilde{\Phi}_\theta(x_{4:}) \end{bmatrix}\right\}$