2025-11-13T11:46:11.189224

Representation Theorem for Matrix Product States

Guo, Draper

In this work, we investigate the universal representation capacity of the Matrix Product States (MPS) from the perspective of boolean functions and continuous functions. We show that MPS can accurately realize arbitrary boolean functions by providing a construction method of the corresponding MPS structure for an arbitrarily given boolean gate. Moreover, we prove that the function space of MPS with the scale-invariant sigmoidal activation is dense in the space of continuous functions defined on a compact subspace of the $n$-dimensional real coordinate space $\mathbb{R^{n}}$. We study the relation between MPS and neural networks and show that the MPS with a scale-invariant sigmoidal function is equivalent to a one-hidden-layer neural network equipped with a kernel function. We construct the equivalent neural networks for several specific MPS models and show that non-linear kernels such as the polynomial kernel which introduces the couplings between different components of the input into the model appear naturally in the equivalent neural networks. At last, we discuss the realization of the Gaussian Process (GP) with infinitely wide MPS by studying their equivalent neural networks.

academic

Representation Theorem for Matrix Product States

Basic Information

Paper ID: 2103.08277
Title: Representation Theorem for Matrix Product States
Authors: Erdong Guo, David Draper (University of California, Santa Cruz)
Classification: stat.ML cs.LG cs.NE quant-ph
Publication Date: March 15, 2021 (arXiv preprint)
Paper Link: https://arxiv.org/abs/2103.08277

Abstract

This paper investigates the universal representation capacity of Matrix Product States (MPS) from the perspectives of Boolean functions and continuous functions. The authors prove that MPS can exactly implement arbitrary Boolean functions and provide constructive methods for building corresponding MPS structures for any given Boolean gate. Furthermore, they demonstrate that the function space of MPS with scale-invariant sigmoid activation functions is dense in the space of continuous functions defined on compact subsets of n-dimensional real coordinate space. The relationship between MPS and neural networks is studied, showing that MPS with scale-invariant sigmoid functions is equivalent to single-hidden-layer neural networks equipped with kernel functions. Finally, the realization of Gaussian processes (GP) by infinite-width MPS is discussed through the study of equivalent neural networks.

Research Background and Motivation

Problem Background

Rise of Tensor Networks: Tensor networks serve as a powerful graphical language for representing many-body quantum systems and have found widespread applications across quantum information, condensed matter physics, applied mathematics, and computer science.
Representation Capacity of MPS: While MPS possesses important physical significance in quantum physics, a natural question arises when treating it as an algebraic tool in machine learning: how powerful is MPS as an algebraic machine in terms of representation capacity?
Need for Universal Approximation Theory: Similar to the universal approximation theorem for neural networks, theoretical proof of MPS's representational boundaries is needed.

Research Motivation

Filling Theoretical Gaps: Existing research primarily focuses on the physical properties of MPS, lacking theoretical analysis of its role as a function approximator.
Establishing Connections Between MPS and Neural Networks: Exploring the equivalence relationships between MPS and classical machine learning models, particularly neural networks.
Practical Considerations: In practical applications, complete bases are typically infinite-dimensional, necessitating investigation of how large a function space MPS can "span" under mild assumptions.

Core Contributions

Exact Representation of Boolean Functions: Proves that MPS can exactly implement arbitrary Boolean functions and provides constructive proof methods.
Universal Approximation of Continuous Functions: Demonstrates that the function space of MPS with scale-invariant sigmoid activation is dense in the continuous function space (with respect to the supremum norm).
Equivalence Between MPS and Neural Networks: Establishes the equivalence relationship between MPS and single-hidden-layer neural networks, revealing the natural emergence of kernel functions in MPS.
Realization of Gaussian Processes: Discusses the implementation of Gaussian processes through infinite-width MPS.

Detailed Methods

MPS Model Definition

Standard MPS Structure

The original MPS model is defined as: $\Psi_l(x|w,B) = \sum_{\{\alpha,s\}} A^{s_1}_{\alpha_1\alpha_2} \cdots A^{s_i}_{l\alpha_i\alpha_{i+1}} \cdots A^{s_n}_{\alpha_n\alpha_1} \Phi^{s_1\cdots s_n}(x)$

where the kernel function is defined as: $\Phi^{s_1\cdots s_n}(x) = \phi^{s_1}(x_1) \otimes \cdots \otimes \phi^{s_i}(x_i) \cdots \otimes \phi^{s_n}(x_n)$

Modified MPS Structure

To achieve universal approximation, the authors propose a modified MPS structure: $\Psi(x|w,B) = \sum_l \sigma\left(\sum_{\{\alpha,s\}} A^{s_1}_{\alpha_1\alpha_2} \cdots A^{s_i}_{l\alpha_i\alpha_{i+1}} \cdots A^{s_n}_{\alpha_n\alpha_1} \Phi^{s_1\cdots s_n}(x)\right)$

where $\sigma(\cdot)$ is a scale-invariant sigmoid function: $\sigma(x) \to \begin{cases} 0 & x \to -\infty \\ C & x \to +\infty \end{cases}$

Boolean Function Representation Methods

Implementation of Basic Boolean Gates

AND Gate Implementation (Theorem 2.1):

Kernel function: $\phi_i(X_i) = [X_i, 1-X_i]$
Tensor node: $A^{s_i} = [1, 0]$ , bond dimension $|\alpha| = 1$

OR Gate Implementation (Theorem 2.2):

Kernel function: $\phi_i(X_i) = [X_i, 1-X_i]$
Tensor node bond dimension: $|\alpha| = 3$
Specific tensor structure: $A^{s_1}_{\alpha_1\alpha_2} = [[1, 0, 1], [0, 1, 0]]$ $A^{s_2}_{\alpha_2\alpha_1} = [[0, 1, 1], [1, 0, 0]]$

NOT Gate Implementation (Theorem 2.3):

Kernel function: $\phi_1(X_1) = 1-X_1$
Tensor node: $A^{s_1} = 1$

Universal AND Gate and Arbitrary Boolean Functions

Universal AND Gate (Theorem 2.4): For $n$ input variables, the following can be implemented: $\Psi(X_1, \cdots, X_n) = (\bigwedge_{i=1}^l X_i) \bigwedge (\bigwedge_{j=l+1}^n \overline{X_j})$

Arbitrary Boolean Functions (Theorem 2.5): By expressing arbitrary Boolean functions as disjunctive normal forms corresponding to truth tables, the corresponding MPS can be constructed. Construction rules:

Write the Boolean function in disjunctive normal form from its truth table
Set the bond dimension equal to the number of disjunctive terms $m$
Fill tensor elements according to specific rules

Continuous Function Approximation Theory

Main Theorem (Theorem 3.1)

The MPS function space is dense in $C_0(I^n)$ (the space of continuous functions on the unit hypercube), meaning for any $f(x) \in C_0(I^n)$ and any $\varepsilon > 0$ , there exists an MPS function $\Psi(x)$ such that: $\sup_x |\Psi(x) - f(x)| < \varepsilon$

Proof Techniques

Linearity Proof (Lemma 3.2): Proves that the MPS function family $\mathcal{M}$ is a linear subspace of $C_0(I^n)$ :

Closure under scalar multiplication: utilizing scale-invariance
Closure under addition: constructing new MPS representing the sum of two MPS

Discriminating Property (Lemma 3.4): Proves that the scale-invariant sigmoid function possesses the discriminating property: if a finite signed measure $\mu$ makes the integral of all MPS functions zero, then $\mu = 0$ .

Main Theorem Proof: Uses proof by contradiction with Hahn-Banach theorem and Riesz representation theorem:

Assume $\overline{\mathcal{M}}$ is a proper subset of $C_0(I^n)$
By Hahn-Banach theorem, there exists a non-trivial functional annihilating $\overline{\mathcal{M}}$
By Riesz representation theorem, this corresponds to a non-trivial measure
By the discriminating property, this measure must be zero, yielding a contradiction

Equivalence Between MPS and Neural Networks

Equivalence Theorem (Theorem 3.5)

MPS with scale-invariant sigmoid activation is equivalent to single-hidden-layer neural networks equipped with kernel functions.

Conversion Method

By contracting internal indices $\{\alpha_i\}$ , MPS can be written as: $\Psi(x) = \sum_l \sigma\left(\sum_s W^l_s \Phi^s(x)\right)$

This is precisely the form of a single-hidden-layer neural network, where:

$W^l_s$ are weight parameters
$\Phi^s(x)$ are kernel functions, naturally introducing coupling between input components

Natural Emergence of Kernel Functions

Concrete examples demonstrate how nonlinear kernels such as polynomial kernels naturally emerge in equivalent neural networks, for instance: $(\Phi^s)^T = [x_1x_2x_3, x_2x_3, x_1x_3, x_1x_2, x_1, x_2, x_3, 1]$

Experimental Results and Case Studies

Boolean Function Implementation Cases

3-Input OR Gate: Boolean expression: $f(X_1,X_2,X_3) = X_1 \vee X_2 \vee X_3$ The corresponding MPS tensor structure is detailed in the methods section.

3-Input Parity Gate: Boolean expression: $f(X_1,X_2,X_3) = X_1 \oplus X_2 \oplus X_3$ Equivalent neural network weights: $W^s = [1, 0, 0, 1, 0, 1, 1, 0]$

Threshold Gate Th₃²: A threshold function that outputs 1 when at least 2 inputs are 1.

Complexity Analysis

For $n$ -input Boolean gates, the bond dimension requires $O(2^n)$ in extreme cases, but can be reduced to $O(2^{n-1})$ through Karnaugh map simplification, with total parameters $O(n2^{n-1})$ , comparable to single-hidden-layer neural network efficiency.

Tensor Network Theory Foundations

Penrose (1971) graphical notation system for tensor calculus
Vidal (2003, 2007) Schmidt decomposition and DMRG methods
Perez-Garcia et al. (2006) systematic study of MPS properties

Tensor Networks in Machine Learning

Stoudenmire & Schwab (2016) supervised learning applications
Various tensor network applications in regression, classification, and density estimation

Universal Approximation Theory

Classical work by Cybenko (1989), Hornik (1991) on neural network universal approximation
This paper employs similar functional analysis techniques

Conclusions and Discussion

Main Conclusions

Theoretical Completeness: MPS possesses the capacity to represent arbitrary Boolean functions and approximate arbitrary continuous functions
Equivalence Revelation: MPS is essentially equivalent to single-hidden-layer neural networks with kernel functions
Kernel Function Importance: The kernel function $\Phi^{s_1\cdots s_n}$ is key to MPS's representational capacity, rather than the bond indices $\{\alpha_i\}$

Limitations

Practical Concerns: MPS implementation of Boolean functions requires exponential bond dimensions
Loss of Physical Meaning: As a pure algebraic tool, MPS loses important quantum physical properties such as entanglement
Kernel Function Design: Careful design of kernel functions is needed to achieve sufficient representational capacity

Future Directions

Efficient Construction Methods: Seeking more efficient MPS construction methods to reduce complexity
Deep Structures: Exploring multi-layer MPS structures, analogous to deep neural networks
Quantum Advantage: Exploring unique advantages of MPS in quantum computing environments

In-Depth Evaluation

Strengths

Significant Theoretical Contribution: First systematic analysis of MPS representational capacity from a function approximation perspective
Rigorous Proofs: Uses classical functional analysis tools with rigorous proof procedures
Insightful Connections: Reveals deep connections between MPS and neural networks, providing bridges for cross-disciplinary understanding
Constructive Proofs: Not only proves existence but also provides concrete construction methods

Weaknesses

Limited Practical Applicability: Theoretical results may face the curse of dimensionality in practical applications
Insufficient Experimental Validation: Lacks large-scale numerical experiments to verify theoretical results
Missing Optimization Algorithms: Does not discuss how to efficiently train such MPS models
Inadequate Comparative Analysis: Lacks detailed comparative analysis with other universal approximators

Impact

High Theoretical Value: Provides solid theoretical foundation for tensor network applications in machine learning
Cross-Disciplinary Significance: Connects quantum physics and machine learning domains
Strong Inspirational Value: Provides important reference for subsequent research on tensor network representation capacity and optimization methods

Applicable Scenarios

Theoretical Research: Suitable as foundational literature for tensor network representation theory
Educational Purposes: Can be used to explain the relationship between MPS and neural networks
Algorithm Design: Provides theoretical guidance for designing MPS-based machine learning algorithms
Quantum Machine Learning: Provides theoretical support for quantum machine learning algorithm design

References

This paper cites important literature from multiple fields including tensor networks, quantum information, machine learning, and functional analysis, including:

Tensor network foundational theory (Penrose, 1971; Vidal, 2007; Perez-Garcia et al., 2006)
Neural network universal approximation theory (Cybenko, 1989; Hornik, 1991)
Tensor network applications in machine learning (Stoudenmire & Schwab, 2016; et al.)
Functional analysis theoretical foundations (Folland, 2013)