2025-11-11T12:52:09.463911

The Splendors and Miseries of Heavisidisation

Dolotin, Morozov
Machine Learning (ML) is applicable to scientific problems, i.e. to those which have a well defined answer, only if this answer can be brought to a peculiar form ${\cal G}: X\longrightarrow Z$ with ${\cal G}(\vec x)$ expressed as a combination of iterated Heaviside functions. At present it is far from obvious, if and when such representations exist, what are the obstacles and, if they are absent, what are the ways to convert the known formulas into this form. This gives rise to a program of reformulation of ordinary science in such terms -- which sounds like a strong enhancement of the constructive mathematics approach, only this time it concerns all natural sciences. We describe the first steps on this long way.
academic

The Splendors and Miseries of Heavisidisation

Basic Information

  • Paper ID: 2205.07377
  • Title: The Splendors and Miseries of Heavisidisation
  • Authors: V. Dolotin, A. Morozov
  • Institution: MIPT, ITEP & IITP, Moscow, Russia
  • Classification: hep-th (High Energy Physics Theory), cs.LG (Machine Learning)
  • Publication Date: May 15, 2022
  • Paper Link: https://arxiv.org/abs/2205.07377

Abstract

Machine learning (ML) can only be applied to scientific problems when the problem has a definite answer that can be expressed as a mapping G:XZG: X \rightarrow Z (where G(x)G(\vec{x}) can be expressed as a combination of iterated Heaviside functions). It remains unclear when such representations exist, what obstacles arise, and how to convert known formulas into this form when no obstacles are present. This raises the question of reformulating ordinary science using this terminology—a procedure that sounds like an enhanced version of constructive mathematics, but this time involving all natural sciences. This paper describes the first steps along this long road.

Research Background and Motivation

Problem Statement

The core problem addressed in this paper is: How can machine learning methods be effectively applied to scientific problems with definite answers? The authors point out that traditional machine learning is primarily used for classification problems (such as image recognition and decision problems), but extending it to genuine scientific problems faces fundamental obstacles.

Problem Significance

The importance of this problem lies in:

  1. Revolutionary needs in scientific computing: Extending machine learning from big data analysis and computational experiments to genuine scientific discovery
  2. Enhancement of constructive mathematics: Providing a framework for reformulating all natural sciences in a constructive manner
  3. Bridge between artificial intelligence and science: Exploring whether machines can discover and understand scientific laws

Limitations of Existing Approaches

  1. Limitations of gradient descent methods: Current ML methodology's gradient descent is only applicable to specific function representations
  2. Specificity of scientific problems: Scientific problems have "objective" answers, unlike general pattern recognition problems
  3. Constraints on representation form: Scientific formulas must be converted into the form of iterated Heaviside functions

Core Contributions

  1. Introduced the concept of "Heavisidisation": A systematic method for representing scientific problem answers as combinations of iterated Heaviside functions
  2. Established Heaviside representations of basic operations: Including logical operations, arithmetic operations, zero-point detection, and other fundamental building blocks
  3. Explored Heavisidisation of algebraic numbers: Attempted to convert problems such as solving quadratic equations into Heaviside function representations
  4. Analyzed applicability of gradient descent methods: Investigated convergence properties of machine learning algorithms under Heaviside representations
  5. Revealed gauge invariance issues: Discovered and analyzed the gauge freedom problem in the Heavisidisation process

Detailed Methodology

Task Definition

Input: Scientific problems with definite answers, expressed as a mapping G:XZG: X \rightarrow Z
Output: A Heaviside function iterated representation of this mapping
Constraints: Must use a parameterized form optimizable by gradient descent methods

Basic Properties of the Heaviside Function

The authors define the Heaviside function as: θ(x)={1if x>00if x0\theta(x) = \begin{cases} 1 & \text{if } x > 0 \\ 0 & \text{if } x \leq 0 \end{cases}

Key properties:

  • Idempotence: θ(θ(x))=θ(x)\theta(\theta(x)) = \theta(x)
  • Logical operation implementation:
    • AND: (a,b):=θ(θ(a)+θ(b)1)\wedge(a,b) := \theta(\theta(a) + \theta(b) - 1)
    • OR: (a,b):=θ(θ(a)+θ(b))\vee(a,b) := \theta(\theta(a) + \theta(b))

Heavisidisation of Basic Operations

1. Identity Function

For integer xx: x=I(x):=i=0θ(xi)i=0θ(xi)x = I(x) := \sum_{i=0}^{\infty} \theta(x-i) - \sum_{i=0}^{\infty} \theta(-x-i)

2. Addition

x+y=I(x)+I(y)=i=0θ(xi)+j=0θ(yj)x + y = I(x) + I(y) = \sum_{i=0}^{\infty} \theta(x-i) + \sum_{j=0}^{\infty} \theta(y-j)

3. Multiplication

xy=i,jθ(θ(xi)+θ(yj)1)=i,j(xi,yj)x \cdot y = \sum_{i,j} \theta(\theta(x-i) + \theta(y-j) - 1) = \sum_{i,j} \wedge(x-i, y-j)

4. Root Extraction

x1/n=i=0θ(xin)x^{1/n} = \sum_{i=0}^{\infty} \theta(x - i^n)

Zero-Point Detection Methods

One-Dimensional Case

For detecting zeros of function f(x)f(x) between grid points ii and i+1i+1: δi(f):=(θ(fi+1)θ(fi),θ(fi)θ(fi+1))\delta_i(f) := \vee(\theta(f_{i+1}) - \theta(f_i), \theta(f_i) - \theta(f_{i+1}))

Two-Dimensional Case

Detecting common zeros of functions f,gf,g within a square region: δi,j(f,g)=(δij(f),δij(g))\delta_{i,j}(f,g) = \wedge(\delta_{ij}(f), \delta_{ij}(g))

Zero-point location approximation: (ijiNδi,j(f,g),ijjNδi,j(f,g))\left(\sum_{ij} \frac{i}{N}\delta_{i,j}(f,g), \sum_{ij} \frac{j}{N}\delta_{i,j}(f,g)\right)

Sector Functions and Classification Problems

One-Dimensional Sector

Characteristic function of interval [2,3][2,3]: G(x)=θ(x2)θ(x3)G(x) = \theta(x-2) - \theta(x-3)

Two-Dimensional Sector

Characteristic function of the first quadrant: G(x1,x2)=θ(θ(x1)+θ(x2)1)+1G(x_1,x_2) = -\theta(\theta(-x_1) + \theta(-x_2) - 1) + 1

General (n+1)(n+1)-Dimensional Sector

G(x)=θ(i=0nθ(xi)n)G(x) = \theta\left(\sum_{i=0}^n \theta(x_i) - n\right)

Experimental Setup

TensorFlow Implementation

The authors used TensorFlow for practical computation but noted the gap between theory and practice:

  1. Activation function selection: Used sigmoid function 11+exp(20x)\frac{1}{1+\exp(-20x)} to approximate the Heaviside function
  2. Training strategy: Employed stochastic gradient descent with one training sample per step
  3. Network architecture: Tested single-layer and two-layer network structures

Experimental Configuration

  • Network nodes: 10-node single-layer network
  • Training epochs: 2000 epochs
  • Optimizer: Adam optimizer
  • Loss function: Mean absolute percentage error

Experimental Results

Identity Function Learning

Experiments verified that the network can learn the Heaviside representation of the identity function. Figure 1 shows bias values converging from initial state (blue dots) to the expected linear arrangement (orange dots).

Quadratic Function Mapping

In learning the mapping f(b,c)=b2+cf(b,c) = b^2 + c:

  • Two-layer network (3 and 30 nodes)
  • 40 training samples with domain [0,2]×[0,2][0,2] \times [0,2]
  • Good matching achieved after 4000 training epochs

Differences Between Heaviside and Smooth Functions

Experiments revealed that using smooth sigmoid functions for training, when parameters are applied to true Heaviside functions, results show significant differences, particularly in the second layer network.

The paper cites research from the following related fields:

  1. Constructive mathematics: Viewing Heavisidisation as an enhancement of constructive mathematical methods
  2. Computational physics: Distinction from big data analysis and computational experiments
  3. Resultant theory: Connection with algebraic numbers and discriminant calculations
  4. Machine learning theory: Mathematical foundations of gradient descent methods

Conclusions and Discussion

Main Conclusions

  1. Feasibility of Heavisidisation: Demonstrated that many fundamental mathematical operations can be expressed as iterations of Heaviside functions
  2. Three classes of core problems:
    • A) Heavisidisation of various problems (constructive)
    • B) Discovery of algebraic formulas (conceptual)
    • C) Distinguishing reasonable from unreasonable answers (conceptual)

Limitations

  1. Gauge invariance problem: Multiple equivalent Heaviside representations exist, requiring selection of appropriate gauge
  2. Convergence issues: Gradient descent may not find the correct answer even when a Heaviside representation exists
  3. Need for human intervention: Practical applications still require substantial human expertise and techniques
  4. Smoothing effects: Function smoothing in numerical computation affects result accuracy

Future Directions

  1. Heavisidisation of higher-degree equations: Extension to cubic, quartic, and higher-order equations
  2. More complex algebraic structures: Exploration of Heaviside representations of discriminants, resultants, etc.
  3. Mechanization of scientific taste: Investigation of whether machines can develop scientific aesthetics similar to humans

In-Depth Evaluation

Strengths

  1. Conceptual innovation: Introduces the novel concept of "Heavisidisation," opening new perspectives for applying machine learning to science
  2. Theoretical depth: Systematically constructs an operational system of Heaviside functions from mathematical foundations
  3. Interdisciplinary perspective: Organically combines machine learning, mathematical physics, and constructive mathematics
  4. Practical verification: TensorFlow experiments validate theoretical feasibility

Weaknesses

  1. Limited application scope: Currently handles only relatively simple mathematical problems, far from genuine scientific discovery
  2. Computational complexity: Heaviside representations often require infinite series, necessitating truncation in practical computation
  3. Lack of convergence guarantees: No theoretical guarantees provided for gradient descent convergence to correct solutions
  4. Blurred human-machine boundaries: Experiments still require substantial human intervention, failing to achieve true automation

Impact

  1. Theoretical contribution: Provides new perspectives on mathematical foundations of machine learning
  2. Methodological value: Heavisidisation methods may inspire solutions to other scientific computing problems
  3. Philosophical significance: Touches on the deep question of whether artificial intelligence can possess scientific creativity

Applicable Scenarios

  1. Symbolic computation: Suitable for mathematical problems requiring precise symbolic representation
  2. Constructive proofs: Applicable to mathematical proofs requiring constructive methods
  3. Science education: Can serve as a teaching tool for understanding mathematical foundations of machine learning

Technical Innovation Points

Key Innovations

  1. Iterated Heaviside representation: Decomposing complex functions into combinations of simple step functions
  2. Operationalization of networks: Converting traditional mathematical operations into forms processable by neural networks
  3. Zero-point detection algorithm: Providing systematic methods for detecting function zeros on discrete grids
  4. Application of gauge theory: Introducing the concept of gauge invariance from physics into machine learning

Mathematical Framework

The paper establishes a complete hierarchical structure from basic Heaviside functions to complex mathematical operations: HeavisideLogical OperationsArithmetic OperationsAlgebraic OperationsScientific Problems\text{Heaviside} \rightarrow \text{Logical Operations} \rightarrow \text{Arithmetic Operations} \rightarrow \text{Algebraic Operations} \rightarrow \text{Scientific Problems}

This layered construction provides a systematic mathematical foundation for machine learning to address scientific problems.

References

The paper cites the following important literature:

  1. Gelfand, Kapranov, Zelevinsky: "Discriminants, Resultants, and Multidimensional Determinants"
  2. Dolotin, Morozov: "Introduction to Non-Linear Algebra"
  3. Morozov, Shakirov: "New and Old Results in Resultant Theory"
  4. Ruelle: "Post-human Mathematics"

Overall Assessment: This is a highly original and theoretically profound paper that attempts to establish new mathematical foundations for applying machine learning to science. While current results are preliminary, the proposed Heavisidisation concept and methodology possess significant theoretical value and inspirational significance. The paper's interdisciplinary nature and consideration of philosophical questions about artificial intelligence grant it academic value transcending the technical level.