2025-11-11T12:52:09.463911

The Splendors and Miseries of Heavisidisation

Dolotin, Morozov

Machine Learning (ML) is applicable to scientific problems, i.e. to those which have a well defined answer, only if this answer can be brought to a peculiar form ${\cal G}: X\longrightarrow Z$ with ${\cal G}(\vec x)$ expressed as a combination of iterated Heaviside functions. At present it is far from obvious, if and when such representations exist, what are the obstacles and, if they are absent, what are the ways to convert the known formulas into this form. This gives rise to a program of reformulation of ordinary science in such terms -- which sounds like a strong enhancement of the constructive mathematics approach, only this time it concerns all natural sciences. We describe the first steps on this long way.

academic

The Splendors and Miseries of Heavisidisation

Basic Information

Paper ID: 2205.07377
Title: The Splendors and Miseries of Heavisidisation
Authors: V. Dolotin, A. Morozov
Institution: MIPT, ITEP & IITP, Moscow, Russia
Classification: hep-th (High Energy Physics Theory), cs.LG (Machine Learning)
Publication Date: May 15, 2022
Paper Link: https://arxiv.org/abs/2205.07377

Abstract

Machine learning (ML) can only be applied to scientific problems when the problem has a definite answer that can be expressed as a mapping $G: X \rightarrow Z$ (where $G(\vec{x})$ can be expressed as a combination of iterated Heaviside functions). It remains unclear when such representations exist, what obstacles arise, and how to convert known formulas into this form when no obstacles are present. This raises the question of reformulating ordinary science using this terminology—a procedure that sounds like an enhanced version of constructive mathematics, but this time involving all natural sciences. This paper describes the first steps along this long road.

Research Background and Motivation

Problem Statement

The core problem addressed in this paper is: How can machine learning methods be effectively applied to scientific problems with definite answers? The authors point out that traditional machine learning is primarily used for classification problems (such as image recognition and decision problems), but extending it to genuine scientific problems faces fundamental obstacles.

Problem Significance

The importance of this problem lies in:

Revolutionary needs in scientific computing: Extending machine learning from big data analysis and computational experiments to genuine scientific discovery
Enhancement of constructive mathematics: Providing a framework for reformulating all natural sciences in a constructive manner
Bridge between artificial intelligence and science: Exploring whether machines can discover and understand scientific laws

Limitations of Existing Approaches

Limitations of gradient descent methods: Current ML methodology's gradient descent is only applicable to specific function representations
Specificity of scientific problems: Scientific problems have "objective" answers, unlike general pattern recognition problems
Constraints on representation form: Scientific formulas must be converted into the form of iterated Heaviside functions

Core Contributions

Introduced the concept of "Heavisidisation": A systematic method for representing scientific problem answers as combinations of iterated Heaviside functions
Established Heaviside representations of basic operations: Including logical operations, arithmetic operations, zero-point detection, and other fundamental building blocks
Explored Heavisidisation of algebraic numbers: Attempted to convert problems such as solving quadratic equations into Heaviside function representations
Analyzed applicability of gradient descent methods: Investigated convergence properties of machine learning algorithms under Heaviside representations
Revealed gauge invariance issues: Discovered and analyzed the gauge freedom problem in the Heavisidisation process

Detailed Methodology

Task Definition

Input: Scientific problems with definite answers, expressed as a mapping $G: X \rightarrow Z$
Output: A Heaviside function iterated representation of this mapping
Constraints: Must use a parameterized form optimizable by gradient descent methods

Basic Properties of the Heaviside Function

The authors define the Heaviside function as: $\theta(x) = \begin{cases} 1 & \text{if } x > 0 \\ 0 & \text{if } x \leq 0 \end{cases}$

Key properties:

Idempotence: $\theta(\theta(x)) = \theta(x)$
Logical operation implementation:
- AND: $\wedge(a,b) := \theta(\theta(a) + \theta(b) - 1)$
- OR: $\vee(a,b) := \theta(\theta(a) + \theta(b))$

Heavisidisation of Basic Operations

1. Identity Function

For integer $x$ : $x = I(x) := \sum_{i=0}^{\infty} \theta(x-i) - \sum_{i=0}^{\infty} \theta(-x-i)$

2. Addition

$x + y = I(x) + I(y) = \sum_{i=0}^{\infty} \theta(x-i) + \sum_{j=0}^{\infty} \theta(y-j)$

3. Multiplication

$x \cdot y = \sum_{i,j} \theta(\theta(x-i) + \theta(y-j) - 1) = \sum_{i,j} \wedge(x-i, y-j)$

4. Root Extraction

$x^{1/n} = \sum_{i=0}^{\infty} \theta(x - i^n)$

Zero-Point Detection Methods

One-Dimensional Case

For detecting zeros of function $f(x)$ between grid points $i$ and $i+1$ : $\delta_i(f) := \vee(\theta(f_{i+1}) - \theta(f_i), \theta(f_i) - \theta(f_{i+1}))$

Two-Dimensional Case

Detecting common zeros of functions $f,g$ within a square region: $\delta_{i,j}(f,g) = \wedge(\delta_{ij}(f), \delta_{ij}(g))$

Zero-point location approximation: $\left(\sum_{ij} \frac{i}{N}\delta_{i,j}(f,g), \sum_{ij} \frac{j}{N}\delta_{i,j}(f,g)\right)$

Sector Functions and Classification Problems

One-Dimensional Sector

Characteristic function of interval $[2,3]$ : $G(x) = \theta(x-2) - \theta(x-3)$

Two-Dimensional Sector

Characteristic function of the first quadrant: $G(x_1,x_2) = -\theta(\theta(-x_1) + \theta(-x_2) - 1) + 1$

General $(n+1)$ -Dimensional Sector

$G(x) = \theta\left(\sum_{i=0}^n \theta(x_i) - n\right)$

Experimental Setup

TensorFlow Implementation

The authors used TensorFlow for practical computation but noted the gap between theory and practice:

Activation function selection: Used sigmoid function $\frac{1}{1+\exp(-20x)}$ to approximate the Heaviside function
Training strategy: Employed stochastic gradient descent with one training sample per step
Network architecture: Tested single-layer and two-layer network structures

Experimental Configuration

Network nodes: 10-node single-layer network
Training epochs: 2000 epochs
Optimizer: Adam optimizer
Loss function: Mean absolute percentage error

Experimental Results

Identity Function Learning

Experiments verified that the network can learn the Heaviside representation of the identity function. Figure 1 shows bias values converging from initial state (blue dots) to the expected linear arrangement (orange dots).

Quadratic Function Mapping

In learning the mapping $f(b,c) = b^2 + c$ :

Two-layer network (3 and 30 nodes)
40 training samples with domain $[0,2] \times [0,2]$
Good matching achieved after 4000 training epochs

Differences Between Heaviside and Smooth Functions

Experiments revealed that using smooth sigmoid functions for training, when parameters are applied to true Heaviside functions, results show significant differences, particularly in the second layer network.

The paper cites research from the following related fields:

Constructive mathematics: Viewing Heavisidisation as an enhancement of constructive mathematical methods
Computational physics: Distinction from big data analysis and computational experiments
Resultant theory: Connection with algebraic numbers and discriminant calculations
Machine learning theory: Mathematical foundations of gradient descent methods

Conclusions and Discussion

Main Conclusions

Feasibility of Heavisidisation: Demonstrated that many fundamental mathematical operations can be expressed as iterations of Heaviside functions
Three classes of core problems:
- A) Heavisidisation of various problems (constructive)
- B) Discovery of algebraic formulas (conceptual)
- C) Distinguishing reasonable from unreasonable answers (conceptual)

Limitations

Gauge invariance problem: Multiple equivalent Heaviside representations exist, requiring selection of appropriate gauge
Convergence issues: Gradient descent may not find the correct answer even when a Heaviside representation exists
Need for human intervention: Practical applications still require substantial human expertise and techniques
Smoothing effects: Function smoothing in numerical computation affects result accuracy

Future Directions

Heavisidisation of higher-degree equations: Extension to cubic, quartic, and higher-order equations
More complex algebraic structures: Exploration of Heaviside representations of discriminants, resultants, etc.
Mechanization of scientific taste: Investigation of whether machines can develop scientific aesthetics similar to humans

In-Depth Evaluation

Strengths

Conceptual innovation: Introduces the novel concept of "Heavisidisation," opening new perspectives for applying machine learning to science
Theoretical depth: Systematically constructs an operational system of Heaviside functions from mathematical foundations
Interdisciplinary perspective: Organically combines machine learning, mathematical physics, and constructive mathematics
Practical verification: TensorFlow experiments validate theoretical feasibility

Weaknesses

Limited application scope: Currently handles only relatively simple mathematical problems, far from genuine scientific discovery
Computational complexity: Heaviside representations often require infinite series, necessitating truncation in practical computation
Lack of convergence guarantees: No theoretical guarantees provided for gradient descent convergence to correct solutions
Blurred human-machine boundaries: Experiments still require substantial human intervention, failing to achieve true automation

Impact

Theoretical contribution: Provides new perspectives on mathematical foundations of machine learning
Methodological value: Heavisidisation methods may inspire solutions to other scientific computing problems
Philosophical significance: Touches on the deep question of whether artificial intelligence can possess scientific creativity

Applicable Scenarios

Symbolic computation: Suitable for mathematical problems requiring precise symbolic representation
Constructive proofs: Applicable to mathematical proofs requiring constructive methods
Science education: Can serve as a teaching tool for understanding mathematical foundations of machine learning

Technical Innovation Points

Key Innovations

Iterated Heaviside representation: Decomposing complex functions into combinations of simple step functions
Operationalization of networks: Converting traditional mathematical operations into forms processable by neural networks
Zero-point detection algorithm: Providing systematic methods for detecting function zeros on discrete grids
Application of gauge theory: Introducing the concept of gauge invariance from physics into machine learning

Mathematical Framework

The paper establishes a complete hierarchical structure from basic Heaviside functions to complex mathematical operations: $\text{Heaviside} \rightarrow \text{Logical Operations} \rightarrow \text{Arithmetic Operations} \rightarrow \text{Algebraic Operations} \rightarrow \text{Scientific Problems}$

This layered construction provides a systematic mathematical foundation for machine learning to address scientific problems.

References

The paper cites the following important literature:

Gelfand, Kapranov, Zelevinsky: "Discriminants, Resultants, and Multidimensional Determinants"
Dolotin, Morozov: "Introduction to Non-Linear Algebra"
Morozov, Shakirov: "New and Old Results in Resultant Theory"
Ruelle: "Post-human Mathematics"

Overall Assessment: This is a highly original and theoretically profound paper that attempts to establish new mathematical foundations for applying machine learning to science. While current results are preliminary, the proposed Heavisidisation concept and methodology possess significant theoretical value and inspirational significance. The paper's interdisciplinary nature and consideration of philosophical questions about artificial intelligence grant it academic value transcending the technical level.