Thought Flow Nets: From Single Predictions to Trains of Model Thought
Schuff, Adel, Vu
When humans solve complex problems, they typically create a sequence of ideas (involving an intuitive decision, reflection, error correction, etc.) in order to reach a conclusive decision. Contrary to this, today's models are mostly trained to map an input to one single and fixed output. In this paper, we investigate how we can give models the opportunity of a second, third and $k$-th thought. Taking inspiration from Hegel's dialectics, we propose the concept of a thought flow which creates a sequence of predictions. We present a self-correction mechanism that is trained to estimate the model's correctness and performs iterative prediction updates based on the correctness prediction's gradient. We introduce our method at the example of question answering and conduct extensive experiments that demonstrate (i) our method's ability to correct its own predictions and (ii) its potential to notably improve model performances. In addition, we conduct a qualitative analysis of thought flow correction patterns and explore how thought flow predictions affect human users within a crowdsourcing study. We find that (iii) thought flows enable improved user performance and are perceived as more natural, correct, and intelligent as single and/or top-3 predictions.
academic
Thought Flow Nets: From Single Predictions to Trains of Model Thought
When humans solve complex problems, they typically create a series of thoughts—including intuitive decisions, reflections, error corrections, and more—to reach a final decision. In contrast, contemporary models are mostly trained to map inputs to single, fixed outputs. This paper investigates how to provide models with a second, third, or k-th opportunity to think. Inspired by Hegelian dialectics, the authors propose the concept of "thought flow," creating sequences of predictions. The paper presents a self-correction mechanism trained to estimate model correctness and performs iterative prediction updates based on gradients of correctness predictions.
Traditional machine learning models employ single-step prediction paradigms (x → ŷ), directly mapping inputs to fixed outputs, lacking the reflection and self-correction capabilities inherent in human cognition. This presents limitations when handling complex tasks such as question answering and multi-step reasoning.
Human Cognition Inspiration: Humans solving problems undergo complex thought processes including initial judgment, reflection, hypothesis comparison, and contradiction resolution
Philosophical Theoretical Foundation: The three stages of Hegelian dialectics provide a theoretical framework for iterative improvement in machine learning
Practical Necessity: As task complexity increases, learning iterative self-correction may be easier than learning to directly hit correct predictions
Using extractive question answering as an example, given a question and context with L tokens, the model must predict the start and end positions of the answer. Traditional methods output two probability distributions: ŷ_start ∈ 0,1^L and ŷ_end ∈ 0,1^L.
The paper cites important works across multiple domains, including:
Philosophical literature on Hegelian dialectics
Cognitive science and neuroscience research
Machine learning methods for confidence estimation and model correction
Related work on sequential prediction and iterative optimization
Overall Assessment: This is a highly innovative paper that successfully combines philosophical theory with modern machine learning techniques, proposing the practically valuable concept of thought flow. Despite remaining improvements needed in stopping mechanisms, its pioneering approach and convincing experimental results make it an important contribution to the field.