Dyck Words, Pattern Avoidance, and Automatic Sequences
Mol, Rampersad, Shallit
We study various aspects of Dyck words appearing in binary sequences, where $0$ is treated as a left parenthesis and $1$ as a right parenthesis. We show that binary words that are $7/3$-power-free have bounded nesting level, but this no longer holds for larger repetition exponents. We give an explicit characterization of the factors of the Thue-Morse word that are Dyck, and show how to count them. We also prove tight upper and lower bounds on $f(n)$, the number of Dyck factors of Thue-Morse of length $2n$.
academic
Dyck Words, Pattern Avoidance, and Automatic Sequences
This paper investigates various properties of Dyck words in binary sequences, where 0 is interpreted as a left parenthesis and 1 as a right parenthesis. The research demonstrates that 7/3-power-free binary words possess bounded nesting depth, though this property no longer holds for larger repetition exponents. The paper provides an explicit characterization of Dyck factors in the Thue-Morse word and demonstrates how to compute their quantities. Furthermore, tight upper and lower bounds are established for the number f(n) of Dyck factors of length 2n in the Thue-Morse word.
The core problem under investigation is understanding the structure and properties of Dyck word factors in infinite binary sequences. Dyck words are fundamental concepts in formal language theory, representing balanced parenthesis strings with important applications in computer science and mathematics.
Theoretical Importance: The Dyck language is a canonical example of context-free languages. Studying its distribution in automatic sequences helps illuminate the deep connections between formal language theory and automata theory.
Combinatorial Value: Pattern avoidance and power avoidance are central research directions in combinatorics on words. This research combines these concepts with Dyck words.
Computational Applications: Automatic sequences have broad applications in algorithms and computational complexity theory. Understanding the properties of their Dyck factors has practical significance.
Relationship Between Power Avoidance and Nesting Depth: Proves that 7/3-power-free Dyck words have nesting depth at most 3, while there exist 7/3⁺-power-free Dyck words with arbitrarily large nesting depth.
Characterization of Dyck Factors in Thue-Morse Words: Provides a complete characterization of all Dyck factors in the Thue-Morse sequence: factors of the form h(x), where x is a factor of a certain ternary sequence s.
General Theory for Automatic Sequences: Establishes a decidability framework for Dyck factors in run-length and synchronized automatic sequences.
Precise Counting Results: Proves tight upper and lower bounds for the number d(n) of Dyck factors of length 2n in the Thue-Morse sequence: d(n) ≤ n and d(n) ≥ n/2.
Given a binary word w = w1..n, if interpreting 0 as a left parenthesis and 1 as a right parenthesis yields a balanced parenthesis string, then w is called a Dyck word. Formally, w is a Dyck word if and only if:
B(w) = |w|₀ - |w|₁ = 0 (balance condition)
For all prefixes w', B(w') ≥ 0 (non-negativity condition)
The nesting depth N(w) is defined as the maximum value of B(w') over all prefixes.
Utilizes the Walnut theorem prover for computational verification:
morphism f "0->00100110100110010110010011001011001101
1->00101100110100110110011010010110011011
2->00101101001101001011001101001011010011"
morphism g "0->022012 1->022112 2->202101"
Morphism Construction Technique: Designs special 6-uniform morphism g and 38-uniform morphism f to achieve precise control of nesting depth.
Synchronized Sequence Theory: Extends run-length and synchronization concepts to Dyck language analysis, establishing a decidability framework.
Linear Representation Minimization: Uses the Schützenberger algorithm to reduce the linear representation rank for Thue-Morse Dyck factor counting from 29 to 7.
Critical Exponent Phenomenon: 7/3 is the critical exponent for the boundedness of nesting depth in Dyck words, reflecting the profound connection between power avoidance and structural complexity.
Universality of Automatic Sequences: Run-length and synchronization properties provide a unified framework for studying Dyck factors in automatic sequences.
Precise Counting Theory: The counting of Dyck factors in the Thue-Morse sequence reveals the rich structure of k-regular sequences.
Theoretical Depth: Organically combines power avoidance, automatic sequences, and formal language theory, demonstrating profound theoretical expertise.
Methodological Innovation: Clever application of morphism construction and linear representation techniques, particularly precise control of nesting depth.
Computational Rigor: Extensive use of computer-assisted verification enhances the reliability of results.
Result Completeness: Provides a complete theoretical picture from existence to counting.
This paper cites important literature from formal language theory, combinatorics, and automata theory, including:
Chomsky & Schützenberger's theory of context-free languages
Thue's pioneering work on overlap-free words
Allouche & Shallit's theory of k-regular sequences
Berstel & Reutenauer's noncommutative rational series
Literature on modern computational tools such as Walnut
Overall Assessment: This is an excellent paper demonstrating both theoretical depth and technical innovation. It successfully combines concepts and methods from multiple mathematical branches to provide important contributions to understanding structural patterns in automatic sequences. While there are certain limitations in computational complexity and generalizability, its theoretical value and methodological significance are substantial.