2025-11-14T01:22:11.048448

Symmetry-Aware GFlowNets

Kim, Lee, Oh

Generative Flow Networks (GFlowNets) offer a powerful framework for sampling graphs in proportion to their rewards. However, existing approaches suffer from systematic biases due to inaccuracies in state transition probability computations. These biases, rooted in the inherent symmetries of graphs, impact both atom-based and fragment-based generation schemes. To address this challenge, we introduce Symmetry-Aware GFlowNets (SA-GFN), a method that incorporates symmetry corrections into the learning process through reward scaling. By integrating bias correction directly into the reward structure, SA-GFN eliminates the need for explicit state transition computations. Empirical results show that SA-GFN enables unbiased sampling while enhancing diversity and consistently generating high-reward graphs that closely match the target distribution.

academic

Symmetry-Aware GFlowNets

Basic Information

Paper ID: 2506.02685
Title: Symmetry-Aware GFlowNets
Authors: Hohyun Kim, Seunggeun Lee, Min-hwan Oh (Seoul National University)
Classification: stat.ML cs.LG
Conference: ICML 2025 (42nd International Conference on Machine Learning)
Paper Link: https://arxiv.org/abs/2506.02685

Abstract

Generative Flow Networks (GFlowNets) provide a powerful framework for sampling graphs proportionally to rewards. However, existing methods suffer from systematic biases due to inaccuracies in computing state transition probabilities. These biases are rooted in the inherent symmetries of graphs and affect both atom-based and fragment-based generation schemes. To address this challenge, this paper introduces Symmetry-Aware GFlowNets (SA-GFN), which incorporates symmetry corrections into the learning process through reward scaling. By directly integrating bias correction into the reward structure, SA-GFN eliminates the need for explicit state transition calculations. Experimental results demonstrate that SA-GFN achieves unbiased sampling while enhancing diversity and consistently generating high-reward graphs that closely match the target distribution.

Research Background and Motivation

Core Problem

GFlowNets face the equivalent action problem in graph generation tasks: different actions may lead to structurally identical graphs. For example, when adding a new node to a graph, actions connecting to two symmetric nodes, though distinct, produce isomorphic graphs. In such cases, state transition probabilities must account for all equivalent actions, but computing them is expensive.

Problem Significance

Bias in Molecular Generation: In molecular discovery, over 50% of molecules possess multiple symmetries, with 18% containing four or more symmetries. Ignoring symmetries leads to incorrect modeling and reduced accuracy in molecular structure generation.
Systematic Bias: The bias is systematic, biasing toward graphs with fewer symmetries in node generation and toward symmetric components in fragment generation.
Computational Complexity: Accurately computing state transition probabilities requires expensive graph isomorphism tests.

Limitations of Existing Methods

Ma et al. (2024) proposed using positional encodings to approximate detection of equivalent actions, but this requires application at each transition step, incurring large computational overhead and providing only approximate solutions.
Traditional GFlowNet objective functions (TB, DB, etc.) cannot avoid the equivalent action problem because they are based on state transition formalization.

Core Contributions

Theoretical Contribution: Provides rigorous formalization of autoregressive graph generation under the GFlowNet framework, explicitly addressing the equivalent action problem
Simple and Effective Solution: Proposes a reward scaling method based on the size of automorphism groups, requiring only minimal modifications to existing training algorithms
Unbiased Estimator: Derives an unbiased estimator for model likelihood
Experimental Validation: Verifies theoretical results through experiments, demonstrating the method's effectiveness in generating diverse high-reward samples

Methodology Details

Task Definition

Given a reward function R(x), the objective of GFlowNets is to train a policy p_A such that the sampling probability of terminal states is proportional to their rewards: p̄_A(x) = R(x)/Z, where Z is the normalization constant.

Core Theoretical Framework

1. Graph Isomorphism and Equivalence Relations

Graph Isomorphism: Two graphs G and G' are isomorphic (G ≅ G') if there exists a permutation π such that π(E) = E'
Automorphism Group: The automorphism group Aut(G) of graph G is the set of all permutations that preserve the graph structure
Orbit: The orbit of node u is Orb(G,u) = {v ∈ V : ∃π ∈ Aut(G), π(u) = v}

2. Formalization of Equivalent Actions

Definition 4.1 (Transition Equivalence): Graph transitions (G₁,G'₁) and (G₂,G'₂) are transition equivalent if G₁ ≅ G₂ and G'₁ ≅ G'₂.

Definition 4.2 (Orbit Equivalence): Graph actions (G₁,t₁,u₁) and (G₂,t₂,u₂) are orbit equivalent if they have the same action type and there exists a permutation π such that π(G₁) = G₂ and π(u₁) = u₂.

Theorem 4.3: Orbit equivalent actions lead to transition equivalent transitions.

3. Key Theoretical Results

Lemma 4.5: For AddEdge actions, $\frac{|\text{Orb}(G,u,v)|}{|\text{Orb}(G',u,v)|} = \frac{|\text{Aut}(G)|}{|\text{Aut}(G')|}$

Theorem 4.6 (Automorphism Correction): If permutation equivariant functions are used, then $\frac{p_{\bar{A}}(a|s)}{q_{\bar{A}}(a|s')} = \frac{|\text{Aut}(G)|}{|\text{Aut}(G')|} \cdot \frac{p_E(G'|G)}{q_E(G|G')}$

Symmetry-Aware Correction Method

1. Reward Scaling for Node Generation

Corollary 5.1 (TB Correction): The trajectory balance loss should be $L_{TB}(\tau) = \left(\log \frac{Z\prod_{t=0}^{n-1} p_E(G_{t+1}|G_t)}{|\text{Aut}(G_n)|R(G_n)\prod_{t=0}^{n-1} q_E(G_t|G_{t+1})}\right)^2$

Solution: Scale the reward as $\tilde{R}(G) = |\text{Aut}(G)|R(G)$

2. Correction for Fragment Generation

Theorem 5.3 (Fragment Correction): For a terminal state G generated by connecting k fragments {C₁,...,Cₖ}: $\tilde{R}(G) = \frac{|\text{Aut}(G)|R(G)}{\prod_{i=1}^k |\text{Aut}(C_i)|}$

3. Unbiased Estimator for Model Likelihood

$\bar{p}_A(x) = \mathbb{E}_{\tau \sim q_E(\tau|G_n)}\left[\frac{p_E(\tau)}{|\text{Aut}(G_n)|q_E(\tau|G_n)}\right]$

Technical Innovations

Theoretical Elegance: Simplifies complex transition-level corrections to reward scaling at terminal states
Computational Efficiency: Avoids expensive graph isomorphism tests at each step, requiring only one-time automorphism group size computation
Generality: Applicable to multiple GFlowNet objective functions including TB, DB, and FM
Exactness: Provides exact solutions rather than approximations

Experimental Setup

Datasets

Illustrative Examples: Initial state with 6 disconnected nodes, 112 terminal states
Synthetic Graphs: Heterogeneous graphs with up to 7 nodes, 72,296 terminal states
Molecular Generation:
- Atom-level: HOMO-LUMO gap prediction task
- Fragment-level: sEH target binding affinity prediction task

Evaluation Metrics

L1 Error: L1 error between target probability and model terminal probability
Diversity: Average Tanimoto distance between molecules
Top K Metrics: Diversity and rewards of top K high-reward molecules
Uniqueness: Proportion of unique molecules in generated samples

Comparison Methods

Vanilla GFlowNets: Without considering graph symmetry
Transition Correction: Identifying transition equivalent actions through multiple isomorphism tests
PE (Ma et al., 2024): Using positional encodings to approximate orbit equivalent action identification
Reward Scaling (This Work): Implementing correction through modified reward signals
Flow Scaling (This Work): Multiplying by symmetry ratios at each transition

Experimental Results

Main Results

1. Illustrative Experiments

Vanilla model's terminal probabilities cluster by |x|, showing clear bias
Reward Scaling achieves the same effect as Transition Correction
Estimated normalization constant Z: Reward Scaling yields 112 (ground truth), Vanilla yields 26,706

2. Synthetic Graph Experiments

TB Objective: Reward Scaling significantly reduces L1 error, performing comparably to Transition Correction
DB Objective: Reward Scaling converges more slowly but ultimately achieves the same precision
PE method, as an approximate solution, performs between Vanilla and exact methods

3. Molecular Generation Experiments

Atom-level Generation Results:

Diversity: 0.929→0.959 (Vanilla→Reward Scaling)
Uniqueness: 0.93→1.0

Fragment-level Generation Results:

Top K Rewards: 0.941→0.952 (Vanilla→Reward Scaling Exact)
Cyclohexane fragment usage: 5220→1042 (significant reduction in symmetric fragment overuse)

Ablation Studies

Approximate vs. Exact Correction: Approximate methods already show significant performance improvement
Different Objective Functions: Both TB and DB can be effectively corrected through reward scaling

Computational Efficiency Analysis

Automorphism computation time: 0.010ms on QM9 dataset, 0.022ms on ZINC250k dataset
Compared to neural network forward propagation during trajectory sampling, computational overhead is negligible
Training time comparison: Reward Scaling is approximately 15% faster than Transition Correction

Autoregressive Graph Generation

Adjacency Matrix Methods: Preserve node ordering information, less prone to equivalent action problems
Graph Sequence Methods: Easily produce equivalent actions, with problems becoming prominent when state transition probabilities are needed

GFlowNets

Existing objective functions (flow matching, detailed balance, trajectory balance, etc.) cannot avoid the equivalent action problem
Ma et al. (2024) first identified the problem but provided only approximate solutions

Graph Neural Network Expressiveness

Permutation equivariance causes nodes in the same orbit to have identical representations
Limited expressiveness may cause action representations from different orbits to overlap

Conclusions and Discussion

Main Conclusions

Theoretical Contribution: First rigorous analysis of the equivalent action problem in GFlowNets, proving it causes systematic bias
Practical Solution: Reward scaling provides a simple, exact, and efficient correction method
Broad Applicability: The method applies to atom-level and fragment-level generation, as well as multiple training objectives

Limitations

Action Design Dependency: Theoretical guarantees depend on specific predefined graph action sets
Task Specificity: Primarily validated on molecular discovery-related datasets
GNN Expressiveness: Limited GNN expressiveness may introduce additional bias

Future Directions

Explore tasks with different symmetry patterns and reward structures
Design GNN architectures with stronger expressiveness
Extend to more complex graph generation scenarios

In-Depth Evaluation

Strengths

Theoretical Rigor: Provides a complete mathematical framework and rigorous theoretical analysis
Method Simplicity: The solution is extremely simple, easy to implement and integrate
Practical Value: Demonstrates real-world effectiveness in important applications like molecular generation
Computational Efficiency: Avoids expensive online graph isomorphism tests
Strong Generality: Applicable to multiple GFlowNet training objectives

Weaknesses

Limited Experimental Scope: Primarily focused on molecular generation tasks, with limited validation on other graph generation tasks
Theoretical Assumptions: Depends on specific action design and GNN architecture assumptions
Approximate Methods: Approximate corrections for fragment generation lack theoretical guarantees
Scalability: Automorphism computation may become a bottleneck for very large graphs

Impact

Academic Value: Provides important supplement to GFlowNet theory, addressing fundamental problems
Practical Value: Direct contribution to application domains like drug discovery
Reproducibility: Simple method, easy to reproduce and apply
Inspirational Value: Provides insights for symmetry handling in other generative models

Applicable Scenarios

Molecular Design: Drug discovery, materials design, and other cheminformatics applications
Graph Generation: Graph generation tasks requiring consideration of structural symmetry
Combinatorial Optimization: Combinatorial optimization problems with symmetry constraints
Reinforcement Learning: RL tasks with symmetric state spaces

References

Bengio et al. (2021) - GFlowNet foundations
Ma et al. (2024) - First identification of equivalent action problem
Malkin et al. (2022) - Trajectory balance objective
Jain et al. (2023) - Multi-objective GFlowNet applications

Overall Assessment: This is an excellent paper that combines theory and practice, addressing an important but previously overlooked fundamental problem in GFlowNets. The method is elegant and simple, the theoretical analysis is rigorous, and the experimental validation is comprehensive. It makes significant contributions to both GFlowNet theory development and practical applications, and is expected to have sustained impact on related fields.