Generative Flow Networks (GFlowNets) offer a powerful framework for sampling graphs in proportion to their rewards. However, existing approaches suffer from systematic biases due to inaccuracies in state transition probability computations. These biases, rooted in the inherent symmetries of graphs, impact both atom-based and fragment-based generation schemes. To address this challenge, we introduce Symmetry-Aware GFlowNets (SA-GFN), a method that incorporates symmetry corrections into the learning process through reward scaling. By integrating bias correction directly into the reward structure, SA-GFN eliminates the need for explicit state transition computations. Empirical results show that SA-GFN enables unbiased sampling while enhancing diversity and consistently generating high-reward graphs that closely match the target distribution.
- Paper ID: 2506.02685
- Title: Symmetry-Aware GFlowNets
- Authors: Hohyun Kim, Seunggeun Lee, Min-hwan Oh (Seoul National University)
- Classification: stat.ML cs.LG
- Conference: ICML 2025 (42nd International Conference on Machine Learning)
- Paper Link: https://arxiv.org/abs/2506.02685
Generative Flow Networks (GFlowNets) provide a powerful framework for sampling graphs proportionally to rewards. However, existing methods suffer from systematic biases due to inaccuracies in computing state transition probabilities. These biases are rooted in the inherent symmetries of graphs and affect both atom-based and fragment-based generation schemes. To address this challenge, this paper introduces Symmetry-Aware GFlowNets (SA-GFN), which incorporates symmetry corrections into the learning process through reward scaling. By directly integrating bias correction into the reward structure, SA-GFN eliminates the need for explicit state transition calculations. Experimental results demonstrate that SA-GFN achieves unbiased sampling while enhancing diversity and consistently generating high-reward graphs that closely match the target distribution.
GFlowNets face the equivalent action problem in graph generation tasks: different actions may lead to structurally identical graphs. For example, when adding a new node to a graph, actions connecting to two symmetric nodes, though distinct, produce isomorphic graphs. In such cases, state transition probabilities must account for all equivalent actions, but computing them is expensive.
- Bias in Molecular Generation: In molecular discovery, over 50% of molecules possess multiple symmetries, with 18% containing four or more symmetries. Ignoring symmetries leads to incorrect modeling and reduced accuracy in molecular structure generation.
- Systematic Bias: The bias is systematic, biasing toward graphs with fewer symmetries in node generation and toward symmetric components in fragment generation.
- Computational Complexity: Accurately computing state transition probabilities requires expensive graph isomorphism tests.
- Ma et al. (2024) proposed using positional encodings to approximate detection of equivalent actions, but this requires application at each transition step, incurring large computational overhead and providing only approximate solutions.
- Traditional GFlowNet objective functions (TB, DB, etc.) cannot avoid the equivalent action problem because they are based on state transition formalization.
- Theoretical Contribution: Provides rigorous formalization of autoregressive graph generation under the GFlowNet framework, explicitly addressing the equivalent action problem
- Simple and Effective Solution: Proposes a reward scaling method based on the size of automorphism groups, requiring only minimal modifications to existing training algorithms
- Unbiased Estimator: Derives an unbiased estimator for model likelihood
- Experimental Validation: Verifies theoretical results through experiments, demonstrating the method's effectiveness in generating diverse high-reward samples
Given a reward function R(x), the objective of GFlowNets is to train a policy p_A such that the sampling probability of terminal states is proportional to their rewards: p̄_A(x) = R(x)/Z, where Z is the normalization constant.
- Graph Isomorphism: Two graphs G and G' are isomorphic (G ≅ G') if there exists a permutation π such that π(E) = E'
- Automorphism Group: The automorphism group Aut(G) of graph G is the set of all permutations that preserve the graph structure
- Orbit: The orbit of node u is Orb(G,u) = {v ∈ V : ∃π ∈ Aut(G), π(u) = v}
Definition 4.1 (Transition Equivalence): Graph transitions (G₁,G'₁) and (G₂,G'₂) are transition equivalent if G₁ ≅ G₂ and G'₁ ≅ G'₂.
Definition 4.2 (Orbit Equivalence): Graph actions (G₁,t₁,u₁) and (G₂,t₂,u₂) are orbit equivalent if they have the same action type and there exists a permutation π such that π(G₁) = G₂ and π(u₁) = u₂.
Theorem 4.3: Orbit equivalent actions lead to transition equivalent transitions.
Lemma 4.5: For AddEdge actions,
∣Orb(G′,u,v)∣∣Orb(G,u,v)∣=∣Aut(G′)∣∣Aut(G)∣
Theorem 4.6 (Automorphism Correction): If permutation equivariant functions are used, then
qAˉ(a∣s′)pAˉ(a∣s)=∣Aut(G′)∣∣Aut(G)∣⋅qE(G∣G′)pE(G′∣G)
Corollary 5.1 (TB Correction): The trajectory balance loss should be
LTB(τ)=(log∣Aut(Gn)∣R(Gn)∏t=0n−1qE(Gt∣Gt+1)Z∏t=0n−1pE(Gt+1∣Gt))2
Solution: Scale the reward as R~(G)=∣Aut(G)∣R(G)
Theorem 5.3 (Fragment Correction): For a terminal state G generated by connecting k fragments {C₁,...,Cₖ}:
R~(G)=∏i=1k∣Aut(Ci)∣∣Aut(G)∣R(G)
pˉA(x)=Eτ∼qE(τ∣Gn)[∣Aut(Gn)∣qE(τ∣Gn)pE(τ)]
- Theoretical Elegance: Simplifies complex transition-level corrections to reward scaling at terminal states
- Computational Efficiency: Avoids expensive graph isomorphism tests at each step, requiring only one-time automorphism group size computation
- Generality: Applicable to multiple GFlowNet objective functions including TB, DB, and FM
- Exactness: Provides exact solutions rather than approximations
- Illustrative Examples: Initial state with 6 disconnected nodes, 112 terminal states
- Synthetic Graphs: Heterogeneous graphs with up to 7 nodes, 72,296 terminal states
- Molecular Generation:
- Atom-level: HOMO-LUMO gap prediction task
- Fragment-level: sEH target binding affinity prediction task
- L1 Error: L1 error between target probability and model terminal probability
- Diversity: Average Tanimoto distance between molecules
- Top K Metrics: Diversity and rewards of top K high-reward molecules
- Uniqueness: Proportion of unique molecules in generated samples
- Vanilla GFlowNets: Without considering graph symmetry
- Transition Correction: Identifying transition equivalent actions through multiple isomorphism tests
- PE (Ma et al., 2024): Using positional encodings to approximate orbit equivalent action identification
- Reward Scaling (This Work): Implementing correction through modified reward signals
- Flow Scaling (This Work): Multiplying by symmetry ratios at each transition
- Vanilla model's terminal probabilities cluster by |x|, showing clear bias
- Reward Scaling achieves the same effect as Transition Correction
- Estimated normalization constant Z: Reward Scaling yields 112 (ground truth), Vanilla yields 26,706
- TB Objective: Reward Scaling significantly reduces L1 error, performing comparably to Transition Correction
- DB Objective: Reward Scaling converges more slowly but ultimately achieves the same precision
- PE method, as an approximate solution, performs between Vanilla and exact methods
Atom-level Generation Results:
- Diversity: 0.929→0.959 (Vanilla→Reward Scaling)
- Uniqueness: 0.93→1.0
Fragment-level Generation Results:
- Top K Rewards: 0.941→0.952 (Vanilla→Reward Scaling Exact)
- Cyclohexane fragment usage: 5220→1042 (significant reduction in symmetric fragment overuse)
- Approximate vs. Exact Correction: Approximate methods already show significant performance improvement
- Different Objective Functions: Both TB and DB can be effectively corrected through reward scaling
- Automorphism computation time: 0.010ms on QM9 dataset, 0.022ms on ZINC250k dataset
- Compared to neural network forward propagation during trajectory sampling, computational overhead is negligible
- Training time comparison: Reward Scaling is approximately 15% faster than Transition Correction
- Adjacency Matrix Methods: Preserve node ordering information, less prone to equivalent action problems
- Graph Sequence Methods: Easily produce equivalent actions, with problems becoming prominent when state transition probabilities are needed
- Existing objective functions (flow matching, detailed balance, trajectory balance, etc.) cannot avoid the equivalent action problem
- Ma et al. (2024) first identified the problem but provided only approximate solutions
- Permutation equivariance causes nodes in the same orbit to have identical representations
- Limited expressiveness may cause action representations from different orbits to overlap
- Theoretical Contribution: First rigorous analysis of the equivalent action problem in GFlowNets, proving it causes systematic bias
- Practical Solution: Reward scaling provides a simple, exact, and efficient correction method
- Broad Applicability: The method applies to atom-level and fragment-level generation, as well as multiple training objectives
- Action Design Dependency: Theoretical guarantees depend on specific predefined graph action sets
- Task Specificity: Primarily validated on molecular discovery-related datasets
- GNN Expressiveness: Limited GNN expressiveness may introduce additional bias
- Explore tasks with different symmetry patterns and reward structures
- Design GNN architectures with stronger expressiveness
- Extend to more complex graph generation scenarios
- Theoretical Rigor: Provides a complete mathematical framework and rigorous theoretical analysis
- Method Simplicity: The solution is extremely simple, easy to implement and integrate
- Practical Value: Demonstrates real-world effectiveness in important applications like molecular generation
- Computational Efficiency: Avoids expensive online graph isomorphism tests
- Strong Generality: Applicable to multiple GFlowNet training objectives
- Limited Experimental Scope: Primarily focused on molecular generation tasks, with limited validation on other graph generation tasks
- Theoretical Assumptions: Depends on specific action design and GNN architecture assumptions
- Approximate Methods: Approximate corrections for fragment generation lack theoretical guarantees
- Scalability: Automorphism computation may become a bottleneck for very large graphs
- Academic Value: Provides important supplement to GFlowNet theory, addressing fundamental problems
- Practical Value: Direct contribution to application domains like drug discovery
- Reproducibility: Simple method, easy to reproduce and apply
- Inspirational Value: Provides insights for symmetry handling in other generative models
- Molecular Design: Drug discovery, materials design, and other cheminformatics applications
- Graph Generation: Graph generation tasks requiring consideration of structural symmetry
- Combinatorial Optimization: Combinatorial optimization problems with symmetry constraints
- Reinforcement Learning: RL tasks with symmetric state spaces
- Bengio et al. (2021) - GFlowNet foundations
- Ma et al. (2024) - First identification of equivalent action problem
- Malkin et al. (2022) - Trajectory balance objective
- Jain et al. (2023) - Multi-objective GFlowNet applications
Overall Assessment: This is an excellent paper that combines theory and practice, addressing an important but previously overlooked fundamental problem in GFlowNets. The method is elegant and simple, the theoretical analysis is rigorous, and the experimental validation is comprehensive. It makes significant contributions to both GFlowNet theory development and practical applications, and is expected to have sustained impact on related fields.