Hierarchical Bayesian Flow Networks for Molecular Graph Generation
Xiong, Chen, Li et al.
Molecular graph generation is essentially a classification generation problem, aimed at predicting categories of atoms and bonds. Currently, prevailing paradigms such as continuous diffusion models are trained to predict continuous numerical values, treating the training process as a regression task. However, the final generation necessitates a rounding step to convert these predictions back into discrete classification categories, which is intrinsically a classification operation. Given that the rounding operation is not incorporated during training, there exists a significant discrepancy between the model's training objective and its inference procedure. As a consequence, an excessive emphasis on point-wise precision can lead to overfitting and inefficient learning. This occurs because considerable efforts are devoted to capturing intra-bin variations that are ultimately irrelevant to the discrete nature of the task at hand. Such a flaw results in diminished molecular diversity and constrains the model's generalization capabilities. To address this fundamental limitation, we propose GraphBFN, a novel hierarchical coarse-to-fine framework based on Bayesian Flow Networks that operates on the parameters of distributions. By innovatively introducing Cumulative Distribution Function, GraphBFN is capable of calculating the probability of selecting the correct category, thereby unifying the training objective with the sampling rounding operation. We demonstrate that our method achieves superior performance and faster generation, setting new state-of-the-art results on the QM9 and ZINC250k molecular graph generation benchmarks.
academic
Hierarchical Bayesian Flow Networks for Molecular Graph Generation
Molecular graph generation is inherently a categorical generation problem aimed at predicting atom and chemical bond categories. Current mainstream continuous diffusion models treat the training process as a regression task, predicting continuous values, but require rounding operations to convert to discrete categorical classes during final generation. Since the rounding operation is not included during training, there exists a significant discrepancy between the model's training objective and inference process, leading to overfitting, low learning efficiency, and reduced molecular diversity. To address this fundamental limitation, the authors propose GraphBFN, a hierarchical coarse-to-fine framework based on Bayesian Flow Networks, which innovatively introduces cumulative distribution functions to calculate the probability of selecting the correct category, thereby unifying the training objective with sampling rounding operations.
There exists a fundamental train-inference inconsistency problem in molecular graph generation:
Training Phase: Continuous diffusion models map discrete atom/bond categories to continuous space, optimizing continuous value predictions through regression loss
Inference Phase: Requires hard rounding to convert continuous predictions back to discrete categories
Inconsistency: Training does not account for rounding rules, causing models to focus excessively on intra-class variations while neglecting the discrete nature
Molecular graph generation is a key technology in drug discovery, impacting molecular optimization, drug-target binding affinity prediction, and other downstream tasks
The inconsistency in existing methods leads to reduced molecular diversity and limited generalization capability
Even minor regression errors can result in completely incorrect classification outcomes
Discrete Diffusion Models: While suitable for discrete graph structures, they sacrifice the smoothness and dynamic generation characteristics of continuous representations
Continuous Diffusion Models: Training objectives decouple from inference processes, prone to overfitting to irrelevant intra-class variations
Traditional Bayesian Flow Networks: Assume all categories are equidistant in the probability simplex, leading to slow convergence and high noise
First application of Bayesian Flow Networks to molecular graph generation, enhancing generation effectiveness through hierarchical molecular representation supervision
Innovative introduction of Cumulative Distribution Functions (CDF), calculating class probabilities rather than fitting specific values, unifying training objectives with sampling rounding operations
Proposes hierarchical coarse-to-fine framework, capturing both local atomic connectivity and global molecular topology through multi-scale graph representations
Achieves faster training and sampling, reaching state-of-the-art performance on QM9 and ZINC250k benchmarks with significantly reduced sampling steps
The paper cites important works in the field, including:
Graves et al. (2023): Original work on Bayesian Flow Networks
Vignac et al. (2023): DiGress discrete diffusion method
Jo, Lee, and Hwang (2022): GDSS score-based diffusion model
Ying et al. (2018): DiffPool hierarchical graph pooling method
Overall Assessment: This is a high-quality research paper that successfully identifies and addresses core problems in molecular graph generation. Through innovative CDF mechanisms and hierarchical frameworks, it significantly improves practical performance while maintaining theoretical rigor. Although there is room for improvement in theoretical analysis depth and experimental scale, its contributions are sufficient to advance the field.