2025-11-22T20:07:15.604385

Semantic-Condition Tuning: Fusing Graph Context with Large Language Models for Knowledge Graph Completion

Liu, Wen, Sun et al.
Fusing Knowledge Graphs with Large Language Models is crucial for knowledge-intensive tasks like knowledge graph completion. The prevailing paradigm, prefix-tuning, simply concatenates knowledge embeddings with text inputs. However, this shallow fusion overlooks the rich relational semantics within KGs and imposes a significant implicit reasoning burden on the LLM to correlate the prefix with the text. To address these, we propose Semantic-condition Tuning (SCT), a new knowledge injection paradigm comprising two key modules. First, a Semantic Graph Module employs a Graph Neural Network to extract a context-aware semantic condition from the local graph neighborhood, guided by knowledge-enhanced relations. Subsequently, this condition is passed to a Condition-Adaptive Fusion Module, which, in turn, adaptively modulates the textual embedding via two parameterized projectors, enabling a deep, feature-wise, and knowledge-aware interaction. The resulting pre-fused embedding is then fed into the LLM for fine-tuning. Extensive experiments on knowledge graph benchmarks demonstrate that SCT significantly outperforms prefix-tuning and other strong baselines. Our analysis confirms that by modulating the input representation with semantic graph context before LLM inference, SCT provides a more direct and potent signal, enabling more accurate and robust knowledge reasoning.
academic

Semantic-Condition Tuning: Fusing Graph Context with Large Language Models for Knowledge Graph Completion

Basic Information

  • Paper ID: 2510.08966
  • Title: Semantic-Condition Tuning: Fusing Graph Context with Large Language Models for Knowledge Graph Completion
  • Authors: Ruitong Liu, Yan Wen, Te Sun, Yunjia Wu, Pingyang Huang, Zihang Yu, Siyuan Li
  • Classification: cs.AI cs.CL
  • Publication Time/Conference: The ACM Web Conference, April 13-17, 2026, Dubai, UAE
  • Paper Link: https://arxiv.org/abs/2510.08966

Abstract

This paper proposes Semantic-Condition Tuning (SCT), a novel knowledge injection paradigm to address the fusion of knowledge graphs with large language models in knowledge graph completion tasks. Traditional prefix tuning methods simply concatenate knowledge embeddings with text inputs, and this shallow fusion ignores the rich relational semantics in knowledge graphs while imposing a heavy implicit reasoning burden on LLMs. SCT comprises two key modules: a semantic graph module that uses graph neural networks to extract context-aware semantic conditions from local graph neighborhoods; and a condition-adaptive fusion module that adaptively modulates text embeddings through two parameterized projectors, enabling deep, feature-level, and knowledge-aware interactions.

Research Background and Motivation

Core Problems

  1. Knowledge Graph Incompleteness: Real-world knowledge graphs are inherently incomplete, limiting their utility in downstream applications
  2. Limitations of Shallow Fusion: Existing prefix tuning methods only perform simple concatenation operations, failing to fully exploit the structural information of knowledge graphs
  3. Dynamicity of Relational Semantics: The meaning of relations changes dynamically based on surrounding semantic context, as illustrated in Figure 1 where the "treats" relation represents different treatment mechanisms in different contexts

Research Significance

  • Knowledge graph completion is crucial for recommendation systems, information extraction, question-answering systems, and other applications
  • LLMs lack deep and precise factual knowledge and are prone to hallucination problems
  • There is a need to effectively fuse the explicit structured knowledge of knowledge graphs with the implicit parameterized knowledge of LLMs

Limitations of Existing Methods

  1. Shallowness of Prefix Tuning: Simple concatenation operations cannot achieve deep integration
  2. Neglect of Relational Semantics: Fails to capture the rich relational semantics in knowledge graphs
  3. Reasoning Burden: Imposes heavy implicit reasoning burden on LLMs to associate prefixes with text

Core Contributions

  1. Proposes SCT Framework: The first semantic condition tuning framework integrating context-aware and adaptive embedding fusion, overcoming the limitations of simple prefix tuning concatenation
  2. Semantic Graph Module: Introduces a novel relation-centric message passing mechanism where neighbor selection is guided by explicit semantic similarity scores from knowledge-enhanced relation descriptions
  3. Condition-Adaptive Fusion Module: Introduces a fusion mechanism that uses semantic conditions to learn direct feature-level affine transformations of input text embeddings, enabling deep synergistic integration of graph context
  4. Performance Validation: Demonstrates state-of-the-art performance and high parameter efficiency of SCT across multiple benchmarks

Method Details

Task Definition

A knowledge graph G is defined as a set of triples T = {(h, r, t) | h, t ∈ E, r ∈ R}, where E and R represent entity and relation sets respectively. The knowledge graph completion task is to infer missing elements in given triples, such as predicting the tail entity t for query (h, r, ?). In LLM-based KGC, this task is formalized as a text generation problem.

Model Architecture

1. Semantic Graph Module

Knowledge Enhancement:

  • Uses a powerful LLM (GPT-4O) to generate canonical text descriptions for each relation type
  • Encodes descriptions as semantic vectors using a pre-trained sentence embedding model (Sentence-BERT)

Relation-Centric Message Passing:

  • Treats the relation structure of KG as the primary computation graph
  • Edges (relations) update their states by aggregating information from neighboring edges
  • Uses Top-K selection mechanism to filter the most semantically relevant neighbors:
Score(ec, en) = (sc · sn) / (||sc||2 ||sn||2)

Transformer Layer Update:

s^(l+1)_c = TransformerLayer(s^l_c, s̄_N_K(ec))

Semantic Condition Generation:

cS = MeanPool({s^L_h,i}_i ∪ {s^L_t,j}_j)

2. Condition-Adaptive Fusion Module

Uses Feature-wise Linear Modulation (FiLM) mechanism:

X' = X ⊙ γ + β
γ = σ(MLP1(cS))
β = MLP2(cS)

where γ is a scaling vector and β is an offset vector, implementing feature-level affine transformations of text embeddings.

Technical Innovations

  1. Deep Fusion vs. Shallow Concatenation: Unlike simple prefix concatenation, SCT achieves feature-level deep interaction
  2. Semantically-Driven Neighbor Selection: Uses LLM-enhanced relation descriptions for semantic similarity computation rather than task-specific learned representations
  3. Relation-Centric Graph Processing: Focuses on relations rather than entities, being more efficient and semantically indicative

Experimental Setup

Datasets

Link Prediction:

  • WN18RR: 40,943 entities, 11 relations, 86,835 training triples
  • FB15k-237: 14,541 entities, 237 relations, 272,115 training triples

Triple Classification:

  • UMLS: 135 entities, 46 relations
  • CoDeX-S: 2,034 entities, 42 relations
  • FB15k-237N: 13,104 entities, 93 relations

Evaluation Metrics

  • Link Prediction: Mean Reciprocal Rank (MRR) and Hits@N
  • Triple Classification: Accuracy (Acc), Precision (P), Recall (R), F1-Score

Comparison Methods

Embedding Methods: TransE, CompGCN, AdaProp, MA-GNN, etc. LLM Methods: KICGPT, KG-FIT, MKGL, SSQR-LLaMA2, KoPA, etc.

Implementation Details

  • Implementation based on Alpaca-7B
  • Semantic Graph Module: 2-layer Transformer, Top-K=10
  • Fine-tuning LLM using LoRA (rank=64)
  • AdamW optimizer, batch size 12
  • Two-stage training strategy

Experimental Results

Main Results

Link Prediction Performance:

  • WN18RR Dataset: Compared to the strongest baseline SSQR-LLaMA2, MRR improvement of 2.2%, Hits@1 improvement of 2.4%, Hits@3 improvement of 2.6%
  • FB15k-237 Dataset: Significant MRR improvement of 4.9%, Hits@1 improvement of 1.6%, Hits@10 improvement of 4.4%

Triple Classification Performance:

  • UMLS Dataset: Accuracy 93.15%, F1 score 93.18%, achieving best performance
  • FB15k-237N Dataset: Accuracy 78.02%, Precision 71.10%, F1 score 80.93%, all best-in-class
  • CoDeX-S Dataset: Precision 78.52% is highest, other metrics comparable to strong baselines

Ablation Studies

Component Effectiveness Validation:

  1. w/o Semantics: Removes semantic graph module, replaces with traditional KGE
    • MRR on FB15k-237 drops from 0.471 to 0.433, Hits@1 drops from 0.380 to 0.327
  2. w/o Fusion: Removes condition-adaptive fusion module, uses prefix tuning instead
    • Most severe performance degradation, MRR and Hits@1 drop by 0.062 and 0.081 respectively

Scoring Function Comparison:

  • RotatE-style function performs best with MRR of 0.471
  • Simple DistMult and MLP result in obvious performance degradation

Case Analysis

Semantic Enhancement Effects: For query (Barack Obama, /government/politician/government_positions_held..., ?):

  • Without Knowledge Enhancement: Based on lexical overlap, Gov Position (Title) ranks high
  • With Knowledge Enhancement: Semantically related concepts like Person (Nationality) rank higher, reflecting the transition from shallow text matching to true semantic relevance

Hyperparameter Sensitivity: The Top-K parameter achieves optimal performance at K=10 (MRR=0.471, Hits@1=0.380), with K=4 being insufficient and K=32 introducing noise.

Knowledge Graph Completion

  1. Embedding Methods: Evolution from geometric models like TransE and ComplEx to more complex geometric space methods like RotatE and HAKE
  2. GNN Methods: PathCon, CBLiP, etc. aggregate multi-hop path information, but still rely on static representations
  3. LLM Methods: KG-BERT, SimKGC, etc. convert triples to text sequences, but interactions remain at surface level

Fusion of LLMs and Knowledge Graphs

Two main directions:

  1. Using KGs to provide factual grounding for LLMs, reducing hallucinations
  2. Leveraging LLMs' generation and reasoning capabilities to solve KG-related tasks

Common limitations of existing methods: Interactions with knowledge graphs often remain at textual or surface levels.

Conclusions and Discussion

Main Conclusions

  1. SCT significantly outperforms shallow prefix tuning methods through deep feature-level fusion
  2. The semantic graph module effectively captures context-aware relational semantics
  3. The condition-adaptive fusion module achieves deep synergistic integration of knowledge and text
  4. Achieves state-of-the-art or highly competitive performance across multiple benchmarks

Limitations

  1. Limited Reasoning Depth: The current framework has limited reasoning depth
  2. Insufficient Adaptability to Dynamic Knowledge Graphs: Adaptability to dynamically changing knowledge graphs needs improvement
  3. Computational Complexity: Two-stage training and complex fusion mechanisms increase computational costs

Future Directions

  1. Hierarchical Semantic Condition Generation: Introduce hierarchical mechanisms to enhance reasoning depth
  2. Temporal Awareness: Incorporate temporal awareness to handle dynamic knowledge
  3. Extended Application Scenarios: Explore applications in more complex scenarios such as temporal knowledge graphs

In-Depth Evaluation

Strengths

  1. Strong Method Novelty: First to propose feature-level deep fusion paradigm, breaking through limitations of traditional prefix tuning
  2. Reasonable Technical Design: Relation-centric message passing and semantically-driven neighbor selection are ingeniously designed
  3. Comprehensive Experiments: Covers both link prediction and triple classification tasks, validated across multiple datasets
  4. Detailed Ablation Studies: Systematically validates the contribution of each component
  5. In-Depth Case Analysis: Demonstrates semantic enhancement effects through concrete examples

Weaknesses

  1. Insufficient Computational Complexity Analysis: Lacks detailed analysis of computational overhead for two-stage training
  2. Limited Scalability Discussion: Insufficient analysis of applicability to large-scale knowledge graphs
  3. Missing Error Analysis: Lacks in-depth analysis of failure cases
  4. Baseline Selection: Some baseline methods may not be the latest strongest methods

Impact

  1. Theoretical Contribution: Provides a new paradigm for knowledge graph and LLM fusion
  2. Practical Value: Superior performance across multiple benchmarks demonstrates practical utility
  3. Reproducibility: Provides detailed implementation details facilitating reproduction
  4. Inspirational Value: Feature-level fusion approach may inspire related research

Applicable Scenarios

  1. Knowledge-Intensive Tasks: Particularly suitable for reasoning tasks requiring structured knowledge
  2. Medium-Scale Knowledge Graphs: Current experimental scale suggests suitability for medium-scale KG applications
  3. High-Accuracy Requirement Scenarios: Outstanding performance in applications where accuracy is more important than efficiency
  4. Multi-Hop Reasoning Needs: Effectively handles complex queries requiring multi-hop reasoning

References

The paper cites 80 related references covering multiple domains including knowledge graph embeddings, graph neural networks, and large language models, providing a solid theoretical foundation. Key references include classical KG embedding methods like TransE and RotatE, as well as representative LLM-KG fusion works like KG-BERT and KoPA.