2025-11-19T19:10:14.291595

FrameEOL: Semantic Frame Induction using Causal Language Models

Yano, Yamada, Tsukagoshi et al.
Semantic frame induction is the task of clustering frame-evoking words according to the semantic frames they evoke. In recent years, leveraging embeddings of frame-evoking words that are obtained using masked language models (MLMs) such as BERT has led to high-performance semantic frame induction. Although causal language models (CLMs) such as the GPT and Llama series succeed in a wide range of language comprehension tasks and can engage in dialogue as if they understood frames, they have not yet been applied to semantic frame induction. We propose a new method for semantic frame induction based on CLMs. Specifically, we introduce FrameEOL, a prompt-based method for obtaining Frame Embeddings that outputs One frame-name as a Label representing the given situation. To obtain embeddings more suitable for frame induction, we leverage in-context learning (ICL) and deep metric learning (DML). Frame induction is then performed by clustering the resulting embeddings. Experimental results on the English and Japanese FrameNet datasets demonstrate that the proposed methods outperform existing frame induction methods. In particular, for Japanese, which lacks extensive frame resources, the CLM-based method using only 5 ICL examples achieved comparable performance to the MLM-based method fine-tuned with DML.
academic

FrameEOL: Semantic Frame Induction using Causal Language Models

Basic Information

  • Paper ID: 2510.09097
  • Title: FrameEOL: Semantic Frame Induction using Causal Language Models
  • Authors: Chihiro Yano¹, Kosuke Yamada¹'², Hayato Tsukagoshi¹, Ryohei Sasano¹, Koichi Takeda³
  • Affiliations: ¹Nagoya University, ²CyberAgent, ³National Institute of Informatics
  • Classification: cs.CL (Computational Linguistics)
  • Publication Date: October 10, 2025 (arXiv preprint)
  • Paper Link: https://arxiv.org/abs/2510.09097

Abstract

Semantic frame induction is the task of clustering frame-evoking words according to the semantic frames they evoke. Recent work has achieved high performance in semantic frame induction using frame-evoking word embeddings obtained from masked language models (MLMs) such as BERT. Although causal language models (CLMs) such as GPT and Llama series have achieved success in a wide range of language understanding tasks and can engage in dialogue-like understanding of frames, they have not yet been applied to semantic frame induction. This paper proposes FrameEOL, a novel approach to semantic frame induction based on CLMs, which is a prompt-based method for obtaining frame embeddings that output a frame name as a label. To obtain embeddings more suitable for frame induction, we leverage in-context learning (ICL) and deep metric learning (DML). Experimental results demonstrate that the method outperforms existing approaches on English and Japanese FrameNet datasets. Notably, for Japanese, where extensive frame resources are lacking, the CLM method using only 5 ICL examples achieves performance comparable to MLM methods with DML fine-tuning.

Research Background and Motivation

Problem Definition

Semantic frame induction aims to automatically identify and cluster verb instances that evoke the same semantic frame. For example, the verb "lost" may evoke different semantic frames in different contexts:

  • "He lost the gold medal by just .02 points" → FINISH_COMPETITION frame
  • "He lost his gold medal at the restaurant" → LOSING frame

Research Significance

  1. Resource Scarcity: Manual construction of semantic frame resources is prohibitively expensive; automatic construction is urgently needed
  2. Multilingual Demand: Frame resources for languages other than English are extremely limited
  3. Domain Adaptation: Specific domains may require frame representations at different granularities

Limitations of Existing Methods

  1. MLM Dependency: Existing methods are primarily based on masked language models such as BERT
  2. Resource Dependency: Requires substantial annotated data for effective training
  3. Language Limitations: Poor performance on low-resource languages

Research Motivation

Although modern CLMs such as GPT-4o demonstrate the ability to understand semantic frames (as shown in the ChatGPT example in Figure 1), they have not been systematically applied to semantic frame induction tasks. This paper aims to fill this gap.

Core Contributions

  1. First Application of CLMs to Semantic Frame Induction: Proposes FrameEOL method, extending PromptEOL for frame embedding acquisition
  2. Multi-Strategy Optimization: Combines in-context learning (ICL) and deep metric learning (DML) to enhance embedding quality
  3. Surpassing Existing Methods: Achieves best performance on English FrameNet with BcF score of 71.9
  4. Low-Resource Language Breakthrough: On Japanese FrameNet, achieves performance comparable to DML fine-tuned MLM using only 5 ICL examples
  5. Bilingual Validation: Validates method effectiveness on both English and Japanese datasets

Methodology Details

Task Definition

Input: A set of sentences containing frame-evoking verbs Output: Clustering of verb instances according to the semantic frames they evoke Constraint: No predefined set of frame labels required

Model Architecture

3.1 FrameEOL Core Method

FrameEOL is inspired by PromptEOL and obtains frame embeddings through specially designed prompt templates:

Prompt Template:

The FrameNet frame evoked by "[verb]" in "[sentence]" is

Key Design:

  • [verb]: Frame-evoking verb placeholder
  • [sentence]: Placeholder for sentence containing the verb
  • Uses the final layer embedding of the last token "is" as the frame embedding

3.2 In-Context Learning Optimization (ICL)

To address low-resource language challenges, ICL is introduced:

Example Construction:

The FrameNet frame evoked by "wear" in "On his head he wore a white nightcap..." is Wearing.
The FrameNet frame evoked by "type" in "I typed it out for Diana Morrison." is Text_creation.
The FrameNet frame evoked by "kneel" in "He knelt up and leaned towards Lucien." is Change_posture.

The FrameNet frame evoked by "lost" in "He lost his gold medal at the restaurant." is

Advantages: Significant performance improvement with only a few examples (5-20), particularly suitable for scenarios with scarce training data.

3.3 Deep Metric Learning Optimization (DML)

Employs triplet loss function to optimize the embedding space:

Ltri=max(D(xa,xp)D(xa,xn)+m,0)L_{tri} = \max(D(x_a, x_p) - D(x_a, x_n) + m, 0)

Where:

  • xa,xp,xnx_a, x_p, x_n: Frame embeddings of anchor, positive, and negative samples
  • D(,)D(\cdot, \cdot): Euclidean distance of normalized embeddings
  • mm: Margin parameter

Implementation Details:

  • Parameter-efficient fine-tuning using LoRA
  • LoRA rank r=8, α=32
  • Training for 20 epochs with batch size 32

Technical Innovations

  1. Prompt Design Innovation: Specializes PromptEOL's generic sentence embedding method for frame embedding tasks
  2. Dual Optimization Strategy: ICL for low-resource scenarios, DML for supervised scenarios
  3. Parameter-Efficient Training: Uses LoRA to reduce computational resource requirements
  4. Cross-Lingual Adaptation: Enables multilingual support through simple prompt translation

Experimental Setup

Datasets

English FrameNet 1.7

  • Scale: 82,610 instances, 642 frames, 2,492 verbs
  • Split: Three-fold cross-validation, average 27,537 training instances
  • Characteristics: Test set includes frames unseen during training (average 135.3/434.3)

Japanese FrameNet

  • Scale: 3,130 instances, 344 frames, 766 verbs
  • Split: Three-fold cross-validation, average 1,043 training instances
  • Challenge: Only 3.2% of English dataset size

Evaluation Metrics

Uses B-cubed precision (BCP), recall (BCR), and F-score (BCF) as primary evaluation metrics, with BCF as the main evaluation standard.

Baseline Methods

  • MLM Baselines: BERTbase/large, ModernBERTbase/large, RoBERTalarge
  • Clustering Methods: One-step clustering (average linkage) and two-step clustering (X-means + average linkage)
  • Training Settings: Both no fine-tuning and DML fine-tuning configurations

Implementation Details

  • Models: Gemma 3-12B, Llama 3.1-8B, etc.
  • ICL Settings: 5/10/20 examples, maximum sequence length 2048
  • Hyperparameters: Learning rate {3e-5, 5e-5, 1e-4}, margin {0.1, 0.2, 0.5, 1.0}

Experimental Results

Main Results

English FrameNet Performance

ModelTraining MethodOne-Step Clustering BCFTwo-Step Clustering BCF
RoBERTalarge + DMLDML67.969.6
Gemma 3 + DMLDML71.970.6
Llama 3.1 + DMLDML70.870.9

Key Findings:

  • CLM+DML method significantly outperforms best MLM method
  • Gemma 3 achieves highest 71.9 BCF in one-step clustering
  • ICL method shows performance improvement with increasing number of examples

Japanese FrameNet Performance

ModelTraining MethodOne-Step Clustering BCFTwo-Step Clustering BCF
Japanese ModernBERTbase + DMLDML60.058.4
LLM-jp-3 + DMLDML61.359.2
Llama 3.1 + ICL(5-shot)ICL59.957.4

Important Findings:

  • Achieves performance comparable to DML with only 5 ICL examples
  • Demonstrates CLM advantages on low-resource languages

Ablation Studies

Impact Analysis of "FrameNet" Terminology

Removing "FrameNet" terminology from prompts has limited performance impact:

  • Performance degradation less than 1% in ICL and DML settings
  • Demonstrates model does not simply rely on FrameNet knowledge from pretraining

Experimental Findings

  1. CLM Advantages: CLM+DML significantly outperforms MLM methods with sufficient training data
  2. ICL Potential: Few examples yield competitive performance, particularly suitable for low-resource scenarios
  3. Clustering Strategy: One-step clustering is sufficiently effective after DML/ICL optimization
  4. Cross-Lingual Capability: CLMs demonstrate strong multilingual frame understanding ability

Semantic Frame Induction Research

  • Unsupervised Methods: Clustering using contextualized embeddings from MLMs such as BERT
  • Supervised Methods: Optimizing embedding space through deep metric learning
  • Two-Step Clustering: Addressing over-fragmentation issues of traditional methods

Prompt-Based Text Embedding

  • PromptBERT: Obtaining sentence embeddings through masked prediction
  • PromptEOL: Using CLM's next-word prediction capability for embeddings
  • This Work's Contribution: Specializing generic embedding methods for frame embedding tasks

Conclusions and Discussion

Main Conclusions

  1. First Successful Application: CLMs can effectively be applied to semantic frame induction, outperforming traditional MLM methods
  2. Low-Resource Advantages: ICL method demonstrates significant potential in data-scarce scenarios
  3. Cross-Lingual Effectiveness: Method achieves excellent performance on both English and Japanese

Limitations

  1. Computational Resources: Large-scale CLMs require significant computational resources
  2. Language Coverage: Validation only on English and Japanese; generalization to other languages unknown
  3. Domain Adaptation: Applicability in specific domains requires further verification

Future Directions

  1. Multilingual Extension: Validate method effectiveness on more languages
  2. Domain Adaptation: Explore application effects in specific domains
  3. Efficiency Optimization: Develop more efficient training and inference methods

In-Depth Evaluation

Strengths

  1. Strong Innovation: First systematic application of CLMs to semantic frame induction
  2. Complete Methodology: Provides both ICL and DML optimization strategies for different resource conditions
  3. Comprehensive Experiments: Full evaluation across two languages and multiple models
  4. Practical Value: Provides feasible solutions for frame construction in low-resource languages

Weaknesses

  1. Theoretical Analysis: Lacks in-depth theoretical explanation for why CLMs perform better on this task
  2. Computational Cost: Insufficient discussion of computational cost comparison with MLM methods
  3. Error Analysis: Lacks detailed analysis of failure cases
  4. Generalization: Validation only on FrameNet data; applicability to other frame resources unknown

Impact

  1. Academic Contribution: Opens new technical pathways for semantic frame research
  2. Practical Value: Provides practical tools for multilingual frame resource construction
  3. Reproducibility: Provides detailed experimental settings and hyperparameter configurations

Applicable Scenarios

  1. Low-Resource Languages: Languages with scarce frame resources
  2. Domain Adaptation: Scenarios requiring domain-specific frame construction
  3. Rapid Prototyping: Applications requiring quick frame system development

References

This paper cites important works from multiple domains including semantic frames, deep metric learning, and prompt-based learning, providing solid theoretical foundations for method design. Particularly noteworthy are foundational works by Yamada et al. (2021, 2023) in MLM-based frame induction and the PromptEOL method proposed by Jiang et al. (2024).


Overall Assessment: This is a high-quality research paper that successfully introduces causal language models to semantic frame induction tasks, with significant contributions in method innovation, experimental validation, and practical value. The breakthrough performance in low-resource language scenarios is particularly noteworthy and provides important reference for related field development.