2025-11-19T19:10:14.291595

FrameEOL: Semantic Frame Induction using Causal Language Models

Yano, Yamada, Tsukagoshi et al.

Semantic frame induction is the task of clustering frame-evoking words according to the semantic frames they evoke. In recent years, leveraging embeddings of frame-evoking words that are obtained using masked language models (MLMs) such as BERT has led to high-performance semantic frame induction. Although causal language models (CLMs) such as the GPT and Llama series succeed in a wide range of language comprehension tasks and can engage in dialogue as if they understood frames, they have not yet been applied to semantic frame induction. We propose a new method for semantic frame induction based on CLMs. Specifically, we introduce FrameEOL, a prompt-based method for obtaining Frame Embeddings that outputs One frame-name as a Label representing the given situation. To obtain embeddings more suitable for frame induction, we leverage in-context learning (ICL) and deep metric learning (DML). Frame induction is then performed by clustering the resulting embeddings. Experimental results on the English and Japanese FrameNet datasets demonstrate that the proposed methods outperform existing frame induction methods. In particular, for Japanese, which lacks extensive frame resources, the CLM-based method using only 5 ICL examples achieved comparable performance to the MLM-based method fine-tuned with DML.

academic

FrameEOL: Semantic Frame Induction using Causal Language Models

Basic Information

Paper ID: 2510.09097
Title: FrameEOL: Semantic Frame Induction using Causal Language Models
Authors: Chihiro Yano¹, Kosuke Yamada¹'², Hayato Tsukagoshi¹, Ryohei Sasano¹, Koichi Takeda³
Affiliations: ¹Nagoya University, ²CyberAgent, ³National Institute of Informatics
Classification: cs.CL (Computational Linguistics)
Publication Date: October 10, 2025 (arXiv preprint)
Paper Link: https://arxiv.org/abs/2510.09097

Abstract

Semantic frame induction is the task of clustering frame-evoking words according to the semantic frames they evoke. Recent work has achieved high performance in semantic frame induction using frame-evoking word embeddings obtained from masked language models (MLMs) such as BERT. Although causal language models (CLMs) such as GPT and Llama series have achieved success in a wide range of language understanding tasks and can engage in dialogue-like understanding of frames, they have not yet been applied to semantic frame induction. This paper proposes FrameEOL, a novel approach to semantic frame induction based on CLMs, which is a prompt-based method for obtaining frame embeddings that output a frame name as a label. To obtain embeddings more suitable for frame induction, we leverage in-context learning (ICL) and deep metric learning (DML). Experimental results demonstrate that the method outperforms existing approaches on English and Japanese FrameNet datasets. Notably, for Japanese, where extensive frame resources are lacking, the CLM method using only 5 ICL examples achieves performance comparable to MLM methods with DML fine-tuning.

Research Background and Motivation

Problem Definition

Semantic frame induction aims to automatically identify and cluster verb instances that evoke the same semantic frame. For example, the verb "lost" may evoke different semantic frames in different contexts:

"He lost the gold medal by just .02 points" → FINISH_COMPETITION frame
"He lost his gold medal at the restaurant" → LOSING frame

Research Significance

Resource Scarcity: Manual construction of semantic frame resources is prohibitively expensive; automatic construction is urgently needed
Multilingual Demand: Frame resources for languages other than English are extremely limited
Domain Adaptation: Specific domains may require frame representations at different granularities

Limitations of Existing Methods

MLM Dependency: Existing methods are primarily based on masked language models such as BERT
Resource Dependency: Requires substantial annotated data for effective training
Language Limitations: Poor performance on low-resource languages

Research Motivation

Although modern CLMs such as GPT-4o demonstrate the ability to understand semantic frames (as shown in the ChatGPT example in Figure 1), they have not been systematically applied to semantic frame induction tasks. This paper aims to fill this gap.

Core Contributions

First Application of CLMs to Semantic Frame Induction: Proposes FrameEOL method, extending PromptEOL for frame embedding acquisition
Multi-Strategy Optimization: Combines in-context learning (ICL) and deep metric learning (DML) to enhance embedding quality
Surpassing Existing Methods: Achieves best performance on English FrameNet with BcF score of 71.9
Low-Resource Language Breakthrough: On Japanese FrameNet, achieves performance comparable to DML fine-tuned MLM using only 5 ICL examples
Bilingual Validation: Validates method effectiveness on both English and Japanese datasets

Methodology Details

Task Definition

Input: A set of sentences containing frame-evoking verbs Output: Clustering of verb instances according to the semantic frames they evoke Constraint: No predefined set of frame labels required

Model Architecture

3.1 FrameEOL Core Method

FrameEOL is inspired by PromptEOL and obtains frame embeddings through specially designed prompt templates:

Prompt Template:

The FrameNet frame evoked by "[verb]" in "[sentence]" is

Key Design:

[verb]: Frame-evoking verb placeholder
[sentence]: Placeholder for sentence containing the verb
Uses the final layer embedding of the last token "is" as the frame embedding

3.2 In-Context Learning Optimization (ICL)

To address low-resource language challenges, ICL is introduced:

Example Construction:

The FrameNet frame evoked by "wear" in "On his head he wore a white nightcap..." is Wearing.
The FrameNet frame evoked by "type" in "I typed it out for Diana Morrison." is Text_creation.
The FrameNet frame evoked by "kneel" in "He knelt up and leaned towards Lucien." is Change_posture.

The FrameNet frame evoked by "lost" in "He lost his gold medal at the restaurant." is

Advantages: Significant performance improvement with only a few examples (5-20), particularly suitable for scenarios with scarce training data.

3.3 Deep Metric Learning Optimization (DML)

Employs triplet loss function to optimize the embedding space:

$L_{tri} = \max(D(x_a, x_p) - D(x_a, x_n) + m, 0)$

Where:

$x_a, x_p, x_n$ : Frame embeddings of anchor, positive, and negative samples
$D(\cdot, \cdot)$ : Euclidean distance of normalized embeddings
$m$ : Margin parameter

Implementation Details:

Parameter-efficient fine-tuning using LoRA
LoRA rank r=8, α=32
Training for 20 epochs with batch size 32

Technical Innovations

Prompt Design Innovation: Specializes PromptEOL's generic sentence embedding method for frame embedding tasks
Dual Optimization Strategy: ICL for low-resource scenarios, DML for supervised scenarios
Parameter-Efficient Training: Uses LoRA to reduce computational resource requirements
Cross-Lingual Adaptation: Enables multilingual support through simple prompt translation

Experimental Setup

Datasets

English FrameNet 1.7

Scale: 82,610 instances, 642 frames, 2,492 verbs
Split: Three-fold cross-validation, average 27,537 training instances
Characteristics: Test set includes frames unseen during training (average 135.3/434.3)

Japanese FrameNet

Scale: 3,130 instances, 344 frames, 766 verbs
Split: Three-fold cross-validation, average 1,043 training instances
Challenge: Only 3.2% of English dataset size

Evaluation Metrics

Uses B-cubed precision (BCP), recall (BCR), and F-score (BCF) as primary evaluation metrics, with BCF as the main evaluation standard.

Baseline Methods

MLM Baselines: BERTbase/large, ModernBERTbase/large, RoBERTalarge
Clustering Methods: One-step clustering (average linkage) and two-step clustering (X-means + average linkage)
Training Settings: Both no fine-tuning and DML fine-tuning configurations

Implementation Details

Models: Gemma 3-12B, Llama 3.1-8B, etc.
ICL Settings: 5/10/20 examples, maximum sequence length 2048
Hyperparameters: Learning rate {3e-5, 5e-5, 1e-4}, margin {0.1, 0.2, 0.5, 1.0}

Experimental Results

Main Results

English FrameNet Performance

Model	Training Method	One-Step Clustering BCF	Two-Step Clustering BCF
RoBERTalarge + DML	DML	67.9	69.6
Gemma 3 + DML	DML	71.9	70.6
Llama 3.1 + DML	DML	70.8	70.9

Key Findings:

CLM+DML method significantly outperforms best MLM method
Gemma 3 achieves highest 71.9 BCF in one-step clustering
ICL method shows performance improvement with increasing number of examples

Japanese FrameNet Performance

Model	Training Method	One-Step Clustering BCF	Two-Step Clustering BCF
Japanese ModernBERTbase + DML	DML	60.0	58.4
LLM-jp-3 + DML	DML	61.3	59.2
Llama 3.1 + ICL(5-shot)	ICL	59.9	57.4

Important Findings:

Achieves performance comparable to DML with only 5 ICL examples
Demonstrates CLM advantages on low-resource languages

Ablation Studies

Impact Analysis of "FrameNet" Terminology

Removing "FrameNet" terminology from prompts has limited performance impact:

Performance degradation less than 1% in ICL and DML settings
Demonstrates model does not simply rely on FrameNet knowledge from pretraining

Experimental Findings

CLM Advantages: CLM+DML significantly outperforms MLM methods with sufficient training data
ICL Potential: Few examples yield competitive performance, particularly suitable for low-resource scenarios
Clustering Strategy: One-step clustering is sufficiently effective after DML/ICL optimization
Cross-Lingual Capability: CLMs demonstrate strong multilingual frame understanding ability

Semantic Frame Induction Research

Unsupervised Methods: Clustering using contextualized embeddings from MLMs such as BERT
Supervised Methods: Optimizing embedding space through deep metric learning
Two-Step Clustering: Addressing over-fragmentation issues of traditional methods

Prompt-Based Text Embedding

PromptBERT: Obtaining sentence embeddings through masked prediction
PromptEOL: Using CLM's next-word prediction capability for embeddings
This Work's Contribution: Specializing generic embedding methods for frame embedding tasks

Conclusions and Discussion

Main Conclusions

First Successful Application: CLMs can effectively be applied to semantic frame induction, outperforming traditional MLM methods
Low-Resource Advantages: ICL method demonstrates significant potential in data-scarce scenarios
Cross-Lingual Effectiveness: Method achieves excellent performance on both English and Japanese

Limitations

Computational Resources: Large-scale CLMs require significant computational resources
Language Coverage: Validation only on English and Japanese; generalization to other languages unknown
Domain Adaptation: Applicability in specific domains requires further verification

Future Directions

Multilingual Extension: Validate method effectiveness on more languages
Domain Adaptation: Explore application effects in specific domains
Efficiency Optimization: Develop more efficient training and inference methods

In-Depth Evaluation

Strengths

Strong Innovation: First systematic application of CLMs to semantic frame induction
Complete Methodology: Provides both ICL and DML optimization strategies for different resource conditions
Comprehensive Experiments: Full evaluation across two languages and multiple models
Practical Value: Provides feasible solutions for frame construction in low-resource languages

Weaknesses

Theoretical Analysis: Lacks in-depth theoretical explanation for why CLMs perform better on this task
Computational Cost: Insufficient discussion of computational cost comparison with MLM methods
Error Analysis: Lacks detailed analysis of failure cases
Generalization: Validation only on FrameNet data; applicability to other frame resources unknown

Impact

Academic Contribution: Opens new technical pathways for semantic frame research
Practical Value: Provides practical tools for multilingual frame resource construction
Reproducibility: Provides detailed experimental settings and hyperparameter configurations

Applicable Scenarios

Low-Resource Languages: Languages with scarce frame resources
Domain Adaptation: Scenarios requiring domain-specific frame construction
Rapid Prototyping: Applications requiring quick frame system development

References

This paper cites important works from multiple domains including semantic frames, deep metric learning, and prompt-based learning, providing solid theoretical foundations for method design. Particularly noteworthy are foundational works by Yamada et al. (2021, 2023) in MLM-based frame induction and the PromptEOL method proposed by Jiang et al. (2024).

Overall Assessment: This is a high-quality research paper that successfully introduces causal language models to semantic frame induction tasks, with significant contributions in method innovation, experimental validation, and practical value. The breakthrough performance in low-resource language scenarios is particularly noteworthy and provides important reference for related field development.