2025-11-17T06:22:13.355563

Survey in Characterization of Semantic Change

de SÃ¡, Da Silveira, Pruski

Live languages continuously evolve to integrate the cultural change of human societies. This evolution manifests through neologisms (new words) or \textbf{semantic changes} of words (new meaning to existing words). Understanding the meaning of words is vital for interpreting texts coming from different cultures (regionalism or slang), domains (e.g., technical terms), or periods. In computer science, these words are relevant to computational linguistics algorithms such as translation, information retrieval, question answering, etc. Semantic changes can potentially impact the quality of the outcomes of these algorithms. Therefore, it is important to understand and characterize these changes formally. The study of this impact is a recent problem that has attracted the attention of the computational linguistics community. Several approaches propose methods to detect semantic changes with good precision, but more effort is needed to characterize how the meaning of words changes and to reason about how to reduce the impact of semantic change. This survey provides an understandable overview of existing approaches to the \textit{characterization of semantic changes} and also formally defines three classes of characterizations: if the meaning of a word becomes more general or narrow (change in dimension) if the word is used in a more pejorative or positive/ameliorated sense (change in orientation), and if there is a trend to use the word in a, for instance, metaphoric or metonymic context (change in relation). We summarized the main aspects of the selected publications in a table and discussed the needs and trends in the research activities on semantic change characterization.

academic

意味変化の特性化に関する調査

基本情報

論文ID: 2402.19088
タイトル: Survey in Characterization of Semantic Change
著者: Jader Martins Camboim de Sá, Marcos Da Silveira, Cédric Pruski（ルクセンブルク科学技術研究所＆ルクセンブルク大学）
分類: cs.CL（計算言語学）、cs.AI
発表時期: プレプリント、2025年11月17日（arXiv v4）
論文リンク: https://arxiv.org/abs/2402.19088

要約

言語は新語（neologisms）または既存語の意味変化を通じて社会文化的変遷を反映する動的進化体である。異なる文化、領域、時期のテキストを解釈するために語義を理解することは重要であり、機械翻訳、情報検索、質問応答システムなどのNLPアプリケーションの性能に直接影響する。既存の方法は意味変化検出において良好な精度を達成しているが、意味変化の種類を特性化（characterize）する方法は体系的研究が不足している。本調査は、意味変化特性化の既存方法を初めて包括的に整理し、三つの変化タイプを形式的に定義した：次元変化（語義の拡大または縮小）、方向性変化（語義がより貶義的または褒義的になること）、関係変化（隠喩や転喩などの修辞手段を通じた語義の変化）。論文は主要な研究成果をまとめ、現在の制限を分析し、将来の研究方向を指摘している。

研究背景と動機

1. 核心問題

語彙意味変化（Lexical Semantic Change, LSC）は自然言語進化の核心現象である。既存研究は主に意味変化が発生したかどうか（detection）に焦点を当てているが、どのように変化したか（how it changed）の特性化研究は極めて不足している。例えば：

「gay」は「陽気な」から「同性愛の」へ変化（次元縮小＋方向性中立化）
「heart」は「心臓器官」から「勇気」「核心」などの隠喩義に拡張（関係変化）
「awful」は「畏敬の念を起こさせる」から「ひどい」へ変化（方向性貶義化）

2. 重要性

言語学的価値：言語進化の法則を理解し、文化、社会、技術が言語に与える影響を明らかにする
NLPアプリケーション：
- 歴史的テキスト理解（デジタル人文学研究など）
- 知識グラフ維持（Wikidataの時系列一貫性など）
- 時代横断的情報検索（技術文献における「cloud」の意味漂流など）
- 感情分析（俚語における「sick」の褒義化など）

3. 既存方法の制限

統一的形式化フレームワークの欠如：各研究が異なる用語と定義を使用し、比較が困難
評価基準の不一致：標準データセットと評価指標が不足
検出重視、特性化軽視：研究の90%が「変化したか」に焦点を当て、わずか10%が「どのように変化したか」を研究
データ不足：歴史語料库の規模は現代NLPが必要とするサイズより遥かに小さい（百万語 vs 兆語）

4. 研究動機

本論文は意味変化特性化を体系的に調査する初めての研究であり、以下を目的としている：

既存の表現方法と分類方法の制限を識別する
異なる方法の利点を評価する
一階述語論理に基づく形式的定義を提供する
LSC特性化タスクの概念的実証

核心貢献

初の特性化指向LSC調査：既存の調査（Tahmasebi et al. 2018、Kutuzov et al. 2018）が検出に焦点を当てるのとは異なり、本論文は特性化に焦点を当てている
三極分類法（Three-Pole Taxonomy）：
- 次元（Dimension）：broadening/narrowing（語義数量の変化）
- 方向性（Orientation）：amelioration/pejoration（感情傾向の変化）
- 関係（Relation）：metaphorization/metonymization（修辞関係の変化）
形式化フレームワーク：集合論に基づく数学的定義（第5節）を提供し、identificationと characterizationを区別
体系的方法分類：表現方法（頻度/トピック/グラフ/埋め込み）×変化極（D/R/O）の二次元分類マトリックスを構築（表3）
実証的実演：SEMCORとMASCデータセットを使用してフレームワークの実行可能性を検証
研究空白の識別：関係極（R）と多極共同特性化の研究不足を指摘

意味宇宙： $S_T$ はすべての可能な語義の集合
語義関数： $S: V \times T \rightarrow \wp(S_t)$ は語 $w$ を語料 $t$ における語義の集合にマッピングする $S(w, t) = \{s_1, s_2, ..., s_k\}$

意味変化の判定

語 $w$ が $t_1, t_2$ 間で変化が発生する当且つ当たり前の場合：

Survey in Characterization of Semantic Change

意味変化の特性化に関する調査

基本情報

要約

研究背景と動機

1. 核心問題

2. 重要性

3. 既存方法の制限

4. 研究動機

核心貢献

方法の詳細説明

タスク定義

意味変化検出（Identification）

意味変化特性化（Characterization）★核心的革新

形式化フレームワーク（第5節の核心）

基本定義

意味変化の判定