2025-11-18T03:04:13.779328

Interpreting the Latent Structure of Operator Precedence in Language Models

Yugeswardeenoo, Nukala, Blondin et al.

Large Language Models (LLMs) have demonstrated impressive reasoning capabilities but continue to struggle with arithmetic tasks. Prior works largely focus on outputs or prompting strategies, leaving the open question of the internal structure through which models do arithmetic computation. In this work, we investigate whether LLMs encode operator precedence in their internal representations via the open-source instruction-tuned LLaMA 3.2-3B model. We constructed a dataset of arithmetic expressions with three operands and two operators, varying the order and placement of parentheses. Using this dataset, we trace whether intermediate results appear in the residual stream of the instruction-tuned LLaMA 3.2-3B model. We apply interpretability techniques such as logit lens, linear classification probes, and UMAP geometric visualization. Our results show that intermediate computations are present in the residual stream, particularly after MLP blocks. We also find that the model linearly encodes precedence in each operator's embeddings post attention layer. We introduce partial embedding swap, a technique that modifies operator precedence by exchanging high-impact embedding dimensions between operators.

academic

言語モデルにおける演算子優先度の潜在構造の解釈

基本情報

論文ID: 2510.13908
タイトル: Interpreting the Latent Structure of Operator Precedence in Language Models
著者: Dharunish Yugeswardeenoo, Harshil Nukala, Cole Blondin, Sean O'Brien, Vasu Sharma, Kevin Zhu
分類: cs.CL（計算言語学）
発表時期/会議: COLM 2025
論文リンク: https://arxiv.org/abs/2510.13908

要旨

大規模言語モデル（LLM）は推論能力において優れた性能を示していますが、算術タスクではいまだに困難を抱えています。先行研究は主に出力またはプロンプト戦略に焦点を当てており、モデルが算術計算を実行する内部構造を見落としていました。本研究は、オープンソースの指示調整LLaMA 3.2-3Bモデルを通じて、LLMがその内部表現に演算子優先度をエンコードしているかどうかを探究しています。研究は、3つのオペランドと2つの演算子を含む算術式データセットを構築し、演算順序と括弧の位置を変化させました。このデータセットを使用して、中間結果がモデルの残差流に現れるかどうかを追跡し、logit lens、線形分類プローブ、およびUMAP幾何学的可視化などの解釈可能性技術を適用しました。結果は、中間計算が残差流に存在し、特にMLPブロックの後に存在することを示しています。研究はまた、モデルが注意層後の演算子埋め込みに優先度情報を線形にエンコードしていることを発見しました。論文は部分埋め込み交換技術を導入し、演算子間の高影響力埋め込み次元を交換することで演算子優先度を修正します。

研究背景と動機

問題定義

本研究が解決しようとしている中核的な問題は、大規模言語モデルが算術式を処理する際に、その内部表現に演算子優先度規則をどのようにエンコードしているかということです。具体的には、モデルが「1 + 1 × 2」のような式に直面したとき、数学的優先度規則に従って最初に乗算を計算するのか、それとも単に左から右の順序で処理するのかということです。