2025-11-18T04:19:13.869286

Confidence-Based Response Abstinence: Improving LLM Trustworthiness via Activation-Based Uncertainty Estimation

Huang, Datla, Zhu et al.
We propose a method for confidence estimation in retrieval-augmented generation (RAG) systems that aligns closely with the correctness of large language model (LLM) outputs. Confidence estimation is especially critical in high-stakes domains such as finance and healthcare, where the cost of an incorrect answer outweighs that of not answering the question. Our approach extends prior uncertainty quantification methods by leveraging raw feed-forward network (FFN) activations as auto-regressive signals, avoiding the information loss inherent in token logits and probabilities after projection and softmax normalization. We model confidence prediction as a sequence classification task, and regularize training with a Huber loss term to improve robustness against noisy supervision. Applied in a real-world financial industry customer-support setting with complex knowledge bases, our method outperforms strong baselines and maintains high accuracy under strict latency constraints. Experiments on Llama 3.1 8B model show that using activations from only the 16th layer preserves accuracy while reducing response latency. Our results demonstrate that activation-based confidence modeling offers a scalable, architecture-aware path toward trustworthy RAG deployment.
academic

์‹ ๋ขฐ๋„ ๊ธฐ๋ฐ˜ ์‘๋‹ต ๊ธฐ๊ถŒ: ํ™œ์„ฑํ™” ๊ธฐ๋ฐ˜ ๋ถˆํ™•์‹ค์„ฑ ์ถ”์ •์„ ํ†ตํ•œ LLM ์‹ ๋ขฐ์„ฑ ํ–ฅ์ƒ

๊ธฐ๋ณธ ์ •๋ณด

  • ๋…ผ๋ฌธ ID: 2510.13750
  • ์ œ๋ชฉ: Confidence-Based Response Abstinence: Improving LLM Trustworthiness via Activation-Based Uncertainty Estimation
  • ์ €์ž: Zhiqi Huang, Vivek Datla, Chenyang Zhu, Alfy Samuel, Daben Liu, Anoop Kumar, Ritesh Soni (Capital One)
  • ๋ถ„๋ฅ˜: cs.CL (๊ณ„์‚ฐ ์–ธ์–ดํ•™)
  • ๋ฐœํ‘œ ์‹œ๊ฐ„: 2025๋…„ 10์›” 16์ผ (arXiv v2)
  • ๋…ผ๋ฌธ ๋งํฌ: https://arxiv.org/abs/2510.13750v2

์ดˆ๋ก

๋ณธ ๋…ผ๋ฌธ์€ ๊ฒ€์ƒ‰ ์ฆ๊ฐ• ์ƒ์„ฑ(RAG) ์‹œ์Šคํ…œ์„ ์œ„ํ•œ ์‹ ๋ขฐ๋„ ์ถ”์ • ๋ฐฉ๋ฒ•์„ ์ œ์•ˆํ•˜๋ฉฐ, ์ด๋Š” ๋Œ€๊ทœ๋ชจ ์–ธ์–ด ๋ชจ๋ธ(LLM) ์ถœ๋ ฅ์˜ ์ •ํ™•์„ฑ๊ณผ ๋ฐ€์ ‘ํ•œ ๊ด€๋ จ์ด ์žˆ์Šต๋‹ˆ๋‹ค. ์‹ ๋ขฐ๋„ ์ถ”์ •์€ ๊ธˆ์œต ๋ฐ ์˜๋ฃŒ์™€ ๊ฐ™์€ ๊ณ ์œ„ํ—˜ ๋ถ„์•ผ์—์„œ ํŠนํžˆ ์ค‘์š”ํ•˜๋ฉฐ, ์ด๋Ÿฌํ•œ ๋ถ„์•ผ์—์„œ๋Š” ์˜ค๋‹ต์˜ ๋น„์šฉ์ด ์งˆ๋ฌธ์— ๋‹ตํ•˜์ง€ ์•Š๋Š” ๋น„์šฉ๋ณด๋‹ค ํ›จ์”ฌ ํฝ๋‹ˆ๋‹ค. ๋ณธ ๋ฐฉ๋ฒ•์€ ์›์‹œ ํ”ผ๋“œํฌ์›Œ๋“œ ๋„คํŠธ์›Œํฌ(FFN) ํ™œ์„ฑํ™”๋ฅผ ์ž๊ธฐํšŒ๊ท€ ์‹ ํ˜ธ๋กœ ํ™œ์šฉํ•˜์—ฌ ๊ธฐ์กด์˜ ๋ถˆํ™•์‹ค์„ฑ ์ •๋Ÿ‰ํ™” ๋ฐฉ๋ฒ•์„ ํ™•์žฅํ•˜๋ฉฐ, ํ† ํฐ ๋กœ์ง“๊ณผ ํ™•๋ฅ ์ด ํˆฌ์˜ ๋ฐ ์†Œํ”„ํŠธ๋งฅ์Šค ์ •๊ทœํ™” ํ›„ ๊ฒช๋Š” ๋ณธ์งˆ์ ์ธ ์ •๋ณด ์†์‹ค์„ ํšŒํ”ผํ•ฉ๋‹ˆ๋‹ค. ์ €์ž๋“ค์€ ์‹ ๋ขฐ๋„ ์˜ˆ์ธก์„ ์‹œํ€€์Šค ๋ถ„๋ฅ˜ ์ž‘์—…์œผ๋กœ ๋ชจ๋ธ๋งํ•˜๊ณ  Huber ์†์‹ค ํ•ญ์„ ์‚ฌ์šฉํ•˜์—ฌ ํ›ˆ๋ จ์„ ์ •๊ทœํ™”ํ•˜์—ฌ ๋…ธ์ด์ฆˆ ๊ฐ๋…์— ๋Œ€ํ•œ ๊ฒฌ๊ณ ์„ฑ์„ ํ–ฅ์ƒ์‹œํ‚ต๋‹ˆ๋‹ค. ๋ณต์žกํ•œ ์ง€์‹ ๊ธฐ๋ฐ˜์„ ๊ฐ€์ง„ ์‹ค์ œ ๊ธˆ์œต ์‚ฐ์—… ๊ณ ๊ฐ ์ง€์› ์‹œ๋‚˜๋ฆฌ์˜ค์—์„œ, ๋ณธ ๋ฐฉ๋ฒ•์€ ์—„๊ฒฉํ•œ ์ง€์—ฐ ์ œ์•ฝ ํ•˜์—์„œ ๊ฐ•๋ ฅํ•œ ๊ธฐ์ค€์„ ์„ ๋Šฅ๊ฐ€ํ•˜๋ฉด์„œ ๋†’์€ ์ •ํ™•์„ฑ์„ ์œ ์ง€ํ•ฉ๋‹ˆ๋‹ค.

์—ฐ๊ตฌ ๋ฐฐ๊ฒฝ ๋ฐ ๋™๊ธฐ

๋ฌธ์ œ ์ •์˜

๊ณ ์œ„ํ—˜ ์‘์šฉ ์‹œ๋‚˜๋ฆฌ์˜ค์—์„œ RAG ์‹œ์Šคํ…œ์€ ์˜ค๋‹ต์„ ์ œ๊ณตํ•˜๊ธฐ๋ณด๋‹ค๋Š” ๋‹ต๋ณ€์„ ๊ฑฐ๋ถ€ํ•˜๋Š” ๊ฒƒ์ด ๋‚ซ์Šต๋‹ˆ๋‹ค. ์ด๋Š” ์‘๋‹ต ์ •ํ™•์„ฑ๊ณผ ๊ฐ•ํ•˜๊ฒŒ ์ƒ๊ด€๋œ ์‹ ๋ขฐ๋„ ์ธก์ •์ด ํ•„์š”ํ•˜๋ฉฐ, ์‹ ๋ขฐ๋„ ์ ์ˆ˜๊ฐ€ ์ž„๊ณ„๊ฐ’ ์ดํ•˜์ผ ๋•Œ ์‘๋‹ต์„ ์ฐจ๋‹จํ•ฉ๋‹ˆ๋‹ค.

๋ฌธ์ œ์˜ ์ค‘์š”์„ฑ

  1. ๊ณ ์œ„ํ—˜ ๋ถ„์•ผ ์š”๊ตฌ์‚ฌํ•ญ: ๊ธˆ์œต, ์˜๋ฃŒ ๋“ฑ ์—„๊ฒฉํžˆ ๊ทœ์ œ๋˜๋Š” ๋ถ„์•ผ์—์„œ ์˜ค๋‹ต ์ œ๊ณต์˜ ํ‰ํŒ ๋ฐ ์žฌ์ • ๋น„์šฉ์€ ๋‹ต๋ณ€ ๋ฏธ์ œ๊ณต ๋น„์šฉ๋ณด๋‹ค ํ›จ์”ฌ ๋†’์Šต๋‹ˆ๋‹ค
  2. ์‹ค์‹œ๊ฐ„ ๋ฐฐํฌ ๋„์ „: ๊ธฐ์กด ๋ฐฉ๋ฒ•์€ ๊ธด ์„œ์ˆ ํ˜• ๋‹ต๋ณ€๊ณผ ํ”„๋กœ๋•์…˜ ํ™˜๊ฒฝ์˜ ์ง€์—ฐ ์š”๊ตฌ์‚ฌํ•ญ์—์„œ ์„ฑ๋Šฅ์ด ์ €์กฐํ•ฉ๋‹ˆ๋‹ค
  3. ๋ถˆํ™•์‹ค์„ฑ ์ถœ์ฒ˜: ์ฃผ๋กœ ์ธ์‹๋ก ์  ๋ถˆํ™•์‹ค์„ฑ(๋ชจ๋ธ ์ง€์‹ ๋ถ€์กฑ)์—์„œ ๋น„๋กฏ๋˜๋ฉฐ, ์šฐ์—ฐ์  ๋ถˆํ™•์‹ค์„ฑ(๋ฐ์ดํ„ฐ ๊ณ ์œ  ๋ฌด์ž‘์œ„์„ฑ)์ด ์•„๋‹™๋‹ˆ๋‹ค

๊ธฐ์กด ๋ฐฉ๋ฒ•์˜ ํ•œ๊ณ„

  1. ์ƒ˜ํ”Œ๋ง ๊ธฐ๋ฐ˜ ๋ฐฉ๋ฒ•: ์—ฌ๋Ÿฌ ๋ฒˆ์˜ ์ƒ์„ฑ์ด ํ•„์š”ํ•˜๋ฉฐ, ํ”„๋กœ๋•์…˜ ํ™˜๊ฒฝ์—์„œ ๊ณผ๋„ํ•œ ๊ณ„์‚ฐ ๋น„์šฉ๊ณผ ์ง€์—ฐ์„ ์ดˆ๋ž˜ํ•ฉ๋‹ˆ๋‹ค
  2. ํ† ํฐ ํ™•๋ฅ  ๋ฐฉ๋ฒ•: ๊ธด ๋‹ต๋ณ€์—์„œ ์„ฑ๋Šฅ์ด ์ €์กฐํ•˜๋ฉฐ, ๋‹จ์ผ ๋‚ฎ์€ ํ™•๋ฅ  ๋‹จ์–ด๊ฐ€ ์ „์ฒด ์‹œํ€€์Šค ์ ์ˆ˜๋ฅผ ๋ถ€๋‹นํ•˜๊ฒŒ ๋‚ฎ์ถœ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค
  3. ์ •๋ณด ์†์‹ค: ํ† ํฐ ํ™•๋ฅ ์€ ์„ ํ˜• ํˆฌ์˜ ๋ฐ ์†Œํ”„ํŠธ๋งฅ์Šค ๋ณ€ํ™˜ ํ›„ ํ’๋ถ€ํ•œ ๋‚ด๋ถ€ ํ‘œํ˜„ ์ •๋ณด๋ฅผ ์žƒ์Šต๋‹ˆ๋‹ค

ํ•ต์‹ฌ ๊ธฐ์—ฌ

  1. ํ™œ์„ฑํ™” ๊ธฐ๋ฐ˜ ์‹ ๋ขฐ๋„ ์ถ”์ • ๋ฐฉ๋ฒ• ์ œ์•ˆ: ์›์‹œ FFN ํ™œ์„ฑํ™”๋ฅผ ์ž๊ธฐํšŒ๊ท€ ์‹ ํ˜ธ๋กœ ํ™œ์šฉํ•˜์—ฌ ํ† ํฐ ๋กœ์ง“์˜ ์ •๋ณด ์†์‹ค์„ ํšŒํ”ผํ•ฉ๋‹ˆ๋‹ค
  2. ์‹œํ€€์Šค ๋ถ„๋ฅ˜ ํ”„๋ ˆ์ž„์›Œํฌ: ์‹ ๋ขฐ๋„ ์˜ˆ์ธก์„ ์‹œํ€€์Šค ๋ถ„๋ฅ˜ ์ž‘์—…์œผ๋กœ ๋ชจ๋ธ๋งํ•˜๋ฉฐ, LSTM์„ ์‚ฌ์šฉํ•˜์—ฌ ํ™œ์„ฑํ™” ์‹œํ€€์Šค๋ฅผ ์ฒ˜๋ฆฌํ•ฉ๋‹ˆ๋‹ค
  3. Huber ์†์‹ค ์ •๊ทœํ™”: Huber ์†์‹ค์„ ๋„์ž…ํ•˜์—ฌ ๊ฒ€์ƒ‰ ๋‹จ๊ณ„์˜ ๋…ธ์ด์ฆˆ ๊ฐ๋…์— ๋Œ€ํ•œ ๊ฒฌ๊ณ ์„ฑ์„ ํ–ฅ์ƒ์‹œํ‚ต๋‹ˆ๋‹ค
  4. ํ”„๋กœ๋•์…˜ ํ™˜๊ฒฝ ๊ฒ€์ฆ: ์‹ค์ œ ๊ธˆ์œต ๊ณ ๊ฐ ์ง€์› ์‹œ๋‚˜๋ฆฌ์˜ค์—์„œ ๋ฐฉ๋ฒ•์˜ ํšจ๊ณผ์„ฑ๊ณผ ํ™•์žฅ์„ฑ์„ ๊ฒ€์ฆํ•ฉ๋‹ˆ๋‹ค
  5. ํšจ์œจ์„ฑ ์ตœ์ ํ™”: ์ œ16์ธต ํ™œ์„ฑํ™”๋งŒ ์‚ฌ์šฉํ•˜๋ฉด ์ •ํ™•์„ฑ์„ ์œ ์ง€ํ•˜๋ฉด์„œ ์ง€์—ฐ์„ ํฌ๊ฒŒ ์ค„์ผ ์ˆ˜ ์žˆ์Œ์„ ์ž…์ฆํ•ฉ๋‹ˆ๋‹ค

๋ฐฉ๋ฒ• ์ƒ์„ธ ์„ค๋ช…

์ž‘์—… ์ •์˜

์ž…๋ ฅ x์™€ ์ƒ์„ฑ ์‹œํ€€์Šค s๊ฐ€ ์ฃผ์–ด์กŒ์„ ๋•Œ, ๋ชฉํ‘œ๋Š” ์‘๋‹ต ์ •ํ™•์„ฑ๊ณผ ๊ฐ•ํ•˜๊ฒŒ ์ƒ๊ด€๋œ ์‹ ๋ขฐ๋„ ์ ์ˆ˜ c๋ฅผ ์ถ”์ •ํ•˜๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค. c๊ฐ€ ์ž„๊ณ„๊ฐ’ ์ดํ•˜์ผ ๋•Œ ์‹œ์Šคํ…œ์€ ์‘๋‹ต ํ‘œ์‹œ๋ฅผ ๊ฑฐ๋ถ€ํ•ฉ๋‹ˆ๋‹ค.

๋ชจ๋ธ ์•„ํ‚คํ…์ฒ˜

์ „์ฒด ํ”„๋ ˆ์ž„์›Œํฌ

์ž…๋ ฅ ์‹œํ€€์Šค๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™์ด ๊ตฌ์„ฑ๋ฉ๋‹ˆ๋‹ค:

x = xI โŠ• xQ โŠ• xC โŠ• s โŠ• xEOS

์—ฌ๊ธฐ์„œ xI(์ง€์‹œ์‚ฌํ•ญ), xQ(์งˆ๋ฌธ), xC(์ปจํ…์ŠคํŠธ), s(๋‹ต๋ณ€), xEOS(์ข…๋ฃŒ ๊ธฐํ˜ธ)

ํ™œ์„ฑํ™” ์ถ”์ถœ

Transformer ์ œโ„“์ธต์—์„œ ์ˆจ๊ฒจ์ง„ ์ƒํƒœ ํ™œ์„ฑํ™”๋ฅผ ์ถ”์ถœํ•ฉ๋‹ˆ๋‹ค:

Hโ„“ = (hยนโ„“, ..., h^(T+L+1)โ„“)

๋‹ต๋ณ€ ๋ถ€๋ถ„์— ํ•ด๋‹นํ•˜๋Š” ํ™œ์„ฑํ™”๋งŒ ๋ณด์กดํ•ฉ๋‹ˆ๋‹ค:

Sin = (h^(T+1)โ„“, h^(T+2)โ„“, ..., h^(T+L+1)โ„“)

์‹œํ€€์Šค ๋ถ„๋ฅ˜๊ธฐ

LSTM์„ ์‹œํ€€์Šค ๋ถ„๋ฅ˜๊ธฐ g(Sin)๋กœ ์‚ฌ์šฉํ•˜๋ฉฐ, 2์ฐจ์› ๋กœ์ง“ ๋ฒกํ„ฐ z๋ฅผ ์ถœ๋ ฅํ•˜๊ณ , ์‹ ๋ขฐ๋„ ์ ์ˆ˜๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค:

c = softmax(z)โ‚ = e^zโ‚/(e^zโ‚€ + e^zโ‚)

ํ›ˆ๋ จ ์ „๋žต

์†์‹ค ํ•จ์ˆ˜

๊ต์ฐจ ์—”ํŠธ๋กœํ”ผ ์†์‹ค๊ณผ Huber ์†์‹ค ์ •๊ทœํ™”๋ฅผ ๊ฒฐํ•ฉํ•ฉ๋‹ˆ๋‹ค:

LTotal = LCE + ฮปLHuber

Huber ์†์‹ค์€ ๋‹ค์Œ๊ณผ ๊ฐ™์ด ์ •์˜๋ฉ๋‹ˆ๋‹ค:

Hฮด(x) = {
  ยฝxยฒ for |x| โ‰ค ฮด
  ฮด(|x| - ยฝฮด) otherwise
}

๋ฐฐ์น˜ ์ˆ˜์ค€ Huber ์†์‹ค:

LHuber = Hฮด(1/|B| ฮฃci - 1/|B| ฮฃI(ลทi = yi))

๊ธฐ์ˆ  ํ˜์‹  ํฌ์ธํŠธ

  1. ์›์‹œ ํ™œ์„ฑํ™” vs ํ† ํฐ ํ™•๋ฅ : ์„ ํ˜• ํˆฌ์˜ ๋ฐ ์†Œํ”„ํŠธ๋งฅ์Šค๋กœ ์ธํ•œ ์ •๋ณด ์••์ถ• ๋ฐ ์™œ๊ณก์„ ํšŒํ”ผํ•ฉ๋‹ˆ๋‹ค
  2. ์ž๊ธฐํšŒ๊ท€ ์‹œํ€€์Šค ๋ชจ๋ธ๋ง: LSTM์„ ์‚ฌ์šฉํ•˜์—ฌ ์ƒ์„ฑ ๊ณผ์ •์˜ ์‹œ๊ฐ„์  ์˜์กด์„ฑ์„ ํฌ์ฐฉํ•ฉ๋‹ˆ๋‹ค
  3. ๊ฒฌ๊ณ ์„ฑ ์ •๊ทœํ™”: Huber ์†์‹ค์€ ๊ฒ€์ƒ‰ ์˜ค๋ฅ˜๋กœ ์ธํ•œ ๋…ธ์ด์ฆˆ ๋ ˆ์ด๋ธ”์— ๋”์šฑ ๊ฒฌ๊ณ ํ•ฉ๋‹ˆ๋‹ค
  4. ์ธต ์ˆ˜์ค€ ์ตœ์ ํ™”: ์‹คํ—˜์„ ํ†ตํ•ด ์ตœ์ ์˜ ํ™œ์„ฑํ™” ์ถ”์ถœ ์ธต์„ ๊ฒฐ์ •ํ•ฉ๋‹ˆ๋‹ค

์‹คํ—˜ ์„ค์ •

๋ฐ์ดํ„ฐ์…‹

  • ์ถœ์ฒ˜: Capital One ๋‚ด๋ถ€ ๊ธˆ์œต ๊ณ ๊ฐ ์ง€์› ์ง€์‹ ๊ธฐ๋ฐ˜
  • ๊ทœ๋ชจ: 8.5k ๋ฌธ์„œ, ์•ฝ 45k ์ฒญํฌ
  • ํŠน์ง•: ๋ฐ˜๊ตฌ์กฐํ™” ๋ฌธ์„œ, ๋ณต์žกํ•œ ๊ณ„์ธต ๊ตฌ์กฐ, ํ‘œ, ๋ชฉ๋ก ๋“ฑ ํฌํ•จ
  • ์ฃผ์„: ์‹ค์‹œ๊ฐ„ ํ”ผ๋“œ๋ฐฑ ๋ฐ SME ์ „๋ฌธ๊ฐ€ ํ‰๊ฐ€์˜ 2๋‹จ๊ณ„ ๊ฒ€์ฆ ๋ฉ”์ปค๋‹ˆ์ฆ˜

ํ‰๊ฐ€ ์ง€ํ‘œ

  • AUROC: ์‹ ๋ขฐ๋„ ์ ์ˆ˜์˜ ๊ตฌ๋ถ„ ๋Šฅ๋ ฅ
  • ์ •๋ฐ€๋„(P): ํ‘œ์‹œ๋œ ์‘๋‹ต์˜ ์ •ํ™•๋ฅ 
  • ์žฌํ˜„์œจ(R): ์˜ฌ๋ฐ”๋ฅธ ์‘๋‹ต์˜ ์žฌํ˜„์œจ
  • ROUGE-L: ์‘๋‹ต ํ’ˆ์งˆ ํ‰๊ฐ€
  • ์ฐจ๋‹จ์œจ: ์ฐจ๋‹จ๋œ ์‘๋‹ต์˜ ๋น„์œจ
  • ์ง€์—ฐ: ํ‰๊ท  ๋ฐ P99 ์‘๋‹ต ์‹œ๊ฐ„

๋น„๊ต ๋ฐฉ๋ฒ•

  • Vectara (HHEM2.1): ํ•จ์ถ• ๊ธฐ๋ฐ˜ ์˜๋ฏธ ์ผ๊ด€์„ฑ ๋ชจ๋ธ
  • VectaraFT: Vectara์˜ ๋ฏธ์„ธ ์กฐ์ • ๋ฒ„์ „
  • Logits ๊ธฐ๋ฐ˜: ํ† ํฐ ๋กœ์ง“ ๊ธฐ๋ฐ˜ ๋ถˆํ™•์‹ค์„ฑ ๋ชจ๋ธ

๊ตฌํ˜„ ์„ธ๋ถ€์‚ฌํ•ญ

  • ๋ชจ๋ธ: Llama 3.1 8B
  • ํ™œ์„ฑํ™” ์ธต: ์ œ16์ธต ๋ฐ ์ œ32์ธต
  • ์ปจํ…์ŠคํŠธ ํฌ๊ธฐ: Top-1, Top-3, Top-5, Full (Top-7)
  • ์ถ”๋ก  ํ”„๋ ˆ์ž„์›Œํฌ: Hugging Face, vLLM

์‹คํ—˜ ๊ฒฐ๊ณผ

์ฃผ์š” ๊ฒฐ๊ณผ

๋ฐฉ๋ฒ•AUROC
Vectara0.590
VectaraFT0.634
Logits ๊ธฐ๋ฐ˜0.663
๋ณธ ๋ชจ๋ธ (๋ณด์ • ์—†์Œ)0.741
๋ณธ ๋ชจ๋ธ (๋ณด์ • ํฌํ•จ)0.772

์‹ ๋ขฐ๋„ ์ž„๊ณ„๊ฐ’ ๋ถ„์„

์ž„๊ณ„๊ฐ’์ •๋ฐ€๋„์žฌํ˜„์œจROUGE-L (ํ‘œ์‹œ/์ฐจ๋‹จ)์ฐจ๋‹จ์œจ
0.50.950.730.65/0.5729.9%
0.70.960.650.66/0.5738.6%
0.90.970.520.67/0.5852.0%

์ธต ๋ฐ ์ปจํ…์ŠคํŠธ ์ตœ์ ํ™”

์ œ16์ธต vs ์ œ32์ธต:

  • ์ œ16์ธต์€ ์œ ์‚ฌํ•œ ์„ฑ๋Šฅ์„ ์œ ์ง€ํ•˜๋ฉด์„œ ์ง€์—ฐ์„ ํฌ๊ฒŒ ์ค„์ž…๋‹ˆ๋‹ค(์•ฝ 42.5%)
  • Full ์ปจํ…์ŠคํŠธ ์„ค์ •์—์„œ ์ œ16์ธต์€ 0.97 ์ •๋ฐ€๋„, 31.3% ์ฐจ๋‹จ์œจ์„ ๋‹ฌ์„ฑํ•ฉ๋‹ˆ๋‹ค

์ง€์—ฐ ๋ถ„์„:

ํ”„๋ ˆ์ž„์›Œํฌ์ธต์ปจํ…์ŠคํŠธํ‰๊ท  ์ง€์—ฐ(ms)P99 ์ง€์—ฐ(ms)
vLLM16Full127267
vLLM32Full206354

์ œ๊ฑฐ ์‹คํ—˜

  1. Huber ์†์‹ค์˜ ์—ญํ• : 0.741์—์„œ 0.772 AUROC๋กœ ํ–ฅ์ƒ
  2. ํ™œ์„ฑํ™” ์ธต ์„ ํƒ: ์ œ16์ธต ์„ฑ๋Šฅ์€ ์ œ32์ธต์— ๊ฐ€๊น์ง€๋งŒ ์ง€์—ฐ์ด ๋” ๋‚ฎ์Šต๋‹ˆ๋‹ค
  3. ์ปจํ…์ŠคํŠธ ํฌ๊ธฐ ์˜ํ–ฅ: ๋” ํฐ ์ปจํ…์ŠคํŠธ๋Š” ์ •ํ™•์„ฑ์„ ํ–ฅ์ƒ์‹œํ‚ค์ง€๋งŒ ์ง€์—ฐ์„ ์ฆ๊ฐ€์‹œํ‚ต๋‹ˆ๋‹ค

๊ด€๋ จ ์—ฐ๊ตฌ

๋ถˆํ™•์‹ค์„ฑ ์ •๋Ÿ‰ํ™” ๋ฐฉ๋ฒ• ๋ถ„๋ฅ˜

  1. ์ƒ˜ํ”Œ๋ง ๊ธฐ๋ฐ˜ ๋ฐฉ๋ฒ•: ์—ฌ๋Ÿฌ ์ƒ์„ฑ์„ ํ†ตํ•ด ์ผ๊ด€์„ฑ์„ ์ธก์ •ํ•˜์ง€๋งŒ ๊ณ„์‚ฐ ๋น„์šฉ์ด ๋†’์Šต๋‹ˆ๋‹ค
  2. ํ™•๋ฅ  ๊ธฐ๋ฐ˜ ๋ฐฉ๋ฒ•: ํ† ํฐ ํ™•๋ฅ  ๋ฐ ์˜๋ฏธ ์—”ํŠธ๋กœํ”ผ๋ฅผ ํ™œ์šฉํ•˜์ง€๋งŒ ๊ธด ํ…์ŠคํŠธ์—์„œ ํšจ๊ณผ๊ฐ€ ์ œํ•œ์ ์ž…๋‹ˆ๋‹ค
  3. ๋ถ„๋ฅ˜ ๊ธฐ๋ฐ˜ ๋ฐฉ๋ฒ•: HHEM ๋“ฑ์œผ๋กœ ์—ฌ๋Ÿฌ ์ƒ์„ฑ์„ ํšŒํ”ผํ•˜์ง€๋งŒ ๋ธ”๋ž™๋ฐ•์Šค ์ ‘๊ทผ์— ์˜์กดํ•ฉ๋‹ˆ๋‹ค
  4. ํ™œ์„ฑํ™” ๊ธฐ๋ฐ˜ ๋ฐฉ๋ฒ•: ๋‚ด๋ถ€ ํ‘œํ˜„์„ ํ™œ์šฉํ•˜๋ฉฐ, ๋ณธ ๋…ผ๋ฌธ์˜ ์ฃผ์š” ๊ธฐ์—ฌ ๋ฐฉํ–ฅ์ž…๋‹ˆ๋‹ค

๋ณธ ๋…ผ๋ฌธ์˜ ์žฅ์ 

  • ์ƒ˜ํ”Œ๋ง ๋ฐฉ๋ฒ• ๋Œ€๋น„: ๋‹จ์ผ ์ „๋ฐฉ ์ „ํŒŒ, ๋” ๋‚ฎ์€ ์ง€์—ฐ
  • ํ™•๋ฅ  ๊ธฐ๋ฐ˜ ๋ฐฉ๋ฒ• ๋Œ€๋น„: ์™„์ „ํ•œ ๋‚ด๋ถ€ ํ‘œํ˜„ ๋ณด์กด, ๋” ์ ์€ ์ •๋ณด ์†์‹ค
  • ๋ธ”๋ž™๋ฐ•์Šค ๋ฐฉ๋ฒ• ๋Œ€๋น„: ํ™”์ดํŠธ๋ฐ•์Šค ์ ‘๊ทผ์„ ํ™œ์šฉํ•˜์—ฌ ๋” ํ’๋ถ€ํ•œ ์‹ ํ˜ธ ํš๋“

๊ฒฐ๋ก  ๋ฐ ๋…ผ์˜

์ฃผ์š” ๊ฒฐ๋ก 

  1. ํšจ๊ณผ์„ฑ: ํ™œ์„ฑํ™” ๊ธฐ๋ฐ˜ ๋ฐฉ๋ฒ•์€ ๊ธฐ์กด ๊ธฐ์ค€์„ ์„ ํฌ๊ฒŒ ๋Šฅ๊ฐ€ํ•˜๋ฉฐ AUROC 0.772 ๋‹ฌ์„ฑ
  2. ์‹ค์šฉ์„ฑ: ํ”„๋กœ๋•์…˜ ํ™˜๊ฒฝ์—์„œ 0.95 ์ •๋ฐ€๋„์™€ 29.9% ์ฐจ๋‹จ์œจ์˜ ์ข‹์€ ๊ท ํ˜• ๋‹ฌ์„ฑ
  3. ํšจ์œจ์„ฑ: ์ œ16์ธต ํ™œ์„ฑํ™”๋Š” ์„ฑ๋Šฅ์„ ์œ ์ง€ํ•˜๋ฉด์„œ ์ง€์—ฐ์„ ํฌ๊ฒŒ ๊ฐ์†Œ์‹œํ‚ต๋‹ˆ๋‹ค
  4. ๊ฒฌ๊ณ ์„ฑ: Huber ์†์‹ค์€ ๋…ธ์ด์ฆˆ ๊ฐ๋…์— ๋Œ€ํ•œ ๊ฒฌ๊ณ ์„ฑ์„ ํšจ๊ณผ์ ์œผ๋กœ ํ–ฅ์ƒ์‹œํ‚ต๋‹ˆ๋‹ค

ํ•œ๊ณ„

  1. ํ™”์ดํŠธ๋ฐ•์Šค ์˜์กด์„ฑ: ๋ชจ๋ธ ๋‚ด๋ถ€ ํ™œ์„ฑํ™”์— ๋Œ€ํ•œ ์ ‘๊ทผ์ด ํ•„์š”ํ•˜์—ฌ ์ผ๋ฐ˜์„ฑ์„ ์ œํ•œํ•ฉ๋‹ˆ๋‹ค
  2. ์•„ํ‚คํ…์ฒ˜ ํŠน์ด์„ฑ: ๋ฐฉ๋ฒ•์€ ํŠน์ • ๋ชจ๋ธ ์•„ํ‚คํ…์ฒ˜์— ๋งž์ถคํ™”๋˜์–ด ์žˆ์œผ๋ฉฐ ์ „์ด์—๋Š” ์žฌ๊ตฌ์„ฑ์ด ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค
  3. 2๋‹จ๊ณ„ ์ฒ˜๋ฆฌ: ์‹ ๋ขฐ๋„ ์ ์ˆ˜๋ฅผ ๊ณ„์‚ฐํ•˜๊ธฐ ์œ„ํ•ด ์ถ”๊ฐ€ ์ „๋ฐฉ ์ „ํŒŒ๊ฐ€ ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค
  4. ๋ฐ์ดํ„ฐ ์ œํ•œ: ์‹คํ—˜ ๋ฐ์ดํ„ฐ๋ฅผ ๊ณต๊ฐœํ•  ์ˆ˜ ์—†์–ด ์žฌํ˜„์„ฑ์— ์˜ํ–ฅ์„ ๋ฏธ์นฉ๋‹ˆ๋‹ค

ํ–ฅํ›„ ๋ฐฉํ–ฅ

  1. ์—”๋“œ-ํˆฌ-์—”๋“œ ํ†ตํ•ฉ: ์‹ ๋ขฐ๋„ ์ถ”์ •์„ ์ƒ์„ฑ ๊ณผ์ •์— ์ง์ ‘ ํ†ตํ•ฉํ•ฉ๋‹ˆ๋‹ค
  2. ์•„ํ‚คํ…์ฒ˜ ๋ฌด๊ด€์„ฑ: ๋‹ค์–‘ํ•œ LLM ์•„ํ‚คํ…์ฒ˜์— ์ ์šฉ ๊ฐ€๋Šฅํ•œ ๋ฒ”์šฉ ๋ฐฉ๋ฒ• ๊ฐœ๋ฐœ
  3. ๊ณ„์‚ฐ ์ตœ์ ํ™”: ์‹ ๋ขฐ๋„ ์ถ”์ •์˜ ๊ณ„์‚ฐ ์˜ค๋ฒ„ํ—ค๋“œ๋ฅผ ์ถ”๊ฐ€๋กœ ๊ฐ์†Œ์‹œํ‚ต๋‹ˆ๋‹ค
  4. ์ด๋ก ์  ๋ถ„์„: ํ™œ์„ฑํ™” ํŒจํ„ด๊ณผ ์‹ ๋ขฐ๋„ ๊ฐ„์˜ ์ด๋ก ์  ๊ด€๊ณ„๋ฅผ ๊นŠ์ด ์žˆ๊ฒŒ ์ดํ•ดํ•ฉ๋‹ˆ๋‹ค

์‹ฌ์ธต ํ‰๊ฐ€

์žฅ์ 

  1. ๊ธฐ์ˆ  ํ˜์‹ : FFN ํ™œ์„ฑํ™”๋ฅผ RAG ์‹ ๋ขฐ๋„ ์ถ”์ •์— ์ฒด๊ณ„์ ์œผ๋กœ ํ™œ์šฉํ•œ ์ฒซ ์‚ฌ๋ก€๋กœ, ํ† ํฐ ํ™•๋ฅ ์˜ ์ •๋ณด ์†์‹ค์„ ํšŒํ”ผํ•ฉ๋‹ˆ๋‹ค
  2. ์‹ค์ œ ๊ฐ€์น˜: ์‹ค์ œ ๊ธˆ์œต ์‹œ๋‚˜๋ฆฌ์˜ค์—์„œ ๊ฒ€์ฆ๋˜์–ด ๊ฐ•ํ•œ ์‹ค์šฉ ์ง€ํ–ฅ์„ฑ์„ ๊ฐ€์ง‘๋‹ˆ๋‹ค
  3. ํฌ๊ด„์  ์‹คํ—˜: ์—ฌ๋Ÿฌ ์ฐจ์›(์ธต, ์ปจํ…์ŠคํŠธ, ์ง€์—ฐ)์—์„œ ์ถฉ๋ถ„ํ•œ ์ œ๊ฑฐ ์‹คํ—˜์„ ์ˆ˜ํ–‰ํ•ฉ๋‹ˆ๋‹ค
  4. ์—”์ง€๋‹ˆ์–ด๋ง ๊ณ ๋ ค: ํ”„๋กœ๋•์…˜ ํ™˜๊ฒฝ์˜ ์ง€์—ฐ ์ œ์•ฝ ๋ฐ ํ™•์žฅ์„ฑ ์š”๊ตฌ์‚ฌํ•ญ์„ ์ถฉ๋ถ„ํžˆ ๊ณ ๋ คํ•ฉ๋‹ˆ๋‹ค

๋ถ€์กฑํ•œ ์ 

  1. ์ผ๋ฐ˜์„ฑ ์ œํ•œ: ๋ฐฉ๋ฒ•์€ ํ™”์ดํŠธ๋ฐ•์Šค ์ ‘๊ทผ๊ณผ ํŠน์ • ์•„ํ‚คํ…์ฒ˜์— ์˜์กดํ•˜์—ฌ ํ™•์‚ฐ์ด ์ œํ•œ๋ฉ๋‹ˆ๋‹ค
  2. ์ด๋ก ์  ๊ธฐ์ดˆ: FFN ํ™œ์„ฑํ™”๊ฐ€ ์‹ ๋ขฐ๋„๋ฅผ ์˜ˆ์ธกํ•  ์ˆ˜ ์žˆ๋Š” ์ด์œ ์— ๋Œ€ํ•œ ๊นŠ์ด ์žˆ๋Š” ์ด๋ก ์  ๋ถ„์„์ด ๋ถ€์กฑํ•ฉ๋‹ˆ๋‹ค
  3. ๋ฐ์ดํ„ฐ ํˆฌ๋ช…์„ฑ: ๋…์  ๋ฐ์ดํ„ฐ์…‹์œผ๋กœ ์ธํ•ด ๊ฒฐ๊ณผ์˜ ๊ฒ€์ฆ ๊ฐ€๋Šฅ์„ฑ์— ์˜ํ–ฅ์„ ๋ฏธ์นฉ๋‹ˆ๋‹ค
  4. ์ œํ•œ๋œ ๋น„๊ต: ๋” ๋งŽ์€ ์ตœ์‹  ๋ถˆํ™•์‹ค์„ฑ ์ •๋Ÿ‰ํ™” ๋ฐฉ๋ฒ•๊ณผ์˜ ๋น„๊ต๊ฐ€ ๋ถ€์กฑํ•ฉ๋‹ˆ๋‹ค

์˜ํ–ฅ๋ ฅ

  1. ํ•™์ˆ  ๊ธฐ์—ฌ: RAG ์‹œ์Šคํ…œ์˜ ์‹ ๋ขฐ์„ฑ ์—ฐ๊ตฌ์— ์ƒˆ๋กœ์šด ๊ธฐ์ˆ  ๊ฒฝ๋กœ๋ฅผ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค
  2. ์‚ฐ์—… ๊ฐ€์น˜: ๊ณ ์œ„ํ—˜ ๋ถ„์•ผ์˜ LLM ๋ฐฐํฌ๋ฅผ ์œ„ํ•œ ์‹ค์šฉ์  ์†”๋ฃจ์…˜์„ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค
  3. ๋ฐฉ๋ฒ•๋ก ์  ์˜๊ฐ: ํ™œ์„ฑํ™” ๊ธฐ๋ฐ˜ ๋ฐฉ๋ฒ•์€ ๋” ๋งŽ์€ ๋‚ด๋ถ€ ํ‘œํ˜„ ํ™œ์šฉ ์—ฐ๊ตฌ์— ์˜๊ฐ์„ ์ค„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค

์ ์šฉ ์‹œ๋‚˜๋ฆฌ์˜ค

  1. ๊ณ ์œ„ํ—˜ ๋ถ„์•ผ: ๊ธˆ์œต, ์˜๋ฃŒ, ๋ฒ•๋ฅ  ๋“ฑ ์ •ํ™•์„ฑ ์š”๊ตฌ์‚ฌํ•ญ์ด ๊ทนํžˆ ๋†’์€ ์‹œ๋‚˜๋ฆฌ์˜ค
  2. ํ™”์ดํŠธ๋ฐ•์Šค ๋ฐฐํฌ: ๋ชจ๋ธ ๋‚ด๋ถ€ ์ ‘๊ทผ ๊ถŒํ•œ์ด ์žˆ๋Š” ์—”ํ„ฐํ”„๋ผ์ด์ฆˆ ์• ํ”Œ๋ฆฌ์ผ€์ด์…˜
  3. ์‹ค์‹œ๊ฐ„ ์‹œ์Šคํ…œ: ์—„๊ฒฉํ•œ ์ง€์—ฐ ์ œ์•ฝ ํ•˜์—์„œ ์‹ ๋ขฐํ•  ์ˆ˜ ์žˆ๋Š” ์‘๋‹ต์„ ์ œ๊ณตํ•ด์•ผ ํ•˜๋Š” ์‹œ๋‚˜๋ฆฌ์˜ค
  4. ์ „๋ฌธ ์ง€์‹ ๊ธฐ๋ฐ˜: ๊ตฌ์กฐํ™”๋˜๊ณ  ์ „๋ฌธํ™”๋œ ์ง€์‹ ๊ธฐ๋ฐ˜์„ ๊ฐ€์ง„ RAG ์• ํ”Œ๋ฆฌ์ผ€์ด์…˜

์ฐธ๊ณ ๋ฌธํ—Œ

๋ณธ ๋…ผ๋ฌธ์€ ๋ถˆํ™•์‹ค์„ฑ ์ •๋Ÿ‰ํ™”, RAG ์‹œ์Šคํ…œ, ํ™œ์„ฑํ™” ๋ถ„์„ ๋“ฑ ์—ฌ๋Ÿฌ ๊ด€๋ จ ๋ถ„์•ผ์˜ ์ค‘์š”ํ•œ ์—ฐ๊ตฌ๋ฅผ ์ธ์šฉํ•˜๋ฉฐ, ๋‹ค์Œ์„ ํฌํ•จํ•ฉ๋‹ˆ๋‹ค:

  • Azaria and Mitchell (2023): LLM ๋‚ด๋ถ€ ์ƒํƒœ์™€ "๊ฑฐ์ง“๋ง" ํƒ์ง€
  • Bakman et al. (2024): ์˜๋ฏธ ๊ธฐ๋ฐ˜ ์‘๋‹ต ์ ์ˆ˜ ๋งค๊ธฐ๊ธฐ
  • Bao et al. (2024): HHEM ํ•จ์ถ• ๋ชจ๋ธ
  • Dai et al. (2022): ์‚ฌ์ „ ํ›ˆ๋ จ๋œ Transformer์˜ ์ง€์‹ ์‹ ๊ฒฝ์›

์ข…ํ•ฉ ํ‰๊ฐ€: ์ด๋Š” ๊ธฐ์ˆ ์ ์œผ๋กœ ๊ฒฌ๊ณ ํ•˜๊ณ  ์‹ค์šฉ ๊ฐ€์น˜๊ฐ€ ๋†’์€ ๋…ผ๋ฌธ์œผ๋กœ, RAG ์‹œ์Šคํ…œ ์‹ ๋ขฐ๋„ ์ถ”์ •์ด๋ผ๋Š” ์ค‘์š”ํ•œ ๋ฌธ์ œ์— ๋Œ€ํ•ด ํ˜์‹ ์ ์ธ ์†”๋ฃจ์…˜์„ ์ œ์‹œํ•ฉ๋‹ˆ๋‹ค. ์ผ๋ฐ˜์„ฑ๊ณผ ์ด๋ก ์  ๊นŠ์ด์—์„œ ์ผ์ •ํ•œ ํ•œ๊ณ„๊ฐ€ ์žˆ์ง€๋งŒ, ์‹ค์ œ ์‹œ๋‚˜๋ฆฌ์˜ค์—์„œ์˜ ์„ฑ๊ณต์ ์ธ ์‘์šฉ๊ณผ ์ถฉ๋ถ„ํ•œ ์‹คํ—˜ ๊ฒ€์ฆ์œผ๋กœ ์ธํ•ด ์ค‘์š”ํ•œ ํ•™์ˆ  ๋ฐ ์‚ฐ์—… ๊ฐ€์น˜๋ฅผ ๊ฐ€์ง‘๋‹ˆ๋‹ค.