2025-11-15T08:58:11.885290

Efficient support ticket resolution using Knowledge Graphs

Varghese, Tian

A review of over 160,000 customer cases indicates that about 90% of time is spent by the product support for solving around 10% of subset of tickets where a trivial solution may not exist. Many of these challenging cases require the support of several engineers working together within a "swarm", and some also need to go to development support as bugs. These challenging customer issues represent a major opportunity for machine learning and knowledge graph that identifies the ideal engineer / group of engineers(swarm) that can best address the solution, reducing the wait times for the customer. The concrete ML task we consider here is a learning-to-rank(LTR) task that given an incident and a set of engineers currently assigned to the incident (which might be the empty set in the non-swarming context), produce a ranked list of engineers best fit to help resolve that incident. To calculate the rankings, we may consider a wide variety of input features including the incident description provided by the customer, the affected component(s), engineer ratings of their expertise, knowledge base article text written by engineers, response to customer text written by engineers, and historic swarming data. The central hypothesis test is that by including a holistic set of contextual data around which cases an engineer has solved, we can significantly improve the LTR algorithm over benchmark models. The article proposes a novel approach of modelling Knowledge Graph embeddings from multiple data sources, including the swarm information. The results obtained proves that by incorporating this additional context, we can improve the recommendations significantly over traditional machine learning methods like TF-IDF.

academic

Efficient support ticket resolution using Knowledge Graphs

基本信息

论文ID: 2501.00461
标题: Efficient support ticket resolution using Knowledge Graphs
作者: Sherwin Varghese (SAP Labs India), James Tian (SAP Labs US)
分类: cs.AI cs.LG cs.MA
发表机构: SAP Labs
论文链接: https://arxiv.org/abs/2501.00461

摘要

基于超过16万个客户案例的分析显示，产品支持团队约90%的时间用于解决大约10%的复杂工单，这些工单往往没有显而易见的解决方案。许多挑战性案例需要多名工程师协作组成"群体"(swarm)，有些甚至需要开发团队支持作为bug处理。本文将此问题建模为学习排序(LTR)任务，给定事件和当前分配的工程师集合，产生最适合解决该事件的工程师排序列表。文章提出了一种新颖的方法，通过多数据源建模知识图谱嵌入，包含群体信息，实验结果证明相比传统机器学习方法如TF-IDF有显著改进。

研究背景与动机

问题定义

核心问题: 客户支持工单分配效率低下，约90%的时间用于解决10%的复杂工单
业务影响: 高周转时间影响客户满意度和业务成果
技术挑战: 识别理想的工程师或工程师团队来解决特定技术问题

现有方法局限性

传统ML方法: TF-IDF、随机森林等方法相对简单但模型复杂度低
关系建模不足: 无法捕获工程师之间的协作关系和团队解决问题的模式
上下文缺失: 缺乏对工程师历史解决案例的全面上下文理解
生产系统限制: 现有专家匹配系统使用预定义权重，缺乏学习能力

研究动机

基于SAP内部16万+客户案例的实际业务需求，利用机器学习和知识图谱技术优化工程师-工单匹配，减少客户等待时间，提高问题解决效率。

核心贡献

新颖的知识图谱建模方法: 提出基于多数据源的知识图谱嵌入方法，整合群体协作信息
学习排序框架: 将专家匹配问题建模为LTR任务，直接优化排序目标
多模态数据融合: 结合结构化数据(工程师信息、组件)和非结构化数据(事件描述、KBA文本)
显著性能提升: 在多个评价指标上相比传统方法取得大幅改进
实际业务应用: 基于真实SAP客户支持数据的端到端解决方案

方法详解

任务定义

输入:

事件描述(客户提供)
受影响组件
当前分配的工程师集合(可能为空)
工程师专业评级
历史群体数据

输出: 最适合解决该事件的工程师排序列表

约束: 考虑工程师可用性、专业匹配度、历史协作关系等

模型架构

1. 知识图谱构建

节点类型:

工程师(Engineers)
知识库文章(KBAs)
事件(Incidents)
组件(Components)

边关系:

工程师-事件: 解决关系
工程师-KBA: 创作关系
工程师-工程师: 群体协作关系
事件-组件: 影响关系

2. 数据处理管道

数据提取 → 清洗预处理 → NLU嵌入生成 → 图结构转换 → GNN训练

3. 核心技术组件

自然语言理解(NLU):

使用BERT等变换器模型处理文本数据
生成事件描述、KBA文本的上下文嵌入
轻量级NLP模型进行预处理以控制计算复杂度

图神经网络(GNN):

采用PinSage算法实现
动态生成工程师节点嵌入
考虑图结构进行损失函数正则化

排序模块:

使用三元组损失函数(Triplet Loss)
计算事件向量与工程师向量的相似度
生成最终排序列表

4. 算法流程

def generateGNN():
    # 1. 数据ETL处理
    ETL_process(KBA, Communication, Component, User, Swarm)
    
    # 2. NLU转换
    embeddings = NLU_transform(KBA, Communication, Components)
    
    # 3. 向量归一化
    vectors = normalize_embeddings(embeddings)
    
    # 4. 构建知识图谱
    KG = build_networkx_graph(vectors)
    
    # 5. PinSage排序
    rankings = PinSage_ranking(incident_vector, KG)
    
    # 6. 基于三元组损失排序
    return rank_engineers(rankings, triplet_loss)