2025-11-19T21:10:14.255447

Embodied AI-Enhanced Vehicular Networks: An Integrated Large Language Models and Reinforcement Learning Method

Zhang, Zhao, Du et al.

This paper investigates adaptive transmission strategies in embodied AI-enhanced vehicular networks by integrating large language models (LLMs) for semantic information extraction and deep reinforcement learning (DRL) for decision-making. The proposed framework aims to optimize both data transmission efficiency and decision accuracy by formulating an optimization problem that incorporates the Weber-Fechner law, serving as a metric for balancing bandwidth utilization and quality of experience (QoE). Specifically, we employ the large language and vision assistant (LLAVA) model to extract critical semantic information from raw image data captured by embodied AI agents (i.e., vehicles), reducing transmission data size by approximately more than 90\% while retaining essential content for vehicular communication and decision-making. In the dynamic vehicular environment, we employ a generalized advantage estimation-based proximal policy optimization (GAE-PPO) method to stabilize decision-making under uncertainty. Simulation results show that attention maps from LLAVA highlight the model's focus on relevant image regions, enhancing semantic representation accuracy. Additionally, our proposed transmission strategy improves QoE by up to 36\% compared to DDPG and accelerates convergence by reducing required steps by up to 47\% compared to pure PPO. Further analysis indicates that adapting semantic symbol length provides an effective trade-off between transmission quality and bandwidth, achieving up to a 61.4\% improvement in QoE when scaling from 4 to 8 vehicles.

academic

Embodied AI-Enhanced Vehicular Networks: An Integrated Large Language Models and Reinforcement Learning Method

基本信息

论文ID: 2501.01141
标题: Embodied AI-Enhanced Vehicular Networks: An Integrated Large Language Models and Reinforcement Learning Method
作者: Ruichen Zhang, Changyuan Zhao, Hongyang Du, Dusit Niyato, Jiacheng Wang, Suttinee Sawadsitang, Xuemin Shen, Dong In Kim
分类: cs.NI (Networking and Internet Architecture)
发表时间: 2025年1月2日 (arXiv预印本)
论文链接: https://arxiv.org/abs/2501.01141

摘要

本文研究了通过集成大语言模型(LLMs)进行语义信息提取和深度强化学习(DRL)进行决策制定的具身AI增强车联网中的自适应传输策略。该框架旨在通过制定包含Weber-Fechner定律的优化问题来平衡带宽利用率和用户体验质量(QoE)，从而优化数据传输效率和决策准确性。具体而言，采用大语言和视觉助手(LLAVA)模型从具身AI代理(即车辆)捕获的原始图像数据中提取关键语义信息，在保留车联网通信和决策制定所需基本内容的同时，将传输数据大小减少约90%以上。在动态车联网环境中，采用基于广义优势估计的近端策略优化(GAE-PPO)方法来稳定不确定性下的决策制定。

研究背景与动机

问题定义

随着6G时代的到来，车联网(IoV)预期将实现前所未有的进步，流量密度超过0.1-10 Gbps/m²，连接密度达到每平方公里1000万设备。这些改进将显著提升数据速率、连接性和网络容量，从根本上改变IoV服务，如实时导航、环境感知和自主决策制定。

研究动机

数据处理挑战：随着联网车辆数量的增长，需要部署大量传感器收集和处理大量实时数据，传统判别式AI模型在动态条件下难以保持高性能。
传输效率问题：原始传感器数据传输需要大量带宽，如何在保证信息质量的同时减少数据传输量成为关键挑战。
决策制定复杂性：车联网环境高度动态，需要实时适应环境变化的智能决策制定系统。

现有方法局限性

传统方法主要关注频谱效率、延迟和安全性等传统性能指标
缺乏对语义数据传输和决策制定效率的考虑
未充分探索LLMs和DRL在车联网资源优化中的集成应用

核心贡献

数据传输建模：制定了平衡数据传输效率和决策制定准确性的优化问题，引入Weber-Fechner定律作为量化用户体验质量(QoE)的指标。
基于LLM的语义数据处理：利用LLAVA从原始图像数据中提取语义信息，显著减少传输带宽同时保留车联网通信和决策制定所需的基本上下文细节。
基于DRL的增强决策制定：提出GAE-PPO方法改进动态车联网环境中的决策制定，通过广义优势估计减少策略梯度更新的方差，稳定训练过程。
首创性工作：据作者所知，这是首个探索LLMs数据处理和DRL决策制定在具身AI增强车联网中联合应用的工作。

方法详解

任务定义

在城市环境中考虑基于蜂窝网络的车联网通信网络，其中I辆配备具身AI系统的车辆在基站(BS)通信范围内行驶。网络包括W个车辆到基础设施(V2I)链路和Q个车辆到车辆(V2V)链路。

目标：优化传输功率、语义符号分配和信道使用，最大化QoE同时确保高效资源利用。

模型架构

1. LLAVA语义信息提取

架构设计：

视觉编码器：使用对比语言-图像预训练(CLIP)视觉编码器将图像转换为特征向量：
```
Zi = g(Ii)
```
投影矩阵：通过可训练线性投影矩阵W将特征投影到语言模型词嵌入空间：
```
Ei = W · Zi
```
语义提取：通过LLAVA模型生成语义信息：
```
Mi = LLAVA(Ii; θi)
```

模型微调：

损失函数：L = Σ||Mi - M̂i||²
交叉熵损失：LCE = Σq(vi,l)log p(vi,l)

2. GAE-PPO传输策略优化

MDP设计：

动作空间：at = [{bq[w]}, {P^V2V_q[w]}, {uq}]（维度：3Q）
状态空间：st = [{H^(w)_i}, {γ^V2V_q(t)}, {γ^V2I_w(t)}]（维度：2W+Q）
奖励函数：基于QoE的奖励，包含约束违反惩罚项

GAE-PPO算法：

代理目标函数：J(θA) = Et[ρt(θA)A^π_θold_A_t]
裁剪目标：Jclip(θA) = Et[min(ρt(θA)A^π_θold_A_t, clip(ρt(θA), 1-ε, 1+ε)A^π_θold_A_t)]
广义优势估计：A^π_θold_A_t = Σ(γλ)^l δt+l