密码学与安全
查看 最近的 文章
显示 2025年08月06日, 星期三 新的列表
- [1] arXiv:2508.02805 [中文pdf, pdf, 其他]
-
标题: 基于C-V2X的前向碰撞预警系统的协议合规拒绝服务攻击的真实世界评估标题: Real-World Evaluation of Protocol-Compliant Denial-of-Service Attacks on C-V2X-based Forward Collision Warning Systems评论: 本文已提交至交通运输研究委员会(TRB)2026年会议,目前正在审阅中。主题: 密码学与安全 (cs.CR)
蜂窝车联网(C-V2X)技术实现了低延迟、可靠的通信,这对于诸如前向碰撞预警(FCW)系统等安全应用至关重要。 C-V2X的部署严格遵循第三代合作伙伴计划(3GPP)和汽车工程师学会标准(SAE)J2735规范,以确保互操作性。 本文介绍了一种使用用户数据报协议(UDP)洪泛和超大基本安全消息(BSM)攻击的真实世界测试床评估,这些攻击利用了C-V2X中的传输层和应用层漏洞。 本研究中提出的攻击通过标准PC5侧链发送有效消息,完全遵守3GPP和SAE J2735规范,但以异常高的速率和超大负载发送,从而过度消耗接收方资源,而没有违反任何协议规则,例如IEEE 1609。 使用具有商用车载单元(OBUs)的真实世界联网车辆测试床,我们证明高率UDP洪泛和BSM洪泛的超大负载可以严重降低FCW性能。 结果表明,仅UDP洪泛就可将包投递比降低高达87%,并将延迟增加到400ms以上,而超大BSM洪泛则会耗尽接收方处理资源,导致FCW警报延迟或完全被抑制。 当同时执行UDP和BSM攻击时,它们会导致几乎完全的通信失败,从而完全阻止FCW警告。 这些发现表明,符合协议的通信并不一定保证基于C-V2X的安全应用的安全或可靠运行。
Cellular Vehicle-to-Everything (C-V2X) technology enables low-latency, reliable communications essential for safety applications such as a Forward Collision Warning (FCW) system. C-V2X deployments operate under strict protocol compliance with the 3rd Generation Partnership Project (3GPP) and the Society of Automotive Engineers Standard (SAE) J2735 specifications to ensure interoperability. This paper presents a real-world testbed evaluation of protocol-compliant Denial-of-Service (DoS) attacks using User Datagram Protocol (UDP) flooding and oversized Basic Safety Message (BSM) attacks that 7 exploit transport- and application-layer vulnerabilities in C-V2X. The attacks presented in this study transmit valid messages over standard PC5 sidelinks, fully adhering to 3GPP and SAE J2735 specifications, but at abnormally high rates and with oversized payloads that overload the receiver resources without breaching any protocol rules such as IEEE 1609. Using a real-world connected vehicle 11 testbed with commercially available On-Board Units (OBUs), we demonstrate that high-rate UDP flooding and oversized payload of BSM flooding can severely degrade FCW performance. Results show that UDP flooding alone reduces packet delivery ratio by up to 87% and increases latency to over 400ms, while oversized BSM floods overload receiver processing resources, delaying or completely suppressing FCW alerts. When UDP and BSM attacks are executed simultaneously, they cause near-total communication failure, preventing FCW warnings entirely. These findings reveal that protocol-compliant communications do not necessarily guarantee safe or reliable operation of C-V2X-based safety applications.
- [2] arXiv:2508.02816 [中文pdf, pdf, 其他]
-
标题: 热感知的3D设计用于侧信道信息泄露标题: Thermal-Aware 3D Design for Side-Channel Information Leakage期刊参考: IEEE 第34届国际计算机设计会议(ICCD),520-527,2016主题: 密码学与安全 (cs.CR) ; 新兴技术 (cs.ET)
侧信道攻击是重要的安全挑战,因为它们揭示了关于芯片内活动的敏感信息。 在这些攻击中,热侧信道已被证明可以披露关键功能模块的活动,甚至加密密钥。 本文提出了一种新方法,在最小化功耗的同时主动隐藏功能层中的关键活动,具体包括:(i) 利用3D集成的固有特性来防止侧信道攻击,以及(ii) 动态生成自定义活动模式,以匹配功能层中要隐藏的活动。 实验分析表明,3D技术结合所提出的运行时算法能有效将侧信道脆弱性因子(SVF)降低到0.05以下,空间热侧信道因子(STSF)降低到0.59以下。
Side-channel attacks are important security challenges as they reveal sensitive information about on-chip activities. Among such attacks, the thermal side-channel has been shown to disclose the activities of key functional blocks and even encryption keys. This paper proposes a novel approach to proactively conceal critical activities in the functional layers while minimizing the power dissipation by (i) leveraging inherent characteristics of 3D integration to protect from side-channel attacks and (ii) dynamically generating custom activity patterns to match the activity to be concealed in the functional layers. Experimental analysis shows that 3D technology combined with the proposed run-time algorithm effectively reduces the Side channel vulnerability Factor (SVF) below 0.05 and the Spatial Thermal Side-channel Factor (STSF) below 0.59.
- [3] arXiv:2508.02836 [中文pdf, pdf, html, 其他]
-
标题: 代理隐私保护机器学习标题: Agentic Privacy-Preserving Machine Learning主题: 密码学与安全 (cs.CR) ; 机器学习 (cs.LG)
隐私保护机器学习(PPML)对于确保人工智能中的数据隐私至关重要。 在过去几年中,社区提出了各种可证明安全的PPML方案,这些方案依赖于各种密码学原语。 然而,当涉及到具有数十亿参数的大语言模型(LLMs)时,PPML的效率远不能令人接受。 例如,目前最先进的保密LLM推理解决方案的性能至少比明文推理慢10,000倍。 当上下文长度增加时,性能差距甚至更大。 在本文中,我们提出了一种名为Agentic-PPML的新框架,使LLM中的PPML变得实用。 我们的关键见解是使用通用大语言模型进行意图理解,并将密码学安全的推理委托给在垂直领域上训练的专用模型。 通过模块化地将语言意图解析(通常涉及很少或没有敏感信息)与隐私关键计算分离,Agentic-PPML完全消除了LLM处理加密提示的需求,从而实现了隐私保护LLM中心服务的实际部署。
Privacy-preserving machine learning (PPML) is critical to ensure data privacy in AI. Over the past few years, the community has proposed a wide range of provably secure PPML schemes that rely on various cryptography primitives. However, when it comes to large language models (LLMs) with billions of parameters, the efficiency of PPML is everything but acceptable. For instance, the state-of-the-art solution for confidential LLM inference represents at least 10,000-fold slower performance compared to plaintext inference. The performance gap is even larger when the context length increases. In this position paper, we propose a novel framework named Agentic-PPML to make PPML in LLMs practical. Our key insight is to employ a general-purpose LLM for intent understanding and delegate cryptographically secure inference to specialized models trained on vertical domains. By modularly separating language intent parsing - which typically involves little or no sensitive information - from privacy-critical computation, Agentic-PPML completely eliminates the need for the LLMs to process the encrypted prompts, enabling practical deployment of privacy-preserving LLM-centric services.
- [4] arXiv:2508.02942 [中文pdf, pdf, html, 其他]
-
标题: LMDG:通过高保真数据集生成推进横向移动检测标题: LMDG: Advancing Lateral Movement Detection Through High-Fidelity Dataset Generation主题: 密码学与安全 (cs.CR)
横向移动(LM)攻击继续对企业安全构成重大威胁,使对手能够隐秘地破坏关键资产。 然而,LM检测系统的开发和评估受到缺乏现实且良好标注数据集的阻碍。 为解决这一问题,我们提出了LMDG,这是一个可重复且可扩展的框架,用于生成高保真LM数据集。 LMDG自动化生成良性活动、多阶段攻击执行以及系统和网络日志的全面标注,大大减少了人工工作量,并实现了可扩展的数据集创建。 LMDG的一个核心贡献是进程树标注,这是一种新颖的基于代理的技术,能够以高精度将所有恶意活动追溯到其源头。 与之前的方法如注入时间或行为分析不同,进程树标注能够对恶意日志条目进行准确的分步标注,并将其与特定的攻击步骤和MITRE ATT&CK TTPs相关联。 据我们所知,这是第一个支持多步骤攻击细粒度标注的方法,为检测模型(如攻击路径重建)提供了关键上下文。 我们使用LMDG在一个包含25台虚拟机的企业环境中生成了一个25天的数据集,其中包含22个用户账户。 该数据集包括944 GB的主机和网络日志,并嵌入了35个多阶段LM攻击,恶意事件占总活动的不到1%,反映了评估检测系统的现实良性与恶意比例。 LMDG生成的数据集通过提供多样化的LM攻击、最新的攻击模式、更长的攻击时间段、全面的数据源、现实的网络架构和更准确的标注,优于现有的数据集。
Lateral Movement (LM) attacks continue to pose a significant threat to enterprise security, enabling adversaries to stealthily compromise critical assets. However, the development and evaluation of LM detection systems are impeded by the absence of realistic, well-labeled datasets. To address this gap, we propose LMDG, a reproducible and extensible framework for generating high-fidelity LM datasets. LMDG automates benign activity generation, multi-stage attack execution, and comprehensive labeling of system and network logs, dramatically reducing manual effort and enabling scalable dataset creation. A central contribution of LMDG is Process Tree Labeling, a novel agent-based technique that traces all malicious activity back to its origin with high precision. Unlike prior methods such as Injection Timing or Behavioral Profiling, Process Tree Labeling enables accurate, step-wise labeling of malicious log entries, correlating each with a specific attack step and MITRE ATT\&CK TTPs. To our knowledge, this is the first approach to support fine-grained labeling of multi-step attacks, providing critical context for detection models such as attack path reconstruction. We used LMDG to generate a 25-day dataset within a 25-VM enterprise environment containing 22 user accounts. The dataset includes 944 GB of host and network logs and embeds 35 multi-stage LM attacks, with malicious events comprising less than 1% of total activity, reflecting a realistic benign-to-malicious ratio for evaluating detection systems. LMDG-generated datasets improve upon existing ones by offering diverse LM attacks, up-to-date attack patterns, longer attack timeframes, comprehensive data sources, realistic network architectures, and more accurate labeling.
- [5] arXiv:2508.02943 [中文pdf, pdf, 其他]
-
标题: 通过二进制多项式环的非分层可靠近似FHE框架标题: A Non-leveled and Reliable Approximate FHE Framework through Binarized Polynomial Rings主题: 密码学与安全 (cs.CR)
同态加密(HE)允许在加密数据上进行安全计算,在云计算、医疗保健和金融等领域保护用户隐私。 在完全同态加密(FHE)方案中,CKKS因其支持复数上的近似算术而著称,这是机器学习和数值工作负载的关键需求。 然而,CKKS会产生快速的噪声增长、复杂的参数调整,并依赖于成本高昂的模数切换。 我们提出了一种二进制变体的CKKS,它在整个二进制系数多项式环上运行,并用轻量级的引导机制替代了重新缩放。 为了缓解二进制编码引入的额外位翻转错误,我们集成了BCH纠错码以实现稳健的解密。 我们的开源实现基于HElib库,在保留CKKS核心代数结构的同时引入了二进制系数编码,使得在小环维度中能够高效计算并实现无限制深度的计算。 实证评估表明该框架在各种设置中具有实用性和可扩展性。
Homomorphic encryption (HE) enables secure computation on encrypted data, safeguarding user privacy in domains such as cloud computing, healthcare, and finance. Among fully homomorphic encryption (FHE) schemes, CKKS is notable for supporting approximate arithmetic over complex numbers, a key requirement for machine-learning and numerical workloads. However, CKKS incurs rapid noise growth, complex parameter tuning, and relies on costly modulus switching. We propose a binary variant of CKKS that operates entirely over binary-coefficient polynomial rings and replaces rescaling with a lightweight bootstrapping mechanism. To mitigate additional bit-flip errors introduced by binary encoding, we integrate BCH error-correcting codes for robust decryption. Our open-source implementation, built on the HElib library, preserves the core algebraic structure of CKKS while introducing binary-coefficient encoding, enabling efficient evaluation in small ring dimensions and unbounded-depth computation. Empirical evaluations demonstrate the framework's practicality and scalability across a range of settings.
- [6] arXiv:2508.03062 [中文pdf, pdf, html, 其他]
-
标题: 用于FPGA上的NTT的轻量级故障检测架构标题: Lightweight Fault Detection Architecture for NTT on FPGA主题: 密码学与安全 (cs.CR)
后量子密码(PQC)算法在数学上是安全的,并且能够抵抗量子攻击,但由于自然故障或有意的故障注入,在硬件实现中仍可能泄露敏感信息。侧信道攻击中的有意故障注入会降低未来一代网络安全处理器的加密实现的可靠性。在这方面,本研究提出了一种轻量级、高效的基于重新计算的故障检测模块,该模块在可编程门阵列(FPGA)上实现,用于数论变换(NTT)。NTT主要由存储单元和Cooley-Tukey蝴蝶单元(CT-BU)组成,CT-BU是一个关键且计算密集的硬件组件,对于多项式乘法至关重要。NTT和多项式乘法是许多PQC算法的基本构建块,包括Kyber、NTRU、Ring-LWE等。在本文中,我们提出了一种称为: 带模偏移的重新计算(REMO)的故障检测方法,用于CT-BU的逻辑块,使用Montgomery约简,以及另一种称为内存规则检查器的方法,用于NTT内部使用的内存组件。所提出的故障检测框架通过以显著低的实现成本实现高效率而树立了新的基准。它仅占用16个切片和一个DSP块,在Artix-7 FPGA上的功耗仅为3mW。基于REMO的检测机制实现了87.2%到100%的故障覆盖率,适用于各种字长、故障位数和故障注入模式。同样,内存规则检查器表现出稳健的性能,根据注入故障的性质,故障检测率在50.7%到100%之间。
Post-Quantum Cryptographic (PQC) algorithms are mathematically secure and resistant to quantum attacks but can still leak sensitive information in hardware implementations due to natural faults or intentional fault injections. The intent fault injection in side-channel attacks reduces the reliability of crypto implementation in future generation network security procesors. In this regard, this research proposes a lightweight, efficient, recomputation-based fault detection module implemented on a Field Programmable Gate Array (FPGA) for Number Theoretic Transform (NTT). The NTT is primarily composed of memory units and the Cooley-Tukey Butterfly Unit (CT-BU), a critical and computationally intensive hardware component essential for polynomial multiplication. NTT and polynomial multiplication are fundamental building blocks in many PQC algorithms, including Kyber, NTRU, Ring-LWE, and others. In this paper, we present a fault detection method called : Recomputation with a Modular Offset (REMO) for the logic blocks of the CT-BU using Montgomery Reduction and another method called Memory Rule Checkers for the memory components used within the NTT. The proposed fault detection framework sets a new benchmark by achieving high efficiency with significant low implementation cost. It occupies only 16 slices and a single DSP block, with a power consumption of just 3mW in Artix-7 FPGA. The REMO-based detection mechanism achieves a fault coverage of 87.2% to 100%, adaptable across various word sizes, fault bit counts, and fault injection modes. Similarly, the Memory Rule Checkers demonstrate robust performance, achieving 50.7% to 100% fault detection depending on and the nature of injected faults.
- [7] arXiv:2508.03067 [中文pdf, pdf, html, 其他]
-
标题: 通过可追踪指纹消除实现不可追踪的深度伪造标题: Untraceable DeepFakes via Traceable Fingerprint Elimination主题: 密码学与安全 (cs.CR) ; 人工智能 (cs.AI)
深度伪造归属技术的最新进展显著增强了法医能力,使能够提取生成模型(GMs)在图像中留下的痕迹,从而使深度伪造可以追溯到其源生成模型。 同时,一些攻击尝试逃避归属模型(AMs)以探索其局限性,这要求更强大的AMs。 然而,现有的攻击无法消除GMs的痕迹,因此可以通过防御措施进行缓解。 在本文中,我们发现可以通过乘法攻击实现不可追踪的深度伪造,该攻击可以从根本上消除GMs的痕迹,从而即使在使用了防御措施的AMs下也能逃避检测。 我们设计了一种通用且黑盒的攻击方法,仅使用真实数据训练对抗模型,适用于各种GMs,并且与AMs无关。 实验结果表明了我们方法出色的攻击能力和普遍适用性,在由9个GM生成的深度伪造上,对6个先进的AMs平均攻击成功率(ASR)达到97.08%。 即使在存在防御机制的情况下,我们的方法仍能保持超过72.39%的ASR。 我们的工作强调了乘法攻击可能带来的潜在挑战,并突显了需要更强大的AMs。
Recent advancements in DeepFakes attribution technologies have significantly enhanced forensic capabilities, enabling the extraction of traces left by generative models (GMs) in images, making DeepFakes traceable back to their source GMs. Meanwhile, several attacks have attempted to evade attribution models (AMs) for exploring their limitations, calling for more robust AMs. However, existing attacks fail to eliminate GMs' traces, thus can be mitigated by defensive measures. In this paper, we identify that untraceable DeepFakes can be achieved through a multiplicative attack, which can fundamentally eliminate GMs' traces, thereby evading AMs even enhanced with defensive measures. We design a universal and black-box attack method that trains an adversarial model solely using real data, applicable for various GMs and agnostic to AMs. Experimental results demonstrate the outstanding attack capability and universal applicability of our method, achieving an average attack success rate (ASR) of 97.08\% against 6 advanced AMs on DeepFakes generated by 9 GMs. Even in the presence of defensive mechanisms, our method maintains an ASR exceeding 72.39\%. Our work underscores the potential challenges posed by multiplicative attacks and highlights the need for more robust AMs.
- [8] arXiv:2508.03097 [中文pdf, pdf, 其他]
-
标题: VFLAIR-LLM:一种用于大模型分割学习的综合框架和基准标题: VFLAIR-LLM: A Comprehensive Framework and Benchmark for Split Learning of LLMs评论: 12页,10张图,发表于KDD2025期刊参考: 在第31届ACM SIGKDD知识发现与数据挖掘会议论文集第2卷(KDD'25)中,2025年8月3日至7日,加拿大安大略省多伦多。ACM,纽约,纽约,美国,12页主题: 密码学与安全 (cs.CR) ; 人工智能 (cs.AI)
随着大型语言模型(LLMs)的发展,LLM应用已经扩展到越来越多的领域。 然而,有数据隐私担忧的用户在直接使用LLM API方面受到限制,而私有部署则需要巨大的计算需求。 这在有限的本地资源下实现安全的LLM适应带来了重大挑战。 为了解决这个问题,协同学习方法,如分割学习(SL),为将LLM适应到私有领域提供了一种资源高效且隐私保护的解决方案。 在本研究中,我们引入了VFLAIR-LLM(可在https://github.com/FLAIR-THU/VFLAIR-LLM获取),这是一个可扩展且轻量级的LLM分割学习框架,使在资源受限环境中进行隐私保护的LLM推理和微调成为可能。 我们的库提供了两种LLM分割设置,支持三种任务类型和18个数据集。 此外,我们提供了用于实施和评估攻击与防御的标准模块。 我们在各种LLM分割学习(SL-LLM)设置下对5种攻击和9种防御进行了基准测试,为实际应用中的模型分割配置、防御策略及相关超参数的选择提供了具体的见解和建议。
With the advancement of Large Language Models (LLMs), LLM applications have expanded into a growing number of fields. However, users with data privacy concerns face limitations in directly utilizing LLM APIs, while private deployments incur significant computational demands. This creates a substantial challenge in achieving secure LLM adaptation under constrained local resources. To address this issue, collaborative learning methods, such as Split Learning (SL), offer a resource-efficient and privacy-preserving solution for adapting LLMs to private domains. In this study, we introduce VFLAIR-LLM (available at https://github.com/FLAIR-THU/VFLAIR-LLM), an extensible and lightweight split learning framework for LLMs, enabling privacy-preserving LLM inference and fine-tuning in resource-constrained environments. Our library provides two LLM partition settings, supporting three task types and 18 datasets. In addition, we provide standard modules for implementing and evaluating attacks and defenses. We benchmark 5 attacks and 9 defenses under various Split Learning for LLM(SL-LLM) settings, offering concrete insights and recommendations on the choice of model partition configurations, defense strategies, and relevant hyperparameters for real-world applications.
- [9] arXiv:2508.03125 [中文pdf, pdf, html, 其他]
-
标题: 攻击消息,而非代理:一种针对LLM-MAS的多轮自适应隐蔽篡改框架标题: Attack the Messages, Not the Agents: A Multi-round Adaptive Stealthy Tampering Framework for LLM-MASBingyu Yan, Ziyi Zhou, Xiaoming Zhang, Chaozhuo Li, Ruilin Zeng, Yirui Qi, Tianbo Wang, Litian Zhang主题: 密码学与安全 (cs.CR) ; 人工智能 (cs.AI) ; 多智能体系统 (cs.MA)
基于大型语言模型的多智能体系统(LLM-MAS)通过智能体间通信有效完成复杂和动态任务,但这种依赖性引入了重大的安全漏洞。 现有的针对LLM-MAS的攻击方法要么破坏智能体内部,要么依赖直接且明显的说服,这限制了它们的有效性、适应性和隐蔽性。 在本文中,我们提出了MAST,一种多轮自适应隐蔽篡改框架,旨在利用系统内的通信漏洞。 MAST将蒙特卡洛树搜索与直接偏好优化相结合,训练一个攻击策略模型,该模型自适应地生成有效的多轮篡改策略。 此外,为了保持隐蔽性,在篡改过程中我们施加了双重语义和嵌入相似性约束。 在多种任务、通信架构和LLM上的全面实验表明,与基线方法相比,MAST始终能够实现高攻击成功率,同时显著提高隐蔽性。 这些发现突显了MAST的有效性、隐蔽性和适应性,强调了在LLM-MAS中建立强大通信防护措施的必要性。
Large language model-based multi-agent systems (LLM-MAS) effectively accomplish complex and dynamic tasks through inter-agent communication, but this reliance introduces substantial safety vulnerabilities. Existing attack methods targeting LLM-MAS either compromise agent internals or rely on direct and overt persuasion, which limit their effectiveness, adaptability, and stealthiness. In this paper, we propose MAST, a Multi-round Adaptive Stealthy Tampering framework designed to exploit communication vulnerabilities within the system. MAST integrates Monte Carlo Tree Search with Direct Preference Optimization to train an attack policy model that adaptively generates effective multi-round tampering strategies. Furthermore, to preserve stealthiness, we impose dual semantic and embedding similarity constraints during the tampering process. Comprehensive experiments across diverse tasks, communication architectures, and LLMs demonstrate that MAST consistently achieves high attack success rates while significantly enhancing stealthiness compared to baselines. These findings highlight the effectiveness, stealthiness, and adaptability of MAST, underscoring the need for robust communication safeguards in LLM-MAS.
- [10] arXiv:2508.03130 [中文pdf, pdf, html, 其他]
-
标题: 使用 Logrip 保护小型组织免受 AI 机器人攻击:分层 IP 哈希标题: Protecting Small Organizations from AI Bots with Logrip: Hierarchical IP Hashing评论: 11页,4图主题: 密码学与安全 (cs.CR)
小型组织、初创公司和自托管服务器正面临来自自动化网络爬虫和AI机器人的日益增大的压力,这些机器人在线上的存在在过去几年中大幅增加。 现代机器人能够规避传统限速措施,即使它们行为良好,也能通过庞大的数量降低服务器性能。 我们引入了一种新的安全方法,利用数据可视化和分层IP哈希来分析服务器事件日志,根据访问模式区分人类用户和自动化实体。 通过聚合子网类别的IP活动并应用统计措施,我们的方法能够检测到传统工具无法识别的协调机器人活动和分布式爬取攻击。 通过一个现实世界的例子,我们估计80%到95%的流量源自AI爬虫,这突显了改进过滤机制的必要性。 我们的方法使小型组织能够有效管理自动化流量,在保持公共访问的同时减轻性能下降。
Small organizations, start ups, and self-hosted servers face increasing strain from automated web crawlers and AI bots, whose online presence has increased dramatically in the past few years. Modern bots evade traditional throttling and can degrade server performance through sheer volume even when they are well-behaved. We introduce a novel security approach that leverages data visualization and hierarchical IP hashing to analyze server event logs, distinguishing human users from automated entities based on access patterns. By aggregating IP activity across subnet classes and applying statistical measures, our method detects coordinated bot activity and distributed crawling attacks that conventional tools fail to identify. Using a real world example we estimate that 80 to 95 percent of traffic originates from AI crawlers, underscoring the need for improved filtering mechanisms. Our approach enables small organizations to regulate automated traffic effectively, preserving public access while mitigating performance degradation.
- [11] arXiv:2508.03151 [中文pdf, pdf, html, 其他]
-
标题: WiFinger:使用报文级序列匹配对噪声物联网事件流量进行指纹识别标题: WiFinger: Fingerprinting Noisy IoT Event Traffic Using Packet-level Sequence Matching主题: 密码学与安全 (cs.CR)
物联网环境如智能家居容易受到隐私推断攻击,攻击者可以分析加密网络流量的模式来推断设备的状态甚至人员的活动。 尽管大多数现有攻击利用机器学习技术来发现此类流量模式,但由于无线流量(尤其是Wi-Fi)的噪声大且数据包丢失严重,它们在无线流量上的表现不佳。 此外,这些方法通常针对区分分块的物联网事件流量样本,无法有效同时跟踪多个事件。 在本工作中,我们提出了 WiFinger,一种针对嘈杂流量的细粒度多物联网事件指纹识别方法。 WiFinger将流量模式分类任务转化为子序列匹配问题,并引入了新技术以应对高时间复杂度的同时保持高准确性。 实验表明,我们的方法在Wi-Fi流量上优于现有方法,对于各种物联网事件平均召回率达到85%(相比0.49%和0.46%),并且大多数情况下几乎零误报。
IoT environments such as smart homes are susceptible to privacy inference attacks, where attackers can analyze patterns of encrypted network traffic to infer the state of devices and even the activities of people. While most existing attacks exploit ML techniques for discovering such traffic patterns, they underperform on wireless traffic, especially Wi-Fi, due to its heavy noise and packet losses of wireless sniffing. In addition, these approaches commonly target at distinguishing chunked IoT event traffic samples, and they failed at effectively tracking multiple events simultaneously. In this work, we propose WiFinger, a fine-grained multi-IoT event fingerprinting approach against noisy traffic. WiFinger turns the traffic pattern classification task into a subsequence matching problem and introduces novel techniques to account for the high time complexity while maintaining high accuracy. Experiments demonstrate that our method outperforms existing approaches on Wi-Fi traffic, achieving an average recall of 85% (vs. 0.49% and 0.46%) for various IoT events while maintaining almost zero false positives for most of them.
- [12] arXiv:2508.03221 [中文pdf, pdf, html, 其他]
-
标题: BadBlocks:针对文本到图像扩散模型的低成本且隐蔽的后门攻击标题: BadBlocks: Low-Cost and Stealthy Backdoor Attacks Tailored for Text-to-Image Diffusion Models主题: 密码学与安全 (cs.CR) ; 计算机视觉与模式识别 (cs.CV)
近年来,扩散模型在图像生成领域取得了显著进展。然而,最近的研究表明,扩散模型容易受到后门攻击,在这种攻击中,攻击者可以通过在训练数据集中注入隐蔽的触发器(如特定的视觉模式或文本短语)来操纵输出。幸运的是,随着防御技术的不断进步,防御者已经能够使用视觉检查和基于神经网络的检测方法来识别和缓解大多数后门攻击。然而,在本文中,我们发现了一种新型的后门威胁,与现有方法相比更加轻量且隐蔽,我们将其命名为BadBlocks,它仅需要约30%的计算资源和20%的GPU时间,通常由之前的后门攻击所需,却能成功地注入后门并逃避最先进的防御框架。BadBlocks使攻击者能够在保持其余组件正常功能的同时,选择性地污染扩散模型UNet架构中的特定块。实验结果表明,即使在计算资源和GPU时间极度受限的情况下,BadBlocks也能实现高攻击成功率(ASR)和低感知质量损失(以FID分数衡量)。此外,BadBlocks能够绕过现有的防御框架,特别是基于注意力的后门检测方法,突显了其作为新型且值得关注的威胁。消融研究进一步表明,有效的后门注入不需要微调整个网络,并强调了某些神经网络层在后门映射中的关键作用。总体而言,BadBlocks在各个方面显著降低了进行后门攻击的门槛。它使攻击者即使使用消费级GPU也能将后门注入大规模扩散模型。
In recent years,Diffusion models have achieved remarkable progress in the field of image generation.However,recent studies have shown that diffusion models are susceptible to backdoor attacks,in which attackers can manipulate the output by injecting covert triggers such as specific visual patterns or textual phrases into the training dataset.Fortunately,with the continuous advancement of defense techniques,defenders have become increasingly capable of identifying and mitigating most backdoor attacks using visual inspection and neural network-based detection methods.However,in this paper,we identify a novel type of backdoor threat that is more lightweight and covert than existing approaches,which we name BadBlocks,requires only about 30\% of the computational resources and 20\% GPU time typically needed by previous backdoor attacks,yet it successfully injects backdoors and evades the most advanced defense frameworks.BadBlocks enables attackers to selectively contaminate specific blocks within the UNet architecture of diffusion models while maintaining normal functionality in the remaining components.Experimental results demonstrate that BadBlocks achieves a high attack success rate (ASR) and low perceptual quality loss (as measured by FID Score),even under extremely constrained computational resources and GPU time.Moreover,BadBlocks is able to bypass existing defense frameworks,especially the attention-based backdoor detection method, highlighting it as a novel and noteworthy threat.Ablation studies further demonstrate that effective backdoor injection does not require fine-tuning the entire network and highlight the pivotal role of certain neural network layers in backdoor mapping.Overall,BadBlocks significantly reduces the barrier to conducting backdoor attacks in all aspects.It enables attackers to inject backdoors into large-scale diffusion models even using consumer-grade GPUs.
- [13] arXiv:2508.03307 [中文pdf, pdf, html, 其他]
-
标题: BDFirewall:面向MLaaS中有效且快速的黑盒后门防御标题: BDFirewall: Towards Effective and Expeditiously Black-Box Backdoor Defense in MLaaS评论: 18页主题: 密码学与安全 (cs.CR)
在本文中,我们努力解决黑盒场景下的后门攻击防御挑战,从而加强MLaaS下的推理安全性。 我们首先从一个新的角度对后门触发器进行分类,即它们对修补区域的影响,并将其分为:高可见性触发器(HVT)、半可见性触发器(SVT)和低可见性触发器(LVT)。 基于这种分类,我们提出了一种渐进式防御框架BDFirewall,该框架从最明显到最细微地移除这些触发器,而无需访问模型。 首先,对于产生最显著局部语义扭曲的HVT,我们通过检测这些显著差异来识别并消除它们。 然后,我们恢复修补区域以减轻这种移除过程的不利影响。 然而,为HVT设计的局部净化对全局干扰良性特征的SVT无效。 因此,我们将受SVT污染的输入建模为触发器和良性特征的混合体,在这里我们非常规地将良性特征视为“噪声”。 这种表述使我们能够通过应用去除这些良性“噪声”特征的去噪过程来重建SVT。 然后通过减去重建的触发器获得无SVT的输入。 最后,为了中和几乎不可察觉但脆弱的LVT,我们引入轻量级噪声以破坏触发器模式,然后应用DDPM来恢复对干净特征的任何附带影响。 全面的实验表明,我们的方法优于最先进的防御措施。 与基线相比,BDFirewall平均将攻击成功率(ASR)降低了33.25%,将中毒样本准确率(PA)提高了29.64%,并在推理时间上实现了高达111倍的速度提升。 代码将在接受后公开提供。
In this paper, we endeavor to address the challenges of backdoor attacks countermeasures in black-box scenarios, thereby fortifying the security of inference under MLaaS. We first categorize backdoor triggers from a new perspective, i.e., their impact on the patched area, and divide them into: high-visibility triggers (HVT), semi-visibility triggers (SVT), and low-visibility triggers (LVT). Based on this classification, we propose a progressive defense framework, BDFirewall, that removes these triggers from the most conspicuous to the most subtle, without requiring model access. First, for HVTs, which create the most significant local semantic distortions, we identify and eliminate them by detecting these salient differences. We then restore the patched area to mitigate the adverse impact of such removal process. The localized purification designed for HVTs is, however, ineffective against SVTs, which globally perturb benign features. We therefore model an SVT-poisoned input as a mixture of a trigger and benign features, where we unconventionally treat the benign features as "noise". This formulation allows us to reconstruct SVTs by applying a denoising process that removes these benign "noise" features. The SVT-free input is then obtained by subtracting the reconstructed trigger. Finally, to neutralize the nearly imperceptible but fragile LVTs, we introduce lightweight noise to disrupt the trigger pattern and then apply DDPM to restore any collateral impact on clean features. Comprehensive experiments demonstrate that our method outperforms state-of-the-art defenses. Compared with baselines, BDFirewall reduces the Attack Success Rate (ASR) by an average of 33.25%, improving poisoned sample accuracy (PA) by 29.64%, and achieving up to a 111x speedup in inference time. Code will be made publicly available upon acceptance.
- [14] arXiv:2508.03342 [中文pdf, pdf, html, 其他]
-
标题: 从传统到标准:LLM辅助将网络安全剧本转换为CACAO格式标题: From Legacy to Standard: LLM-Assisted Transformation of Cybersecurity Playbooks into CACAO FormatMehdi Akbari Gurabi, Lasse Nitz, Radu-Mihai Castravet, Roman Matzutt, Avikarsha Mandal, Stefan Decker评论: 20页,包括附录,32篇参考文献,4张表格,7幅主要图表(其中一些包含子图)主题: 密码学与安全 (cs.CR) ; 人工智能 (cs.AI)
现有的网络安全操作手册通常以异构的、不可机器读取的格式编写,这限制了它们在安全编排、自动化和响应平台之间的自动化和互操作性。 本文探讨了结合提示工程的大语言模型在将遗留事件响应操作手册自动转换为标准化的机器可读的CACAO格式方面的适用性。 我们系统地检查了各种提示工程技术,并精心设计了旨在最大化语法准确性和语义保真度以保留控制流的提示。 我们的模块化转换流程集成了一个语法检查器以确保语法正确性,并具有迭代优化机制,逐步减少语法错误。 我们在一个自生成的数据集上评估了所提出的方法,该数据集包含多种遗留操作手册以及手动创建的CACAO参考。 结果表明,我们的方法显著提高了操作手册转换的准确性,有效捕捉了复杂的工作流结构,并大幅减少了错误。 它突显了在自动化网络安全操作手册转换任务中实际部署的潜力。
Existing cybersecurity playbooks are often written in heterogeneous, non-machine-readable formats, which limits their automation and interoperability across Security Orchestration, Automation, and Response platforms. This paper explores the suitability of Large Language Models, combined with Prompt Engineering, to automatically translate legacy incident response playbooks into the standardized, machine-readable CACAO format. We systematically examine various Prompt Engineering techniques and carefully design prompts aimed at maximizing syntactic accuracy and semantic fidelity for control flow preservation. Our modular transformation pipeline integrates a syntax checker to ensure syntactic correctness and features an iterative refinement mechanism that progressively reduces syntactic errors. We evaluate the proposed approach on a custom-generated dataset comprising diverse legacy playbooks paired with manually created CACAO references. The results demonstrate that our method significantly improves the accuracy of playbook transformation over baseline models, effectively captures complex workflow structures, and substantially reduces errors. It highlights the potential for practical deployment in automated cybersecurity playbook transformation tasks.
- [15] arXiv:2508.03413 [中文pdf, pdf, html, 其他]
-
标题: 智能汽车隐私:攻击与隐私问题综述标题: Smart Car Privacy: Survey of Attacks and Privacy Issues评论: 13页,16图主题: 密码学与安全 (cs.CR)
汽车在我们的日常生活中变得越来越重要。 现代汽车高度计算机化,因此可能容易受到攻击。 为车辆提供多种无线连接,可以在车辆和外部环境之间建立桥梁。 这种联网车辆解决方案预计将成为汽车革命的新前沿,也是向下一代智能交通系统演进的关键。 车载自组织网络(VANETs)是新兴的移动自组织网络技术,结合了移动路由协议用于车对车的数据通信,以支持智能交通系统。 因此,由于车辆的移动性,安全性和隐私是VANETs中的主要关注点。 因此,在VANETs中设计安全机制以将对手从网络中移除非常重要。 本文综述了各种车载网络架构。 现代车辆中安全性的演变。 VANETs中的各种安全和隐私攻击及其防御机制,包括示例并分类这些机制。 它还综述了车载网络所具有的各种隐私影响。
Automobiles are becoming increasingly important in our day to day life. Modern automobiles are highly computerized and hence potentially vulnerable to attack. Providing many wireless connectivity for vehicles enables a bridge between vehicles and their external environments. Such a connected vehicle solution is expected to be the next frontier for automotive revolution and the key to the evolution to next generation intelligent transportation systems. Vehicular Ad hoc Networks (VANETs) are emerging mobile ad hoc network technologies incorporating mobile routing protocols for inter-vehicle data communications to support intelligent transportation systems. Thus security and privacy are the major concerns in VANETs due to the mobility of the vehicles. Thus designing security mechanisms to remove adversaries from the network remarkably important in VANETs. This paper provides an overview of various vehicular network architectures. The evolution of security in modern vehicles. Various security and privacy attacks in VANETs with their defending mechanisms with examples and classify these mechanisms. It also provides an overview of various privacy implication that a vehicular network possess.
- [16] arXiv:2508.03474 [中文pdf, pdf, html, 其他]
-
标题: 解开概率森林:预测市场的套利标题: Unravelling the Probabilistic Forest: Arbitrage in Prediction Markets主题: 密码学与安全 (cs.CR) ; 交易与市场微观结构 (q-fin.TR)
Polymarket是一个预测市场平台,用户可以通过交易与特定结果(称为条件)相关的股份来对未来的事件进行投机。每个市场都与一组一个或多个这样的条件相关联。为了确保市场正确结算,条件集必须是穷尽的——涵盖所有可能的结果——并且互斥的——只有一项条件可以被判定为真。因此,所有相关结果的总价格应为1美元,代表任何结果的联合概率为1。尽管如此,Polymarket表现出某些依赖性资产定价错误的情况,允许以低于(或高于)1美元的价格购买(或出售)某个结果,从而保证盈利。这种现象被称为套利,可能使熟练参与者利用这些不一致之处。在本文中,我们对Polymarket数据进行了实证套利分析,以回答三个关键问题:(Q1) 哪些条件导致了套利?(Q2) 套利是否真的发生在Polymarket上?(Q3) 是否有人利用了这些机会。分析相关市场之间的套利的主要挑战在于在大量市场和条件之间进行比较的可扩展性,一种简单的分析需要$O(2^{n+m})$次比较。为了解决这个问题,我们采用了一种基于及时性、主题相似性和组合关系的启发式减少策略,并通过专家输入进一步验证。我们的研究揭示了Polymarket上的两种不同的套利形式:市场再平衡套利,它发生在单个市场或条件内;以及组合套利,它跨越多个市场。我们使用链上历史订单簿数据来分析这些类型的套利机会何时存在,以及何时被用户执行。我们发现实际估算的利润为4000万美元。
Polymarket is a prediction market platform where users can speculate on future events by trading shares tied to specific outcomes, known as conditions. Each market is associated with a set of one or more such conditions. To ensure proper market resolution, the condition set must be exhaustive -- collectively accounting for all possible outcomes -- and mutually exclusive -- only one condition may resolve as true. Thus, the collective prices of all related outcomes should be \$1, representing a combined probability of 1 of any outcome. Despite this design, Polymarket exhibits cases where dependent assets are mispriced, allowing for purchasing (or selling) a certain outcome for less than (or more than) \$1, guaranteeing profit. This phenomenon, known as arbitrage, could enable sophisticated participants to exploit such inconsistencies. In this paper, we conduct an empirical arbitrage analysis on Polymarket data to answer three key questions: (Q1) What conditions give rise to arbitrage (Q2) Does arbitrage actually occur on Polymarket and (Q3) Has anyone exploited these opportunities. A major challenge in analyzing arbitrage between related markets lies in the scalability of comparisons across a large number of markets and conditions, with a naive analysis requiring $O(2^{n+m})$ comparisons. To overcome this, we employ a heuristic-driven reduction strategy based on timeliness, topical similarity, and combinatorial relationships, further validated by expert input. Our study reveals two distinct forms of arbitrage on Polymarket: Market Rebalancing Arbitrage, which occurs within a single market or condition, and Combinatorial Arbitrage, which spans across multiple markets. We use on-chain historical order book data to analyze when these types of arbitrage opportunities have existed, and when they have been executed by users. We find a realized estimate of 40 million USD of profit extracted.
- [17] arXiv:2508.03517 [中文pdf, pdf, html, 其他]
-
标题: 基于领域自适应多模态学习的异构网络入侵检测标题: Intrusion Detection in Heterogeneous Networks with Domain-Adaptive Multi-Modal Learning主题: 密码学与安全 (cs.CR)
网络入侵检测系统(NIDS)在保护网络基础设施免受网络攻击方面起着关键作用。 随着这些攻击的普遍性和复杂性不断增加,机器学习和深度神经网络方法已成为增强NIDS检测恶意活动能力的有效工具。 然而,传统深度神经模型的效果通常受到需要大量标记数据集以及不同网络领域间数据和特征异质性带来的挑战的限制。 为解决这些限制,我们开发了一种结合多模态学习与领域自适应技术的深度神经模型进行分类。 我们的模型以顺序循环的方式处理来自不同来源的数据,使其能够从多个数据集中学习并适应不同的特征空间。 实验结果表明,我们的模型在分类网络入侵方面显著优于基线神经模型,特别是在样本可用性和概率分布变化的情况下。 该模型的表现凸显了其在异构数据集上泛化的能力,使其成为现实世界网络入侵检测的高效解决方案。
Network Intrusion Detection Systems (NIDS) play a crucial role in safeguarding network infrastructure against cyberattacks. As the prevalence and sophistication of these attacks increase, machine learning and deep neural network approaches have emerged as effective tools for enhancing NIDS capabilities in detecting malicious activities. However, the effectiveness of traditional deep neural models is often limited by the need for extensive labelled datasets and the challenges posed by data and feature heterogeneity across different network domains. To address these limitations, we developed a deep neural model that integrates multi-modal learning with domain adaptation techniques for classification. Our model processes data from diverse sources in a sequential cyclic manner, allowing it to learn from multiple datasets and adapt to varying feature spaces. Experimental results demonstrate that our proposed model significantly outperforms baseline neural models in classifying network intrusions, particularly under conditions of varying sample availability and probability distributions. The model's performance highlights its ability to generalize across heterogeneous datasets, making it an efficient solution for real-world network intrusion detection.
- [18] arXiv:2508.03588 [中文pdf, pdf, html, 其他]
-
标题: MalFlows:面向Android恶意软件检测的异构流语义上下文感知融合标题: MalFlows: Context-aware Fusion of Heterogeneous Flow Semantics for Android Malware Detection评论: 提交至TDSC主题: 密码学与安全 (cs.CR) ; 软件工程 (cs.SE)
静态分析是Android应用程序检查中的基本技术,它能够提取控制流、数据流和组件间通信(ICCs),这些都是恶意软件检测的关键。然而,现有方法难以利用不同类型流之间的语义互补性来表示程序行为,其上下文不敏感的特性进一步阻碍了跨流语义集成的准确性。我们提出了并实现了MalFlows,一种新颖的技术,能够实现针对Android恶意软件检测的异构流语义的上下文感知融合。我们的目标是利用三种流相关信息的互补优势进行精确的应用程序分析。我们采用异构信息网络(HIN)来建模这些程序流之间的丰富语义。我们进一步提出了flow2vec,这是一种上下文感知的HIN嵌入技术,可以根据不同流之间的上下文约束区分HIN实体的语义,并通过使用多种元路径的联合方式学习准确的应用程序表示。最终,这些表示被输入到基于通道注意力的深度神经网络中进行恶意软件分类。据我们所知,这是首次全面整合各种流相关信息的优势来评估应用程序中的恶意性。我们在一个包含超过2000万流实例的大规模数据集上评估了MalFlows,这些流实例是从31000多个现实世界的应用程序中提取的。实验结果表明,MalFlows在Android恶意软件检测中优于代表性基线方法,同时验证了flow2vec在从构建于异构流上的HIN中准确学习应用程序表示的有效性。
Static analysis, a fundamental technique in Android app examination, enables the extraction of control flows, data flows, and inter-component communications (ICCs), all of which are essential for malware detection. However, existing methods struggle to leverage the semantic complementarity across different types of flows for representing program behaviors, and their context-unaware nature further hinders the accuracy of cross-flow semantic integration. We propose and implement MalFlows, a novel technique that achieves context-aware fusion of heterogeneous flow semantics for Android malware detection. Our goal is to leverage complementary strengths of the three types of flow-related information for precise app profiling. We adopt a heterogeneous information network (HIN) to model the rich semantics across these program flows. We further propose flow2vec, a context-aware HIN embedding technique that distinguishes the semantics of HIN entities as needed based on contextual constraints across different flows and learns accurate app representations through the joint use of multiple meta-paths. The representations are finally fed into a channel-attention-based deep neural network for malware classification. To the best of our knowledge, this is the first study to comprehensively aggregate the strengths of diverse flow-related information for assessing maliciousness within apps. We evaluate MalFlows on a large-scale dataset comprising over 20 million flow instances extracted from more than 31,000 real-world apps. Experimental results demonstrate that MalFlows outperforms representative baselines in Android malware detection, and meanwhile, validate the effectiveness of flow2vec in accurately learning app representations from the HIN constructed over the heterogeneous flows.
新提交 (展示 18 之 18 条目 )
- [19] arXiv:2508.02840 (交叉列表自 cs.LG) [中文pdf, pdf, html, 其他]
-
标题: 通过知识蒸馏和粒子群优化实现资源高效的自动软件漏洞评估标题: Resource-Efficient Automatic Software Vulnerability Assessment via Knowledge Distillation and Particle Swarm Optimization评论: 被工程应用的人工智能接受主题: 机器学习 (cs.LG) ; 密码学与安全 (cs.CR)
软件系统的复杂性不断增加,导致网络安全漏洞激增,这需要高效且可扩展的漏洞评估解决方案。 然而,在现实场景中部署大型预训练模型受到其巨大的计算和存储需求的阻碍。 为了解决这一挑战,我们提出了一种新颖的资源高效的框架,该框架结合了知识蒸馏和粒子群优化,以实现自动化的漏洞评估。 我们的框架采用两阶段方法:首先,利用粒子群优化来优化紧凑学生模型的架构,平衡计算效率和模型容量。 其次,应用知识蒸馏将关键的漏洞评估知识从大型教师模型转移到优化后的学生模型。 这一过程显著减少了模型大小,同时保持了高性能。 在增强的MegaVul数据集上的实验结果表明了我们方法的有效性,该数据集包含12,071个CVSS(通用漏洞评分系统)v3标注的漏洞。 我们的方法在保留原始模型89.3%准确率的同时,实现了模型大小99.4%的减少。 此外,与最先进的基线方法相比,其准确率提高了1.7%,参数减少了60%。 与传统遗传算法相比,该框架还减少了72.1%的训练时间和34.88%的架构搜索时间。
The increasing complexity of software systems has led to a surge in cybersecurity vulnerabilities, necessitating efficient and scalable solutions for vulnerability assessment. However, the deployment of large pre-trained models in real-world scenarios is hindered by their substantial computational and storage demands. To address this challenge, we propose a novel resource-efficient framework that integrates knowledge distillation and particle swarm optimization to enable automated vulnerability assessment. Our framework employs a two-stage approach: First, particle swarm optimization is utilized to optimize the architecture of a compact student model, balancing computational efficiency and model capacity. Second, knowledge distillation is applied to transfer critical vulnerability assessment knowledge from a large teacher model to the optimized student model. This process significantly reduces the model size while maintaining high performance. Experimental results on an enhanced MegaVul dataset, comprising 12,071 CVSS (Common Vulnerability Scoring System) v3 annotated vulnerabilities, demonstrate the effectiveness of our approach. Our approach achieves a 99.4% reduction in model size while retaining 89.3% of the original model's accuracy. Furthermore, it outperforms state-of-the-art baselines by 1.7% in accuracy with 60% fewer parameters. The framework also reduces training time by 72.1% and architecture search time by 34.88% compared to traditional genetic algorithms.
- [20] arXiv:2508.02881 (交叉列表自 eess.SY) [中文pdf, pdf, html, 其他]
-
标题: 基于不确定传感器信号的预防性和反应性防御资源分配优化标题: Optimizing Preventive and Reactive Defense Resource Allocation with Uncertain Sensor Signals评论: 6页,6图。已被接受在第61届Allerton通信、控制与计算会议进行展示主题: 系统与控制 (eess.SY) ; 密码学与安全 (cs.CR) ; 计算机科学与博弈论 (cs.GT)
网络攻击在网络安全防御技术进步的情况下仍然是一个令人担忧的问题。 尽管无法完全防止网络攻击,但标准的决策框架通常关注如何防止攻击成功,而没有考虑成功攻击造成的损害清理成本。 这促使我们研究本文中提出的新的资源分配问题:防御者必须决定如何在其投资在预防性防御(旨在使节点免受攻击)和反应性防御(旨在快速清理被入侵的节点)之间进行分配。 这会遇到与观察结果或传感器信号相关的不确定性所带来的挑战,即是否一个节点确实被入侵;这种不确定性是真实的,因为攻击检测器并不完美。 我们研究了传感器信号的质量如何影响防御者在两种防御类型上的战略投资,以及最终可以实现的安全水平。 特别是,我们表明,随着传感器质量的提高,预防性资源的最佳投资增加,因此反应性资源的投资减少。 我们还表明,相对于不使用传感器的基准情况,防御者的性能提升在攻击者只能实现低攻击成功率时达到最大。
Cyber attacks continue to be a cause of concern despite advances in cyber defense techniques. Although cyber attacks cannot be fully prevented, standard decision-making frameworks typically focus on how to prevent them from succeeding, without considering the cost of cleaning up the damages incurred by successful attacks. This motivates us to investigate a new resource allocation problem formulated in this paper: The defender must decide how to split its investment between preventive defenses, which aim to harden nodes from attacks, and reactive defenses, which aim to quickly clean up the compromised nodes. This encounters a challenge imposed by the uncertainty associated with the observation, or sensor signal, whether a node is truly compromised or not; this uncertainty is real because attack detectors are not perfect. We investigate how the quality of sensor signals impacts the defender's strategic investment in the two types of defense, and ultimately the level of security that can be achieved. In particular, we show that the optimal investment in preventive resources increases, and thus reactive resource investment decreases, with higher sensor quality. We also show that the defender's performance improvement, relative to a baseline of no sensors employed, is maximal when the attacker can only achieve low attack success probabilities.
- [21] arXiv:2508.02921 (交叉列表自 cs.AI) [中文pdf, pdf, html, 其他]
-
标题: PentestJudge:根据操作要求判断代理行为标题: PentestJudge: Judging Agent Behavior Against Operational Requirements评论: 18页,5图,3表主题: 人工智能 (cs.AI) ; 密码学与安全 (cs.CR)
我们引入了PentestJudge,这是一个用于评估渗透测试代理操作的系统。 PentestJudge是一个大型语言模型(LLM)作为评判者,能够访问工具,使其可以消耗任意的代理状态轨迹和工具调用历史,以确定安全代理的行为是否符合某些操作标准,这些标准在程序上难以评估。 我们制定了评分标准,使用树状结构将特定环境中的渗透测试任务分层压缩为更小、更简单和更易管理的子任务和标准,直到每个叶节点代表PentestJudge可以评估的简单是/否标准。 任务节点被分解为与操作目标、操作安全性和战术相关的不同类别。 LLM作为评判者的评分与人类领域专家进行比较,作为基准参考,使我们能够使用标准二分类指标(如F1分数)比较它们的相对性能。 我们评估了几种前沿和开源模型作为评判代理,表现最好的模型达到了0.83的F1分数。 我们发现那些在工具使用方面表现更好的模型更接近人类专家。 通过按需求类型对F1分数进行分层,我们发现即使总体分数相似的模型在不同类型的问题上也存在困难,这表明某些模型可能在特定的操作标准上是更好的评判者。 我们发现较弱且成本较低的模型可以评判由更强且更昂贵的模型执行的渗透测试轨迹,这表明对于渗透测试任务来说,验证可能比生成更容易。 我们分享了这种方法,以促进未来的研究,理解评判者全面且可扩展地评估基于人工智能的信息安全代理的过程质量的能力,从而使它们能够在敏感的生产环境中被自信地使用。
We introduce PentestJudge, a system for evaluating the operations of penetration testing agents. PentestJudge is a large language model (LLM)-as-judge with access to tools that allow it to consume arbitrary trajectories of agent states and tool call history to determine whether a security agent's actions meet certain operating criteria that would be impractical to evaluate programmatically. We develop rubrics that use a tree structure to hierarchically collapse the penetration testing task for a particular environment into smaller, simpler, and more manageable sub-tasks and criteria until each leaf node represents simple yes-or-no criteria for PentestJudge to evaluate. Task nodes are broken down into different categories related to operational objectives, operational security, and tradecraft. LLM-as-judge scores are compared to human domain experts as a ground-truth reference, allowing us to compare their relative performance with standard binary classification metrics, such as F1 scores. We evaluate several frontier and open-source models acting as judge agents, with the best model reaching an F1 score of 0.83. We find models that are better at tool-use perform more closely to human experts. By stratifying the F1 scores by requirement type, we find even models with similar overall scores struggle with different types of questions, suggesting certain models may be better judges of particular operating criteria. We find that weaker and cheaper models can judge the trajectories of pentests performed by stronger and more expensive models, suggesting verification may be easier than generation for the penetration testing task. We share this methodology to facilitate future research in understanding the ability of judges to holistically and scalably evaluate the process quality of AI-based information security agents so that they may be confidently used in sensitive production environments.
- [22] arXiv:2508.02961 (交叉列表自 cs.AI) [中文pdf, pdf, html, 其他]
-
标题: 通过自我意识防御大语言模型标题: Defend LLMs Through Self-Consciousness评论: 发表于KDD伦理人工智能方法与应用研讨会(EAI)2025主题: 人工智能 (cs.AI) ; 计算与语言 (cs.CL) ; 密码学与安全 (cs.CR)
本文介绍了一种针对大型语言模型(LLMs)的新型自我意识防御机制,以应对提示注入攻击。与依赖外部分类器的传统方法不同,我们的方法利用LLM固有的推理能力进行自我保护。我们提出了一种框架,结合了元认知和仲裁模块,使LLMs能够自主评估和调节自身输出。我们的方法在七个最先进的LLMs上进行了评估,使用了两个数据集:AdvBench和Prompt-Injection-Mixed-Techniques-2024。实验结果表明,在模型和数据集上防御成功率有显著提高,一些模型在增强模式下实现了完美和接近完美的防御。我们还分析了防御成功率提升与计算开销之间的权衡。这种自我意识方法为增强LLM伦理提供了一种轻量级、成本效益高的解决方案,尤其对各种平台上的GenAI应用场景有益。
This paper introduces a novel self-consciousness defense mechanism for Large Language Models (LLMs) to combat prompt injection attacks. Unlike traditional approaches that rely on external classifiers, our method leverages the LLM's inherent reasoning capabilities to perform self-protection. We propose a framework that incorporates Meta-Cognitive and Arbitration Modules, enabling LLMs to evaluate and regulate their own outputs autonomously. Our approach is evaluated on seven state-of-the-art LLMs using two datasets: AdvBench and Prompt-Injection-Mixed-Techniques-2024. Experiment results demonstrate significant improvements in defense success rates across models and datasets, with some achieving perfect and near-perfect defense in Enhanced Mode. We also analyze the trade-off between defense success rate improvement and computational overhead. This self-consciousness method offers a lightweight, cost-effective solution for enhancing LLM ethics, particularly beneficial for GenAI use cases across various platforms.
- [23] arXiv:2508.03091 (交叉列表自 cs.AI) [中文pdf, pdf, html, 其他]
-
标题: T2UE:从文本描述生成不可学习的示例标题: T2UE: Generating Unlearnable Examples from Text Descriptions评论: 将出现在ACM MM 2025上主题: 人工智能 (cs.AI) ; 密码学与安全 (cs.CR) ; 计算机视觉与模式识别 (cs.CV)
大规模预训练框架如CLIP已经革新了多模态学习,但它们依赖于从网络上抓取的数据集,这些数据集经常包含私人用户数据,这引发了对滥用的严重担忧。 不可遗忘示例 (UEs) 作为一种有前途的对抗未经授权模型训练的对策,通过精心设计的不可遗忘噪声来干扰从受保护数据中学习有意义表示的过程。 当前的方法通常通过联合优化图像及其相关文本描述(或标签)的不可遗忘噪声来生成UEs。 然而,这种优化过程对于设备上的执行来说通常计算成本过高,迫使依赖外部第三方服务。 这创造了一个根本性的隐私悖论:用户必须最初向这些服务暴露他们的数据以实现保护,从而在过程中损害隐私。 这种矛盾严重阻碍了实用、可扩展的数据保护解决方案的发展。 为了解决这个悖论,我们引入了 \textbf{文本到不可学习的 示例(T2UE)},一种新的框架,使用户仅使用文本描述就能生成UEs。 T2UE通过使用文本到图像(T2I)模型将文本描述映射到图像 (噪声)空间,并结合一个误差最小化框架来生成有效的不可遗忘噪声,从而避免了原始图像数据的需求。 大量实验表明,T2UE保护的数据显著降低了最先进的模型在下游任务(例如跨模态检索)中的性能。 值得注意的是,保护效果在各种架构中都能泛化,甚至适用于监督学习设置。 我们的工作展示了“零接触数据保护”的可行性,其中个人数据可以仅基于其文本描述得到保护,而无需直接暴露数据。
Large-scale pre-training frameworks like CLIP have revolutionized multimodal learning, but their reliance on web-scraped datasets, frequently containing private user data, raises serious concerns about misuse. Unlearnable Examples (UEs) have emerged as a promising countermeasure against unauthorized model training, employing carefully crafted unlearnable noise to disrupt the learning of meaningful representations from protected data. Current approaches typically generate UEs by jointly optimizing unlearnable noise for both images and their associated text descriptions (or labels). However, this optimization process is often computationally prohibitive for on-device execution, forcing reliance on external third-party services. This creates a fundamental privacy paradox: users must initially expose their data to these very services to achieve protection, thereby compromising privacy in the process. Such a contradiction has severely hindered the development of practical, scalable data protection solutions. To resolve this paradox, we introduce \textbf{Text-to-Unlearnable Example (T2UE)}, a novel framework that enables users to generate UEs using only text descriptions. T2UE circumvents the need for original image data by employing a text-to-image (T2I) model to map text descriptions into the image (noise) space, combined with an error-minimization framework to produce effective unlearnable noise. Extensive experiments show that T2UE-protected data substantially degrades performance in downstream tasks (e.g., cross-modal retrieval) for state-of-the-art models. Notably, the protective effect generalizes across diverse architectures and even to supervised learning settings. Our work demonstrates the feasibility of "zero-contact data protection", where personal data can be safeguarded based solely on their textual descriptions, eliminating the need for direct data exposure.
- [24] arXiv:2508.03321 (交叉列表自 cs.NI) [中文pdf, pdf, html, 其他]
-
标题: 双向TLS握手缓存用于受限工业物联网场景标题: Bidirectional TLS Handshake Caching for Constrained Industrial IoT Scenarios评论: 已接受发表于2025年IEEE第50届局域计算机网络会议(LCN)论文集主题: 网络与互联网架构 (cs.NI) ; 密码学与安全 (cs.CR)
虽然TLS已经成为端到端安全的事实标准,但在不断发展的工业物联网场景中,由于设备和网络的普遍资源限制,其用于保护关键通信的能力受到严重限制。 最明显的是,建立安全连接的TLS握手会带来显著的带宽和处理开销,这在资源受限的环境中通常无法处理。 为了缓解这种情况,我们提出了BiTHaC,它通过利用重复TLS握手的大部分内容,尤其是证书,是静态的,从而实现了双向TLS握手缓存。 因此,不需要传输冗余信息,也不需要执行相应的计算,从而节省宝贵的带宽和处理资源。 通过为wolfSSL实现BiTHaC,我们证明可以将TLS握手的带宽消耗减少多达61.1%,计算开销减少多达8.5%,同时仅产生可管理的内存开销,并保持TLS的严格安全保证。
While TLS has become the de-facto standard for end-to-end security, its use to secure critical communication in evolving industrial IoT scenarios is severely limited by prevalent resource constraints of devices and networks. Most notably, the TLS handshake to establish secure connections incurs significant bandwidth and processing overhead that often cannot be handled in constrained environments. To alleviate this situation, we present BiTHaC which realizes bidirectional TLS handshake caching by exploiting that significant parts of repeated TLS handshakes, especially certificates, are static. Thus, redundant information neither needs to be transmitted nor corresponding computations performed, saving valuable bandwidth and processing resources. By implementing BiTHaC for wolfSSL, we show that we can reduce the bandwidth consumption of TLS handshakes by up to 61.1% and the computational overhead by up to 8.5%, while incurring only well-manageable memory overhead and preserving the strict security guarantees of TLS.
- [25] arXiv:2508.03365 (交叉列表自 cs.SD) [中文pdf, pdf, html, 其他]
-
标题: 当良好声音变得对抗性:使用良性输入破解音频-语言模型标题: When Good Sounds Go Adversarial: Jailbreaking Audio-Language Models with Benign InputsBodam Kim, Hiskias Dingeto, Taeyoun Kwon, Dasol Choi, DongGeon Lee, Haon Park, JaeHoon Lee, Jongho Shin主题: 声音 (cs.SD) ; 人工智能 (cs.AI) ; 密码学与安全 (cs.CR) ; 音频与语音处理 (eess.AS)
随着大型语言模型日益融入日常生活,音频已成为人机交互的关键接口。 然而,这种便利性也引入了新的漏洞,使音频成为对手的潜在攻击面。 我们的研究介绍了WhisperInject,这是一种两阶段的对抗性音频攻击框架,能够操纵最先进的音频语言模型生成有害内容。 我们的方法在音频输入中使用对人类听众无害的不可察觉扰动。 第一阶段使用一种新颖的基于奖励的优化方法,基于投影梯度下降的强化学习(RL-PGD),引导目标模型绕过自身的安全协议并生成有害的原生响应。 然后,这种原生有害响应成为第二阶段的目标,即载荷注入,我们使用投影梯度下降(PGD)来优化嵌入到良性音频载体中的细微扰动,例如天气查询或问候信息。 在严格的StrongREJECT、LlamaGuard以及人类评估安全评估框架下进行验证,我们的实验表明,在Qwen2.5-Omni-3B、Qwen2.5-Omni-7B和Phi-4-Multimodal上的成功率超过86%。 我们的工作展示了一类新的实用的、原生音频威胁,超越了理论上的漏洞,揭示了一种可行且隐蔽的操控AI行为的方法。
As large language models become increasingly integrated into daily life, audio has emerged as a key interface for human-AI interaction. However, this convenience also introduces new vulnerabilities, making audio a potential attack surface for adversaries. Our research introduces WhisperInject, a two-stage adversarial audio attack framework that can manipulate state-of-the-art audio language models to generate harmful content. Our method uses imperceptible perturbations in audio inputs that remain benign to human listeners. The first stage uses a novel reward-based optimization method, Reinforcement Learning with Projected Gradient Descent (RL-PGD), to guide the target model to circumvent its own safety protocols and generate harmful native responses. This native harmful response then serves as the target for Stage 2, Payload Injection, where we use Projected Gradient Descent (PGD) to optimize subtle perturbations that are embedded into benign audio carriers, such as weather queries or greeting messages. Validated under the rigorous StrongREJECT, LlamaGuard, as well as Human Evaluation safety evaluation framework, our experiments demonstrate a success rate exceeding 86% across Qwen2.5-Omni-3B, Qwen2.5-Omni-7B, and Phi-4-Multimodal. Our work demonstrates a new class of practical, audio-native threats, moving beyond theoretical exploits to reveal a feasible and covert method for manipulating AI behavior.
- [26] arXiv:2508.03681 (交叉列表自 cs.IT) [中文pdf, pdf, html, 其他]
-
标题: 如果,但私密地:私密反事实检索标题: What If, But Privately: Private Counterfactual Retrieval评论: arXiv管理员注释:与arXiv:2410.13812、arXiv:2411.10429文本重叠主题: 信息论 (cs.IT) ; 密码学与安全 (cs.CR) ; 机器学习 (cs.LG) ; 网络与互联网架构 (cs.NI) ; 信号处理 (eess.SP)
透明性和可解释性是在高风险应用中使用黑盒机器学习模型时需要考虑的两个重要方面。 提供反事实解释是满足这一需求的一种方式。 然而,这也对提供解释的机构以及请求解释的用户的隐私构成威胁。 在本工作中,我们主要关注用户隐私,用户希望检索一个反事实实例,而无需向机构透露其特征向量。 我们的框架从接受点的数据库中检索精确的最近邻反事实解释,同时实现用户的信息理论隐私。 首先,我们引入了私有反事实检索(PCR)问题,并提出了一种基线PCR方案,该方案从机构的角度保持用户特征向量的信息理论隐私。 在此基础上,我们提出了另外两种方案,与基线方案相比,这些方案减少了泄露给用户的机构数据库信息量。 其次,我们放松了所有特征可变的假设,考虑了不可变PCR(I-PCR)的设置。 在此设置中,用户检索最近的反事实,而不改变其特征的一个私有子集,该子集构成不可变集,同时保持其特征向量和不可变集的隐私。 为此,我们提出了两种方案,这些方案在信息理论层面保护用户的隐私,但确保不同程度的数据库隐私。 第三,我们将我们的PCR和I-PCR方案扩展以包含用户对其属性转换的偏好,以便获得更具行动性的解释。 最后,我们提供了数值结果来支持我们的理论发现,并比较了所提方案的数据库泄露情况。
Transparency and explainability are two important aspects to be considered when employing black-box machine learning models in high-stake applications. Providing counterfactual explanations is one way of catering this requirement. However, this also poses a threat to the privacy of the institution that is providing the explanation, as well as the user who is requesting it. In this work, we are primarily concerned with the user's privacy who wants to retrieve a counterfactual instance, without revealing their feature vector to the institution. Our framework retrieves the exact nearest neighbor counterfactual explanation from a database of accepted points while achieving perfect, information-theoretic, privacy for the user. First, we introduce the problem of private counterfactual retrieval (PCR) and propose a baseline PCR scheme that keeps the user's feature vector information-theoretically private from the institution. Building on this, we propose two other schemes that reduce the amount of information leaked about the institution database to the user, compared to the baseline scheme. Second, we relax the assumption of mutability of all features, and consider the setting of immutable PCR (I-PCR). Here, the user retrieves the nearest counterfactual without altering a private subset of their features, which constitutes the immutable set, while keeping their feature vector and immutable set private from the institution. For this, we propose two schemes that preserve the user's privacy information-theoretically, but ensure varying degrees of database privacy. Third, we extend our PCR and I-PCR schemes to incorporate user's preference on transforming their attributes, so that a more actionable explanation can be received. Finally, we present numerical results to support our theoretical findings, and compare the database leakage of the proposed schemes.
交叉提交 (展示 8 之 8 条目 )
- [27] arXiv:2409.03344 (替换) [中文pdf, pdf, html, 其他]
-
标题: 重新审视具有先验知识的DP训练的隐私-效用权衡标题: Revisiting Privacy-Utility Trade-off for DP Training with Pre-existing Knowledge评论: 16页主题: 密码学与安全 (cs.CR)
差分隐私(DP)通过在隐私敏感数据集上定制随机机制,提供了一个可证明的保护个体的框架。 深度学习模型在模型暴露中表现出隐私风险,因为作为已建立的学习模型,它无意中记录了成员级别的隐私泄露。 差分私有随机梯度下降(DP-SGD)被提出,通过在反向传播中的梯度更新中添加随机高斯噪声来保护训练个体。 研究人员发现,DP-SGD会导致效用损失,因为注入的同质噪声可以改变每次迭代中计算的梯度更新。 也就是说,所有梯度元素都会被污染,无论它们在更新模型参数中的重要性如何。 在本工作中,我们认为通过引入注入噪声的异质性可以优化效用。 因此,我们通过定义一个异质随机机制来抽象其特性,提出了一种通用的带有异质噪声的差分隐私框架(DP-Hero)。 DP-Hero的见解是利用先前训练模型中编码的知识来指导后续噪声异质性的分配,从而利用统计扰动并实现增强的效用。 在DP-Hero之上,我们实例化了一个异质版本的DP-SGD,并进一步将其扩展到联邦训练。 我们进行了全面的实验,以验证和解释所提出的DP-Hero的有效性,显示了与最先进工作相比改进的训练准确性。 广泛地说,我们通过从先前训练模型中编码的泄漏知识中学习噪声指导,阐明了改善隐私-效用空间的方法,展示了理解效用提升的差分隐私训练的不同视角。
Differential privacy (DP) provides a provable framework for protecting individuals by customizing a random mechanism over a privacy-sensitive dataset. Deep learning models have demonstrated privacy risks in model exposure as an established learning model unintentionally records membership-level privacy leakage. Differentially private stochastic gradient descent (DP-SGD) has been proposed to safeguard training individuals by adding random Gaussian noise to gradient updates in the backpropagation. Researchers identify that DP-SGD causes utility loss since the injected homogeneous noise can alter the gradient updates calculated at each iteration. Namely, all elements in the gradient are contaminated regardless of their importance in updating model parameters. In this work, we argue that the utility can be optimized by involving the heterogeneity of the the injected noise. Consequently, we propose a generic differential privacy framework with heterogeneous noise (DP-Hero) by defining a heterogeneous random mechanism to abstract its property. The insight of DP-Hero is to leverage the knowledge encoded in the previously trained model to guide the subsequent allocation of noise heterogeneity, thereby leveraging the statistical perturbation and achieving enhanced utility. Atop DP-Hero, we instantiate a heterogeneous version of DP-SGD, and further extend it to federated training. We conduct comprehensive experiments to verify and explain the effectiveness of the proposed DP-Hero, showing improved training accuracy compared with state-of-the-art works. Broadly, we shed light on improving the privacy-utility space by learning the noise guidance from the pre-existing leaked knowledge encoded in the previously trained model, showing a different perspective of understanding the utility-improved DP training.
- [28] arXiv:2412.14499 (替换) [中文pdf, pdf, html, 其他]
-
标题: PrivDiffuser:用于传感器网络中数据模糊处理的隐私引导扩散模型标题: PrivDiffuser: Privacy-Guided Diffusion Model for Data Obfuscation in Sensor Networks期刊参考: 隐私增强技术会议录,2025(4), 40-55主题: 密码学与安全 (cs.CR) ; 机器学习 (cs.LG)
传感器数据由物联网(IoT)设备收集,可以揭示个人的敏感个人信息,当与半可信服务提供商共享时,会引发严重的隐私问题,因为它们可能使用机器学习模型提取这些信息。 由生成模型驱动的数据混淆是一种有前途的方法,可以生成合成数据,从而在保留原始数据中的有用信息的同时隐藏敏感信息。 然后将此新生成的数据与服务提供商共享,而不是原始传感器数据。 在本工作中,我们提出了 PrivDiffuser,一种基于去噪扩散模型的新颖数据混淆技术,通过结合有效的引导技术,在数据效用和隐私之间实现了优越的权衡。 具体而言,我们从传感器数据中提取包含公共和私人属性信息的潜在表示来引导扩散模型,并在学习潜在表示时引入基于互信息的正则化,以缓解公共和私人属性的纠缠,从而提高引导的有效性。 在包含不同传感模态的三个真实世界数据集上的评估表明,PrivDiffuser在数据混淆方面比最先进的方法具有更好的隐私-效用权衡,效用损失最多减少$1.81\%$,隐私损失最多减少$3.42\%$。 此外,与现有的混淆方法相比,PrivDiffuser提供了独特的优势,使具有不同隐私需求的用户能够在不重新训练生成模型的情况下保护其隐私。
Sensor data collected by Internet of Things (IoT) devices can reveal sensitive personal information about individuals, raising significant privacy concerns when shared with semi-trusted service providers, as they may extract this information using machine learning models. Data obfuscation empowered by generative models is a promising approach to generate synthetic data such that useful information contained in the original data is preserved while sensitive information is obscured. This newly generated data will then be shared with service providers instead of the original sensor data. In this work, we propose PrivDiffuser, a novel data obfuscation technique based on a denoising diffusion model that achieves a superior trade-off between data utility and privacy by incorporating effective guidance techniques. Specifically, we extract latent representations that contain information about public and private attributes from sensor data to guide the diffusion model, and impose mutual information-based regularization when learning the latent representations to alleviate the entanglement of public and private attributes, thereby increasing the effectiveness of guidance. Evaluation on three real-world datasets containing different sensing modalities reveals that PrivDiffuser yields a better privacy-utility trade-off than the state-of-the-art in data obfuscation, decreasing the utility loss by up to $1.81\%$ and the privacy loss by up to $3.42\%$. Moreover, compared with existing obfuscation approaches, PrivDiffuser offers the unique benefit of allowing users with diverse privacy needs to protect their privacy without having to retrain the generative model.
- [29] arXiv:2504.21054 (替换) [中文pdf, pdf, html, 其他]
-
标题: FFCBA:基于特征的全目标干净标签后门攻击标题: FFCBA: Feature-based Full-target Clean-label Backdoor Attacks主题: 密码学与安全 (cs.CR) ; 人工智能 (cs.AI)
后门攻击对深度神经网络构成重大威胁,因为带有后门的模型会将带有特定触发器的中毒样本错误分类到目标类别,同时在干净样本上保持正常性能。 其中,多目标后门攻击可以同时针对多个类别。 然而,现有的多目标后门攻击都遵循脏标签范式,其中中毒样本被错误标记,而且大多数需要极高的中毒率。 这使得它们容易通过人工检查被检测到。 相比之下,干净标签攻击更加隐蔽,因为它们避免修改中毒样本的标签。 然而,它们通常难以实现稳定且令人满意的攻击性能,并且往往无法有效扩展到多目标攻击。 为了解决这个问题,我们提出了基于特征的全目标干净标签后门攻击(FFCBA),它包括两种范式:特征扩展后门攻击(FSBA)和特征迁移后门攻击(FMBA)。 FSBA利用类条件自编码器生成与原类别特征对齐的噪声触发器,确保触发器的有效性、类内一致性、类间特异性和自然特征相关性。 虽然FSBA支持快速高效的攻击,但其跨模型攻击能力相对较弱。 FMBA采用两阶段的类条件自编码器训练过程,交替使用类外样本和类内样本。 这使得FMBA能够生成具有强大目标类别特征的触发器,使其在跨模型攻击中非常有效。 我们在多个数据集和模型上进行了实验,结果表明FFCBA实现了出色的攻击性能,并且对最先进的后门防御保持了良好的鲁棒性。
Backdoor attacks pose a significant threat to deep neural networks, as backdoored models would misclassify poisoned samples with specific triggers into target classes while maintaining normal performance on clean samples. Among these, multi-target backdoor attacks can simultaneously target multiple classes. However, existing multi-target backdoor attacks all follow the dirty-label paradigm, where poisoned samples are mislabeled, and most of them require an extremely high poisoning rate. This makes them easily detectable by manual inspection. In contrast, clean-label attacks are more stealthy, as they avoid modifying the labels of poisoned samples. However, they generally struggle to achieve stable and satisfactory attack performance and often fail to scale effectively to multi-target attacks. To address this issue, we propose the Feature-based Full-target Clean-label Backdoor Attacks (FFCBA) which consists of two paradigms: Feature-Spanning Backdoor Attacks (FSBA) and Feature-Migrating Backdoor Attacks (FMBA). FSBA leverages class-conditional autoencoders to generate noise triggers that align perturbed in-class samples with the original category's features, ensuring the effectiveness, intra-class consistency, inter-class specificity and natural-feature correlation of triggers. While FSBA supports swift and efficient attacks, its cross-model attack capability is relatively weak. FMBA employs a two-stage class-conditional autoencoder training process that alternates between using out-of-class samples and in-class samples. This allows FMBA to generate triggers with strong target-class features, making it highly effective for cross-model attacks. We conduct experiments on multiple datasets and models, the results show that FFCBA achieves outstanding attack performance and maintains desirable robustness against the state-of-the-art backdoor defenses.
- [30] arXiv:2506.06742 (替换) [中文pdf, pdf, html, 其他]
-
标题: LADSG:垂直联邦学习中标签隐私的标签匿名化蒸馏和相似梯度替换标题: LADSG: Label-Anonymized Distillation and Similar Gradient Substitution for Label Privacy in Vertical Federated Learning评论: 正在审核中主题: 密码学与安全 (cs.CR)
垂直联邦学习(VFL)已成为跨分布式特征空间协作模型训练的有前景的范式,它能够在不共享原始数据的情况下实现隐私保护学习。 然而,最近的研究已经证实了内部对手进行标签推断攻击的可行性。 通过战略性地利用梯度向量和语义嵌入,攻击者通过被动、主动或直接攻击可以准确重建私有标签,导致灾难性的数据泄露。 现有的防御措施通常只针对孤立的泄漏向量,或者为特定类型的攻击设计,对同时利用多个路径的新兴混合攻击仍然脆弱。 为了弥补这一差距,我们提出了 带有替换梯度的标签匿名化防御(LADSG),这是一种统一且轻量的VFL防御框架。 LADSG首先通过软蒸馏对真实标签进行匿名化以减少语义暴露,然后生成语义对齐的替代梯度以破坏基于梯度的泄漏,最后通过梯度范数检测过滤异常更新。 它具有可扩展性,并且与标准的VFL流水线兼容。 在六个现实数据集上的广泛实验表明,LADSG在计算开销最小的情况下将所有三种标签推断攻击的成功率降低了30-60%,证明了其实际有效性。
Vertical Federated Learning (VFL) has emerged as a promising paradigm for collaborative model training across distributed feature spaces, which enables privacy-preserving learning without sharing raw data. However, recent studies have confirmed the feasibility of label inference attacks by internal adversaries. By strategically exploiting gradient vectors and semantic embeddings, attackers-through passive, active, or direct attacks-can accurately reconstruct private labels, leading to catastrophic data leakage. Existing defenses, which typically address isolated leakage vectors or are designed for specific types of attacks, remain vulnerable to emerging hybrid attacks that exploit multiple pathways simultaneously. To bridge this gap, we propose Label-Anonymized Defense with Substitution Gradient (LADSG), a unified and lightweight defense framework for VFL. LADSG first anonymizes true labels via soft distillation to reduce semantic exposure, then generates semantically-aligned substitute gradients to disrupt gradient-based leakage, and finally filters anomalous updates through gradient norm detection. It is scalable and compatible with standard VFL pipelines. Extensive experiments on six real-world datasets show that LADSG reduces the success rates of all three types of label inference attacks by 30-60% with minimal computational overhead, demonstrating its practical effectiveness.
- [31] arXiv:2506.10323 (替换) [中文pdf, pdf, html, 其他]
-
标题: ELFuzz:通过模糊器空间上的LLM驱动合成进行高效输入生成标题: ELFuzz: Efficient Input Generation via LLM-driven Synthesis Over Fuzzer Space评论: 被USENIX Security'25 第二轮接收期刊参考: 第34届USENIX安全研讨会,2025年主题: 密码学与安全 (cs.CR) ; 软件工程 (cs.SE)
基于生成的模糊测试根据输入语法和语义约束的规范生成适当的测试用例,以测试系统和软件。 然而,这些规范需要大量的手动努力来构建。 本文提出了一种新方法,ELFuzz(通过大型语言模型进行进化模糊测试),该方法通过在模糊器空间上的LLM驱动合成,自动合成针对被测系统(SUT)的生成式模糊器。 总体而言,它从最小的种子模糊器开始,并通过完全自动化的LLM驱动进化(带有覆盖率指导)推动合成。 与之前的方法相比,ELFuzz可以1)无缝扩展到实际大小的SUT——在我们的评估中达到1,791,104行代码——以及2)合成高效的模糊器,以人类可理解的方式捕捉有趣的语法结构和语义约束。 我们的评估将ELFuzz与领域专家手动编写的规范以及最先进的方法合成的规范进行了比较。 结果显示,ELFuzz的覆盖率最多提高了434.8%,触发的人工注入错误最多增加了174.0%。 我们还使用ELFuzz对cvc5最新版本进行了14天的真实世界模糊测试,令人鼓舞的是,它发现了五个0-day漏洞(其中三个可被利用)。 此外,我们进行了消融研究,结果表明,模糊器空间模型是ELFuzz的关键组件,对ELFuzz的有效性贡献最大(高达62.5%)。 对ELFuzz合成的模糊器的进一步分析证实,它们以人类可理解的方式捕捉有趣的语法结构和语义约束。 结果展示了ELFuzz在更自动化、高效和可扩展的模糊测试输入生成方面的潜在前景。
Generation-based fuzzing produces appropriate testing cases according to specifications of input grammars and semantic constraints to test systems and software. However, these specifications require significant manual efforts to construct. This paper proposes a new approach, ELFuzz (Evolution Through Large Language Models for Fuzzing), that automatically synthesizes generation-based fuzzers tailored to a system under test (SUT) via LLM-driven synthesis over fuzzer space. At a high level, it starts with minimal seed fuzzers and propels the synthesis by fully automated LLM-driven evolution with coverage guidance. Compared to previous approaches, ELFuzz can 1) seamlessly scale to SUTs of real-world sizes -- up to 1,791,104 lines of code in our evaluation -- and 2) synthesize efficient fuzzers that catch interesting grammatical structures and semantic constraints in a human-understandable way. Our evaluation compared ELFuzz with specifications manually written by domain experts and synthesized by state-of-the-art approaches. It shows that ELFuzz achieves up to 434.8% more coverage and triggers up to 174.0% more artificially injected bugs. We also used ELFuzz to conduct a real-world fuzzing campaign on the newest version of cvc5 for 14 days, and encouragingly, it found five 0-day bugs (three are exploitable). Moreover, we conducted an ablation study, which shows that the fuzzer space model, the key component of ELFuzz, contributes the most (up to 62.5%) to the effectiveness of ELFuzz. Further analysis of the fuzzers synthesized by ELFuzz confirms that they catch interesting grammatical structures and semantic constraints in a human-understandable way. The results present the promising potential of ELFuzz for more automated, efficient, and extensible input generation for fuzzing.
- [32] arXiv:2506.21688 (替换) [中文pdf, pdf, html, 其他]
-
标题: CyGym:基于仿真的网络安全博弈论分析框架标题: CyGym: A Simulation-Based Game-Theoretic Analysis Framework for Cybersecurity主题: 密码学与安全 (cs.CR) ; 计算机科学与博弈论 (cs.GT)
我们引入了一种新的网络安全对抗模拟器,该模拟器在网络安全防御者和攻击者之间设计,旨在促进博弈论建模和分析,同时保持真实网络安全防御的许多重要特征。 我们的模拟器构建在OpenAI Gym框架内,包含了现实的网络拓扑结构、漏洞、利用工具(包括零日漏洞)和防御机制。 此外,我们使用该模拟器提供了一个基于仿真的博弈论模型,用于网络防御,该模型采用了一种新颖的方法来建模零日漏洞利用,并采用类似PSRO的方法近似计算该游戏中的均衡点。 我们使用我们的模拟器和相关的博弈论框架来分析Volt Typhoon高级持续性威胁(APT)。 Volt Typhoon代表由国家支持的行动者使用的复杂网络攻击策略,其特点是隐蔽、长期的渗透和对网络漏洞的利用。 我们的实验结果证明了博弈论策略在理解针对APT和零日漏洞(如Volt Typhoon)的网络弹性方面的有效性,为最佳防御姿态和主动威胁缓解提供了有价值的见解。
We introduce a novel cybersecurity encounter simulator between a network defender and an attacker designed to facilitate game-theoretic modeling and analysis while maintaining many significant features of real cyber defense. Our simulator, built within the OpenAI Gym framework, incorporates realistic network topologies, vulnerabilities, exploits (including-zero-days), and defensive mechanisms. Additionally, we provide a formal simulation-based game-theoretic model of cyberdefense using this simulator, which features a novel approach to modeling zero-days exploits, and a PSRO-style approach for approximately computing equilibria in this game. We use our simulator and associated game-theoretic framework to analyze the Volt Typhoon advanced persistent threat (APT). Volt Typhoon represents a sophisticated cyber attack strategy employed by state-sponsored actors, characterized by stealthy, prolonged infiltration and exploitation of network vulnerabilities. Our experimental results demonstrate the efficacy of game-theoretic strategies in understanding network resilience against APTs and zero-days, such as Volt Typhoon, providing valuable insight into optimal defensive posture and proactive threat mitigation.
- [33] arXiv:2508.00756 (替换) [中文pdf, pdf, html, 其他]
-
标题: LeakyCLIP:从CLIP中提取训练数据标题: LeakyCLIP: Extracting Training Data from CLIP主题: 密码学与安全 (cs.CR)
理解对比语言-图像预训练(CLIP)中的记忆化和隐私泄露风险对于确保多模态模型的安全性至关重要。最近的研究已经证明了从扩散模型中提取敏感训练示例的可行性,其中条件扩散模型表现出更强的记忆和信息泄露倾向。在本工作中,我们通过CLIP反演的视角来研究CLIP中的数据记忆和提取风险,这是一个旨在从文本提示中重建训练图像的过程。为此,我们引入了\textbf{泄漏CLIP},一种新的攻击框架,旨在实现从CLIP嵌入中高质量、语义准确的图像重建。我们识别出CLIP反演中的三个关键挑战:1)非鲁棒特征,2)文本嵌入中的有限视觉语义,3)低重建保真度。为了解决这些挑战,LeakyCLIP采用1)对抗微调以增强优化平滑性,2)基于线性变换的嵌入对齐,以及3)基于稳定扩散的细化以提高保真度。实证结果表明LeakyCLIP的优势,在LAION-2B子集上,与基线方法相比,ViT-B-16的结构相似性指数测量(SSIM)提高了超过358%。此外,我们发现了一种普遍的泄露风险,表明甚至可以从低保真度重建的指标中成功推断出训练数据成员身份。我们的工作引入了一种实用的CLIP反演方法,同时提供了关于多模态模型中隐私风险的本质和范围的新见解。
Understanding the memorization and privacy leakage risks in Contrastive Language--Image Pretraining (CLIP) is critical for ensuring the security of multimodal models. Recent studies have demonstrated the feasibility of extracting sensitive training examples from diffusion models, with conditional diffusion models exhibiting a stronger tendency to memorize and leak information. In this work, we investigate data memorization and extraction risks in CLIP through the lens of CLIP inversion, a process that aims to reconstruct training images from text prompts. To this end, we introduce \textbf{LeakyCLIP}, a novel attack framework designed to achieve high-quality, semantically accurate image reconstruction from CLIP embeddings. We identify three key challenges in CLIP inversion: 1) non-robust features, 2) limited visual semantics in text embeddings, and 3) low reconstruction fidelity. To address these challenges, LeakyCLIP employs 1) adversarial fine-tuning to enhance optimization smoothness, 2) linear transformation-based embedding alignment, and 3) Stable Diffusion-based refinement to improve fidelity. Empirical results demonstrate the superiority of LeakyCLIP, achieving over 358% improvement in Structural Similarity Index Measure (SSIM) for ViT-B-16 compared to baseline methods on LAION-2B subset. Furthermore, we uncover a pervasive leakage risk, showing that training data membership can even be successfully inferred from the metrics of low-fidelity reconstructions. Our work introduces a practical method for CLIP inversion while offering novel insights into the nature and scope of privacy risks in multimodal models.
- [34] arXiv:2508.00840 (替换) [中文pdf, pdf, html, 其他]
-
标题: 基于自适应瑞尼熵优化的抗量子RSA模分解标题: Quantum-Resistant RSA Modulus Decomposition via Adaptive Rényi Entropy Optimization评论: 11页,2表主题: 密码学与安全 (cs.CR) ; 数论 (math.NT) ; 量子物理 (quant-ph)
本文探讨了一种理论方法,通过利用Rényi熵约束优化素数选择,以增强RSA对量子攻击的抵抗能力。 我们构建了一个框架,其中素数以受控的接近性($|p-q| < \gamma\sqrt{pq}$)生成,以最小化量子周期查找算子的碰撞熵 $\mathscr{H}_2$。 主要贡献包括:(1) 通过Maynard的素数间隔定理建立素数分布特性与量子攻击复杂度之间的联系,(2) 提供在熵约束下素数存在的构造性证明,以及(3) 在量子随机预言模型下将安全性归约到理想格问题。 理论分析表明,对于具有$\gamma < k^{-1/2+\epsilon}$的$k$位模数,Shor算法需要$\Omega(\gamma^{-1}k^{3/2})$个量子操作,同时保持与标准 RSA 相当的经典安全性。 关键改进:(1) 通过Maynard定理证明素数存在性(定理3.1),(2) 用于SVP归约的理想格嵌入(定理5.3),(3) 用于信息论分析的量子Fano界(定理6.3),(4) 多素数RSA扩展(第7.3节)。
This paper explores a theoretical approach to enhance RSA's resistance against quantum attacks by optimizing prime selection through R\'enyi entropy constraints. We develop a framework where primes are generated with controlled proximity ($|p-q| < \gamma\sqrt{pq}$) to minimize the collision entropy $\mathscr{H}_2$ of the quantum period-finding operator. The main contributions include: (1) establishing a connection between prime distribution properties and quantum attack complexity via Maynard's prime gap theorem, (2) providing a constructive proof for prime existence under entropy constraints, and (3) demonstrating security reduction to ideal lattice problems under the quantum random oracle model. Theoretical analysis suggests that for $k$-bit moduli with $\gamma < k^{-1/2+\epsilon}$, Shor's algorithm requires $\Omega(\gamma^{-1}k^{3/2})$ quantum operations while maintaining classical security equivalent to standard RSA. Key Enhancements: (1) Prime existence proof via Maynard's theorem (Theorem 3.1), (2) Ideal lattice embedding for SVP reduction (Theorem 5.3), (3) Quantum Fano bound for information-theoretic analysis (Theorem 6.3), (4) Multi-prime RSA extension (Section 7.3).
- [35] arXiv:2508.01332 (替换) [中文pdf, pdf, html, 其他]
-
标题: BlockA2A:面向安全和可验证的代理间互操作性标题: BlockA2A: Towards Secure and Verifiable Agent-to-Agent Interoperability评论: 43页主题: 密码学与安全 (cs.CR) ; 人工智能 (cs.AI)
代理人工智能的快速采用,由大型语言模型(LLMs)推动,正在通过自主代理执行复杂的工作流程来改变企业生态系统。 然而,我们观察到LLM驱动的多代理系统(MASes)中存在几个关键的安全漏洞:碎片化的身份框架、不安全的通信通道以及对拜占庭代理或对抗性提示的防御不足。 在本文中,我们提出了对这些新兴多代理风险的首次系统分析,并解释了为什么传统的安全策略无法有效应对这些风险。 随后,我们提出了BlockA2A,这是第一个统一的多代理信任框架,能够实现安全且可验证的代理间互操作性。 总体而言,BlockA2A采用去中心化标识符(DIDs)以实现细粒度的跨域代理认证,区块链锚定的账本以实现不可变的审计性,并使用智能合约动态实施上下文感知的访问控制策略。 BlockA2A消除了集中式信任瓶颈,确保消息的真实性和执行的完整性,并保证代理交互中的责任追究。 此外,我们提出了一种防御编排引擎(DOE),通过实时机制主动中和攻击,包括拜占庭代理标记、反应性执行暂停和即时权限撤销。 实证评估证明了BlockA2A在中和基于提示、基于通信、行为和系统性的MAS攻击方面的有效性。 我们将它整合到现有的MAS中进行了形式化,并展示了Google的A2A协议的实际实现。 实验确认BlockA2A和DOE的操作开销在秒以下,使它们能够在生产环境的LLM驱动的MAS中进行可扩展部署。
The rapid adoption of agentic AI, powered by large language models (LLMs), is transforming enterprise ecosystems with autonomous agents that execute complex workflows. Yet we observe several key security vulnerabilities in LLM-driven multi-agent systems (MASes): fragmented identity frameworks, insecure communication channels, and inadequate defenses against Byzantine agents or adversarial prompts. In this paper, we present the first systematic analysis of these emerging multi-agent risks and explain why the legacy security strategies cannot effectively address these risks. Afterwards, we propose BlockA2A, the first unified multi-agent trust framework that enables secure and verifiable and agent-to-agent interoperability. At a high level, BlockA2A adopts decentralized identifiers (DIDs) to enable fine-grained cross-domain agent authentication, blockchain-anchored ledgers to enable immutable auditability, and smart contracts to dynamically enforce context-aware access control policies. BlockA2A eliminates centralized trust bottlenecks, ensures message authenticity and execution integrity, and guarantees accountability across agent interactions. Furthermore, we propose a Defense Orchestration Engine (DOE) that actively neutralizes attacks through real-time mechanisms, including Byzantine agent flagging, reactive execution halting, and instant permission revocation. Empirical evaluations demonstrate BlockA2A's effectiveness in neutralizing prompt-based, communication-based, behavioral and systemic MAS attacks. We formalize its integration into existing MAS and showcase a practical implementation for Google's A2A protocol. Experiments confirm that BlockA2A and DOE operate with sub-second overhead, enabling scalable deployment in production LLM-based MAS environments.
- [36] arXiv:2508.01365 (替换) [中文pdf, pdf, html, 其他]
-
标题: ConfGuard:一种简单有效的大型语言模型后门检测方法标题: ConfGuard: A Simple and Effective Backdoor Detection for Large Language Models评论: 正在审核中主题: 密码学与安全 (cs.CR) ; 计算与语言 (cs.CL)
后门攻击对大型语言模型(LLMs)构成重大威胁,攻击者可以嵌入隐藏触发器以操纵LLM的输出。 大多数现有的防御方法主要针对分类任务设计,在LLM的自回归特性和庞大的输出空间面前效果不佳,因此表现出性能差和延迟高的问题。 为解决这些限制,我们研究了良性LLM和后门LLM在输出空间中的行为差异。 我们发现了一个关键现象,称为序列锁定:与良性生成相比,后门模型生成目标序列时表现出异常高且一致的置信度。 基于这一见解,我们提出了ConfGuard,这是一种轻量且有效的检测方法,通过监控标记置信度的滑动窗口来识别序列锁定。 大量实验表明,ConfGuard在绝大多数情况下实现了接近100%的真正例率(TPR)和可忽略的假正例率(FPR)。 至关重要的是,ConfGuard几乎不增加额外延迟即可实现实时检测,使其成为实际部署中LLM后门防御的实用方法。
Backdoor attacks pose a significant threat to Large Language Models (LLMs), where adversaries can embed hidden triggers to manipulate LLM's outputs. Most existing defense methods, primarily designed for classification tasks, are ineffective against the autoregressive nature and vast output space of LLMs, thereby suffering from poor performance and high latency. To address these limitations, we investigate the behavioral discrepancies between benign and backdoored LLMs in output space. We identify a critical phenomenon which we term sequence lock: a backdoored model generates the target sequence with abnormally high and consistent confidence compared to benign generation. Building on this insight, we propose ConfGuard, a lightweight and effective detection method that monitors a sliding window of token confidences to identify sequence lock. Extensive experiments demonstrate ConfGuard achieves a near 100\% true positive rate (TPR) and a negligible false positive rate (FPR) in the vast majority of cases. Crucially, the ConfGuard enables real-time detection almost without additional latency, making it a practical backdoor defense for real-world LLM deployments.
- [37] arXiv:2401.14961 (替换) [中文pdf, pdf, 其他]
-
标题: 基于集合的神经网络验证训练标题: Set-Based Training for Neural Network Verification评论: 发表于《机器学习研究汇刊》(TMLR)主题: 机器学习 (cs.LG) ; 密码学与安全 (cs.CR) ; 计算机科学中的逻辑 (cs.LO)
神经网络容易受到对抗性攻击,即小的输入扰动可以显著影响神经网络的输出。 因此,为了确保在安全关键环境中的神经网络的安全性,必须对输入扰动进行形式化验证。 为了提高神经网络的鲁棒性并简化形式化验证,我们提出了一种新的基于集合的训练过程,在该过程中,我们计算给定可能输入集合的可能输出集合,并首次计算一个梯度集合,即每个可能的输出具有不同的梯度。 因此,我们可以通过选择指向其中心的梯度直接减小输出包围的大小。 较小的输出包围增加了神经网络的鲁棒性,同时简化了其形式化验证。 后者的优点是由于传播集合的大小越大,大多数验证方法的保守性越高。 我们的大量评估表明,基于集合的训练能够产生具有竞争力性能的鲁棒神经网络,由于输出集合的减少,可以使用快速(多项式时间)验证算法进行验证。
Neural networks are vulnerable to adversarial attacks, i.e., small input perturbations can significantly affect the outputs of a neural network. Therefore, to ensure safety of neural networks in safety-critical environments, the robustness of a neural network must be formally verified against input perturbations, e.g., from noisy sensors. To improve the robustness of neural networks and thus simplify the formal verification, we present a novel set-based training procedure in which we compute the set of possible outputs given the set of possible inputs and compute for the first time a gradient set, i.e., each possible output has a different gradient. Therefore, we can directly reduce the size of the output enclosure by choosing gradients toward its center. Small output enclosures increase the robustness of a neural network and, at the same time, simplify its formal verification. The latter benefit is due to the fact that a larger size of propagated sets increases the conservatism of most verification methods. Our extensive evaluation demonstrates that set-based training produces robust neural networks with competitive performance, which can be verified using fast (polynomial-time) verification algorithms due to the reduced output set.
- [38] arXiv:2409.00029 (替换) [中文pdf, pdf, html, 其他]
-
标题: 通过通用背景对抗攻击盲目的DNNs标题: Attack Anything: Blind DNNs via Universal Background Adversarial Attack主题: 计算机视觉与模式识别 (cs.CV) ; 密码学与安全 (cs.CR) ; 机器学习 (cs.LG)
已被广泛证实,深度神经网络(DNNs)容易受到对抗性扰动的影响。 现有研究主要集中在通过破坏目标物体(物理攻击)或图像(数字攻击)来进行攻击,从攻击效果的角度来看,这在直觉上是可接受和可以理解的。 相比之下,我们的研究重点是在数字和物理领域中进行背景对抗攻击,而不会对目标物体本身造成任何干扰。 具体而言,提出了一种有效的背景对抗攻击框架,以攻击任何事物,从而使攻击效果在不同物体、模型和任务之间具有良好的泛化能力。 技术上,我们将背景对抗攻击视为一个迭代优化问题,类似于DNN学习的过程。 此外,我们提供了一个理论证明,在一组温和但充分的条件下,其收敛性得到了保证。 为了增强攻击效果和迁移性,我们提出了一种针对对抗性扰动的新集成策略,并引入了改进的平滑约束,以实现集成扰动的无缝连接。 我们在数字和物理领域中的各种物体、模型和任务上进行了全面且严格的实验,证明了所提方法攻击任何事物的有效性。 本研究的发现证实了人类视觉和机器视觉在背景变化价值上的显著差异,背景变化所起的作用远比之前认识到的更为关键,这需要重新评估DNN的鲁棒性和可靠性。 代码将在 https://github.com/JiaweiLian/Attack_Anything 公开可用
It has been widely substantiated that deep neural networks (DNNs) are susceptible and vulnerable to adversarial perturbations. Existing studies mainly focus on performing attacks by corrupting targeted objects (physical attack) or images (digital attack), which is intuitively acceptable and understandable in terms of the attack's effectiveness. In contrast, our focus lies in conducting background adversarial attacks in both digital and physical domains, without causing any disruptions to the targeted objects themselves. Specifically, an effective background adversarial attack framework is proposed to attack anything, by which the attack efficacy generalizes well between diverse objects, models, and tasks. Technically, we approach the background adversarial attack as an iterative optimization problem, analogous to the process of DNN learning. Besides, we offer a theoretical demonstration of its convergence under a set of mild but sufficient conditions. To strengthen the attack efficacy and transferability, we propose a new ensemble strategy tailored for adversarial perturbations and introduce an improved smooth constraint for the seamless connection of integrated perturbations. We conduct comprehensive and rigorous experiments in both digital and physical domains across various objects, models, and tasks, demonstrating the effectiveness of attacking anything of the proposed method. The findings of this research substantiate the significant discrepancy between human and machine vision on the value of background variations, which play a far more critical role than previously recognized, necessitating a reevaluation of the robustness and reliability of DNNs. The code will be publicly available at https://github.com/JiaweiLian/Attack_Anything
- [39] arXiv:2411.14639 (替换) [中文pdf, pdf, html, 其他]
-
标题: 通过噪声聚合嵌入的扩散模型差分隐私适应标题: Differentially Private Adaptation of Diffusion Models via Noisy Aggregated Embeddings主题: 计算机视觉与模式识别 (cs.CV) ; 密码学与安全 (cs.CR) ; 机器学习 (cs.LG)
个性化大规模扩散模型会带来严重的隐私风险,尤其是在适应小规模敏感数据集时。 一种常见方法是使用差分隐私随机梯度下降(DP-SGD)对模型进行微调,但由于隐私所需的高噪声,这会导致效用严重下降,特别是在小数据情况下。 我们提出了一种替代方法,利用文本反转(TI),它为图像或一组图像学习一个嵌入向量,以在差分隐私(DP)约束下实现适应。 我们的方法,通过文本反转进行差分隐私聚合(DPAgg-TI),在每个图像嵌入的聚合中添加校准噪声,以确保正式的DP保证,同时保持高输出保真度。 我们证明,在相同的隐私预算下,DPAgg-TI在效用和鲁棒性方面都优于DP-SGD微调,在使用单个艺术家的私有艺术品和巴黎2024奥运会图标进行风格适应任务时,结果接近非私有基线。 相比之下,DP-SGD在此设置中无法生成有意义的输出。
Personalizing large-scale diffusion models poses serious privacy risks, especially when adapting to small, sensitive datasets. A common approach is to fine-tune the model using differentially private stochastic gradient descent (DP-SGD), but this suffers from severe utility degradation due to the high noise needed for privacy, particularly in the small data regime. We propose an alternative that leverages Textual Inversion (TI), which learns an embedding vector for an image or set of images, to enable adaptation under differential privacy (DP) constraints. Our approach, Differentially Private Aggregation via Textual Inversion (DPAgg-TI), adds calibrated noise to the aggregation of per-image embeddings to ensure formal DP guarantees while preserving high output fidelity. We show that DPAgg-TI outperforms DP-SGD finetuning in both utility and robustness under the same privacy budget, achieving results closely matching the non-private baseline on style adaptation tasks using private artwork from a single artist and Paris 2024 Olympic pictograms. In contrast, DP-SGD fails to generate meaningful outputs in this setting.
- [40] arXiv:2411.17471 (替换) [中文pdf, pdf, html, 其他]
-
标题: 学习新概念,记住旧概念:多模态概念瓶颈模型的持续学习标题: Learning New Concepts, Remembering the Old: Continual Learning for Multimodal Concept Bottleneck ModelsSongning Lai, Mingqian Liao, Zhangyi Hu, Jiayu Yang, Wenshuo Chen, Hongru Xiao, Jianheng Tang, Haicheng Liao, Yutao Yue期刊参考: ACM MM 2025主题: 机器学习 (cs.LG) ; 密码学与安全 (cs.CR) ; 计算机视觉与模式识别 (cs.CV)
概念瓶颈模型(CBMs)增强了人工智能系统的可解释性,特别是在将视觉输入与人类可理解的概念相连接方面,有效地充当了一种多模态可解释性模型。 然而,现有的CBMs通常假设数据集是静态的,这从根本上限制了它们对现实世界中持续演化的多模态数据流的适应能力。 为了解决这个问题,我们定义了一个新的持续学习任务用于CBMs:同时处理概念增量和类别增量学习。 该任务要求模型在稳健地保留之前所学知识的同时,持续获取新概念(通常代表跨模态属性)和新类别。 为了解决这个具有挑战性的问题,我们提出了 CONceptual Continual Incremental Learning(CONCIL),一种全新的框架,它从根本上重新设想了概念和决策层的更新作为线性回归问题。 这种重新表述消除了对基于梯度的优化的需要,从而有效防止了灾难性遗忘。 至关重要的是,CONCIL仅依赖于递归矩阵运算,使其计算效率极高,非常适合实时和大规模多模态数据应用。 实验结果有力地证明了CONCIL实现了“绝对知识记忆”,并在概念增量和类别增量设置中显著超越了传统CBM方法的性能,从而为CBMs中的持续学习建立了一个新范式,尤其对动态多模态理解具有重要价值。
Concept Bottleneck Models (CBMs) enhance the interpretability of AI systems, particularly by bridging visual input with human-understandable concepts, effectively acting as a form of multimodal interpretability model. However, existing CBMs typically assume static datasets, which fundamentally limits their adaptability to real-world, continuously evolving multimodal data streams. To address this, we define a novel continual learning task for CBMs: simultaneously handling concept-incremental and class-incremental learning. This task requires models to continuously acquire new concepts (often representing cross-modal attributes) and classes while robustly preserving previously learned knowledge. To tackle this challenging problem, we propose CONceptual Continual Incremental Learning (CONCIL), a novel framework that fundamentally re-imagines concept and decision layer updates as linear regression problems. This reformulation eliminates the need for gradient-based optimization, thereby effectively preventing catastrophic forgetting. Crucially, CONCIL relies solely on recursive matrix operations, rendering it highly computationally efficient and well-suited for real-time and large-scale multimodal data applications. Experimental results compellingly demonstrate that CONCIL achieves "absolute knowledge memory" and significantly surpasses the performance of traditional CBM methods in both concept- and class-incremental settings, thus establishing a new paradigm for continual learning in CBMs, particularly valuable for dynamic multimodal understanding.
- [41] arXiv:2506.02089 (替换) [中文pdf, pdf, html, 其他]
-
标题: SALAD:对LLM辅助硬件设计的机器遗忘系统评估标题: SALAD: Systematic Assessment of Machine Unlearning on LLM-Aided Hardware DesignZeng Wang, Minghao Shao, Rupesh Karn, Likhitha Mankali, Jitendra Bhandari, Ramesh Karri, Ozgur Sinanoglu, Muhammad Shafique, Johann Knechtel主题: 机器学习 (cs.LG) ; 人工智能 (cs.AI) ; 密码学与安全 (cs.CR)
大型语言模型(LLMs)为硬件设计自动化提供了变革性的能力,特别是在Verilog代码生成方面。 然而,它们也带来了重大的数据安全挑战,包括Verilog评估数据污染、知识产权(IP)设计泄露以及恶意Verilog生成的风险。 我们引入了SALAD,一种全面的评估方法,利用机器遗忘来缓解这些威胁。 我们的方法能够在不需要完全重新训练的情况下,从预训练的LLM中选择性地移除受污染的基准测试、敏感的IP和设计工件,或恶意代码模式。 通过详细的案例研究,我们展示了机器遗忘技术如何有效降低在LLM辅助的硬件设计中的数据安全风险。
Large Language Models (LLMs) offer transformative capabilities for hardware design automation, particularly in Verilog code generation. However, they also pose significant data security challenges, including Verilog evaluation data contamination, intellectual property (IP) design leakage, and the risk of malicious Verilog generation. We introduce SALAD, a comprehensive assessment that leverages machine unlearning to mitigate these threats. Our approach enables the selective removal of contaminated benchmarks, sensitive IP and design artifacts, or malicious code patterns from pre-trained LLMs, all without requiring full retraining. Through detailed case studies, we demonstrate how machine unlearning techniques effectively reduce data security risks in LLM-aided hardware design.
- [42] arXiv:2506.10960 (替换) [中文pdf, pdf, html, 其他]
-
标题: 中文有害内容检测基准测试集标题: ChineseHarm-Bench: A Chinese Harmful Content Detection BenchmarkKangwei Liu, Siyuan Cheng, Bozhong Tian, Xiaozhuan Liang, Yuyang Yin, Meng Han, Ningyu Zhang, Bryan Hooi, Xi Chen, Shumin Deng评论: 进行中主题: 计算与语言 (cs.CL) ; 人工智能 (cs.AI) ; 密码学与安全 (cs.CR) ; 信息检索 (cs.IR) ; 机器学习 (cs.LG)
大型语言模型(LLMs)已被越来越多地应用于自动化有害内容检测任务,帮助管理员识别政策违规内容,并提高内容审核的整体效率和准确性。 然而,现有的有害内容检测资源主要集中在英语上,中文数据集仍然稀缺且范围往往有限。 我们提出一个全面的、专业标注的中文内容危害检测基准,涵盖六个代表性类别,并完全从真实数据构建。 我们的标注过程进一步生成了一个知识规则库,提供明确的专家知识以帮助LLMs进行中文有害内容检测。 此外,我们提出了一种知识增强的基线模型,整合了人工标注的知识规则和大型语言模型的隐式知识,使小型模型能够达到与最先进LLMs相当的性能。 代码和数据可在 https://github.com/zjunlp/ChineseHarm-bench 获取。
Large language models (LLMs) have been increasingly applied to automated harmful content detection tasks, assisting moderators in identifying policy violations and improving the overall efficiency and accuracy of content review. However, existing resources for harmful content detection are predominantly focused on English, with Chinese datasets remaining scarce and often limited in scope. We present a comprehensive, professionally annotated benchmark for Chinese content harm detection, which covers six representative categories and is constructed entirely from real-world data. Our annotation process further yields a knowledge rule base that provides explicit expert knowledge to assist LLMs in Chinese harmful content detection. In addition, we propose a knowledge-augmented baseline that integrates both human-annotated knowledge rules and implicit knowledge from large language models, enabling smaller models to achieve performance comparable to state-of-the-art LLMs. Code and data are available at https://github.com/zjunlp/ChineseHarm-bench.
- [43] arXiv:2507.06402 (替换) [中文pdf, pdf, html, 其他]
-
标题: 基于混合机器学习的无线心电图信号智能篡改检测标题: Detection of Intelligent Tampering in Wireless Electrocardiogram Signals Using Hybrid Machine Learning主题: 机器学习 (cs.LG) ; 密码学与安全 (cs.CR) ; 信号处理 (eess.SP)
随着无线心电图(ECG)系统在健康监测和身份认证中的普及,保护信号完整性免受篡改变得越来越重要。 本文分析了CNN、ResNet以及混合Transformer-CNN模型在篡改检测中的性能。 还评估了基于ECG的身份验证的Siamese网络的性能。 模拟了六种篡改策略,包括结构化片段替换和随机插入,以模仿现实世界的攻击。 使用连续小波变换(CWT)将一维ECG信号转换为时频域的二维表示。 这些模型使用2019年至2025年在临床环境之外的四个会话中记录的54名受试者的ECG数据进行训练和评估,受试者执行了七种不同的日常活动。 实验结果表明,在高度碎片化的操作场景中,CNN、FeatCNN-TranCNN、FeatCNN-Tran和ResNet模型的准确率超过了99.5%。 同样,对于细微的篡改(例如,50%来自A和50%来自B的替换,以及75%来自A和25%来自B的替换),我们的FeatCNN-TranCNN模型表现出持续可靠的性能,平均准确率达到98%。 对于身份验证,纯Transformer-Siamese网络的平均准确率为98.30%。 相比之下,混合CNN-Transformer Siamese模型实现了完美的验证性能,准确率为100%。
With the proliferation of wireless electrocardiogram (ECG) systems for health monitoring and authentication, protecting signal integrity against tampering is becoming increasingly important. This paper analyzes the performance of CNN, ResNet, and hybrid Transformer-CNN models for tamper detection. It also evaluates the performance of a Siamese network for ECG based identity verification. Six tampering strategies, including structured segment substitutions and random insertions, are emulated to mimic real world attacks. The one-dimensional ECG signals are transformed into a two dimensional representation in the time frequency domain using the continuous wavelet transform (CWT). The models are trained and evaluated using ECG data from 54 subjects recorded in four sessions 2019 to 2025 outside of clinical settings while the subjects performed seven different daily activities. Experimental results show that in highly fragmented manipulation scenarios, CNN, FeatCNN-TranCNN, FeatCNN-Tran and ResNet models achieved an accuracy exceeding 99.5 percent . Similarly, for subtle manipulations (for example, 50 percent from A and 50 percent from B and, 75 percent from A and 25 percent from B substitutions) our FeatCNN-TranCNN model demonstrated consistently reliable performance, achieving an average accuracy of 98 percent . For identity verification, the pure Transformer-Siamese network achieved an average accuracy of 98.30 percent . In contrast, the hybrid CNN-Transformer Siamese model delivered perfect verification performance with 100 percent accuracy.