优化与控制
查看 最近的 文章
显示 2025年08月13日, 星期三 新的列表
- [1] arXiv:2508.08413 [中文pdf, pdf, html, 其他]
-
标题: 去中心化松弛平滑优化与梯度下降方法标题: Decentralized Relaxed Smooth Optimization with Gradient Descent Methods评论: 43页,31图主题: 优化与控制 (math.OC)
$L_0$-平滑性,这对于推动分布式优化理论至关重要,对于现代任务如深度学习来说通常相当严格。 最近出现的放松的$(L_0,L_1)$-平滑条件使得梯度方法的收敛速度得到改进。 尽管在集中式方面取得了进展,但其分布式扩展仍未被探索且具有挑战性。 在这项工作中,我们通过引入新的分析技术,提出了首个在$(L_0,L_1)$-平滑性下的分布式梯度下降(DGD)通用框架。 在确定性环境下,我们的方法结合自适应裁剪实现了最佳已知的收敛速度,适用于凸/非凸函数,无需事先了解$L_0$和$L_1$以及有界梯度假设。 在随机环境下,我们推导了复杂度界限,并确定了凸优化中改进复杂度界限的条件。 使用真实数据集的实证验证表明了梯度范数相关的平滑性,为$(L_0,L_1)$-分布式优化算法架起了理论与实践之间的桥梁。
$L_0$-smoothness, which has been pivotal to advancing decentralized optimization theory, is often fairly restrictive for modern tasks like deep learning. The recent advent of relaxed $(L_0,L_1)$-smoothness condition enables improved convergence rates for gradient methods. Despite centralized advances, its decentralized extension remains unexplored and challenging. In this work, we propose the first general framework for decentralized gradient descent (DGD) under $(L_0,L_1)$-smoothness by introducing novel analysis techniques. For deterministic settings, our method with adaptive clipping achieves the best-known convergence rates for convex/nonconvex functions without prior knowledge of $L_0$ and $L_1$ and bounded gradient assumption. In stochastic settings, we derive complexity bounds and identify conditions for improved complexity bound in convex optimization. The empirical validation with real datasets demonstrates gradient-norm-dependent smoothness, bridging theory and practice for $(L_0,L_1)$-decentralized optimization algorithms.
- [2] arXiv:2508.08485 [中文pdf, pdf, html, 其他]
-
标题: 基于梯度和牛顿的单位向量极值搜索控制标题: Gradient- and Newton-Based Unit Vector Extremum Seeking Control主题: 优化与控制 (math.OC) ; 系统与控制 (eess.SY)
本文提出了新颖的方法,通过滑模技术实现多变量极值搜索控制(ESC)的稳定和高效收敛。 受到经典滑模控制和有限时间和固定时间控制最新发展的启发,我们提出了一种新的框架,将这些概念整合到基于正弦扰动信号的梯度和牛顿法ESC方案中。 关键创新在于使用不连续的“继电器型”控制组件,以单位向量控制(UVC)估计未知二次非线性性能图的梯度,从而取代传统的比例反馈。 这是首次在经典极值搜索范式内,利用滑模技术进行实时、无模型优化的尝试。 在基于梯度的方法中,收敛速率受目标函数未知Hessian矩阵的影响。 相比之下,牛顿法通过采用Hessian逆矩阵的动态估计器来克服这一限制,该估计器通过Riccati方程滤波器实现。 我们通过利用基于Lyapunov的分析和针对具有不连续右端系统的平均理论,建立了两种方法的闭环平均系统在有限时间内收敛到极值点。 数值仿真验证了所提出的方法,表明其收敛速度显著加快且鲁棒性优于传统ESC策略,后者通常仅保证指数稳定性。 结果还表明,基于梯度的方法收敛较慢且瞬态较高,因为梯度轨迹沿着曲线和最速下降路径移动,而牛顿法则直接趋向极值,从而实现更快的收敛和更好的整体性能。
This paper presents novel methods for achieving stable and efficient convergence in multivariable extremum seeking control (ESC) using sliding mode techniques. Drawing inspiration from both classical sliding mode control and more recent developments in finite-time and fixed-time control, we propose a new framework that integrates these concepts into Gradient- and Newton-based ESC schemes based on sinusoidal perturbation signals. The key innovation lies in the use of discontinuous "relay-type" control components, replacing traditional proportional feedback to estimate the gradient of unknown quadratic nonlinear performance maps with Unit Vector Control (UVC). This represents the first attempt to address real-time, model-free optimization using sliding modes within the classical extremum seeking paradigm. In the Gradient-based approach, the convergence rate is influenced by the unknown Hessian of the objective function. In contrast, the Newton-based method overcomes this limitation by employing a dynamic estimator for the inverse of the Hessian, implemented via a Riccati equation filter. We establish finite-time convergence of the closed-loop average system to the extremum point for both methods by leveraging Lyapunov-based analysis and averaging theory tailored to systems with discontinuous right-hand sides. Numerical simulations validate the proposed method, illustrating significantly faster convergence and improved robustness compared to conventional ESC strategies, which typically guarantee only exponential stability. The results also demonstrate that the Gradient-based method exhibits slower convergence and higher transients since the gradient trajectory follows the curved and steepest-descent path, whereas the Newton-based method achieves faster convergence and improved overall performance going straightly to the extremum.
- [3] arXiv:2508.08511 [中文pdf, pdf, html, 其他]
-
标题: 控制仿射薛定谔桥和广义玻姆势标题: Control-affine Schrödinger Bridge and Generalized Bohm PotentialAlexis M.H. Teter, Abhishek Halder, Michael D. Schneider, Alexx S. Perloff, Jane Pratt, Conor M. Artman, Maria Demireva评论: 此工作是在美国能源部的授权下,由劳伦斯利弗莫尔国家实验室在合同DE-AC52-07NA27344下进行的。这项工作的部分资金由LLNL实验室定向研究与发展拨款GS 25-ERD-044提供。文件发布编号:LLNL-JRNL-2008865主题: 优化与控制 (math.OC) ; 系统与控制 (eess.SY) ; 数学物理 (math-ph) ; 概率 (math.PR)
控制仿射薛定谔桥涉及一个随机最优控制问题。 其解是满足控制仿射伊藤扩散的联合状态概率密度的受控演化,且给定截止时间连接给定的初始和终端密度。 在本工作中,我们将控制仿射薛定谔桥问题的最优性必要条件重新表述为带有复势的量子力学薛定谔偏微分方程的两点边值问题。 这个复值势是量子力学中实值玻姆势的推广。 我们推导出的势类似于核物理中的光学势,其中势的实部编码弹性散射(波函数的透射),虚部编码非弹性散射(波函数的吸收)。 关键结论是,驱动概率密度演化的过程噪声在波函数的演化中引入了吸收介质。 这些结果通过量子力学的视角,建立了控制理论与非平衡统计力学之间的新联系。
The control-affine Schr\"odinger bridge concerns with a stochastic optimal control problem. Its solution is a controlled evolution of joint state probability density subject to a control-affine It\^o diffusion with a given deadline connecting a given pair of initial and terminal densities. In this work, we recast the necessary conditions of optimality for the control-affine Schr\"odinger bridge problem as a two point boundary value problem for a quantum mechanical Schr\"odinger PDE with complex potential. This complex-valued potential is a generalization of the real-valued Bohm potential in quantum mechanics. Our derived potential is akin to the optical potential in nuclear physics where the real part of the potential encodes elastic scattering (transmission of wave function), and the imaginary part encodes inelastic scattering (absorption of wave function). The key takeaway is that the process noise that drives the evolution of probability densities induces an absorbing medium in the evolution of wave function. These results make new connections between control theory and non-equilibrium statistical mechanics through the lens of quantum mechanics.
- [4] arXiv:2508.08520 [中文pdf, pdf, html, 其他]
-
标题: 多时间尺度随机规划及其在电力系统中的应用标题: Multi-timescale Stochastic Programming with Applications in Power Systems主题: 优化与控制 (math.OC)
本文介绍了一种多时间尺度随机规划框架,旨在解决高可再生能源渗透率下的电力系统决策问题。 该框架通过聚合状态变量来建模不同时间尺度之间的相互作用,以协调决策。 除了通过多时段树建模的多时间尺度不确定性外,我们还引入了一种“同步状态近似”,定期对齐不同时间尺度的状态,以保持一致性和可处理性。 利用这种近似,我们提出了两种实例化方法:一种是基于场景的方法,另一种是专门为此设置设计的基于价值函数的方法。 我们的框架非常通用,涵盖了广泛的应用范围。
This paper introduces a multi-timescale stochastic programming framework designed to address decision-making challenges in power systems, particularly those with high renewable energy penetration. The framework models interactions across different timescales using aggregated state variables to coordinate decisions. In addition to Multi-timescale uncertainty modeled via multihorizon trees, we also introduce a "synchronized state approximation," which periodically aligns states across timescales to maintain consistency and tractability. Using this approximation, we propose two instantiation methods: a scenario-based approach and a value function-based approach specialized for this setup. Our framework is very generic, and covers a wide-spectrum of applications.
- [5] arXiv:2508.08617 [中文pdf, pdf, 其他]
-
标题: 一种多尺度边界控制和路径引导系统用于大规模道路网络标题: A Multi-scale Perimeter Control and Route Guidance System for Large-scale Road Networks主题: 优化与控制 (math.OC)
通过控制网络上的空间和时间交通分布,周界控制和路径引导是减少交通拥堵和提高交通效率的有效方法。 本文提出了一种多尺度联合周界控制和路径引导(MSJC)框架,用于控制大规模网络中的交通。 首先将网络划分为几个子网络(区域),每个区域的交通由其宏观基本图(MFD)进行管理,这形成了宏观尺度网络(上层)。 每个子网络由实际的道路链接和信号交叉口组成,形成了微观尺度网络(下层)。 在上层,联合周界控制和路径引导模型求解基于区域的流入率和超路径流量,以控制每个区域的累积量,从而最大化每个区域的吞吐量。 在下层,一种与反压策略集成的周界控制策略确定区域边界处交叉口的最佳信号相位。 同时,构建了一个车辆路径选择模型,以满足超路径流量并确保区域内交通密度的一致性。 案例研究结果表明,所提出的MSJC在调节区域累积方面优于其他基准,从而提高了吞吐量。
Perimeter control and route guidance are effective ways to reduce traffic congestion and improve traffic efficiency by controlling the spatial and temporal traffic distribution on the network. This paper presents a multi-scale joint perimeter control and route guidance (MSJC) framework for controlling traffic in large-scale networks. The network is first partitioned into several subnetworks (regions) with traffic in each region governed by its macroscopic fundamental diagram (MFD), which forms the macroscale network (upper level). Each subnetwork, comprised of actual road links and signalized intersections, forms the microscale network (lower level). At the upper level, a joint perimeter control and route guidance model solves the region-based inflow rate and hyper-path flows to control the accumulation of each region and thus maximize the throughput of each region. At the lower level, a perimeter control strategy integrated with a backpressure policy determines the optimal signal phases of the intersections at the regional boundary. At the same time, a route choice model for vehicles is constructed to meet hyper-path flows and ensure the intra-region homogeneity of traffic density. The case study results demonstrate that the proposed MSJC outperforms other benchmarks in regulating regional accumulation, thereby improving throughput.
- [6] arXiv:2508.08658 [中文pdf, pdf, html, 其他]
-
标题: 拜占庭容错的去中心化在线资源分配标题: Byzantine-Resilient Decentralized Online Resource Allocation主题: 优化与控制 (math.OC)
在本文中,我们研究了存在拜占庭攻击情况下的去中心化在线资源分配问题。 在这一问题设置中,由于外部操纵或内部故障,一些代理可能被破坏,导致它们表现出恶意行为,并通过向邻居发送错误信息来干扰资源分配过程。 鉴于资源分配问题的非共识性质,我们在原始-对偶优化框架下对其进行表述,其中对偶变量在代理之间进行聚合,从而能够引入稳健的聚合机制以缓解拜占庭攻击。 通过利用经典的拜占庭攻击模型,我们提出了一类具有拜占庭鲁棒性的去中心化在线资源分配算法,这些算法巧妙地将自适应稳健截断技术与现有的稳健聚合规则相结合,以过滤掉敌对信息。 我们建立了理论保证,表明所提出的算法实现了紧密的线性动态遗憾和累积约束违规界限,其中常数取决于稳健聚合规则的特性。 在去中心化在线经济调度上的数值实验验证了我们方法的有效性,并支持我们的理论结果。
In this paper, we investigate the problem of decentralized online resource allocation in the presence of Byzantine attacks. In this problem setting, some agents may be compromised due to external manipulations or internal failures, causing them to behave maliciously and disrupt the resource allocation process by sending incorrect messages to their neighbors. Given the non-consensual nature of the resource allocation problem, we formulate it under a primal-dual optimization framework, where the dual variables are aggregated among the agents, enabling the incorporation of robust aggregation mechanisms to mitigate Byzantine attacks. By leveraging the classical Byzantine attack model, we propose a class of Byzantine-resilient decentralized online resource allocation algorithms that judiciously integrate the adaptive robust clipping technique with the existing robust aggregation rules to filter out adversarial messages. We establish theoretical guarantees, showing that the proposed algorithms achieve tight linear dynamic regret and accumulative constraint violation bounds, where the constants depend on the properties of robust aggregation rules. Numerical experiments on decentralized online economic dispatch validate the effectiveness of our approach and support our theoretical results.
- [7] arXiv:2508.08702 [中文pdf, pdf, html, 其他]
-
标题: 用格枚举求解市场分割问题标题: Solving the Market Split Problem with Lattice Enumeration主题: 优化与控制 (math.OC)
市场分割问题由Cornuéjols和Dawande于1998年提出,作为求解具有二进制变量的线性系统算法的基准问题。最近(2025年)的量子优化基准库(QOBLIB)包含了一组可行的市场分割问题实例。市场分割问题似乎很难用整数线性规划软件的传统分支定界方法解决,据报道该方法可以处理QOBLIB中的实例最多到$m=7$。相反,一种新的Schroeppel-Shamir算法的GPU实现可以解决到$m=11$的实例。在这篇简短的注释中,我们报告了我们算法的实验,该算法将市场分割问题转化为格问题。作者最近的实现diophant应用于QOBLIB市场分割实例可以在标准计算机上解决到$m=14$的实例。
The market split problem was proposed by Cornu\'ejols and Dawande in 1998 as benchmark problem for algorithms solving linear systems with binary variables. The recent (2025) Quantum Optimization Benchmark Library (QOBLIB) contains a set of feasible instances of the market split problem. The market split problem seems to be difficult to solve with the conventional branch-and-bound approach of integer linear programming software which reportedly can handle QOBLIB instances up to $m=7$. In contrast, a new GPU implementation of the Schroeppel-Shamir algorithm solves instances up to $m=11$. In this short note we report about experiments with our algorithm that reduces the market split problem to a lattice problem. The author's most recent implementation solvediophant applied to the QOBLIB market split instances can solve instances up to $m=14$ on a standard computer.
- [8] arXiv:2508.08844 [中文pdf, pdf, html, 其他]
-
标题: 单调跟踪系统的根本限制标题: Fundamental limitations of monotonic tracking systems主题: 优化与控制 (math.OC) ; 系统与控制 (eess.SY)
我们在这篇论文中考虑使用输出反馈线性控制器的连续时间单输入单输出线性系统的单调跟踪控制问题。 我们提供了该问题可解的必要充分条件,并揭示了其基本限制:被控对象零点的精确可行位置、可能的最小控制器阶数以及闭环系统可实现的最大衰减率。 对于具有一对共轭复数零点的被控对象,这些界限之间的关系由一个简单的几何形状来解释。
We consider the monotonic tracking control problem for continuous-time single-input single-output linear systems using output-feedback linear controllers in this paper. We provide the necessary and sufficient conditions for this problem to be solvable and expose its fundamental limitations: the exact feasible locations of the plant zeros, the minimum controller order possible, and the maximum decay rate achievable for the closed-loop system. The relationship between these bounds is explained by a simple geometric shape for plants with a pair of complex-conjugate zeros.
- [9] arXiv:2508.08848 [中文pdf, pdf, html, 其他]
-
标题: 具有共享自动驾驶车辆的瓶颈模型:规模经济与价格管制标题: A bottleneck model with shared autonomous vehicles: Scale economies and price regulations评论: 48页,6图主题: 优化与控制 (math.OC)
本研究考察了共享自动驾驶车辆(SAVs)运营中的规模经济如何影响SAVs与普通车辆(NVs)共存的交通系统的效率。 我们开发了一个瓶颈模型,通勤者在SAVs和NVs之间选择出发时间和出行方式,并分析了三种SAV定价情景下的均衡:边际成本定价、平均成本定价和不受监管的垄断定价。 边际成本定价可以降低通勤成本,但会导致服务提供商出现财务赤字。 平均成本定价确保了财务可持续性,但由于存在多个均衡,其效果取决于实施时机:如果实施过早,会抑制SAV的采用并增加通勤成本;如果在SAV采用达到垄断均衡水平之后引入,则会促进高采用率并实现显著的成本降低而无赤字。 我们还表明,在平均成本定价下扩大道路容量可能会增加通勤成本,这展示了在具有SAVs的交通系统中出现的Downs--Thomson悖论。 接下来,我们研究了两种改善社会成本(包括运营商利润)的最优政策:第一最佳政策结合边际成本定价和拥堵收费,第二最佳政策仅依赖票价调控。 我们的分析表明,这些政策可以通过抑制SAV的过度使用来限制过度采用。 这表明,促进SAV的采用并不总能降低社会成本。
This study examines how scale economies in the operation of shared autonomous vehicles (SAVs) affect the efficiency of a transportation system where SAVs coexist with normal vehicles (NVs). We develop a bottleneck model where commuters choose their departure times and mode of travel between SAVs and NVs, and analyze equilibria under three SAV-fare scenarios: marginal-cost pricing, average-cost pricing, and unregulated monopoly pricing. Marginal-cost pricing reduces commuting costs but results in financial deficits for the service provider. Average-cost pricing ensures financial sustainability but has contrasting effects depending on the timing of implementation due to the existence of multiple equilibria: when implemented too early, it discourages adoption of SAVs and increases commuting costs; when introduced after SAV adoption reaches the monopoly equilibrium level, it promotes high adoption and achieves substantial cost reductions without a deficit. We also show that expanding road capacity may increase commuting costs under average-cost pricing, demonstrating the Downs--Thomson paradox in transportation systems with SAVs. We next examine two optimal policies that improve social cost, including the operator's profit: the first-best policy that combines marginal-cost pricing with congestion tolls, and the second-best policy that relies on fare regulation alone. Our analysis shows that these policies can limit excessive adoption by discouraging overuse of SAVs. This suggests that promoting SAV adoption does not always lower social cost.
- [10] arXiv:2508.08856 [中文pdf, pdf, html, 其他]
-
标题: 基于投影梯度下降的约束决策依赖优化标题: Projected Gradient Descent for Constrained Decision-Dependent Optimization主题: 优化与控制 (math.OC)
本文考虑了决策依赖的优化问题,其中数据分布会根据影响目标函数和线性约束的决策作出反应。 我们提出了一种称为重复投影梯度下降(RPGD)的新方法,该方法在优化过程中迭代地将点投影到不断变化的可行集上。 为了分析不同投影集的影响,我们展示了在具有显式给定Lipschitz常数的情况下,对不同集合进行投影的Lipschitz连续性性质。 利用这一性质,我们提供了RPGD收敛到约束平衡点的充分条件。 与现有的对偶上升方法相比,RPGD在整个优化过程中确保了连续可行性并减少了计算负担。 我们通过在市场问题和动态定价问题上的数值实验验证了我们的结果。
This paper considers the decision-dependent optimization problem, where the data distributions react in response to decisions affecting both the objective function and linear constraints. We propose a new method termed repeated projected gradient descent (RPGD), which iteratively projects points onto evolving feasible sets throughout the optimization process. To analyze the impact of varying projection sets, we show a Lipschitz continuity property of projections onto varying sets with an explicitly given Lipschitz constant. Leveraging this property, we provide sufficient conditions for the convergence of RPGD to the constrained equilibrium point. Compared to the existing dual ascent method, RPGD ensures continuous feasibility throughout the optimization process and reduces the computational burden. We validate our results through numerical experiments on a market problem and dynamic pricing problem.
- [11] arXiv:2508.09010 [中文pdf, pdf, html, 其他]
-
标题: 无优化快速最优控制:突变-滑行特性、单调性及在快速电池充电中的应用标题: Optimization-Free Fast Optimal Control: Bang-Ride Property, Monotonicity, and Applications to Fast Battery Charging主题: 优化与控制 (math.OC) ; 系统与控制 (eess.SY)
单输入快速最优控制问题旨在尽可能快地实现最优目标,在各种现实应用中都会出现。 在快速电池充电的情况下,当使用详细的电池模型时,相关的最优控制问题变得计算上具有挑战性。 一种最近的无启发式优化算法可以显著降低计算成本,并提供一个近似解,与实际中的许多启发式输入轮廓一致。 这些启发式解有几个特殊性质:它们遵循一种“bang-ride”模式,始终激活一个约束并应用最大可行输入。 本研究探讨上述性质在最优输入中何时出现,以及最终启发式输入轮廓何时满足必要最优性条件。 通过利用庞特里亚金最大原理(PMP),我们证明在约束切换和系统局部可控性的正则条件下,最优控制是“bang-ride”形式。 此外,在系统、目标函数和约束的限制灵敏度单调的情况下,这种特殊的“bang-ride”行为,即应用最大可行输入,会出现。 这些结果为一类充电启发式方法和快速无优化算法提供了理论依据。
Single-input fast optimal control problems, which aim to achieve the optimal objective as fast as possible, occur in various real-world applications. In the case of fast battery charging, the associated optimal control problem becomes computationally challenging when detailed battery models are used. A recent heuristic optimization-free algorithm can significantly reduce the computational cost and provide an approximate solution, consistent with many heuristic input profiles in practice. These heuristic solutions have several special properties: They follow a bang-ride pattern that always activates a constraint and applies the maximum feasible input. This work investigates when the above properties arise in the optimal input, and ultimately, when the heuristic input profiles satisfy necessary optimality conditions. By exploiting Pontryagin's maximum principle (PMP), we show that the optimal control is bang-ride under regularity conditions on constraint switching and local controllability of the system. Moreover, the special type of bang-ride behavior, i.e., applying the maximum feasible input, arises under the monotonicity of the system, objective function, and restricted sensitivity of the constraints. These results provide a theoretical justification for a class of charging heuristics and the fast optimization-free algorithm.
- [12] arXiv:2508.09029 [中文pdf, pdf, html, 其他]
-
标题: 随机时变网络上非光滑凸和凸凹问题的分布式优化标题: Stochastic Decentralized Optimization of Non-Smooth Convex and Convex-Concave Problems over Time-Varying Networks主题: 优化与控制 (math.OC)
我们研究随时间变化的网络上的非光滑随机分布式优化问题,其中目标函数分布在各个节点上,网络连接可能间歇性出现或中断。 具体而言,我们考虑两种情况:(i) 随机非光滑(强)凸优化,以及 (ii) 随机非光滑(强)凸-(强)凹鞍点优化。 此类凸问题常见于深度神经网络训练,而鞍点问题在生成对抗网络(GANs)的训练等机器学习任务中起着核心作用。 先前的工作主要集中在光滑情况下,或者时间不变的网络场景中。 我们将现有理论扩展到更一般的非光滑和随机设置,在随时间变化的网络和鞍点问题上进行分析。 我们的分析建立了随机预言机调用次数和通信轮数的上界,与凸优化和鞍点优化问题的下界相匹配。
We study non-smooth stochastic decentralized optimization problems over time-varying networks, where objective functions are distributed across nodes and network connections may intermittently appear or break. Specifically, we consider two settings: (i) stochastic non-smooth (strongly) convex optimization, and (ii) stochastic non-smooth (strongly) convex-(strongly) concave saddle point optimization. Convex problems of this type commonly arise in deep neural network training, while saddle point problems are central to machine learning tasks such as the training of generative adversarial networks (GANs). Prior works have primarily focused on the smooth setting, or time-invariant network scenarios. We extend the existing theory to the more general non-smooth and stochastic setting over time-varying networks and saddle point problems. Our analysis establishes upper bounds on both the number of stochastic oracle calls and communication rounds, matching lower bounds for both convex and saddle point optimization problems.
- [13] arXiv:2508.09111 [中文pdf, pdf, html, 其他]
-
标题: 一阶和零阶学习在异步博弈中的应用标题: First- and Zeroth-Order Learning in Asynchronous Games主题: 优化与控制 (math.OC)
本文研究了离散时间的异步博弈,其中非合作代理试图最小化其各自的成本函数。 在部分异步性的假设基础上,即每个代理在固定长度的时间间隔内至少更新一次,我们探讨了确保此类异步博弈收敛的条件。 分析从一个简单的二次博弈开始,我们通过线性控制理论的视角推导出紧致的收敛条件。 然后,我们提供了适用于一般凸博弈的准优势条件。 我们的结果表明,这个条件是严格的,因为当该条件不满足时,异步博弈可能无法收敛。 我们根据可用反馈的类型,提出了适用于异步博弈的一阶和零阶学习算法,并分析了它们的最终迭代收敛速率。 我们在经济市场问题上进行了数值实验以验证我们的结果。
This paper investigates the discrete-time asynchronous games in which noncooperative agents seek to minimize their individual cost functions. Building on the assumption of partial asynchronism, i.e., each agent updates at least once within a fixed-length time interval, we explore the conditions to ensure convergence of such asynchronous games. The analysis begins with a simple quadratic game from which we derive tight convergence conditions through the lens of linear control theory. Then, we provide a quasidominance condition for general convex games. Our results demonstrate that this condition is stringent since when this condition is not satisfied, the asynchronous games may fail to converge. We propose both first- and zeroth-order learning algorithms for asynchronous games, depending on the type of available feedback, and analyze their last-iterate convergence rates. Numerical experiments are presented on economic market problems to verify our results.
新提交 (展示 13 之 13 条目 )
- [14] arXiv:2508.08397 (交叉列表自 math.FA) [中文pdf, pdf, html, 其他]
-
标题: 带锚定蕴含和事件索引不动点的希尔伯特空间:唯一性和定量速率标题: Anchored Implication & Event-Indexed Fixed Points in Hilbert Spaces: Uniqueness and Quantitative Rates评论: 12页,2图,1表主题: 泛函分析 (math.FA) ; 逻辑 (math.LO) ; 优化与控制 (math.OC)
我们开发了正交模逻辑(投影作为命题)与希尔伯特空间中算子不动点理论的综合方法。 首先,我们引入了一个锚定的蕴含联结词$A \Rightarrow^{\mathrm{comm}}_{P} B$,从语义上定义,只有当$A$为假,或者$A$为真且$B$在由固定非零投影$P$指定的“对易”上下文中为真时,该联结词才为真。 这个联结词通过添加一个侧条件$[E_B,P]=0$($B$与锚点的对易性)来改进物质蕴含,并在布尔(对易)情况下退化为经典蕴含。 其次,我们研究了事件索引收缩下的不动点收敛性。 对于单个非扩张(不一定线性)映射$T$,我们证明事件索引条件等价于经典的断言,即某个幂$T^N$是严格压缩;因此在该设置中,“不规则事件”的表述并不增加一般性。 我们随后介绍真正更一般的变分算子(切换/随机)情况:如果演化的复合块具有有界的事件间间隙并且存在公共不动点,我们得到唯一性以及显式的包络速率。 最后,带有与$T$可交换的锚$P$,同样的推理确保在该子空间上事件索引压缩条件下在$PH$上的收敛性。 我们包含精确的作用域条件、示例和可视化解释。
We develop a synthesis of orthomodular logic (projections as propositions) with operator fixed-point theory in Hilbert spaces. First, we introduce an anchored implication connective $A \Rightarrow^{\mathrm{comm}}_{P} B$, defined semantically so that it is true only when either $A$ is false or else $A$ is true and $B$ is true in a ''commuting'' context specified by a fixed nonzero projection $P$. This connective refines material implication by adding a side condition $[E_B,P]=0$ (commutation of $B$ with the anchor) and reduces to classical implication in the Boolean (commuting) case. Second, we study fixed-point convergence under event-indexed contractions. For a single nonexpansive (not necessarily linear) map $T$, we prove that the event-indexed condition is equivalent to the classical assertion that some power $T^N$ is a strict contraction; thus the ''irregular events'' phrasing does not add generality in that setting. We then present the genuinely more general case of varying operators (switching/randomized): if blocks of the evolving composition are contractive with bounded inter-event gaps and a common fixed point exists, we obtain uniqueness and an explicit envelope rate. Finally, with an anchor $P$ that commutes with $T$, the same reasoning ensures convergence on $PH$ under event-indexed contraction on that subspace. We include precise scope conditions, examples, and visual explanations.
- [15] arXiv:2508.08412 (交叉列表自 stat.ME) [中文pdf, pdf, html, 其他]
-
标题: 未测量混杂因素的多重回归分析标题: Multiple Regression Analysis of Unmeasured Confounding评论: 19页,9图主题: 方法论 (stat.ME) ; 优化与控制 (math.OC)
尽管置信区间用于评估由于未测量个体带来的不确定性,混杂区间可用于评估由于未测量属性带来的不确定性。 之前,我们在一篇题为“未测量混杂的回归分析”的论文中介绍了一种在简单回归设置中计算混杂区间的的方法。 在此,我们扩展了该方法,以便在多重回归的背景下更广泛地应用。 我们的未测量混杂的多重回归分析利用关于决定系数的领域知识来限制遗漏变量偏差,同时考虑已测量的协变量数据。 我们的通用方法可用于部分识别因果效应。 该方法通过示例应用进行演示,以说明决定系数如何与随机性互补,并支持从观察数据中进行因果推断的敏感性分析。 当数据生成过程中存在并可识别的自然随机源时,该方法最为适用。 我们的主要贡献是一个支持我们方法的算法。 本文的目的是详细描述我们的算法。 在论文中,我们提供了一个链接到我们的GitHub页面,供希望访问和使用我们算法的读者使用。
Whereas confidence intervals are used to assess uncertainty due to unmeasured individuals, confounding intervals can be used to assess uncertainty due to unmeasured attributes. Previously, we have introduced a methodology for computing confounding intervals in a simple regression setting in a paper titled ``Regression Analysis of Unmeasured Confounding." Here we extend that methodology for more general application in the context of multiple regression. Our multiple regression analysis of unmeasured confounding utilizes subject matter knowledge about coefficients of determination to bound omitted variables bias, while taking into account measured covariate data. Our generalized methodology can be used to partially identify causal effects. The methodology is demonstrated with example applications, to show how coefficients of determination, being complementary to randomness, can support sensitivity analysis for causal inference from observational data. The methodology is best applied when natural sources of randomness are present and identifiable within the data generating process. Our main contribution is an algorithm that supports our methodology. The purpose of this article is to describe our algorithm in detail. In the paper we provide a link to our GitHub page for readers who would like to access and utilize our algorithm.
- [16] arXiv:2508.08586 (交叉列表自 math.PR) [中文pdf, pdf, html, 其他]
-
标题: 大规模偏差渐近性对于具有增长选择的超级市场模型标题: Large Deviation Asymptotics for the Supermarket Model with Growing Choices评论: 28页主题: 概率 (math.PR) ; 优化与控制 (math.OC)
我们考虑具有增长选择的马尔可夫超级市场模型,其中任务以速率$n\lambda_n$到达,每个$n$个并行服务器在其队列中以速率$1$处理任务。每个到达的任务加入随机选择的$d_n \in \{1,\dotsc,n\}$个队列中最短的那个。 在假设$d_n \to \infty$和$\lambda_n \to \lambda \in (0,\infty)$如$n\to \infty$所示的情况下,在合适的无限维路径空间中建立了占用过程的大偏差原理(LDP),并证明了速率函数相对于$d_n \to \infty$的方式是不变的。 LDP 提供了关于与系统相关的各种罕见事件概率衰减速率的信息。 我们通过建立系统中大量任务数的概率的显式指数衰减率来说明这一点。 作为推论,我们还表明某些罕见事件的概率确实取决于$d_n \to \infty$的速率。
We consider the Markovian supermarket model with growing choices, where jobs arrive at rate $n\lambda_n$ and each of $n$ parallel servers processes jobs in its queue at rate $1$. Each incoming job joins the shortest among $d_n \in \{1,\dotsc,n\}$ randomly selected queues. Under the assumption $d_n \to \infty$ and $\lambda_n \to \lambda \in (0,\infty)$ as $n\to \infty$, a large deviation principle (LDP) for the occupancy process is established in a suitable infinite-dimensional path space, and it is shown that the rate function is invariant with respect to the manner in which $d_n \to \infty$. The LDP gives information on the rate of decay of probabilities of various types of rare events associated with the system. We illustrate this by establishing explicit exponential decay rates for probabilities of large total number of jobs in the system. As a corollary, we also show that probabilities of certain rare events can indeed depend on the rate of $d_n \to \infty$.
- [17] arXiv:2508.08606 (交叉列表自 cs.LG) [中文pdf, pdf, html, 其他]
-
标题: 分布式优化:专为联邦学习设计标题: Distributed optimization: designed for federated learning评论: 16页,6图主题: 机器学习 (cs.LG) ; 优化与控制 (math.OC) ; 机器学习 (stat.ML)
联邦学习(FL)作为一种在隐私保护约束下的分布式协作机器学习(ML)框架,在跨组织数据协作场景中引起了越来越多的研究关注。 本文提出了一类基于增广拉格朗日技术的分布式优化算法,旨在适应集中式和去中心化FL设置中的多种通信拓扑结构。 此外,我们开发了多种终止准则和参数更新机制,以提高计算效率,并伴随着严格的收敛性理论保证。 通过将邻近松弛和二次逼近结合到增广拉格朗日松弛中,我们的框架系统地恢复了广泛的经典无约束优化方法,包括邻近算法、经典梯度下降和随机梯度下降等。 值得注意的是,这些方法的收敛性特性可以在所提出的理论框架内自然推导出来。 数值实验表明,所提出的算法在具有显著统计异质性的大规模设置中表现出强大的性能。
Federated Learning (FL), as a distributed collaborative Machine Learning (ML) framework under privacy-preserving constraints, has garnered increasing research attention in cross-organizational data collaboration scenarios. This paper proposes a class of distributed optimization algorithms based on the augmented Lagrangian technique, designed to accommodate diverse communication topologies in both centralized and decentralized FL settings. Furthermore, we develop multiple termination criteria and parameter update mechanisms to enhance computational efficiency, accompanied by rigorous theoretical guarantees of convergence. By generalizing the augmented Lagrangian relaxation through the incorporation of proximal relaxation and quadratic approximation, our framework systematically recovers a broad of classical unconstrained optimization methods, including proximal algorithm, classic gradient descent, and stochastic gradient descent, among others. Notably, the convergence properties of these methods can be naturally derived within the proposed theoretical framework. Numerical experiments demonstrate that the proposed algorithm exhibits strong performance in large-scale settings with significant statistical heterogeneity across clients.
- [18] arXiv:2508.08802 (交叉列表自 quant-ph) [中文pdf, pdf, 其他]
-
标题: 带有最小导数方差的参数化量子电路的扩展参数偏移规则标题: Extended Parameter Shift Rules with Minimal Derivative Variance for Parameterized Quantum Circuits评论: 22+9页,14图主题: 量子物理 (quant-ph) ; 优化与控制 (math.OC)
参数偏移规则(PSRs)是计算参数化量子电路中成本函数任意阶导数的有用方法。 PSRs的基本思想是在不同的参数偏移下评估成本函数,然后使用特定系数将其线性组合以获得精确的导数。 在本工作中,我们提出了一种扩展的参数偏移规则(EPSR),它推广了现有PSRs的广泛范围,并具有以下两个优势。 首先,EPSR提供了无限多个可能的参数偏移,允许选择最优的参数偏移以最小化最终导数方差,从而在有限的量子资源下获得更准确的导数估计。 其次,EPSR在意义上扩展了PSRs的适用范围,因为EPSR可以处理参数化量子电路中门$U(x) = \exp (iHx)$中的任意厄米特算子$H$,而现有的PSRs仅适用于简单的厄米特生成器$H$,例如简单的泡利词。 此外,我们表明广泛使用的“通用PSR”,由Wierichs等人(2022年)引入,是我们的EPSR的一个特例,并且我们证明在加权采样方案下,它能产生全局最优的偏移以最小化导数方差。 最后,通过数值模拟,我们展示了EPSR的有效性,并表明使用最优参数偏移确实会导致更准确的导数估计。
Parameter shift rules (PSRs) are useful methods for computing arbitrary-order derivatives of the cost function in parameterized quantum circuits. The basic idea of PSRs is to evaluate the cost function at different parameter shifts, then use specific coefficients to combine them linearly to obtain the exact derivatives. In this work, we propose an extended parameter shift rule (EPSR) which generalizes a broad range of existing PSRs and has the following two advantages. First, EPSR offers an infinite number of possible parameter shifts, allowing the selection of the optimal parameter shifts to minimize the final derivative variance and thereby obtaining the more accurate derivative estimates with limited quantum resources. Second, EPSR extends the scope of the PSRs in the sense that EPSR can handle arbitrary Hermitian operator $H$ in gate $U(x) = \exp (iHx)$ in the parameterized quantum circuits, while existing PSRs are valid only for simple Hermitian generators $H$ such as simple Pauli words. Additionally, we show that the widely used ``general PSR'', introduced by Wierichs et al. (2022), is a special case of our EPSR, and we prove that it yields globally optimal shifts for minimizing the derivative variance under the weighted-shot scheme. Finally, through numerical simulations, we demonstrate the effectiveness of EPSR and show that the usage of the optimal parameter shifts indeed leads to more accurate derivative estimates.
- [19] arXiv:2508.08874 (交叉列表自 math.AP) [中文pdf, pdf, html, 其他]
-
标题: 一种分数薄层的Bourgain-Brezis-Mironescu结果标题: A Bourgain-Brezis-Mironescu result for fractional thin films主题: 偏微分方程分析 (math.AP) ; 优化与控制 (math.OC)
We consider the limit of squared $H^s$-Gagliardo seminorms on thin domains of the form $\Omega_\varepsilon=\omega\times(0,\varepsilon)$ in $\mathbb R^d$. 当$\varepsilon$固定时,通过乘以$1-s$这样的半范数已被证明当$s\to 1^-$时收敛到一个与维度有关的常数$c_d$乘以$\Omega_\varepsilon$上的 Dirichlet 积分,这是由 Bourgain、Brezis 和 Mironescu 证明的。 在另一方面,这样的狄利克雷积分除以$\varepsilon$随着$\varepsilon\to 0$收敛到$\omega$上的一个维数降低的狄利克雷积分。 我们证明,如果我们同时让$\varepsilon\to 0$和$s\to 1$,则这些平方半范数在乘以$(1-s) \varepsilon^{2s-3}$时,仍然收敛到相同的维数约简极限,而与$s$和$\varepsilon$的相对收敛速度无关。 这个系数结合了几何缩放$\varepsilon^{-1}$和有关$H^s$-Gagliardo 无 norms 的相关相互作用是在尺度$\varepsilon$的事实。 我们还研究了通过乘以$(1-s)\varepsilon^{-1}$得到的常规膜尺度,这突出了{\em 临界标度}$1-s\sim|\log\varepsilon|^{-1}$,以及当$\varepsilon\to 0$固定时的极限$s$。
We consider the limit of squared $H^s$-Gagliardo seminorms on thin domains of the form $\Omega_\varepsilon=\omega\times(0,\varepsilon)$ in $\mathbb R^d$. When $\varepsilon$ is fixed, multiplying by $1-s$ such seminorms have been proved to converge as $s\to 1^-$ to a dimensional constant $c_d$ times the Dirichlet integral on $\Omega_\varepsilon$ by Bourgain, Brezis and Mironescu. In its turn such Dirichlet integrals divided by $\varepsilon$ converge as $\varepsilon\to 0$ to a dimensionally reduced Dirichlet integral on $\omega$. We prove that if we let simultaneously $\varepsilon\to 0$ and $s\to 1$ then these squared seminorms still converge to the same dimensionally reduced limit when multiplied by $(1-s) \varepsilon^{2s-3}$, independently of the relative converge speed of $s$ and $\varepsilon$. This coefficient combines the geometrical scaling $\varepsilon^{-1}$ and the fact that relevant interactions for the $H^s$-Gagliardo seminorms are those at scale $\varepsilon$. We also study the usual membrane scaling, obtained by multiplying by $(1-s)\varepsilon^{-1}$, which highlighs the {\em critical scaling} $1-s\sim|\log\varepsilon|^{-1}$, and the limit when $\varepsilon\to 0$ at fixed $s$.
- [20] arXiv:2508.09103 (交叉列表自 quant-ph) [中文pdf, pdf, html, 其他]
-
标题: 约束自由能最小化用于热态和稳定器热力学系统的设计标题: Constrained free energy minimization for the design of thermal states and stabilizer thermodynamic systemsMichele Minervini, Madison Chin, Jacob Kupperman, Nana Liu, Ivy Luo, Meghan Ly, Soorya Rethinasamy, Kathie Wang, Mark M. Wilde评论: 32页,8图主题: 量子物理 (quant-ph) ; 统计力学 (cond-mat.stat-mech) ; 机器学习 (cs.LG) ; 优化与控制 (math.OC)
一个量子热力学系统由一个哈密顿量和一组守恒的非对易电荷描述,一个基本目标是确定在电荷约束下的系统最小能量。最近,[Liu 等,arXiv:2505.04514] 提出了用于解决对偶化学势最大化问题的一阶和二阶经典和混合量子-经典算法,并且他们通过梯度上升方法证明了这些算法可以收敛到全局最优解。在本文中,我们在几个热力学感兴趣的问题上对这些算法进行了基准测试,包括具有最近邻和次近邻相互作用的一维和二维量子海森堡模型,其中电荷设置为总$x$,$y$和$z$磁化。我们还提供了这些算法作为设计可控哈密顿量的基态和热态的方法的另一种有说服力的解释,可能在分子和材料设计中有应用。此外,我们引入了稳定器热力学系统,即基于稳定器码的热力学系统,其哈密顿量由给定码的稳定器算子构造,电荷由码的逻辑算子构造。我们在几个稳定器热力学系统的例子上对上述算法进行了基准测试,包括由一到三个量子比特重复码、完美的一到五个量子比特码和两到四个量子比特的纠错码构造的系统。最后,我们观察到,当应用于稳定器热力学系统时,上述混合量子-经典算法可以作为在固定温度下将量子比特编码到稳定器码中的替代方法,并且我们提供了一种有效的方法,每当一个量子比特被编码到多个物理量子比特中时,可以为此类编码算法进行预热启动。
A quantum thermodynamic system is described by a Hamiltonian and a list of conserved, non-commuting charges, and a fundamental goal is to determine the minimum energy of the system subject to constraints on the charges. Recently, [Liu et al., arXiv:2505.04514] proposed first- and second-order classical and hybrid quantum-classical algorithms for solving a dual chemical potential maximization problem, and they proved that these algorithms converge to global optima by means of gradient-ascent approaches. In this paper, we benchmark these algorithms on several problems of interest in thermodynamics, including one- and two-dimensional quantum Heisenberg models with nearest and next-to-nearest neighbor interactions and with the charges set to the total $x$, $y$, and $z$ magnetizations. We also offer an alternative compelling interpretation of these algorithms as methods for designing ground and thermal states of controllable Hamiltonians, with potential applications in molecular and material design. Furthermore, we introduce stabilizer thermodynamic systems as thermodynamic systems based on stabilizer codes, with the Hamiltonian constructed from a given code's stabilizer operators and the charges constructed from the code's logical operators. We benchmark the aforementioned algorithms on several examples of stabilizer thermodynamic systems, including those constructed from the one-to-three-qubit repetition code, the perfect one-to-five-qubit code, and the two-to-four-qubit error-detecting code. Finally, we observe that the aforementioned hybrid quantum-classical algorithms, when applied to stabilizer thermodynamic systems, can serve as alternative methods for encoding qubits into stabilizer codes at a fixed temperature, and we provide an effective method for warm-starting these encoding algorithms whenever a single qubit is encoded into multiple physical qubits.
交叉提交 (展示 7 之 7 条目 )
- [21] arXiv:2207.10820 (替换) [中文pdf, pdf, html, 其他]
-
标题: 平均鲁棒优化标题: Mean Robust Optimization主题: 优化与控制 (math.OC)
鲁棒优化是在不确定性下进行决策的一种可行且表达能力强的技术,但当对不确定参数做出悲观假设时,可能导致过于保守的决策。 Wasserstein分布鲁棒优化可以通过数据驱动来减少保守性,但它通常会导致非常大的问题,求解时间昂贵。 我们引入了均值鲁棒优化,这是一个结合两者优点的通用框架,通过在计算努力和保守性之间提供权衡。 我们提出基于聚类数据而不是直接基于观测数据点构建的不确定性集,从而显著减少问题规模。 通过改变聚类数量,我们的方法在鲁棒优化和Wasserstein分布鲁棒优化之间建立了桥梁。 我们展示了有限样本性能保证,并明确控制任何聚类过程引入的潜在额外悲观性。 此外,我们证明了当不确定性在线性约束中出现时,聚类不会影响最优解的条件。 我们在多个数值示例中展示了我们方法的效率和性能保持,求解时间获得了多个数量级的加速,而对解的质量几乎没有影响。
Robust optimization is a tractable and expressive technique for decision-making under uncertainty, but it can lead to overly conservative decisions when pessimistic assumptions are made on the uncertain parameters. Wasserstein distributionally robust optimization can reduce conservatism by being data-driven, but it often leads to very large problems with prohibitive solution times. We introduce mean robust optimization, a general framework that combines the best of both worlds by providing a trade-off between computational effort and conservatism. We propose uncertainty sets constructed based on clustered data rather than on observed data points directly thereby significantly reducing problem size. By varying the number of clusters, our method bridges between robust and Wasserstein distributionally robust optimization. We show finite-sample performance guarantees and explicitly control the potential additional pessimism introduced by any clustering procedure. In addition, we prove conditions for which, when the uncertainty enters linearly in the constraints, clustering does not affect the optimal solution. We illustrate the efficiency and performance preservation of our method on several numerical examples, obtaining multiple orders of magnitude speedups in solution time with little-to-no effect on the solution quality.
- [22] arXiv:2305.15704 (替换) [中文pdf, pdf, html, 其他]
-
标题: 基于计算机辅助的加速复合优化方法设计:OptISTA标题: Computer-Assisted Design of Accelerated Composite Optimization Methods: OptISTA评论: 72页主题: 优化与控制 (math.OC)
加速的复合优化方法FISTA(Beck,Teboulle 2009)在常数因子上是次优的,我们提出了一种新的方法OptISTA,它在常数因子上优于FISTA两倍。 性能评估问题(PEP)最近被引入作为一种新的计算机辅助范式,用于设计最优的一阶方法。 在这项工作中,我们提出了一种双函数步长优化PEP方法,将固定步长一阶方法在复合优化中的优化问题表述为一个有限维非凸QCQP,可以通过空间分支定界算法实际求解,并用它来设计复合优化设置下的精确最优方法OptISTA。 然后,我们在大规模假设下建立了OptISTA的精确最优性,通过一个扩展了半插值零链构造(Drori,Taylor 2022)的下界构造,将其推广到复合优化的双函数设置中。 通过建立精确最优性,我们的工作结束了针对最差情况函数值次优性的性能度量,寻找最快的的一阶方法的研究,对于涉及光滑凸函数和闭合适当凸函数的近似、投影梯度和近似梯度设置。
The accelerated composite optimization method FISTA (Beck, Teboulle 2009) is suboptimal by a constant factor, and we present a new method OptISTA that improves FISTA by a constant factor of 2. The performance estimation problem (PEP) has recently been introduced as a new computer-assisted paradigm for designing optimal first-order methods. In this work, we present a double-function stepsize-optimization PEP methodology that poses the optimization over fixed-step first-order methods for composite optimization as a finite-dimensional nonconvex QCQP, which can be practically solved through spatial branch-and-bound algorithms, and use it to design the exact optimal method OptISTA for the composite optimization setup. We then establish the exact optimality of OptISTA under the large-scale assumption with a lower-bound construction that extends the semi-interpolated zero-chain construction (Drori, Taylor 2022) to the double-function setup of composite optimization. By establishing exact optimality, our work concludes the search for the fastest first-order methods, with respect to the performance measure of worst-case function value suboptimality, for the proximal, projected-gradient, and proximal-gradient setups involving a smooth convex function and a closed proper convex function.
- [23] arXiv:2311.13765 (替换) [中文pdf, pdf, html, 其他]
-
标题: 从部署中收集的数据中学习在线分配稀缺社会资源的最优和公平策略标题: Learning Optimal and Fair Policies for Online Allocation of Scarce Societal Resources from Data Collected in Deployment评论: 78页,17图,2表主题: 优化与控制 (math.OC) ; 机器学习 (cs.LG) ; 物理与社会 (physics.soc-ph)
我们研究了如何根据观察到的协变量,将不同类型的稀缺社会资源(例如永久性住房、移植用的捐献肾脏、呼吸机)分配给等待名单上的异质分配对象(例如无家可归的人、患有终末期肾病的个体、新冠患者)。 我们利用在部署中收集的行政数据,设计了一个在线策略,在长期内最大化预期结果的同时满足预算约束。 我们提出的策略为每个个体分配能使其估计的平均治疗结果与估计的资源对偶价格之间的差异最大化的资源,或者大致来说,使用该资源的机会成本。 然后,当资源到达时,按先到先得的方式进行分配。 我们证明,在温和的技术假设下,我们的数据驱动策略几乎必然渐近地达到最优样本外策略的预期结果。 我们将框架扩展以纳入各种公平性约束。 我们在基于无家可归者管理信息系统数据为洛杉矶无家可归者设计资源分配政策的问题上评估了我们方法的性能。 特别是,我们展示了使用我们的策略可以将无家可归的退出率提高5.16%,而无论是在分配还是结果方面在种族上公平的政策都以非常低的公平性成本实现。
We study the problem of allocating scarce societal resources of different types (e.g., permanent housing, deceased donor kidneys for transplantation, ventilators) to heterogeneous allocatees on a waitlist (e.g., people experiencing homelessness, individuals suffering from end-stage renal disease, Covid-19 patients) based on their observed covariates. We leverage administrative data collected in deployment to design an online policy that maximizes expected outcomes while satisfying budget constraints, in the long run. Our proposed policy waitlists each individual for the resource maximizing the difference between their estimated mean treatment outcome and the estimated resource dual-price or, roughly, the opportunity cost of using the resource. Resources are then allocated as they arrive, in a first-come first-serve fashion. We demonstrate that our data-driven policy almost surely asymptotically achieves the expected outcome of the optimal out-of-sample policy under mild technical assumptions. We extend our framework to incorporate various fairness constraints. We evaluate the performance of our approach on the problem of designing policies for allocating scarce housing resources to people experiencing homelessness in Los Angeles based on data from the homeless management information system. In particular, we show that using our policies improves rates of exit from homelessness by 5.16% and that policies that are fair in either allocation or outcomes by race come at a very low price of fairness.
- [24] arXiv:2404.00543 (替换) [中文pdf, pdf, html, 其他]
-
标题: 动态转移策略用于并行队列标题: Dynamic Transfer Policies for Parallel Queues评论: 57页,10图主题: 优化与控制 (math.OC) ; 系统与控制 (eess.SY)
我们考虑在离散时间点通过在并行队列之间转移顾客来平衡负载的问题。顾客在队列中等待会产生持有成本,而转移决策会产生固定(设置)成本和随转移数量和旅行距离增加的可变成本,并且根据转移方向有所不同。我们的工作主要受到医院需求激增期间为解决设施间患者转移引起的拥堵不平衡和护理获取不平等问题的动机。通过分析相关的流控制问题,我们表明在一般假设下,包括随时间变化的到达率和凸形持有成本,最优策略将状态空间划分为一个明确的$\textit{no-transfer region}$和其补集,这意味着只有当系统足够不平衡时,转移才是最优的。在没有固定转移成本的情况下,最优策略会将状态移动到无转移区域的边界;相反,在有固定成本的情况下,状态会被移动到其相对内部。利用我们的结构结果,我们提出了一种基于仿真的近似动态规划(ADP)算法,以找到随机系统的有效转移策略。我们在一项案例研究中评估了流策略和ADP策略的性能和鲁棒性,该案例研究使用了多伦多大都会区在新冠疫情期间的数据,结果表明在进行相对较少的转移情况下,在医院之间转移患者可能导致总成本最多减少27.7%。
We consider the problem of load balancing in parallel queues by transferring customers between them at discrete points in time. Holding costs accrue as customers wait in the queue, while transfer decisions incur both fixed (setup) costs and variable costs that increase with the number of transfers and travel distance, and vary by transfer direction. Our work is primarily motivated by inter-facility patient transfers to address imbalanced congestion and inequity in access to care during surges in hospital demand. Analyzing an associated fluid control problem, we show that under general assumptions, including time-varying arrivals and convex holding costs, the optimal policy partitions the state-space into a well-defined $\textit{no-transfer region}$ and its complement, implying that transferring is optimal if and only if the system is sufficiently imbalanced. In the absence of fixed transfer costs, an optimal policy moves the state to the no-transfer region's boundary; in contrast, with fixed costs, the state is moved to its relative interior. Leveraging our structural results, we propose a simulation-based approximate dynamic programming (ADP) algorithm to find effective transfer policies for the stochastic system. We investigate the performance and robustness of the fluid and ADP policies in a case study calibrated using data during the COVID-19 pandemic in the Greater Toronto Area, which demonstrates that transferring patients between hospitals could result in up to 27.7% reduction in total cost with relatively few transfers.
- [25] arXiv:2407.20128 (替换) [中文pdf, pdf, html, 其他]
-
标题: 零和多矩阵博弈中学习动态的有限样本保证标题: Finite-Sample Guarantees for Learning Dynamics in Zero-Sum Polymatrix Games评论: 44页;正在审稿中主题: 优化与控制 (math.OC) ; 计算机科学与博弈论 (cs.GT) ; 机器学习 (stat.ML)
我们研究在两种信息设置下零和多矩阵博弈中的最佳响应型学习动态。 这两种设置是根据玩家对游戏及其对手策略的了解程度来区分的。 第一种设置是完全信息情况,其中每个玩家知道自己的和对手的收益矩阵,并观察所有人的混合策略。 第二种设置是最小信息情况,其中玩家不观察对手的策略,也不了解任何收益矩阵(而是仅观察到其实际获得的收益)。 对于这种设置,也被称为博弈学习文献中的极端解耦情况,我们研究了一种两时间尺度的学习动态,该动态结合了策略估计的平滑最佳响应型更新与一种TD学习更新以估计局部收益函数。 对于这些动态,在没有额外探索的情况下,我们提供了收敛到$\epsilon$-纳什均衡的多项式时间有限样本保证。
We study best-response type learning dynamics for zero-sum polymatrix games under two information settings. The two settings are distinguished by the type of information that each player has about the game and their opponents' strategy. The first setting is the full information case, in which each player knows their own and their opponents' payoff matrices and observes everyone's mixed strategies. The second setting is the minimal information case, where players do not observe their opponents' strategies and are not aware of any payoff matrices (instead they only observe their realized payoffs). For this setting, also known as the radically uncoupled case in the learning in games literature, we study a two-timescale learning dynamics that combine smoothed best-response type updates for strategy estimates with a TD-learning update to estimate a local payoff function. For these dynamics, without additional exploration, we provide polynomial-time finite-sample guarantees for convergence to an $\epsilon$-Nash equilibrium.
- [26] arXiv:2408.13183 (替换) [中文pdf, pdf, html, 其他]
-
标题: 基于模拟的随机过程稳健置信带标题: Robust Confidence Bands for Stochastic Processes Using Simulation评论: 8页,3图主题: 优化与控制 (math.OC)
我们提出了一种稳健优化方法,用于使用有限数量的模拟样本路径构建随机过程的置信带。 我们的方法可用于量化随机过程实现中的不确定性,或通过检查实际系统的历史路径是否落在构建的置信带内来验证随机仿真模型。 与文献中现有的方法不同,我们的方法适用范围广泛,并直接处理约束内的优化偏差,生成具有准确覆盖概率的紧密置信带。 它易于处理,仅比最先进的基线方法稍复杂一些,并且易于使用,因为它采用了标准技术。 此外,在适当离散化时间后,我们的方法也适用于连续时间过程。 在第一个案例研究中,我们展示了与最先进的基线方法相比,我们的方法在样本路径数量少一个数量级的情况下实现了所需的覆盖概率。 在第二个案例研究中,我们说明了我们的方法如何用于验证随机仿真模型。
We propose a robust optimization approach for constructing confidence bands for stochastic processes using a finite number of simulated sample paths. Our approach can be used to quantify uncertainty in realizations of stochastic processes or validate stochastic simulation models by checking whether historical paths from the actual system fall within the constructed confidence band. Unlike existing approaches in the literature, our methodology is widely applicable and directly addresses optimization bias within the constraints, producing tight confidence bands with accurate coverage probabilities. It is tractable, being only slightly more complex than the state-of-the-art baseline approach, and easy to use, as it employs standard techniques. Additionally, our approach is also applicable to continuous-time processes after appropriately discretizing time. In our first case study, we show that our approach achieves the desired coverage probabilities with an order-of-magnitude fewer sample paths than the state-of-the-art baseline approach. In our second case study, we illustrate how our approach can be used to validate stochastic simulation models.
- [27] arXiv:2410.18246 (替换) [中文pdf, pdf, html, 其他]
-
标题: 资产网络的未知退化参数维护优化标题: Maintenance Optimization for Asset Networks with Unknown Degradation Parameters主题: 优化与控制 (math.OC)
我们考虑在退化参数异质且未知的环境下进行多资产维护优化这一关键实际挑战,这些参数必须从退化数据中推断出来。 为了解决这个问题,我们提出了适用于复杂资产网络的可扩展方法。 退化被建模为一个随机冲击过程,并通过贝叶斯框架将实时数据持续纳入对冲击率和冲击大小的估计中。 这构成了一个部分可观测的马尔可夫决策过程公式,我们从中分析推导出单调策略结构。 此外,我们提出了一种开环反馈方法,使在具有真实参数访问权限的仿真环境中通过深度强化学习(DRL)训练的策略,在部署时使用实时贝叶斯点估计仍能保持有效性。 作为补充,我们开发了一个贝叶斯马尔可夫决策过程(BMDP)框架,其中代理在部署期间保持并更新后验分布。 这种公式捕捉了参数不确定性的随时间演变,从而促进了可扩展的基于DRL的策略的训练,这些策略随着更多数据的可用而适应。 我们通过在合成资产网络和涉及介入式X射线系统灯丝的真实案例中的实验验证了我们的方法。 我们发现,所提出的DRL方法在各种场景中始终优于传统启发式方法。 针对BMDP训练的策略即使在先验必须从历史数据中估计的情况下也能表现良好,并且在具有高资产异质性的网络中仍有效。 了解真实的退化参数仅带来微小的成本收益,这突显了我们的方法在退化过程信息有限的情况下做出有效决策的能力。
We consider the key practical challenge of multi-asset maintenance optimization in settings where degradation parameters are heterogeneous and unknown, and must be inferred from degradation data. To address this, we propose scalable methods suitable for complex asset networks. Degradation is modeled as a stochastic shock process, and real-time data are continuously incorporated into estimation of shock rates and magnitudes via a Bayesian framework. This constitutes a partially observable Markov decision process formulation, from which we analytically derive monotonic policy structures. Moreover, we propose an open-loop feedback approach that enables policies trained via deep reinforcement learning (DRL) in a simulation environment with access to the true parameters to remain effective when deployed with real-time Bayesian point estimates instead. Complementing this, we develop a Bayesian Markov decision process (BMDP) framework wherein the agent maintains and updates posterior distributions during deployment. This formulation captures the evolution of parameter uncertainty over time, thereby facilitating the training of scalable DRL-based policies that adapt as additional data become available. We validate our approach through experiments on synthetic asset networks and a real-world case involving interventional X-ray system filaments. We find that the proposed DRL methods consistently outperform traditional heuristics across various scenarios. The policies trained for the BMDP perform well even when priors must be estimated from historical data, and remain effective in networks with high asset heterogeneity. Knowledge of true degradation parameters yields only marginal cost benefits, underscoring the ability of our approach to make effective decisions under limited information on degradation processes.
- [28] arXiv:2412.17384 (替换) [中文pdf, pdf, html, 其他]
-
标题: 二次障碍对多输入系统小时间局部能控性的阻碍标题: Quadratic obstructions to small-time local controllability for multi-input systemsThéo Gherdaoui (ENS Rennes, IRMAR)主题: 优化与控制 (math.OC)
我们提出了一种多输入控制仿射系统在$R^d$上的小时间局部能控性的必要条件。 该条件是基于在零点处计算向量场的李括号所得的$R^d$向量:它涉及它们的方向和幅度。 证明是对 Beauchard 和 Marbach 在单输入情况下引入的一般方法的适应。 它依赖于一种类似 Magnus 的表示公式:状态通过向量场在零点处的李括号的线性组合来近似,其系数是时间和控制的功能。 最后,小时间局部能控性的障碍来自于插值不等式。
We present a necessary condition for the small-time local controllability of multi-input control-affine systems on $R^d$ . This condition is formulated on the vectors of $R^d$ resulting from the evaluation at zero of the Lie brackets of the vector fields: it involves both their direction and their amplitude. The proof is an adaptation to the multi-input case of a general method introduced by Beauchard and Marbach in the single-input case. It relies on a Magnus-type representation formula: the state is approximated by a linear combination of the evaluation at zero of the Lie brackets of the vector fields, whose coefficients are functionals of the time and the controls. Finally, obstructions to small-time local controllability result from interpolation inequalities.
- [29] arXiv:2501.18454 (替换) [中文pdf, pdf, html, 其他]
-
标题: 高精度线性最小化不比投影慢标题: High-precision linear minimization is no slower than projection评论: 7页,1图主题: 优化与控制 (math.OC)
此注释证明,对于所有紧致凸集,可以通过一次投影评估和一个标量-向量乘法进行高精度线性最小化。 因此,如果 $\varepsilon$-近似线性最小化至少需要$L(\varepsilon)$个实向量算术操作,且投影需要$P$个操作,则$\mathcal{O}(P)\geq \mathcal{O}(L(\varepsilon))$是有保证的。 该概念通过示例、显式误差界以及多面体集的精确线性最小化结果进行了阐述。
This note demonstrates that, for all compact convex sets, high-precision linear minimization can be performed via a single evaluation of the projection and a scalar-vector multiplication. In consequence, if $\varepsilon$-approximate linear minimization takes at least $L(\varepsilon)$ real vector-arithmetic operations and projection requires $P$ operations, then $\mathcal{O}(P)\geq \mathcal{O}(L(\varepsilon))$ is guaranteed. This concept is expounded with examples, an explicit error bound, and an exact linear minimization result for polyhedral sets.
- [30] arXiv:2503.01559 (替换) [中文pdf, pdf, 其他]
-
标题: 求解决策依赖鲁棒问题作为双层优化问题标题: Solving Decision-Dependent Robust Problems as Bilevel Optimization Problems主题: 优化与控制 (math.OC)
双层优化和鲁棒优化都是数学优化和运筹学中的成熟领域。 然而,直到最近,它们在数学结构上的相似性既没有被理论研究,也没有被计算利用。 基于\textcite{goerigk2025}的最新结果,本文是第一篇将给定的严格鲁棒优化问题与决策相关的不确定性集重新表述为等价的双层优化问题,并然后使用该领域的求解技术来解决手头的鲁棒问题。 如果不确定性集可以对偶化,那么获得单层重述的相应双层技术与鲁棒优化中使用的经典对偶化技术非常相似,但会导致需要求解更大的单层问题。 我们的数值研究显示,这会导致更大的计算时间,但也可能略微改善对偶界限。 对于由混合整数线性模型表示的更具有挑战性的决策相关不确定性集的情况,我们不能应用标准的对偶化技术。 因此,我们将提出的双层方法与文献中唯一可用的方法进行比较,该方法基于量化混合整数线性规划。 我们的数值结果表明,对于决策相关鲁棒优化问题这一类问题,双层方法在计算时间方面表现更好。
Both bilevel and robust optimization are established fields of mathematical optimization and operations research. However, only until recently, the similarities in their mathematical structure has neither been studied theoretically nor exploited computationally. Based on the recent results by \textcite{goerigk2025}, this paper is the first one that reformulates a given strictly robust optimization problem with a decision-dependent uncertainty set as an equivalent bilevel optimization problem and then uses solution techniques from the latter field to solve the robust problem at hand. If the uncertainty set can be dualized, the respective bilevel techniques to obtain a single-level reformulation are very similar compared with the classic dualization techniques used in robust optimization but lead to larger single-level problems to be solved. Our numerical study shows that this leads to larger computation times but may also slightly improve the dual bound. For the more challenging case of decision-dependent uncertainty sets represented by mixed-integer linear models we cannot apply standard dualization techniques. Thus, we compare the presented bilevel approach with the only available method from the literature, which is based on quantified mixed-integer linear programs. Our numerical results indicate that, for the problem class of decision-dependent robust optimization problems, the bilevel approach performs better in terms of computation times.
- [31] arXiv:2504.02339 (替换) [中文pdf, pdf, html, 其他]
-
标题: 基于流形优化的稀疏张量CCA用于多视图学习标题: Sparse Tensor CCA via Manifold Optimization for Multi-View Learning主题: 优化与控制 (math.OC)
张量典型相关分析(TCCA)由于其在多视图学习中捕捉高阶相关性的有效性而受到广泛关注。 然而,现有的TCCA方法往往无法表征单个视图的结构,并且缺乏算法收敛性保证。 为了应对这些挑战,我们提出了一种称为STCCA-L的新稀疏TCCA模型,该模型将典型矩阵的稀疏正则化和多阶图的拉普拉斯正则化整合到TCCA框架中,从而有效利用单个视图的几何结构。 为了解决这个非凸模型,我们通过流形优化开发了一种高效的交替流形邻近梯度算法,该算法避免了计算昂贵的完整张量分解,并利用半光滑牛顿法解决子问题。 此外,我们严格证明了算法的收敛性并分析了其复杂度。 在八个基准数据集上的实验结果表明了所提方法优越的分类性能。 值得注意的是,在3sources数据集上,其准确率和F1分数分别比竞争对手至少提高了4.50%和6.77%。 我们的代码可在https://github.com/zhudafa/STCCA-L获取。
Tensor canonical correlation analysis (TCCA) has garnered significant attention due to its effectiveness in capturing high-order correlations in multi-view learning. However, existing TCCA methods often fail to characterize the structure of individual views and lack algorithmic convergence guarantees. In order to deal with these challenges, we propose a novel sparse TCCA model called STCCA-L, which integrates sparse regularization of canonical matrices and Laplacian regularization of multi-order graphs into the TCCA framework, thereby effectively exploiting the geometric structure of individual views. To solve this non-convex model, we develop an efficient alternating manifold proximal gradient algorithm via manifold optimization, which avoids computationally expensive full tensor decomposition and leverages a semi-smooth Newton method for subproblem resolution. Furthermore, we rigorously prove the convergence of the algorithm and analyze its complexity. Experimental results on eight benchmark datasets demonstrate the superior classification performance of the proposed method. Notably, on the 3sources dataset, it achieves improvements of at least 4.50\% in accuracy and 6.77\% in F1-score over competitors. Our code is available at https://github.com/zhudafa/STCCA-L.
- [32] arXiv:2504.19022 (替换) [中文pdf, pdf, 其他]
-
标题: 基于数据驱动的拦截与不对称成本不确定性:一种分布鲁棒优化方法标题: Data-driven interdiction with asymmetric cost uncertainty: a distributionally robust optimization approach评论: 补充材料附在主文档的末尾主题: 优化与控制 (math.OC)
我们考虑一个由上层决策者(称为领导者)和下层决策者(称为跟随者)之间的随机干扰博弈组成的类别,其中不确定性存在于跟随者的目标函数系数中。 具体而言,我们模型中跟随者的利润(或成本)包括一个随机向量,其概率分布由领导者和跟随者根据各自的数据独立估计。 为了解决分布不确定性,我们制定了一个分布鲁棒的干扰(DRI)模型,其中两个决策者都基于Wasserstein度量解决传统的分布鲁棒优化问题。 对于该模型,我们证明了渐近一致性,并推导出一个多项式规模的混合整数线性规划(MILP)重写形式。 此外,在我们的双层优化背景下,由于领导者对跟随者数据的不完全了解,可能会面临不确定性。 在这方面,我们提出了两个不同的真实DRI模型的近似方法,其中领导者对跟随者数据的信息不完整或没有信息。 第一种方法采用了一种悲观近似,这被证明在计算上具有挑战性,并需要设计一种专门的Benders分解算法。 第二种方法从领导者的角度利用了一种鲁棒优化方法。 为了解决由此产生的问题,我们提出了一种基于情景的外逼近方法,该方法可以接受潜在的大规模单层MILP重写形式,并满足渐近鲁棒性保证。 最后,对于一类随机生成的装箱干扰问题实例,我们数值评估了信息不对称和决策者风险偏好如何影响模型的样本外性能。
We consider a class of stochastic interdiction games between an upper-level decision-maker (referred to as a leader) and a lower-level decision-maker (referred to as a follower), where uncertainty lies in the follower's objective function coefficients. Specifically, the follower's profits (or costs) in our model comprise a random vector, whose probability distribution is estimated independently by the leader and the follower, based on their own data. To address the distributional uncertainty, we formulate a distributionally robust interdiction (DRI) model, where both decision-makers solve conventional distributionally robust optimization problems based on the Wasserstein metric. For this model, we prove asymptotic consistency and derive a polynomial-size mixed-integer linear programming (MILP) reformulation. Furthermore, in our bilevel optimization context, the leader may face uncertainty due to its incomplete knowledge of the follower's data. In this regard, we propose two distinct approximations of the true DRI model, where the leader has incomplete or no information about the follower's data. The first approach employs a pessimistic approximation, which turns out to be computationally challenging and requires the design of a specialized Benders decomposition algorithm. The second approach leverages a robust optimization approach from the leader's perspective. To address the resulting problem, we propose a scenario-based outer approximation that admits a potentially large single-level MILP reformulation and satisfies asymptotic robustness guarantees. Finally, for a class of randomly generated instances of the packing interdiction problem, we evaluate numerically how the information asymmetry and the decision-makers' risk preferences affect the models' out-of-sample performance.
- [33] arXiv:2504.20017 (替换) [中文pdf, pdf, html, 其他]
-
标题: 构造幻方:一个整数约束满足问题和一种快速启发式方法标题: Constructing Magic Squares: an integer constraint satisfaction problem and a fast heuristic主题: 优化与控制 (math.OC)
幻方是一个引人入胜的数学挑战,几个世纪以来一直吸引着数学家们。 给定一个正整数(可能是很大的)$n$,仍然存在的主要挑战之一是,在可靠计算时间内,找到一个阶数为$n$的幻方,即一个阶数为$n$的矩阵,其中包含从$a_{\min}$到$a_{\max}$的唯一整数,使得每行、每列和对角线的和等于一个常数$\mathcal{C}(A)$。 在本工作中,我们首先提出一个整数约束满足问题,用于构造一个阶数为$n$的幻方。 然而,该问题的求解时间随着阶数的增加呈指数增长。 为克服这一限制,我们还提出了一种启发式方法,该方法根据$n$是奇数、单偶数还是双偶数来构造幻方。 我们的数值结果表明,所提出的启发式方法可以在小于$140$秒的时间内构造出阶数高达$70000$的幻方,证明了其效率和可扩展性。
Magic squares are a fascinating mathematical challenge that has intrigued mathematicians for centuries. Given a positive (and possibly large) integer $n$, one of the main challenges that still remains is to find, within a reliable computational time, a magic square of order $n$, that is, a square matrix of order $n$ with unique integers from $a_{\min}$ to $a_{\max}$, such that the sum of each row, column, and diagonal equals a constant $\mathcal{C}(A)$. In this work, we first present an integer constraint satisfaction problem for constructing a magic square of order $n$. Nonetheless, the solution time of this problem grows exponentially as the order increases. To overcome this limitation, we also propose a heuristic that constructs magic squares depending on whether $n$ is odd, singly even, or double even. Our numerical results show that the proposed heuristic can construct magic squares of order up to $70000$ in less than $140$ seconds, demonstrating its efficiency and scalability.
- [34] arXiv:2506.07952 (替换) [中文pdf, pdf, html, 其他]
-
标题: 离散与连续的子模最小化标题: Discrete and Continuous Difference of Submodular Minimization期刊参考: 第42届国际机器学习大会论文集,加拿大温哥华。PMLR 267,2025主题: 优化与控制 (math.OC) ; 数据结构与算法 (cs.DS) ; 机器学习 (cs.LG)
子模函数在连续或离散域上定义,在许多应用中出现。我们研究在两个域上两个子模(DS)函数的差值的最小化,扩展了之前仅限于集合函数的工作。我们证明所有离散域上的函数和所有连续域上的光滑函数都是DS。对于离散域,我们观察到DS最小化等价于最小化两个凸(DC)函数的差值,如集合函数情况一样。我们提出了一种DC算法(DCA)的新变体,并将其应用于得到的DC程序,获得了与集合函数情况相当的理论保证。该算法可以通过离散化应用于连续域。实验表明,我们的方法在整数压缩感知和整数最小二乘方面优于基线方法。
Submodular functions, defined on continuous or discrete domains, arise in numerous applications. We study the minimization of the difference of two submodular (DS) functions, over both domains, extending prior work restricted to set functions. We show that all functions on discrete domains and all smooth functions on continuous domains are DS. For discrete domains, we observe that DS minimization is equivalent to minimizing the difference of two convex (DC) functions, as in the set function case. We propose a novel variant of the DC Algorithm (DCA) and apply it to the resulting DC Program, obtaining comparable theoretical guarantees as in the set function case. The algorithm can be applied to continuous domains via discretization. Experiments demonstrate that our method outperforms baselines in integer compressive sensing and integer least squares.
- [35] arXiv:2507.20611 (替换) [中文pdf, pdf, html, 其他]
-
标题: 使用最短路径计算计算具有序列依赖设置时间的最优单机调度标题: Computing an optimal single machine schedule with sequence dependent setup times using shortest path computations评论: 18页,11图主题: 优化与控制 (math.OC)
我们研究一个具有序列相关设置时间的单机调度问题,该问题源于制造业和服务行业中的应用——特别是橡胶地板生产中的压延阶段。 在此阶段,设置时间主要由连续作业之间的温度和颜色转换驱动,对吞吐量和能源效率有显著影响。 我们提出了一种新的解决方案框架,将调度问题转化为在特殊构造的分层图上的路径查找问题。 通过将序列相关的影响直接编码到图的结构中,我们能够使用经典的最短路径算法来计算最优作业序列。 该方法对于两色情况是多项式时间可解的,并揭示了最优调度的关键结构特性。 因此,我们的方法提供了一种理论上有依据且实际可应用的优化技术。
We study a single-machine scheduling problem with sequence dependent setup times, motivated by applications in manufacturing and service industries - in particular, the calendering stage in rubber flooring production. In this phase, setup times are primarily driven by temperature and color transitions between consecutive jobs, with significant impact on throughput and energy efficiency. We present a novel solution framework that transforms the scheduling problem into a path-finding problem on a specially constructed layered graph. By encoding sequence-dependent effects directly into the graph's structure, we enable the use of classical shortest-path algorithms to compute optimal job sequences. The resulting method is polynomial-time solvable for the two-color case and reveals key structural properties of optimal schedules. Our approach thus provides both a theoretically grounded and practically applicable optimization technique.
- [36] arXiv:2508.05378 (替换) [中文pdf, pdf, html, 其他]
-
标题: 输电网络中的电压支持采购:通过在线双层博弈进行激励设计标题: Voltage Support Procurement in Transmission Grids: Incentive Design via Online Bilevel Games主题: 优化与控制 (math.OC)
将分布式能源资源整合到输电网运行中带来了复杂的挑战,特别是在无功功率采购以支持电压方面。 本文通过将电压调节问题表述为一个斯塔克尔伯格博弈来解决这一挑战,其中输电系统运营商(TSO)设计激励措施以引导配电系统运营商(DSOs)的无功功率响应。 我们采用一种基于梯度的迭代算法,更新激励措施以确保DSO调整其无功功率注入以保持电压稳定。 我们结合在线反馈优化的原则,以实现实时实施,利用TSO和DSOs策略中的电压测量值。 这种方法不仅增强了对模型不确定性和变化运行条件的鲁棒性,还促进了激励措施和自动化的协同设计。 在5节点输电网上的数值实验证明了我们方法在实现电压调节的同时,能够适应自利DSO的战略互动的有效性。
The integration of distributed energy resources into transmission grid operations presents a complex challenge, particularly in the context of reactive power procurement for voltage support. This paper addresses this challenge by formulating the voltage regulation problem as a Stackelberg game, where the Transmission System Operator (TSO) designs incentives to guide the reactive power responses of Distribution System Operators (DSOs). We utilize a gradient-based iterative algorithm that updates the incentives to ensure that DSOs adjust their reactive power injections to maintain voltage stability. We incorporate principles from online feedback optimization to enable real-time implementation, utilizing voltage measurements in both TSO's and DSOs' policies. This approach not only enhances the robustness against model uncertainties and changing operating conditions but also facilitates the co-design of incentives and automation. Numerical experiments on a 5-bus transmission grid demonstrate the effectiveness of our approach in achieving voltage regulation while accommodating the strategic interactions of self-interested DSOs.
- [37] arXiv:2508.08135 (替换) [中文pdf, pdf, html, 其他]
-
标题: 一种针对部分二进制规则下顺序竞争设施定位问题的高效分支定界方法标题: An efficient branch-and-cut approach for the sequential competitive facility location problem under partially binary rule评论: 36页,5张图,已提交以供可能发表主题: 优化与控制 (math.OC)
我们研究在部分二进制规则下的顺序竞争设施定位问题(SCFLP),其中两家公司依次开设有限数量的设施以最大化其市场份额,要求客户为每家公司光顾具有最高效用的设施。 SCFLP是一个双层混合整数非线性规划(MINLP)问题,可以重写为单层MINLP问题,其中每个非线性约束对应于一个描述领导者市场份额的多重比函数的次图。 通过建立多重比函数的次模性,我们使用次模不等式表征由每个次图引起的混合0-1集合,并将最先进的分支切割(B&C)算法扩展到所考虑的SCFLP。 为了解决底层公式的线性规划(LP)松弛效果不佳的问题,我们为SCFLP开发了两种新的混合整数线性规划(MILP)公式以及基于它们的高效B&C算法。 第一个MILP公式基于一类改进的次模不等式,这些不等式包括经典的次模不等式作为特殊情况,并与平凡不等式一起表征混合0-1集合的凸包。 第二个是第一个的扩展公式,提供了相同的LP松弛界。 我们还开发了用于MILP公式中指数不等式族分离的高效算法。 大量的计算实验表明,所提出的B&C算法在适应的最先进的B&C算法和文献中的复杂启发式算法中显著优于它们。 此外,所提出的B&C算法可以在两小时的时间限制内找到最多1000名客户和设施的SCFLP实例的最优解。
We investigate the sequential competitive facility location problem (SCFLP) under partially binary rule where two companies sequentially open a limited number of facilities to maximize their market shares, requiring customers to patronize, for each company, the facility with the highest utility. The SCFLP is a bilevel mixed integer nonlinear programming (MINLP) problem and can be rewritten as a single-level MINLP problem, where each nonlinear constraint corresponds to a hypograph of a multiple ratio function characterizing the leader's market share for a fixed follower's location choice. By establishing the submodularity of the multiple ratio functions, we characterize the mixed 0-1 set induced by each hypograph using submodular inequalities and extend a state-of-the-art branch-and-cut (B&C) algorithm to the considered SCFLP. To address the challenge of poor linear programming (LP) relaxation of the underlying formulation, we develop two new mixed integer linear programming (MILP) formulations for the SCFLP as well as efficient B&C algorithms based on them. The first MILP formulation is based on a class of improved submodular inequalities, which include the classic submodular inequalities as special cases, and together with the trivial inequalities characterize the convex hull of the mixed 0-1 set. The second one is an extended formulation of the first one that provides the same LP relaxation bound. We also develop efficient algorithms for the separations of the exponential families of the inequalities in the MILP formulations. Extensive computational experiments show that the proposed B&C algorithms significantly outperform an adapted state-of-the-art B&C algorithm and a sophisticated heuristic algorithm in the literature. Moreover, the proposed B&C algorithms can find optimal solutions for SCFLP instances with up to 1000 customers and facilities within a two-hour time limit.
- [38] arXiv:2302.00709 (替换) [中文pdf, pdf, html, 其他]
-
标题: 温和黎曼随机逼近标题: Tame Riemannian Stochastic Approximation主题: 机器学习 (cs.LG) ; 优化与控制 (math.OC)
我们研究了在受黎曼流形定义的约束下,对一个可驯服的不可微函数应用随机逼近的性质。 在o-极小拓扑中出现的可驯服函数的目标景观,当推广到流形时扩展到几何范畴,表现出一些结构,这些结构使得对于一般的随机次梯度下降法,能够保证期望函数减少和渐近收敛的理论保证。 最近的工作表明,这类函数能够忠实模拟深度神经网络训练目标的损失景观,深度学习包中使用的自动微分操作实现了具有正确收敛性质的次梯度下降变体。 黎曼优化利用约束集的几何特性来执行最小化过程,同时确保优化变量位于黎曼流形上。 本文首次研究了黎曼流形上的可驯服优化,突出了该问题丰富的几何结构,并通过一个简单退化的SGD算法的分析和数值报告,确认了标准"SGD"对该类问题的适用性。
We study the properties of stochastic approximation applied to a tame nondifferentiable function subject to constraints defined by a Riemannian manifold. The objective landscape of tame functions, arising in o-minimal topology extended to a geometric category when generalized to manifolds, exhibits some structure that enables theoretical guarantees of expected function decrease and asymptotic convergence for generic stochastic sub-gradient descent. Recent work has shown that this class of functions faithfully model the loss landscape of deep neural network training objectives, and the autograd operation used in deep learning packages implements a variant of subgradient descent with the correct properties for convergence. Riemannian optimization uses geometric properties of a constraint set to perform a minimization procedure while enforcing adherence to the the optimization variable lying on a Riemannian manifold. This paper presents the first study of tame optimization on Riemannian manifolds, highlighting the rich geometric structure of the problem and confirming the appropriateness of the canonical "SGD" for such a problem with the analysis and numerical reports of a simple Retracted SGD algorithm.
- [39] arXiv:2307.13413 (替换) [中文pdf, pdf, html, 其他]
-
标题: 离散时间一般马尔可夫迪尼游戏的马尔可夫随机均衡标题: Markovian randomized equilibria for general Markovian Dynkin games in discrete time主题: 概率 (math.PR) ; 优化与控制 (math.OC)
我们研究了经典双人Dynkin博弈在离散时间马尔可夫设置中的通用形式。 我们确定了一个适当的混合策略类——\textit{马尔可夫随机停止时间}——其中玩家在任何给定状态下以与状态相关的概率停止。 一个主要结果是基于这种随机化概念的Wald-Bellman型纳什均衡的显式表征。 特别是,我们推导出零和Dynkin博弈中随机均衡的新表征,我们用它来(i)建立马尔可夫随机均衡的存在性和显式构造,(ii)提供纯策略均衡不存在的必要充分条件,以及(iii)构造一个仅存在唯一随机均衡而没有纯均衡的例子。 我们还在我们博弈的对称版本中提供了存在性和表征结果。 最后,我们在状态空间是可数的假设下,建立了通用博弈形式中可表征均衡在马尔可夫随机停止时间下的存在性。
We study a general formulation of the classical two-player Dynkin game in a discrete time Markovian setting. We identify an appropriate class of mixed strategies -- \textit{Markovian randomized stopping times} -- in which players stop at any given state with a state-dependent probability. One main result is an explicit characterization of Wald-Bellman-type for Nash equilibria based on this notion of randomization. In particular, we derive a novel characterization of randomized equilibria in zero-sum Dynkin games, which we use to (i) establish the existence and explicit construction of Markovian randomized equilibria, (ii) provide necessary and sufficient conditions for the non-existence of pure strategy equilibria, and (iii) construct an example that admits a unique randomized equilibrium but no pure one. We also provide existence and characterization results in the symmetric version of our game. Finally, we establish existence of a characterizable equilibrium in Markovian randomized stopping times for the general game formulation under the assumption that the state space is countable.
- [40] arXiv:2501.10740 (替换) [中文pdf, pdf, html, 其他]
-
标题: 通过最小权重扰动提高神经ODE的鲁棒性标题: Improving the robustness of neural ODEs with minimal weight perturbation评论: 31页,5图,4表主题: 数值分析 (math.NA) ; 优化与控制 (math.OC)
我们提出一种方法,通过减少初始值扰动后的最大误差增长来增强神经常微分方程(神经ODE)的稳定性。 由于稳定性取决于与神经ODE相关的雅可比矩阵的对数范数,我们通过以最小可能的扰动(在弗罗贝尼乌斯范数下)扰动神经ODE的权重矩阵来控制对数范数。 我们通过参与一个特征值优化问题来实现这一点,我们为此提出了一种嵌套的两层算法。 对于给定的权重矩阵扰动大小,内层计算该大小的最优扰动,而在外层,我们调整扰动幅度,直到达到所需的统一稳定性界限。 我们将所提出的算法嵌入神经ODE的训练中,以提高其对初始值扰动的鲁棒性,例如对抗攻击。 在经典图像数据集上的数值实验表明,根据我们的策略训练的包含神经ODE的图像分类器比以传统方式训练的相同分类器更稳定,因此它对对抗攻击更具鲁棒性,也更不易受到攻击。
We propose a method to enhance the stability of a neural ordinary differential equation (neural ODE) by reducing the maximum error growth subsequent to a perturbation of the initial value. Since the stability depends on the logarithmic norm of the Jacobian matrix associated with the neural ODE, we control the logarithmic norm by perturbing the weight matrices of the neural ODE by a smallest possible perturbation (in Frobenius norm). We do so by engaging an eigenvalue optimisation problem, for which we propose a nested two-level algorithm. For a given perturbation size of the weight matrix, the inner level computes optimal perturbations of that size, while - at the outer level - we tune the perturbation amplitude until we reach the desired uniform stability bound. We embed the proposed algorithm in the training of the neural ODE to improve its robustness to perturbations of the initial value, as adversarial attacks. Numerical experiments on classical image datasets show that an image classifier including a neural ODE in its architecture trained according to our strategy is more stable than the same classifier trained in the classical way, and therefore, it is more robust and less vulnerable to adversarial attacks.
- [41] arXiv:2502.04584 (替换) [中文pdf, pdf, html, 其他]
-
标题: 联合状态和噪声协方差估计标题: Joint State and Noise Covariance Estimation评论: 添加了缺失的相关工作[4]主题: 机器人技术 (cs.RO) ; 优化与控制 (math.OC)
本文解决了从受高斯噪声污染的测量中联合估计噪声协方差矩阵以及状态(如位姿和点等参数)的问题,并且如果可用的话,还包括先验信息。在这样的情况下,噪声协方差矩阵决定了最小二乘问题中各个测量值的权重。我们证明了联合问题表现出凸结构,并在联合最大后验和似然框架及其多种变体中提供了最优噪声协方差估计的完整表征(具有解析解)。利用这一理论结果,我们提出了两种新的算法,用于联合估计主要参数和噪声协方差矩阵。我们的BCD算法可以轻松集成到现有的非线性最小二乘求解器中,每次迭代的计算开销可以忽略不计。为了验证我们的方法,我们在各种场景中进行了广泛的实验,并提供了关于其在机器人技术和计算机视觉估计问题中的应用的实际见解,特别关注SLAM。
This paper tackles the problem of jointly estimating the noise covariance matrix alongside states (parameters such as poses and points) from measurements corrupted by Gaussian noise and, if available, prior information. In such settings, the noise covariance matrix determines the weights assigned to individual measurements in the least squares problem. We show that the joint problem exhibits a convex structure and provide a full characterization of the optimal noise covariance estimate (with analytical solutions) within joint maximum a posteriori and likelihood frameworks and several variants. Leveraging this theoretical result, we propose two novel algorithms that jointly estimate the primary parameters and the noise covariance matrix. Our BCD algorithm can be easily integrated into existing nonlinear least squares solvers, with negligible per-iteration computational overhead. To validate our approach, we conduct extensive experiments across diverse scenarios and offer practical insights into their application in robotics and computer vision estimation problems with a particular focus on SLAM.
- [42] arXiv:2502.05305 (替换) [中文pdf, pdf, html, 其他]
-
标题: 在线非光滑随机逼近中的协方差估计标题: Online Covariance Estimation in Nonsmooth Stochastic Approximation评论: 46页,1图;被第38届学习理论年度会议(COLT 2025)接收主题: 机器学习 (stat.ML) ; 机器学习 (cs.LG) ; 优化与控制 (math.OC)
我们考虑将随机逼近(SA)方法应用于解决非光滑变分包含问题。 现有研究表明,SA方法的平均迭代具有渐近正态性,在Hájek和Le Cam的局部最小最大意义下具有最优极限协方差矩阵。 然而,在非光滑且可能非单调(非凸)的设置中,尚未提出估计此协方差矩阵的方法。 在本文中,我们研究了Zhu等人(2023)中引入的一种在线批次均值协方差矩阵估计器。 该估计器适当分组SA迭代,并计算批次之间的样本协方差作为极限协方差的估计。 其构造不需要事先了解总样本量,并且可以在新数据到达时递归地进行更新。 我们证明,只要批次大小序列被正确指定(取决于步长序列),该估计器在任何$\varepsilon>0$的情况下达到$O(\sqrt{d}n^{-1/8+\varepsilon})$的收敛速度,其中$d$和$n$分别表示问题的维度和使用的迭代次数(或样本数)。 尽管该问题是非光滑且可能非单调(非凸)的,但我们的收敛速度与仅使用一阶信息的平滑且强凸设置中的最佳已知协方差估计方法的收敛速度相匹配。 该协方差估计器的一致性使得可以进行渐近有效的统计推断,包括构建置信区间和执行假设检验。
We consider applying stochastic approximation (SA) methods to solve nonsmooth variational inclusion problems. Existing studies have shown that the averaged iterates of SA methods exhibit asymptotic normality, with an optimal limiting covariance matrix in the local minimax sense of H\'ajek and Le Cam. However, no methods have been proposed to estimate this covariance matrix in a nonsmooth and potentially non-monotone (nonconvex) setting. In this paper, we study an online batch-means covariance matrix estimator introduced in Zhu et al.(2023). The estimator groups the SA iterates appropriately and computes the sample covariance among batches as an estimate of the limiting covariance. Its construction does not require prior knowledge of the total sample size, and updates can be performed recursively as new data arrives. We establish that, as long as the batch size sequence is properly specified (depending on the stepsize sequence), the estimator achieves a convergence rate of order $O(\sqrt{d}n^{-1/8+\varepsilon})$ for any $\varepsilon>0$, where $d$ and $n$ denote the problem dimensionality and the number of iterations (or samples) used. Although the problem is nonsmooth and potentially non-monotone (nonconvex), our convergence rate matches the best-known rate for covariance estimation methods using only first-order information in smooth and strongly-convex settings. The consistency of this covariance estimator enables asymptotically valid statistical inference, including constructing confidence intervals and performing hypothesis testing.
- [43] arXiv:2504.02766 (替换) [中文pdf, pdf, html, 其他]
-
标题: 关于系统协同设计中的可组合性和参数化不确定性标题: On Composable and Parametric Uncertainty in Systems Co-Design评论: 8页,被接受为IEEE控制决策会议(CDC)2025的特邀论文主题: 系统与控制 (eess.SY) ; 优化与控制 (math.OC)
优化复杂系统的设计需要在相互依赖的决策、异构组件和多个目标之间进行权衡。 我们的协同设计单调理论提供了一个组合框架来应对这一挑战,将系统建模为设计问题(DPs),在部分有序集中表示功能与资源之间的权衡。 虽然当前方法使用区间来建模不确定性,捕捉最坏和最好情况的边界,但它们无法表达风险和置信度等概率概念。 这些限制阻碍了协同设计在不确定性起关键作用的领域中的应用。 在本文中,我们引入了一个统一的框架,用于协同设计中的可组合不确定性,该框架涵盖了区间、分布和参数化模型。 这一扩展使得能够对风险-性能权衡进行推理,并支持实验设计、学习和多阶段决策等高级查询。 我们通过一个关于任务驱动的无人驾驶飞行器(UAVs)的不确定性感知协同设计的数值案例研究,展示了该框架的表达能力和实用性。
Optimizing the design of complex systems requires navigating interdependent decisions, heterogeneous components, and multiple objectives. Our monotone theory of co-design offers a compositional framework for addressing this challenge, modeling systems as Design Problems (DPs), representing trade-offs between functionalities and resources within partially ordered sets. While current approaches model uncertainty using intervals, capturing worst- and best-case bounds, they fail to express probabilistic notions such as risk and confidence. These limitations hinder the applicability of co-design in domains where uncertainty plays a critical role. In this paper, we introduce a unified framework for composable uncertainty in co-design, capturing intervals, distributions, and parametrized models. This extension enables reasoning about risk-performance trade-offs and supports advanced queries such as experiment design, learning, and multi-stage decision making. We demonstrate the expressiveness and utility of the framework via a numerical case study on the uncertainty-aware co-design of task-driven Unmanned Aerial Vehicles (UAVs).
- [44] arXiv:2507.04156 (替换) [中文pdf, pdf, html, 其他]
-
标题: 自适应双向组合优化:收益最大化标题: Adaptive Two-sided Assortment Optimization: Revenue Maximization主题: 计算机科学与博弈论 (cs.GT) ; 优化与控制 (math.OC)
我们研究在基于选择的匹配平台中,为了最大化收入而进行的自适应双向组合优化。 平台有两方的代理,一方是发起方,另一方是响应方。 决策者依次从发起方选择代理,为每个代理展示响应方的代理组合,并观察他们的选择。 在处理完所有发起方代理后,响应方代理将被展示组合并做出选择。 当两个代理相互选择对方时会发生匹配,产生依赖于对的收入。 选择遵循多项式逻辑(MNL)模型。 这种设置扩展了之前专注于在次模需求假设下最大化匹配数量的研究,这些假设在我们的收入最大化情境中并不成立。 我们的主要贡献是设计具有常数因子保证的多项式时间近似算法。 特别是,对于一般的成对收入,我们开发了一个随机算法,在任何$\epsilon > 0$的情况下,期望上达到$(\frac{1}{2} - \epsilon)$-近似。 该算法是静态的,并在各种代理到达设置下提供保证,包括固定顺序、同时处理和自适应选择。 当所有涉及任何给定响应方代理的成对收入是均匀的时候,保证提高到$(1 - \frac{1}{e} - \epsilon)$。 在结构设置中,如果响应方代理共享一个基于收入的排名,我们设计了一个更简单的自适应确定性算法,达到$\frac{1}{2}$-近似。 我们的方法利用了新颖的线性规划松弛、相关间隙论证以及收入函数的结构性质。
We study adaptive two-sided assortment optimization for revenue maximization in choice-based matching platforms. The platform has two sides of agents, an initiating side, and a responding side. The decision-maker sequentially selects agents from the initiating side, shows each an assortment of agents from the responding side, and observes their choices. After processing all initiating agents, the responding agents are shown assortments and make their selections. A match occurs when two agents mutually select each other, generating pair-dependent revenue. Choices follow Multinomial Logit (MNL) models. This setting generalizes prior work focused on maximizing the number of matches under submodular demand assumptions, which do not hold in our revenue-maximization context. Our main contribution is the design of polynomial-time approximation algorithms with constant-factor guarantees. In particular, for general pairwise revenues, we develop a randomized algorithm that achieves a $(\frac{1}{2} - \epsilon)$-approximation in expectation for any $\epsilon > 0$. The algorithm is static and provides guarantees under various agent arrival settings, including fixed order, simultaneous processing, and adaptive selection. When revenues are uniform across all pairs involving any given responding-side agent, the guarantee improves to $(1 - \frac{1}{e} - \epsilon)$. In structural settings where responding-side agents share a common revenue-based ranking, we design a simpler adaptive deterministic algorithm achieving a $\frac{1}{2}$-approximation. Our approach leverages novel linear programming relaxations, correlation gap arguments, and structural properties of the revenue functions.