神经与进化计算
查看 最近的 文章
显示 2025年08月13日, 星期三 新的列表
- [1] arXiv:2508.08526 [中文pdf, pdf, html, 其他]
-
标题: 使用稀疏余弦优化策略进化的玩Atari太空入侵者游戏标题: Playing Atari Space Invaders with Sparse Cosine Optimized Policy Evolution评论: 第21届人工智能与互动数字娱乐大会主题: 神经与进化计算 (cs.NE) ; 人工智能 (cs.AI)
进化方法之前已被证明是多种领域有效的学习方法。 然而,游戏博弈领域对进化方法提出了特殊的挑战,因为视频游戏的固有状态空间非常大。 随着输入状态的大小扩大,策略的大小也必须增加,以有效地学习游戏空间中的时间模式。 因此,更大的策略必须包含更多的可训练参数,从而指数级地增加搜索空间的大小。 搜索空间的任何增加对于进化方法来说都是极具问题的,因为可训练参数数量的增加与收敛速度成反比。 为了在保持原始空间有意义表示的同时减小输入空间的大小,我们引入了稀疏余弦优化策略进化(SCOPE)。 SCOPE利用离散余弦变换(DCT)作为伪注意力机制,将输入状态转换为系数矩阵。 通过对该矩阵进行截断和稀疏化处理,我们在保留原始输入最高能量特征的同时降低了输入空间的维度。 我们展示了SCOPE作为Atari游戏太空入侵者策略的有效性。 在此任务中,使用CMA-ES的SCOPE优于考虑未修改输入状态的进化方法,如OpenAI-ES和HyperNEAT。 SCOPE还优于简单的强化学习方法,如DQN和A3C。 SCOPE通过将输入大小从33,600减少到15,625(减少了53%),然后使用稀疏DCT系数的双线性仿射映射到由CMA-ES算法学习的策略动作来实现这一结果。
Evolutionary approaches have previously been shown to be effective learning methods for a diverse set of domains. However, the domain of game-playing poses a particular challenge for evolutionary methods due to the inherently large state space of video games. As the size of the input state expands, the size of the policy must also increase in order to effectively learn the temporal patterns in the game space. Consequently, a larger policy must contain more trainable parameters, exponentially increasing the size of the search space. Any increase in search space is highly problematic for evolutionary methods, as increasing the number of trainable parameters is inversely correlated with convergence speed. To reduce the size of the input space while maintaining a meaningful representation of the original space, we introduce Sparse Cosine Optimized Policy Evolution (SCOPE). SCOPE utilizes the Discrete Cosine Transform (DCT) as a pseudo attention mechanism, transforming an input state into a coefficient matrix. By truncating and applying sparsification to this matrix, we reduce the dimensionality of the input space while retaining the highest energy features of the original input. We demonstrate the effectiveness of SCOPE as the policy for the Atari game Space Invaders. In this task, SCOPE with CMA-ES outperforms evolutionary methods that consider an unmodified input state, such as OpenAI-ES and HyperNEAT. SCOPE also outperforms simple reinforcement learning methods, such as DQN and A3C. SCOPE achieves this result through reducing the input size by 53% from 33,600 to 15,625 then using a bilinear affine mapping of sparse DCT coefficients to policy actions learned by the CMA-ES algorithm.
新提交 (展示 1 之 1 条目 )
- [2] arXiv:2508.08292 (交叉列表自 cs.CL) [中文pdf, pdf, html, 其他]
-
标题: Putnam-AXIOM:一个功能和静态基准标题: Putnam-AXIOM: A Functional and Static BenchmarkAryan Gulati, Brando Miranda, Eric Chen, Emily Xia, Kai Fronsdal, Bruno Dumont, Elyas Obbad, Sanmi Koyejo评论: 总共27页(10页主论文+17页附录),12幅图,6张表。提交至ICML 2025(审稿中)期刊参考: ICML 2025主题: 计算与语言 (cs.CL) ; 人工智能 (cs.AI) ; 机器学习 (cs.LG) ; 计算机科学中的逻辑 (cs.LO) ; 神经与进化计算 (cs.NE)
当前大型语言模型(LLMs)的数学推理基准测试正在接近饱和,一些模型的准确率超过90%,并且越来越受到训练集污染的损害。 我们引入了Putnam-AXIOM,这是一个包含522道大学级别竞赛题的基准测试,这些题目来自享有盛誉的威廉·洛厄尔·普特南数学竞赛,以及Putnam-AXIOM Variation,一个由程序化扰动变量和常量生成的100个功能变体的未见过的配套集。 变异协议可以产生无限数量的同样困难、未见过的实例——从而形成一个抗污染的测试环境。 在原始集上,OpenAI的o1-preview——评估中最强的模型——得分为41.9%,但在配对的变体上其准确率下降了19.6%(相对下降46.8%)。 其余十八个模型也表现出相同的下降趋势,其中十个模型的95%置信区间不重叠。 这些差距表明存在记忆现象,并突显了动态基准的重要性。 我们将“框内”准确率与教师强制准确率(TFA)相结合,这是一种轻量级指标,可以直接对推理轨迹进行评分并自动化自然语言证明评估。 因此,Putnam-AXIOM为评估LLMs的高级数学推理提供了一个严格且抗污染的评估框架。 数据和评估代码可在https://github.com/brando90/putnam-axiom公开获取。
Current mathematical reasoning benchmarks for large language models (LLMs) are approaching saturation, with some achieving > 90% accuracy, and are increasingly compromised by training-set contamination. We introduce Putnam-AXIOM, a benchmark of 522 university-level competition problems drawn from the prestigious William Lowell Putnam Mathematical Competition, and Putnam-AXIOM Variation, an unseen companion set of 100 functional variants generated by programmatically perturbing variables and constants. The variation protocol produces an unlimited stream of equally difficult, unseen instances -- yielding a contamination-resilient test bed. On the Original set, OpenAI's o1-preview -- the strongest evaluated model -- scores 41.9%, but its accuracy drops by 19.6% (46.8% relative decrease) on the paired Variations. The remaining eighteen models show the same downward trend, ten of them with non-overlapping 95% confidence intervals. These gaps suggest memorization and highlight the necessity of dynamic benchmarks. We complement "boxed" accuracy with Teacher-Forced Accuracy (TFA), a lightweight metric that directly scores reasoning traces and automates natural language proof evaluations. Putnam-AXIOM therefore provides a rigorous, contamination-resilient evaluation framework for assessing advanced mathematical reasoning of LLMs. Data and evaluation code are publicly available at https://github.com/brando90/putnam-axiom.
- [3] arXiv:2508.08877 (交叉列表自 cs.LG) [中文pdf, pdf, html, 其他]
-
标题: 使用遗传算法的可扩展彩票券网络标题: Towards Scalable Lottery Ticket Networks using Genetic Algorithms评论: 27页,11图,7表,提交至IJCCI 2024的论文扩展版(DOI:10.5220/0013010300003837),该扩展版将发表在期刊《计算智能研究》中主题: 机器学习 (cs.LG) ; 神经与进化计算 (cs.NE)
构建现代深度学习系统,不仅需要有效,还需要高效,这需要重新思考模型训练和神经网络架构设计的既有范式。 与其适应高度过参数化的网络并随后应用模型压缩技术以减少资源消耗,一类新的高性能网络跳过了昂贵的参数更新需求,同时仅需要少量参数,使其具有高度可扩展性。 强彩票票根假设认为,在随机初始化的、足够过参数化的神经网络中,存在可以与训练后的原始模型精度相当的子网络,而无需任何训练。 本工作探讨了使用遗传算法来识别这些强彩票票根子网络。 我们发现,对于二分类和多分类任务实例,我们的方法在不需要任何梯度信息的情况下,实现了比当前最先进方法更好的准确率和稀疏性水平。 此外,我们提供了在扩展到更复杂的网络架构和学习任务时需要适当评估指标的依据。
Building modern deep learning systems that are not just effective but also efficient requires rethinking established paradigms for model training and neural architecture design. Instead of adapting highly overparameterized networks and subsequently applying model compression techniques to reduce resource consumption, a new class of high-performing networks skips the need for expensive parameter updates, while requiring only a fraction of parameters, making them highly scalable. The Strong Lottery Ticket Hypothesis posits that within randomly initialized, sufficiently overparameterized neural networks, there exist subnetworks that can match the accuracy of the trained original model-without any training. This work explores the usage of genetic algorithms for identifying these strong lottery ticket subnetworks. We find that for instances of binary and multi-class classification tasks, our approach achieves better accuracies and sparsity levels than the current state-of-the-art without requiring any gradient information. In addition, we provide justification for the need for appropriate evaluation metrics when scaling to more complex network architectures and learning tasks.
交叉提交 (展示 2 之 2 条目 )
- [4] arXiv:2503.02303 (替换) [中文pdf, pdf, html, 其他]
-
标题: 灵活的前额叶控制作用于海马体情景记忆以实现目标导向的泛化标题: Flexible Prefrontal Control over Hippocampal Episodic Memory for Goal-Directed Generalization评论: 已被2025年认知计算神经科学会议(CCN 2025)接受。预印本可在OpenReview上查看:https://openreview.net/forum?id=7hhz5ToJnM主题: 神经与进化计算 (cs.NE) ; 人工智能 (cs.AI) ; 机器学习 (cs.LG)
许多任务需要根据当前目标灵活地修改感知和行为。 人类可以从几天到几年前的事件中提取情景记忆,并利用这些记忆来在结构相关但新颖的情境中进行情境化和泛化行为。 大脑根据任务需求控制情景记忆的能力通常归因于前额叶皮层(PFC)和海马体(HPC)之间的相互作用。 我们提出了一种强化学习模型,该模型结合了PFC-HPC相互作用机制,用于目标导向的泛化。 在我们的模型中,PFC学习生成查询-键表示,以编码和检索与目标相关的情景记忆,并根据当前任务需求自上而下地调节HPC记忆。 此外,当面对以块状而非交错方式呈现的多个目标时,PFC会动态地调整其编码和检索策略。 我们的结果表明:(1) 将工作记忆与选择性检索的情景记忆相结合,可以实现相似环境或情境之间的决策转移,(2) 与自下而上的感觉驱动方法相比,PFC对HPC的自上而下控制有助于学习事件之间的任意结构关联,从而实现对新环境的泛化,(3) PFC在目标相关记忆的编码和检索过程中都编码可泛化的表示,而HPC则表现出事件特定的表示。 综上所述,这些发现突显了目标导向的前额叶控制对海马情景记忆在新情境中决策的重要性,并提出了一个计算机制,说明PFC-HPC相互作用如何实现灵活的行为。
Many tasks require flexibly modifying perception and behavior based on current goals. Humans can retrieve episodic memories from days to years ago, using them to contextualize and generalize behaviors across novel but structurally related situations. The brain's ability to control episodic memories based on task demands is often attributed to interactions between the prefrontal cortex (PFC) and hippocampus (HPC). We propose a reinforcement learning model that incorporates a PFC-HPC interaction mechanism for goal-directed generalization. In our model, the PFC learns to generate query-key representations to encode and retrieve goal-relevant episodic memories, modulating HPC memories top-down based on current task demands. Moreover, the PFC adapts its encoding and retrieval strategies dynamically when faced with multiple goals presented in a blocked, rather than interleaved, manner. Our results show that: (1) combining working memory with selectively retrieved episodic memory allows transfer of decisions among similar environments or situations, (2) top-down control from PFC over HPC improves learning of arbitrary structural associations between events for generalization to novel environments compared to a bottom-up sensory-driven approach, and (3) the PFC encodes generalizable representations during both encoding and retrieval of goal-relevant memories, whereas the HPC exhibits event-specific representations. Together, these findings highlight the importance of goal-directed prefrontal control over hippocampal episodic memory for decision-making in novel situations and suggest a computational mechanism by which PFC-HPC interactions enable flexible behavior.
- [5] arXiv:2204.05138 (替换) [中文pdf, pdf, 其他]
-
标题: 人工智能软件,用于模拟人类工作记忆、心理意象和心理连续性标题: Artificial Intelligence Software Structured to Simulate Human Working Memory, Mental Imagery, and Mental Continuity主题: 神经与认知 (q-bio.NC) ; 人工智能 (cs.AI) ; 机器学习 (cs.LG) ; 神经与进化计算 (cs.NE) ; 符号计算 (cs.SC)
本文提出了一种人工智能(AI)架构,旨在模拟人类工作记忆系统的迭代更新。 它包含多个相互连接的神经网络,旨在模仿大脑皮层的专门模块。 这些模块以分层结构组织,并集成到一个全局工作空间中。 它们能够暂时维持类似于工作记忆中保持的心理项目的一些高级表征模式。 这种维持是通过两种模态的持续神经活动实现的:持续的神经放电(导致注意力焦点)和突触增强(导致短期存储)。 在持续活动中保持的表征会被递归地替换,从而导致工作记忆系统内容的渐进变化。 随着内容逐渐演变,连续的处理状态相互重叠并彼此连续。 本文将探讨该架构如何导致共激活表征分布的迭代变化,最终导致处理状态之间的心理连续性,从而产生类人的思维和认知。 与人脑类似,这个AI工作记忆存储将与多个图像生成系统(拓扑图)相关联,对应于各种感觉模态。 当工作记忆被迭代更新时,由此产生的地图将构建相关心理图像的序列。 因此,模拟前额叶皮层及其与早期感觉和运动皮层的相互作用的神经网络捕捉了人脑的图像引导功能。 这种感觉和运动图像的创造,加上迭代更新的工作记忆存储,可能为AI系统提供实现合成意识或人工感知所需的认知资源。
This article presents an artificial intelligence (AI) architecture intended to simulate the iterative updating of the human working memory system. It features several interconnected neural networks designed to emulate the specialized modules of the cerebral cortex. These are structured hierarchically and integrated into a global workspace. They are capable of temporarily maintaining high-level representational patterns akin to the psychological items maintained in working memory. This maintenance is made possible by persistent neural activity in the form of two modalities: sustained neural firing (resulting in a focus of attention) and synaptic potentiation (resulting in a short-term store). Representations held in persistent activity are recursively replaced resulting in incremental changes to the content of the working memory system. As this content gradually evolves, successive processing states overlap and are continuous with one another. The present article will explore how this architecture can lead to iterative shift in the distribution of coactive representations, ultimately leading to mental continuity between processing states, and thus to human-like thought and cognition. Like the human brain, this AI working memory store will be linked to multiple imagery (topographic map) generation systems corresponding to various sensory modalities. As working memory is iteratively updated, the maps created in response will construct sequences of related mental imagery. Thus, neural networks emulating the prefrontal cortex and its reciprocal interactions with early sensory and motor cortex capture the imagery guidance functions of the human brain. This sensory and motor imagery creation, coupled with an iteratively updated working memory store may provide an AI system with the cognitive assets needed to achieve synthetic consciousness or artificial sentience.
- [6] arXiv:2504.13522 (替换) [中文pdf, pdf, html, 其他]
-
标题: 跨模态时间融合用于金融市场预测标题: Cross-Modal Temporal Fusion for Financial Market Forecasting评论: 10页,4图,论文已被ECAI-2025的PAIS接收,欧洲人工智能大会,2025年10月25日至30日,意大利博洛尼亚主题: 机器学习 (cs.LG) ; 神经与进化计算 (cs.NE) ; 计算金融 (q-fin.CP)
准确的金融市场预测需要整合多种数据源,从历史价格到宏观经济指标和财经新闻。 然而,现有的模型往往无法有效对齐这些模态,限制了它们的实际应用。 在本文中,我们引入了一种基于变压器的深度学习框架,跨模态时间融合(CMTF),用于融合结构化和非结构化金融数据以提高市场预测。 该模型包含一个张量解释模块用于特征选择,以及一个自动训练流程用于高效的超参数调整。 使用FTSE 100股票数据的实验结果表明,与经典和深度学习基线相比,CMTF在价格方向分类中表现出更优越的性能。 这些发现表明,我们的框架是现实世界跨模态金融预测任务的一种有效且可扩展的解决方案。
Accurate forecasting in financial markets requires integrating diverse data sources, from historical prices to macroeconomic indicators and financial news. However, existing models often fail to align these modalities effectively, limiting their practical use. In this paper, we introduce a transformer-based deep learning framework, Cross-Modal Temporal Fusion (CMTF), that fuses structured and unstructured financial data for improved market prediction. The model incorporates a tensor interpretation module for feature selection and an auto-training pipeline for efficient hyperparameter tuning. Experimental results using FTSE 100 stock data demonstrate that CMTF achieves superior performance in price direction classification compared to classical and deep learning baselines. These findings suggest that our framework is an effective and scalable solution for real-world cross-modal financial forecasting tasks.