Exploring How Audio Effects Alter Emotion with Foundation Models

Katsis, Stelios; Lyberatos, Vassilis; Kantarelis, Spyridon; Dervakos, Edmund; Stamou, Giorgos

计算机科学 > 声音

arXiv:2509.15151 (cs)

[提交于 2025年9月18日 (v1) ，最后修订 2025年9月20日 (此版本， v2)]

标题：探索基础模型如何改变情感的音频效果

标题： Exploring How Audio Effects Alter Emotion with Foundation Models

Authors:Stelios Katsis, Vassilis Lyberatos, Spyridon Kantarelis, Edmund Dervakos, Giorgos Stamou

摘要：音频效果（FX），如混响、失真、调制和动态范围处理，在音乐聆听过程中对塑造情感反应起着关键作用。尽管之前的研究已经探讨了低级音频特征与情感感知之间的联系，但音频FX对情绪的系统性影响仍研究不足。这项工作研究了如何利用基础模型——在多模态数据上预训练的大规模神经架构——来分析这些效果。这些模型在音乐结构、音色和情感意义之间建立了丰富的关联，为探究声音设计技术的情感后果提供了一个强大的框架。通过将各种探测方法应用于深度学习模型的嵌入表示，我们研究了音频FX与估计情绪之间的复杂非线性关系，揭示了与特定效果相关的模式，并评估了基础音频模型的鲁棒性。我们的研究结果旨在推进对音频制作实践感知影响的理解，对音乐认知、表演和情感计算具有重要意义。

摘要： Audio effects (FX) such as reverberation, distortion, modulation, and dynamic range processing play a pivotal role in shaping emotional responses during music listening. While prior studies have examined links between low-level audio features and affective perception, the systematic impact of audio FX on emotion remains underexplored. This work investigates how foundation models - large-scale neural architectures pretrained on multimodal data - can be leveraged to analyze these effects. Such models encode rich associations between musical structure, timbre, and affective meaning, offering a powerful framework for probing the emotional consequences of sound design techniques. By applying various probing methods to embeddings from deep learning models, we examine the complex, nonlinear relationships between audio FX and estimated emotion, uncovering patterns tied to specific effects and evaluating the robustness of foundation audio models. Our findings aim to advance understanding of the perceptual impact of audio production practices, with implications for music cognition, performance, and affective computing.

评论：	https://github.com/stelioskt/audioFX
主题：	声音 (cs.SD) ; 人工智能 (cs.AI)
引用方式：	arXiv:2509.15151 [cs.SD]
	(或者 arXiv:2509.15151v2 [cs.SD] 对于此版本)
	https://doi.org/10.48550/arXiv.2509.15151

提交历史

来自： Vassilis Lyberatos [查看电子邮件]
[v1] 星期四， 2025 年 9 月 18 日 16:57:08 UTC (13,059 KB)
[v2] 星期六， 2025 年 9 月 20 日 08:36:11 UTC (13,059 KB)

计算机科学 > 声音

标题：探索基础模型如何改变情感的音频效果

标题： Exploring How Audio Effects Alter Emotion with Foundation Models

提交历史

获取论文：

参考文献与引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

计算机科学 > 声音

标题： 探索基础模型如何改变情感的音频效果 显示英文标题

标题： Exploring How Audio Effects Alter Emotion with Foundation Models

提交历史

获取论文：

参考文献与引用

BibTeX 格式的引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

标题：探索基础模型如何改变情感的音频效果