PANDA: AdaPtive Noisy Data Augmentation for Regularization of Undirected Graphical Models

Li, Yinan; Liu, Xiao; Liu, Fang

统计学 > 机器学习

arXiv:1810.04851 (stat)

[提交于 2018年10月11日 (v1) ，最后修订 2019年5月21日 (此版本， v2)]

标题： PANDA：用于无向图模型正则化的自适应噪声数据增强

标题： PANDA: AdaPtive Noisy Data Augmentation for Regularization of Undirected Graphical Models

Authors:Yinan Li, Xiao Liu, Fang Liu

摘要：我们提出了一种自适应噪声增强（PANDA）技术，以对无向图模型的估计和构建进行正则化。 PANDA通过迭代优化噪声增强数据的目标函数，直到收敛，以实现对模型参数的正则化。增强的噪声可以设计为在图估计上实现各种正则化效果，如桥接（包括套索和岭）、弹性网络、自适应套索和SCAD惩罚；它还实现了组套索和融合岭。我们检验了噪声增强损失函数的尾部界限，并建立噪声增强损失函数及其最小值分别几乎必然收敛到期望的惩罚损失函数及其最小值。我们通过PANDA在广义线性模型中推导出正则化参数的渐近分布，基于此，可以在变量选择的同时获得参数的推断。我们在模拟研究中展示了PANDA在构建不同类型图中的非劣性能，并将PANDA应用于自闭症谱系障碍数据以构建混合节点图。我们还表明，基于PANDA得到的正则化参数估计的渐近分布的推断具有名义或接近名义的覆盖率，并且比一些现有的事后选择程序要高效得多。计算上，PANDA可以在实现（GLMs）的软件中轻松编程，而无需求助于复杂的优化技术。

摘要： We propose an AdaPtive Noise Augmentation (PANDA) technique to regularize the estimation and construction of undirected graphical models. PANDA iteratively optimizes the objective function given the noise augmented data until convergence to achieve regularization on model parameters. The augmented noises can be designed to achieve various regularization effects on graph estimation, such as the bridge (including lasso and ridge), elastic net, adaptive lasso, and SCAD penalization; it also realizes the group lasso and fused ridge. We examine the tail bound of the noise-augmented loss function and establish that the noise-augmented loss function and its minimizer converge almost surely to the expected penalized loss function and its minimizer, respectively. We derive the asymptotic distributions for the regularized parameters through PANDA in generalized linear models, based on which, inferences for the parameters can be obtained simultaneously with variable selection. We show the non-inferior performance of PANDA in constructing graphs of different types in simulation studies and apply PANDA to an autism spectrum disorder data to construct a mixed-node graph. We also show that the inferences based on the asymptotic distribution of regularized parameter estimates via PANDA achieve nominal or near-nominal coverage and are far more efficient, compared to some existing post-selection procedures. Computationally, PANDA can be easily programmed in software that implements (GLMs) without resorting to complicated optimization techniques.

主题：	机器学习 (stat.ML) ; 机器学习 (cs.LG)
MSC 类：	62
引用方式：	arXiv:1810.04851 [stat.ML]
	(或者 arXiv:1810.04851v2 [stat.ML] 对于此版本)
	https://doi.org/10.48550/arXiv.1810.04851

提交历史

来自： Fang Liu [查看电子邮件]
[v1] 星期四， 2018 年 10 月 11 日 05:54:44 UTC (936 KB)
[v2] 星期二， 2019 年 5 月 21 日 22:52:51 UTC (1,281 KB)

统计学 > 机器学习

标题： PANDA：用于无向图模型正则化的自适应噪声数据增强

标题： PANDA: AdaPtive Noisy Data Augmentation for Regularization of Undirected Graphical Models

提交历史

获取论文：

参考文献与引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

统计学 > 机器学习

标题： PANDA：用于无向图模型正则化的自适应噪声数据增强 显示英文标题

标题： PANDA: AdaPtive Noisy Data Augmentation for Regularization of Undirected Graphical Models

提交历史

获取论文：

参考文献与引用

BibTeX 格式的引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

标题： PANDA：用于无向图模型正则化的自适应噪声数据增强