Stochastic EM for Shuffled Linear Regression

Abid, Abubakar; Zou, James

统计学 > 机器学习

arXiv:1804.00681 (stat)

[提交于 2018年4月2日 ]

标题：随机EM算法求解混洗线性回归

标题： Stochastic EM for Shuffled Linear Regression

Authors:Abubakar Abid, James Zou

摘要：我们研究了线性回归模型中的推断问题，在该问题中，输入特征和输出标签的相对顺序未知。这种数据集自然出现在实验中，样本在协议期间被洗牌或排列。在这项工作中，我们提出了一种框架，将未知的排列视为潜在变量。我们使用随机期望最大化（EM）方法来最大化观察值的似然性。我们将此与文献中的主导方法进行比较，在我们的框架中，这对应于硬EM。我们在合成数据上表明，我们开发的随机EM算法具有多个优势，包括更低的参数误差、对初始化选择的敏感性降低，以及在部分洗牌的数据集上显著更好的性能。最后，我们在两个真实数据集上进行了实验，这些数据集已被部分洗牌，结果显示随机EM算法可以以适度的误差恢复权重。

摘要： We consider the problem of inference in a linear regression model in which the relative ordering of the input features and output labels is not known. Such datasets naturally arise from experiments in which the samples are shuffled or permuted during the protocol. In this work, we propose a framework that treats the unknown permutation as a latent variable. We maximize the likelihood of observations using a stochastic expectation-maximization (EM) approach. We compare this to the dominant approach in the literature, which corresponds to hard EM in our framework. We show on synthetic data that the stochastic EM algorithm we develop has several advantages, including lower parameter error, less sensitivity to the choice of initialization, and significantly better performance on datasets that are only partially shuffled. We conclude by performing two experiments on real datasets that have been partially shuffled, in which we show that the stochastic EM algorithm can recover the weights with modest error.

评论：	11页，5幅图
主题：	机器学习 (stat.ML) ; 机器学习 (cs.LG)
引用方式：	arXiv:1804.00681 [stat.ML]
	(或者 arXiv:1804.00681v1 [stat.ML] 对于此版本)
	https://doi.org/10.48550/arXiv.1804.00681

提交历史

来自： Abubakar Abid [查看电子邮件]
[v1] 星期一， 2018 年 4 月 2 日 18:13:49 UTC (1,248 KB)

统计学 > 机器学习

标题：随机EM算法求解混洗线性回归

标题： Stochastic EM for Shuffled Linear Regression

提交历史

获取论文：

参考文献与引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

统计学 > 机器学习

标题： 随机EM算法求解混洗线性回归 显示英文标题

标题： Stochastic EM for Shuffled Linear Regression

提交历史

获取论文：

参考文献与引用

BibTeX 格式的引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

标题：随机EM算法求解混洗线性回归