"Read My Lips": Using Automatic Text Analysis to Classify Politicians by Party and Ideology

Sapiro-Gheiler, Eitan

经济学 > 一般经济学

arXiv:1809.00741 (econ)

[提交于 2018年9月3日 ]

标题： “读我的嘴唇”：使用自动文本分析将政治家按党派和意识形态分类

标题： "Read My Lips": Using Automatic Text Analysis to Classify Politicians by Party and Ideology

Authors:Eitan Sapiro-Gheiler

摘要：政治话语的日益数字化为使用文本分析研究政治行为的新维度打开了大门。这项工作研究了来自美国国会记录的单词级统计数据的价值——其中包含了美国国会所有演讲的全文——用于研究参议员的思想立场和行为。应用机器学习技术，我们利用这些数据根据政党自动分类参议员，根据所使用的具体方法，准确率在70%到95%之间。我们还表明，使用文本预测DW-NOMINATE分数——一种常用的思想代理指标——并不能改善这些已经成功的成果。当将这种分类应用于与训练集相隔四年或更久的国会会议文本时，效果会下降，这表明选民需要动态更新他们根据政治话语评估政党的启发式方法。基于文本的预测不如基于投票行为的预测准确，这支持了这样的理论：即投票记录代表了政治家更大的承诺，因此是他们思想偏好更准确的反映。然而，这里研究的机器学习方法的整体成功表明，政治演讲对党派归属具有高度预测性。除了这些发现之外，这项工作还介绍了与使用政治话语数据相关的计算工具和方法。

摘要： The increasing digitization of political speech has opened the door to studying a new dimension of political behavior using text analysis. This work investigates the value of word-level statistical data from the US Congressional Record--which contains the full text of all speeches made in the US Congress--for studying the ideological positions and behavior of senators. Applying machine learning techniques, we use this data to automatically classify senators according to party, obtaining accuracy in the 70-95% range depending on the specific method used. We also show that using text to predict DW-NOMINATE scores, a common proxy for ideology, does not improve upon these already-successful results. This classification deteriorates when applied to text from sessions of Congress that are four or more years removed from the training set, pointing to a need on the part of voters to dynamically update the heuristics they use to evaluate party based on political speech. Text-based predictions are less accurate than those based on voting behavior, supporting the theory that roll-call votes represent greater commitment on the part of politicians and are thus a more accurate reflection of their ideological preferences. However, the overall success of the machine learning approaches studied here demonstrates that political speeches are highly predictive of partisan affiliation. In addition to these findings, this work also introduces the computational tools and methods relevant to the use of political speech data.

主题：	一般经济学 (econ.GN) ; 计算与语言 (cs.CL)
引用方式：	arXiv:1809.00741 [econ.GN]
	(或者 arXiv:1809.00741v1 [econ.GN] 对于此版本)
	https://doi.org/10.48550/arXiv.1809.00741

提交历史

来自： Eitan Sapiro-Gheiler [查看电子邮件]
[v1] 星期一， 2018 年 9 月 3 日 23:13:00 UTC (592 KB)

经济学 > 一般经济学

标题： “读我的嘴唇”：使用自动文本分析将政治家按党派和意识形态分类

标题： "Read My Lips": Using Automatic Text Analysis to Classify Politicians by Party and Ideology

提交历史

获取论文：

参考文献与引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

经济学 > 一般经济学

标题： “读我的嘴唇”：使用自动文本分析将政治家按党派和意识形态分类 显示英文标题

标题： "Read My Lips": Using Automatic Text Analysis to Classify Politicians by Party and Ideology

提交历史

获取论文：

参考文献与引用

BibTeX 格式的引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

标题： “读我的嘴唇”：使用自动文本分析将政治家按党派和意识形态分类