“大数据陷阱”:社交媒体股市大数据偏差及其后果的实证分析 阅读全文
下载全文 |
Title | “Big Data Trap”:Empirical Study of Big Data Bias on Social Media and Economic Consequence
|
作者 | 王建新 吴世农 彭叠峰 |
Author | Wang Jianxin, Wu Shinong and Peng Diefeng |
作者单位 | 中南大学商学院;厦门大学管理学院 |
Organization | School of Business, Central South University;School of Management, Xiamen University |
作者Email | jianxin.wang@csu.edu.cn;snwu@xmu.edu.cn;pdf198558@163.com |
中文关键词 | 社交媒体 大数据偏差 投资者情绪 市场操纵 收益预测 |
Key Words | Social Media; Sample Bias of Big Data; Investor Sentiment; Market Manipulation; Return Prediction |
内容提要 | 近年来,大数据在社会、经济、管理等领域的科研与应用中所面临的问题迅速成为研究热点。《Nature》和《Science》分别在2008年和2011年出专刊对大数据的特征及应用前景进行讨论。随着大数据以及Web2.0技术的发展,社交媒体数据在各领域的“预测作用”和“偏差”被广泛关注。本文应用社交媒体中的投资者生成数据对股票收益进行预测,从而检验社交媒体数据的偏差性。通过收集“雪球”股票论坛中的649,636条讨论数据构建投资者情绪指标进行实证研究,结果发现:(1)整体投资者情绪对股票市场收益不存在预测作用,表明社交媒体股市大数据存在系统性偏差;(2)社交媒体中存在“市场操纵者”,其情绪能够引领其他用户的情绪,并存在提前买入行为;(3)“市场操纵者”的情绪与股票未来市场收益负相关,表明市场操纵者通过数据操纵达到了获利目的。这些实证结果表明:具有一定意图的投资者在社交媒体中发布具有倾向性甚至虚假的信息,从而导致整体数据产生系统性偏差。因此,大数据在生成过程中可能产生的噪声与偏差应引起研究者、监管者和投资者的注意。 |
Abstract | Together with the publishing of the special issue of big data on both Nature and Science, big data has been paid tremendous attention almost in all kinds of domain. As the rapidly evolving of the technology of Web2.0, social media data has also been applied to the investment strategies. Whether there is bias in user generated data on social network is still an open question. We try to test this question by examining the predictability of investor sentiment to stock market return. On measuring the investor sentiment in use of large-scale(649,636) discussion samples from the “Snowball” internet stock forum, we find: first of all, the aggregate investor sentiment fails to predict the stock market return. Secondly, market manipulators try to manipulate the data generating process on social media: the Granger Causal Test shows that manipulator’s sentiment is the Granger reason of the non-manipulator’s sentiment. Finally, manipulator’s sentiment negatively predicts the market return in short term, which indicates that the manipulators do gain abnormal return from the manipulating behavior. The empirical results indicate that the “pump and dumpers” are trying to manipulate the market price thorough posting purposely misleading information, which will bias the aggregate data systematically. Big data researches should be aware of the type of bias formed in the data-generating process. |
文章编号 | WP1211 |
登载时间 | 2017-07-28 |
|