工作论文
当前位置:首页 > 工作论文
“大数据陷阱”:社交媒体股市大数据偏差及其后果的实证分析
阅读全文         下载全文
Title“Big Data Trap”:Empirical Study of Big Data Bias on Social Media and Economic Consequence  
作者王建新 吴世农 彭叠峰  
AuthorWang Jianxin, Wu Shinong and Peng Diefeng  
作者单位中南大学商学院;厦门大学管理学院 
OrganizationSchool of Business, Central South University;School of Management, Xiamen University 
作者Emailjianxin.wang@csu.edu.cn;snwu@xmu.edu.cn;pdf198558@163.com 
中文关键词社交媒体 大数据偏差 投资者情绪 市场操纵 收益预测 
Key WordsSocial Media; Sample Bias of Big Data; Investor Sentiment; Market Manipulation; Return Prediction 
内容提要近年来,大数据在社会、经济、管理等领域的科研与应用中所面临的问题迅速成为研究热点。《Nature》和《Science》分别在2008年和2011年出专刊对大数据的特征及应用前景进行讨论。随着大数据以及Web2.0技术的发展,社交媒体数据在各领域的“预测作用”和“偏差”被广泛关注。本文应用社交媒体中的投资者生成数据对股票收益进行预测,从而检验社交媒体数据的偏差性。通过收集“雪球”股票论坛中的649,636条讨论数据构建投资者情绪指标进行实证研究,结果发现:(1)整体投资者情绪对股票市场收益不存在预测作用,表明社交媒体股市大数据存在系统性偏差;(2)社交媒体中存在“市场操纵者”,其情绪能够引领其他用户的情绪,并存在提前买入行为;(3)“市场操纵者”的情绪与股票未来市场收益负相关,表明市场操纵者通过数据操纵达到了获利目的。这些实证结果表明:具有一定意图的投资者在社交媒体中发布具有倾向性甚至虚假的信息,从而导致整体数据产生系统性偏差。因此,大数据在生成过程中可能产生的噪声与偏差应引起研究者、监管者和投资者的注意。 
AbstractTogether with the publishing of the special issue of big data on both Nature and Science, big data has been paid tremendous attention almost in all kinds of domain. As the rapidly evolving of the technology of Web2.0, social media data has also been applied to the investment strategies. Whether there is bias in user generated data on social network is still an open question. We try to test this question by examining the predictability of investor sentiment to stock market return. On measuring the investor sentiment in use of large-scale(649,636) discussion samples from the “Snowball” internet stock forum, we find: first of all, the aggregate investor sentiment fails to predict the stock market return. Secondly, market manipulators try to manipulate the data generating process on social media: the Granger Causal Test shows that manipulator’s sentiment is the Granger reason of the non-manipulator’s sentiment. Finally, manipulator’s sentiment negatively predicts the market return in short term, which indicates that the manipulators do gain abnormal return from the manipulating behavior. The empirical results indicate that the “pump and dumpers” are trying to manipulate the market price thorough posting purposely misleading information, which will bias the aggregate data systematically. Big data researches should be aware of the type of bias formed in the data-generating process.  
文章编号WP1211 
登载时间2017-07-28 
  • 主管单位:中国社会科学院     主办单位:中国社会科学院经济研究所
  • 经济研究杂志社版权所有 未经允许 不得转载     京ICP备10211437号
  • 本网所登载文章仅代表作者观点 不代表本网观点或意见 常年法律顾问:陆康(重光律师事务所)
  • 国际标准刊号 ISSN 0577-9154      国内统一刊号 CN11-1081/F       国内邮发代号 2-251        国外代号 M16
  • 地址:北京市西城区阜外月坛北小街2号   100836
  • 电话/传真:010-68034153
  • 本刊微信公众号:erj_weixin