关于大数据的研究,学界已经形成了泾渭分明且针锋相对的两个大数据流派——激进派与保守派。通过对2个经典大数据案例的研究,发现“大数据”实际上指称两类既有区别又有联系的对象,一类是“用数据的方法研究科学”,另一类是“用科学的方法研究数据”。两类大数据及二者存在的显著差异,是形成激进派与保守派两种阵营的原因。在归纳了两类大数据各自特点的基础上,提出了从根本上消除目前这种对立且混乱的认识现状,并将大数据研究推向深水区的路径。
In the research of big data, two distinct and diametrically opposed academic schools have emerged, namely radicalism and conservatism. Through an analysis of two typical cases, this article finds that the so-called "big data" actually refers to two types of "big data", one is to study data in a scientific way, and the other is to study science in a data way. It is the existence of the two types of big data that forms the two camps of activism and conservatism. The types of big data and their significant difference are the reasons of the formation of radicalism and conservatism camps. On the basis of summarizing the characteristics of the two kinds of big data, this paper puts forward the only approach that may eliminate the antagonism and confusion and push the big data research further.
[1] Alvarado R, Humphreys P. Big data, thick mediation, and representational opacity[J]. New Literary History, 2017, 48(4):729-749.
[2] 欧高炎, 朱占星, 鄂维南, 等. 数据科学导引[M]. 北京:高等教育出版社, 2017.
[3] Schönberger M V, Cukier K. Big data, a revolution:that will transform how we live, work, and think[M]. Boston:Houghton Mifflin Harcourt, 2013.
[4] Clark L. No questions asked:Big data firm maps solutions without human input[EB/OL].[2020-04-10]. http://www.wired.co.uk/news/archive/2013-01/16/ayasdi-big-data-launch.
[5] Sprenger J. Science without (parametric) models:The case of bootstrap resembling[J]. Synthese, 2011, 180(1):65-76.
[6] Anderson C. The end of theory:The data deluge makes the scientific method obsolete[J]. Wired, 2008, 16(7):1-3.
[7] 冯启思. 大数据统治世界[M]. 曲玉彬, 译. 北京:中国人民大学出出版社, 2013.
[8] Floridi L. Big data and their epistemological challenge[J]. Philos and Technol, 2012, 25(4):435-437.
[9] 董春雨, 薛永红. 从经验归纳到数据归纳:特征、机制与意义[J]. 自然辩证法研究, 2016, 32(5):9-16.
[10] Timmer J. Why the cloud cannot obscure the scientific method[EB/OL].[2020-04-10]. http://arstechnica.com/uncategorized/2008/06/why-the-cloud-cannot-obscurethe-scientific-method.
[11] Brooks D. What you'll do next:using big data to predict human behavior[N]. The New York Times, 2013-04-16.
[12] Boyd D, Crawford K. Six provocations for big data[J]. Social Science Electronic Publishing, 2011, 123(1):1-17.
[13] Sabina L. Integrating data to acquire new knowledge:Three modes of integration in plant science[J]. Studies in History & Philosophy of Biological & Biomedical Sciences, 2013, 44(4):503-514.
[14] Canali S. Big data, epistemology and causality:Knowledge in and knowledge out in EXPOsOMICS[J]. Big Data & Society, 2016, 3(2):1-11.
[15] Pietsch W. Aspects of theory-ladenness in data-intensive science[J]. Philosophy of Science, 2015, 82(5):905-916.
[16] Frické M. Big data and its epistemology[J]. Journal of the Association for Information Science & Technology, 2015, 66(4):651-661.
[17] Hey T, Tansley S, Tolle K. The fourth paradigm:data-intensive scientific discovery[C]. Microsoft Research, 2009.
[18] Berry D. The computational turn:Thinking about the digital humanities[EB/OL].[2020-04-10]. https://culturemachine.net/wp-content/uploads/2019/01/10-Computational-Turn-440-893-1-PB.pdf.
[19] Harford T. Big data:Are we making a big mistake?[J]. Significance, 2015, 11(5):14-19.
[20] Kitchin R, Lauriault T P. Small data in the era of big data[J]. Geojournal, 2015, 80(4):463-475.
[21] 曹贤才, 时冉冉, 牛玉柏. 近似数量系统敏锐度与数学能力的关系[J]. 心理科学, 2016, 39(3):580-586.
[22] Halberda J, Ly R, Wilmer B, et al. Number sense across the lifespan as revealed by a massive internet-based sample[J]. PNAS, 2012, 109(28):11116-11120.
[23] Ginsberg J, Mohebbi M H, Patel R S, et al. Detecting influenza epidemics using search engine query data[J]. Nature, 2009, 457(7232):1012-1015.
[24] Hey T, Tansley S, Tolle K. 第四范式:数据密集型科学发现[M]. 潘教峰, 张晓林, 译. 北京:科学出版社, 2013.
[25] Pietsch W. The causal nature of modeling with big data[J]. Philosophy & Technology, 2016, 29(2):1-35.
[26] 朱迪亚·珀尔, 达纳·麦肯齐. 为什么:关于因果关系的新科学[M]. 江生, 于华, 译. 北京:中信出版集团, 2019.