大数据不仅在科学、工程与计算智能中有着广泛的应用,而且在人类感知、估计、量化、记忆和推理的认知机制中发挥着基础性作用。通过对大数据科学理论的基础研究,提出一组大数据系统的一般原理和分析方法。为了从形式上解释大数据的起源和本质,探讨大数据的认知基础及其数学模型,严格地引出了根植于科学、工程和社会各个领域的大数据的一般模式。研究发现大数据不再是传统实域上的纯数,而是一个前所未有的新型数学结构,称为递归类型化超结构(RTHS)。这一大数据系统的基本拓扑特性揭示了大数据工程的复杂性及其操作与处理的全新认知、理论挑战,以及可选解决方案。
The big data play an indispensable role not only in a wide range of science fields and engineering applications, but also in the cognitive mechanisms of the sensation, the quantification, the qualification, the estimation, the measurement, the memory, and the reasoning of human beings. This paper reviews the basic studies of the theoretical foundations of the big data science, as well as a coherent set of general principles and analytic methodologies for the big data systems. The cognitive foundations of big data are explored in order to formally explain the origin and the nature of the big data. A set of mathematical models of the big data are created to rigorously elicit the general essences and patterns of the big data across pervasive domains in science, engineering, and society. A significant finding about the big data science is that the big data systems in nature are a recursively typed hyperstructure (RTHS) rather than pure numbers. The fundamental topological properties of the big data reveal a set of denotational mathematical solutions for dealing with the inherited complexities and unprecedented challenges in big data engineering.
[1] Hassanien A E, Azar A T, Snasel V, et al. Big data in complex systems:Challenges and opportunities[M]. Berlin:Springer, 2015.
[2] Jacobs A. The pathologies of big data[J]. Queue, 2009, 7(6):10.
[3] Mitchell J C. Type systems for programming languages[M]//van Leeuwen J. Handbook of Theoretical Computer Science. Amsterdam:Elsevier, 1990:365-458.
[4] Wang Y. Software science:On general mathematical models and formal properties of software[J]. Journal of Advanced Mathematics and Applications, 2014, 3(2):130-147.
[5] Wang Y. On cognitive foundations and mathematical theories of knowledge science[J]. International Journal of Cognitive Informatics and Natural Intelligence, 2016, 10(2):1-24.
[6] Wang Y. Keynote:On the emergence of abstract sciences and breakthroughs in machine knowledge learning[C]//18th IEEE International Conference on Cognitive Informatics and Cognitive Computing (ICCI*CC 2019). Piscataway N J:IEEE Press, 2009:5.
[7] Wang Y. Keynote:The cognitive and mathematical foundations of big data science and blockchain engineering[C]//International Conference on Big Data and Blockchain (ICBDB'19). Piscataway N J:IEEE Press, 2009:4.
[8] Bender E A. Mathematical methods in artificial intelligence[M]. Los Alamitos:IEEE CS Press, 1996.
[9] Berkeley B. Principles of human knowledge[M]. London:Berkeley, 1954.
[10] Debenham J K. Knowledge systems design[M]. New York:Prentice Hall, 1989.
[11] McCarthy J, Minsky M L, Rochester N, et al. Proposal for the 1956 dartmouth summer research project on artificial intelligence[R/OL].[2019-10-31]. http://www.formal.stanford.edu/jmc/history/dartmouth/dartmouth.html.
[12] McCulloch W S. Embodiments of mind[M]. Cambridge:MIT Press, 1965.
[13] Shannon C E. A mathematical theory of communication[J]. The Bell System Technical Journal, 1948, 27:379-423, 623-656.
[14] Turing A M. Computing machinery and intelligence[J].Mind, 1950, 59:433-460.
[15] von Neumann J. The computer and the brain[M]. New Haven:Yale University Press, 1958.
[16] Wang Y. Software engineering foundations:A software science perspective[M]. New York:Auerbach Publications, 2007.
[17] Wang Y. On Abstract Intelligence:Toward a unified theory of natural, artificial, machinable, and computational intelligence[J]. International Journal of Software Science and Computational Intelligence, 2009, 1(1):1-17.
[18] Wang Y. Cognitive robots:A reference model towards intelligent authentication[J]. IEEE Robotics and Automation, 2010, 17(4):54-62.
[19] Wang Y. Keynote:Big data algebra:A rigorous approach to big data analytics and engineering[C]//17th International Conference on Mathematical and Computational Methods in Science and Engineering (MACMESE'15). Kuala Lumpur, 2015:2.
[20] Wang Y. Concept algebra:A denotational mathematics for formal knowledge representation and cognitive robot learning[J]. Journal of Advanced Mathematics and Applications, 2015, 4(1):62-87.
[21] Wang Y. In search of denotational mathematics:Novel mathematical means for contemporary intelligence, brain, and knowledge sciences[J]. Journal of Advanced Mathematics and Applications, 2012, 1(1):4-25.
[22] Lewis H R, Papadimitriou C H, Elements of the theory of computation[M]. New York:Prentice Hall, 1998.
[23] Wang Y. On the informatics laws and deductive semantics of software[J]. IEEE Transactions on Systems, Man, and Cybernetics (Part C), 2006, 36(2):161-171.
[24] Zadeh L A. Fuzzy logic and approximate reasoning[J]. Synthese, 1975, 30(3/4):407-428.
[25] Wang Y. Fuzzy Causal Inferences based on fuzzy semantics of fuzzy concepts in cognitive computing[J]. WSEAS Transactions on Computers, 2014, 13:430-441.
[26] Wilson R, Keil F. The MIT encyclopedia of the cognitive sciences[J]. Electronic Resources Review, 2013, 43(4):282-283.
[27] Codd E F. A relational model of data for large shared data banks[J]. Communications of the ACM, 1970, 13(6):377-387.
[28] Cardelli L, Wegner P. On understanding types, data abstraction, and polymorphism[J]. ACM Computing Surveys, 1985, 17(4):471-523.
[29] Guttag J V. Abstract data types and the development of data structures[J]. Communications of the ACM, 1977, 20(6):396-404.
[30] Martin-Lof P. An intuitionistic theory of types:Predicative part[J]. Studies in Logic & the Foundations of Mathematics, 1975, 80:73-118.
[31] McKinsey B, Gartner D. Big Data means high value, not just volume[M]. Computer Weekly, 2011, 6:1-2.
[32] McKinsey B, Gartner D. Big Data means high value, not just volume[J]. Computer Weekly, 2011, 6:1-2.
[33] Mashey J R. Big data and the next wave of infrastress[J]. SGI, 1998:1-46.
[34] Wang Y. On mathematical theories and cognitive foundations of information[J]. International Journal of Cognitive Informatics and Natural Intelligence, 2015, 9(3):41-63.
[35] Wang Y. Keynote:The emergence of abstract sciences and brain-inspired symbiotic systems[C]//IEEE FDC Workshop on Symbiotic Autonomous Systems in SMC'18.1. Piscataway N J:IEEE, 2018:3.
[36] Chapra S C, Canale R P. Numerical methods for engineers with software and programming applications[M]. Boston:McGraw-Hill, 2002.
[37] Chicurel M. Databasing the brain[J]. Nature, 2000, 406(6798):822-825.
[38] Snijders C, Matzat U, Reips U D. ‘Big data’:Big gaps of knowledge in the field of internet[J]. International Journal of Internet Science, 2012, 7:1-5.
[39] Sternberg R J. In search of the human mind[M]. 2nd ed. New York:Harcourt Brace & Co, 1998.
[40] Ullman J D, Widom J. A first course in database systems[M]. New York:Prentice Hall, Inc., 1997.
[41] Wang Y. On cognitive informatics[J]. Brain and Mind, 2003, 4(3):151-167.
[42] Wang Y. Keynote:From information revolution to intelligence revolution:big data science vs. intelligence science[C]//Proceedings of 13th IEEE International Conference on Cognitive Informatics and Cognitive Computing (ICCI*CC 2014). London:IEEE CS Press, 2014:3-5.
[43] Hartmanis J. On computational complexity and the nature of computer science, 1994 turing award lecture[J]. Communications of the ACM, 1994, 37(10):37-43.
[44] Wang Y. On the cognitive complexity of software and its quantification and formal measurement[J]. International Journal of Software Science and Computational Intelligence, 2019, 1(2):31-53.
[45] Wikipedia[EB/OL].[2019-10-31]. https://en.wikipedia.org/wiki/Largest_known_prime_number.
[46] Wang Y. On the big-R notation for describing iterative and recursive behaviors[J]. International Journal of Cognitive Informatics and Natural Intelligence, 2008, 2(1):17-23.
[47] Wang Y. Software science:On general mathematical models and formal properties of software[J]. Journal of Advanced Mathematics and Applications, 2014, 3(2):130-147.