Abstract:With the growth of data generated by human activities, the scale, the type and the demands for the data visualization have expanded greatly. In the big data era, the data visualization faces many challenges. In this paper, based on the characteristics and the requirements of the big data, and the current research states of the data visualization, the common data visualization techniques are reviewed. Eight important challenges that the data visualization has to deal with in the big data applications are highlighted. The AutoVis, a data-aware interactive visualization design platform, is specially discussed, as well as its applications.
[1] Hey T, Tansley S, Tolle K. The fourth paradigm:Data-intensive scientific discovery[J]. Proceedings of the IEEE, 2009, 99(8):1334-1337.
[2] Shen E Y, Xia J Z, Cheng Z Q, et al. Model-driven multicomponent volume exploration[J]. Visual Computer, 2015, 31(4):441-454.
[3] 沈恩亚, 王攀, 李思昆, 等. 大规模数据并行可视化与交互环境[C]//2012全国高性能计算学术年会论文集. 北京:中国计算机学会, 2012:1-7.
[4] Shen E, Wang Y, Li S. Spatiotemporal volume saliency[J]. Journal of Visualization, 2016, 19(1):157-168.
[5] McAfee A, Brynjolfsson E, Thomas H, et al. Big data:The management revolution[J]. Harvard Business Review, 2012, 90(10):60-68.
[6] Doctorow C. Big data:Welcome to the Petacentre[J]. Nature, 2008, 455(7209):16-21.
[7] Reichman O J, Jones M B, Schildhauer M P. Challenges and opportunities of open data in ecology[J]. Science, 2011, 331(6018):703-705.
[8] Rosenblum L D. See what I'm saying:The extraordinary powers of our five senses[M]. London:W.W. Norton & Company Ltd., 2011.
[9] Foley T A, Lane D A, Nielson G M, et al. Scientific Visualization[J]. IEEE Computer Graphics and Applications, 1990, 10(1):32-40.
[10] Ware C. Information visualization:Perception for design[M]. San Francisco:Morgan Kaufmann Publishers Inc., 2012.
[11] Keim D, Andrienko G, Fekete J D, et al. Visual analytics:Definition, process, and challenges[M]//Information Visualization. Berlin:Springer, 2008.
[12] Chang W L, Grady N. NIST big data interoperability framework:Volume 6, big data taxonomies[R]. Gaithersburg:NIST, 2019.
[13] Abela A. Advanced presentations by design:Creating communication that drives action[M]. New York:John Wiley & Sons, 2008.
[14] Ahrens J, Brislawn K, Martin K, et al Large-scale data visualization using parallel data streaming[J]. IEEE Computer Graphics and Applications, 2001, 21(4):34-41.
[15] Singh J P, Gupta A, Levoy M. Parallel visualization algorithms:Performance and architectural implications[J]. Computer, 1994, 27(7):45-55.
[16] Moreland K. A survey of visualization pipelines[J]. IEEE Transactions on Visualization and Computer Graphics, 2013, 19(3):367-378.
[17] Ma K L. In situ visualization at extreme scale:Challenges and opportunities[J]. IEEE Computer Graphics and Applications,2009, 6:14-19.
[18] He W, Wang J, Guo H, et al. Insitunet:Deep image synthesis for parameter space exploration of ensemble simulations[J]. IEEE Transactions on Visualization and Computer Graphics, 2019, 26(1):23-33.
[19] Ahrens J, Jourdain S, O'Leary P, et al. In situ MPASocean image-based visualization[J/OL].[2019-10-31]. http://sc14.supercomputing.org/sites/all/themes/sc14/files/archive/sci_vis/sci_vis_files/svs105s3-file4.pdf.
[20] Ahrens J, Jourdain S, O'Leary P, Patchett J, et al. An image-based approach to extreme scale in situ visualization and analysis[C]//SC'14:Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis. Piscataway N J:IEEE, 2015:10.1109/SC.2014.40.
[21] Dutta S, Chen C M, Heinlein G, et al. In situ distribution guided analysis and visualization of transonic jet engine simulations[J]. IEEE Transactions on Visualization and Computer Graphics, 2016, 23(1):811-820.
[22] Di S, Cappello F. Fast error-bounded lossy HPC data compression with SZ[C]//2016 IEEE International Parallel and Distributed Processing Symposium (IPDPS). Piscataway N J:IEEE, 2016.
[23] Lakshminarasimhan S, Shah N, Ethier S, et al. Isabela for effective in situ compression of scientific data[J].Concurrency and Computation:Practice and Experience, 2013, 25(4):524-540.
[24] Bremer P T, Weber G, Tierny J, et al. Interactive exploration and analysis of large scale simulations using topology-based data segmentation[J]. IEEE:Transaction on Visualization and Computer Graphics, 2011, 17(9):1307-1324..
[25] The data visualisation catalogue[EB/OL].[2019-11-08]. https://datavizcatalogue.com/search/time.html.
[26] Morrow B, Manz T, Chung A E, et al. Periphery plots for contextualizing heterogeneous time-based charts[J]. arXiv, 2019:1906.07637.
[27] Tominski C, Aigner W. The timeviz browser[M/OL].[2019-09-10]. https://vcg.informatik.uni-rostock.de/~ct/timeviz/timeviz.html, 2017.
[28] Shneiderman B. Extreme visualization:Squeezing a billion records into a million pixels[C]//Proceedings of the ACM SIGMOD International Conference on Management of Data, SIGMOD 2008. New York:ACM, 2008, doi:10.1145/1376616.1376618.
[29] Steinarsson S. Down sampling time series for visual representation[R/OL].[2019-10-31]. https://skemman.is/bitstream/1946/15343/3/SS_MSthesis.pdf.
[30] Kehagias A. A hidden markov model segmentation procedure for hydrological and environmental time series[J]. Stochastic Environmental Research and Risk Assessment, 2004, 18(2):117-130.
[31] Guo T, Feng K, CongG, et al. Efficient selection of geospatial data on maps for interactive and visualized exploration[C]//Proceedings of the 2018 International Conference on Management of Data. New York:ACM, 2018, doi:10.1145/3183713.3183738.
[32] Wu Y, Cao N, Archambault D, et al. Evaluation of graph sampling:A visualization perspective[J]. IEEE Transactions on Visualization and Computer Graphics, 2016, 23(1):401-410.
[33] Zhang J, Zhu K, Pei Y, et al. Clustering-structure representative sampling from graph streams[C]//International Conference on Complex Networks and their Applications. Berlin:Springer, 2017, doi:10.1007/978-3-319-72150-7_22.
[34] Woo M, Neider J, Davis T, et al. OpenGL programming guide:The official guide to learning OpenGL, version 1.2[M]. Boston:Addison-Wesley Longman Publishing Co. Inc., 1999.
[35] Schroeder W, Martin K, Lorensen B. The visualization toolkit:An object-oriented approach to 3D graphics[J]. Upper Saddle River:Prentice Hall Inc., 1998.
[36] Bostock M, Ogievetsky V, Heer J. D3 data-driven documents[J]. IEEE Transactions on Visualization and Computer Graphics, 2011, 17(12):2301-2309.
[37] Satyanarayan A, Russell R, Hoffswell J, et al. Reactive vega:A streaming dataflow architecture for declarative interactive visualization[J]. IEEE Transactions on Visualization and Computer Graphics, 2016, 22(1):659-668.
[38] Satyanarayan A, Moritz D, Wongsuphasawat K, et al. Vega-Lite:A grammar of interactive graphics[J]. IEEE Transactions on Visualization and Computer Graphics, 2017, 23(1):341-350.
[39] Stolte C, Tang D, Hanrahan P. Polaris:A system for query, analysis, and visualization of multidimensional relational databases[J]. IEEE Transactions on Visualization and Computer Graphics, 2002, 8(1):52-65.
[40] Tableau Inc[EB/OL].[2019-11-08]. https://www.tableau.com/.
[41] Wongsuphasawat K, Moritz D, Anand A, et al. Voyager:Exploratory analysis faceted browsing of visualization recommendations[J]. IEEE Transactions on Visualization and Computer Graphics, 2016, 22(1):649-658.
[42] Dibia V, Demiralp Ç. Data2vis:Automatic generation of data visualizations using sequence to sequence recurrent neural networks[J]. arXiv, 2018:1804.03126.
[43] Satyanarayan A, Heer J. Lyra:An interactive visualization design environment[J]. Computer Graphics Forum, 2014, 33(3):351-360.
[44] Liu Z, Thompson J, Wilson A, et al. Data illustrator:Augmenting vector design tools with lazy data binding for expressive visualization authoring[C]//Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems. New York:ACM, 2018, doi:10.1145/3173574.3173697.
[45] Yu B W, Silva C T. Visflow-web-based visualization framework for tabular data with a subset flow model[J]. IEEE Transactions on Visualization and Computer Graphics, 2017, 23(1):251-260.
[46] Microsoft Inc[EB/OL].[2019-11-08]. https://powerbi.microsoft.com/.
[47] Qlik Inc[EB/OL].[2019-11-08]. https://www.qlik.com/us/products/qlikview.
[48] Apache software foundation[EB/OL].[2019-11-08]. https://superset.incubator.apache.org/.
[49] MadhaviLatha A, Vijaya K A. Streaming data analysis using apache cassandra and zeppelin[J]. IJISET-International Journal of Innovative Science, Engineering & Technology, 2016, 3(10), http://ijiset.com/vol3/v3s10/IJISET_V3_I10_02.pdf.
[50] Wang L, Wang G, Alexander C A. Big data and visualization:Methods, challenges and technology progress[J]. Digital Technologies, 2015, 1(1):33-38.
[51] Agrawal R, Kadadi A, Dai X, et al. Challenges and opportunities with big data visualization[C]//International Conference on Management of Computational & Collective Intelligence in Digital Ecosystems. New York:ACM, 2015.
[52] Ali S M, Gupta N, Nayak G K, et al. Big data visualization:Tools and challenges[C]//2nd International Conference on Contemporary Computing and Informatics (IC3I). Piscataway NJ:IEEE, 2016, doi:10.1109/IC3I. 2016.7918044.
[53] Bikakis N. Big data visualization tools[J]. arXiv, 2018:1801.08336.
[54] Wang Y. Deck.Gl:Large-scale web-based visual analytics made easy[J]. arXiv, 2019:1910.08865.
[55] Gartner Inc[EB/OL].[2019-11-08]. https://www.gartner.com/en/information-technology/glossary/augmented-analytics.
[56] Balakrishnama S, Ganapathiraju A. Linear discriminant analysis-A brief tutorial[J]. Institute for Signal and Information Processing, 1998, 18:1-8.
[57] Hong F, Lai C, Guo H, et al. Flda:Latent dirichlet allocation based unsteady flow analysis[J]. IEEE Transactions on Visualization and Computer Graphics, 2014, 20(12):2545-2554.
[58] Shen E, Z Cheng, J Xia, and S Li. "Intuitive volume eraser[C]//1st International Conference on Computational Visual Media. Berlin:Springer, 2012:10.1007/978-3-642-34263-9_32.
[59] Shen E, Li S, Cai X, et al. SAVE:saliency-assisted volume exploration[J]. Journal of Visualization, 2015, 18(2):369-379.
[60] Shen E, Li S, Cai X, Zeng L, et al. Sketch-based interactive visualization:a survey[J]. Journal of Visualization, 2014, 14(4):275-294.
[61] Yu B, Silva C T. Flowsense:A natural language interface for visual data exploration within a dataflow system[J]. IEEE Transactions on Visualization and Computer Graphics, 2019, doi:10.1109/TVCG.2019.2934668.
[62] Gao Y, Lou J, Zhang D. Annaparser:Semantic parsing for tabular data analysis[J]. arXiv, 2019:1910.10363.