Exclusive: Development Trends of Web 3.0

Metaverse terminal: Key hard technology development trends in virtual (augmented) reality

  • WANG Lijun ,
  • LI Zhengping ,
  • LI Ying ,
  • HOU Yaohui ,
  • WANG Jingliang ,
  • WANG Chuang ,
  • XU Zhiping ,
  • JIA Kehao ,
  • LIU Yuning ,
  • MA Weizhi
Expand
  • School of Information Science and Technology, North China University of Technology, Beijing 100144, China

Received date: 2023-02-22

  Revised date: 2023-04-26

  Online published: 2023-08-30

Abstract

With the development of the information technology era, metaverse has gradually become the third internet revolution, and the metaverse terminal is an important component of it. This article describes the development trends and key technologies of the hardware part of the metaverse terminal in terms of micro displays, optical systems, and perceptual interactions. In terms of the micro display technology, two technologies, silicon based OLED and MicroLED, are mainly discussed; in terms of optical systems, the main topics discussed are surface relief grating waveguide technology, volume holographic waveguide technology, and metasurface waveguide technology; in terms of perceptual interaction, the interaction technology of gesture recognition and EEG recognition is mainly discussed.

Cite this article

WANG Lijun , LI Zhengping , LI Ying , HOU Yaohui , WANG Jingliang , WANG Chuang , XU Zhiping , JIA Kehao , LIU Yuning , MA Weizhi . Metaverse terminal: Key hard technology development trends in virtual (augmented) reality[J]. Science & Technology Review, 2023 , 41(15) : 46 -60 . DOI: 10.3981/j.issn.1000-7857.2023.15.005

References

[1] 崔迪. 面向建筑信息的多人虚拟交互方式研究——以六主村无止桥公益项目情景为例[D]. 上海: 同济大学, 2018: 1-3.
[2] 赵沁平 . 从虚拟现实技术管窥新兴工科人才培养[J]. 中国大学教学, 2019(9): 7-9.
[3] 史晓刚, 薛正辉, 李会会, 等 . 增强现实显示技术综述[J]. 中国光学, 2021, 14(5): 1146-1161.
[4] 王伟 . 光波导成 AR 眼镜新宠[N]. 中国电子报, 2021-11-23(005).
[5] 王伟 . 硅基 OLED 微型显示领域又有新进展[N]. 中国电子报, 2022-06-28(06).
[6] 杨建兵, 秦昌兵, 张白雪, 等 . 大尺寸高分辨率硅基OLED 微显示技术研究[J]. 光电子技术, 2019, 39(3): 181-185.
[7] 张天宇. 京东方显示屏出货实现全球“双冠”[N]. 北京商报, 2019-11-22(F5).
[8] 史晓刚, 薛正辉, 李会会, 等 . 增强现实显示技术综述[J]. 中国光学, 2021, 14(5): 1146-1161.
[9] 陈浩, 朱杰辉, 沈庆云 . 一种舞台用快装式 LED灯装置: 202210900390.2[P]. 2022-07-26.
[10] 姜玉婷, 张毅, 胡跃强, 等 . 增强现实近眼显示设备中光波导元件的研究进展[J]. 光学精密工程, 2021, 29(1): 28-44.
[11] Richter P, Bürger A, Waldern J, et al. Compact AR-HUD solution with optical waveguide[J]. ATZelektronik Worldwide, 2017, 12(3): 18-23.
[12] Grad Ya A, Odinokov S B, Solomashenko A B, et al. Study of color reproduction features of AR device based on optical waveguides[C]//Optics, Photonics and Digital Technologies for Imaging Applications VI. Bellingham: Society of Photo-Optical Instrumentation Engineers, 2021: 11353.
[13] 倪一博, 闻顺, 沈子程, 等 . 基于超构表面的多维光场感知[J]. 中国激光, 2021, 48(19): 233-260.
[14] Yu N, Genevet P, Kats M A, et al. Light propagation with phase discontinuities: Generalized laws of reflection and refraction[J]. Science, 2011, 334(6054): 333-337.
[15] 刘逸天, 陈琦凯, 唐志远, 等 . 超表面透镜的像差分析和成像技术研究[J]. 中国光学, 2021, 14(4): 831-850.
[16] Li Z, Lin P, Huang Y W, et al. Meta-optics achieves RGB-achromatic focusing for virtual reality[J]. Science Advances, 2021, 7(5): eabe4458.
[17] Lee G Y, Hong J Y, Hwang S, et al. Metasurface eyepiece for augmented reality[J]. Nature Communications. 2018, 9(1): 4562.
[18] Zhang J L, Wang X R, Yang Y, et al. Flat dielectric metasurface lens array for three dimensional integral imaging[J]. Optics Communications, 2018, 414: 1-4.
[19] Deng H, Wang Q H. 3D display technology for augmented reality based on integral imaging-A review[J]. Science & Technology Review, 2018, 36(9): 18-24.
[20] Liu L, Ouyang W L, Wang X G, et al. Deep learning for generic object detection: A survey[J]. International Journal of Computer Vision, 2020, 128(2): 261-318.
[21] McBride T J, Vandayar N, Nixon K J. A comparison of skin detection algorithms for hand gesture recognition[C]//2019 Southern African Universities Power Engineering Conference/Robotics and Mechatronics/Pattern Recognition Association of South Africa (SAUPEC/RobMech/PRASA). Piscataway NJ: IEEE, 2019: 211-216.
[22] 何胜皎 . 视频序列中运动目标检测算法的研究[D]. 兰州: 兰州理工大学, 2018.
[23] Zimmermann C, Brox T. Learning to estimate 3D hand pose from single rgb images[C]//Proceedings of the IEEE International Conference on Computer Vision. Piscataway, NJ: IEEE, 2017: 4903-4911.
[24] Gulati S, Bhogal R K. Comprehensive review of various hand detection approaches[C]//2018 International Conference on Circuits and Systems in Digital Enterprise Technology (ICCSDET). Piscataway, NJ: IEEE, 2018: 1-5.
[25] Zhao S Y, Yang W Y, Wang Y G. A new hand segmentation method based on fully convolutional network[C]//2018 Chinese Control and Decision Conference (CCDC). Piscataway, NJ: IEEE, 2018: 5966-5970.
[26] Grill J B, Strub F, Altché F, et al. Bootstrap your own latent—A new approach to self-supervised learning[J]. Advances in Neural Information Processing Systems, 2020, 33: 21271-21284.
[27] Liu Z P, Chai X J, Liu Z, et al. Continuous gesture recognition with hand-oriented spatiotemporal feature[C]//Proceedings of the IEEE International Conference on Computer Vision Workshops. Piscataway, NJ: 2017: 3056-3064.
[28] Narayana P, Beveridge R, Draper B A. Gesture recogni⁃tion: Focus on the hands[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway, NJ: IEEE, 2018: 5235-5244.
[29] Redmon J, Divvala S, Girshick R, et al. You only look once: Unified, real-time object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway, NJ: IEEE, 2016: 779-788.
[30] Redmon J, Farhadi A. YOLO9000: Better, faster, stronger[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2017: 7263-7271.
[31] Redmon J, Farhadi A. Yolov3: An incremental improvement[J]. arXiv Preprint, 2018: 1804.02767.
[32] Bochkovskiy A, Wang C Y, Liao H Y M. Yolov4: Optimal speed and accuracy of object detection[J]. arXiv Preprint, 2020: 2004.10934.
[33] Huu P N, The H L. Proposing recognition algorithms for hand gestures based on machine learning model[C]//2019 19th International Symposium on Communications and Information Technologies (ISCIT). Piscataway, NJ: IEEE, 2019: 496-501.
[34] Zhan F. Hand gesture recognition with convolution neural networks[C]//2019 IEEE 20th International Conference on Information Reuse and Integration for Data Science (IRI). Piscataway NJ: IEEE, 2019: 295-298.
[35] Du T, Ren X M, Li H C. Gesture recognition method based on deep learning[C]//2018 33rd Youth Academic Annual Conference of Chinese Association of Automation (YAC). Piscataway, NJ: IEEE, 2018: 782-787.
[36] Hong J Y, Park S H, Baek J G. Segmented dynamic time warping based signal pattern classification[C]//2019 IEEE International Conference on Computational Science and Engineering (CSE) and IEEE International Conference on Embedded and Ubiquitous Computing(EUC). Piscataway, NJ: IEEE, 2019: 263-265.
[37] Tölgyessy M, Dekan M, Chovanec L', et al. Evaluation of the azure Kinect and its comparison to Kinect V1 and Kinect V2[J]. Sensors, 2021, 21(2): 413.
[38] Mor B, Garhwal S, Kumar A. A systematic review of hidden markov models and their applications[J]. Archives of Computational Methods in Engineering, 2021, 28(3): 1429-1448.
[39] Haid M, Budaker B, Geiger M, et al. Inertial-based gesture recognition for artificial intelligent cockpit control using hidden markov models[C]//2019 IEEE International Conference on Consumer Electronics (ICCE). Piscataway, NJ: IEEE, 2019: 1-4.
[40] Tang J, Cheng H, Zhao Y, et al. Structured dynamic time warping for continuous hand trajectory gesture recognition[J]. Pattern Recognition, 2018, 80: 21-31.
[41] 张建荣 . 基于 Kinect 手势识别的虚拟环境体感交互技术研究[D]. 重庆: 重庆邮电大学, 2016.
[42] Saha S, Lahiri R, Konar A, et al. HMM-based gesture recognition system using Kinect sensor for improvised human-computer interaction[C]//2017 International Joint Conference on Neural Networks (IJCNN). Piscataway, NJ: IEEE, 2017: 2776-2783.
[43] Khan A, Sohail A, Zahoora U, et al. A survey of the recent architectures of deep convolutional neural networks[J]. Artificial Intelligence Review, 2020, 53(8): 5455-5516.
[44] Feichtenhofer C. X3d: Expanding architectures for efficient video recognition[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway, NJ: 2020: 203-213.
[45] Feichtenhofer C, Pinz A, Zisserman A. Convolutional two-stream network fusion for video action recognition[C]//Proceedings of the IEEE Conference on Computer vision and Pattern Recognition. Piscataway, NJ: 2016: 1933-1941.
[46] Jing L L, Tian Y L. Self-supervised visual feature learning with deep neural networks: A survey[J]. IEEE Transactions On Pattern Analysis and Machine Intelligence, 2020, 43(11): 4037-4058.
[47] Han T, Xie W, Zisserman A. Self-supervised co-training for video representation learning[J]. Advances in Neural Information Processing Systems, 2020, 33: 5679-5690.
[48] Feichtenhofer C, Fan H, Malik J, et al. Slowfast networks for video recognition[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. Piscataway, NJ: IEEE, 2019: 6202-6211.
[49] Tan M, Le Q. Efficientnet: Rethinking model scaling for convolutional neural networks[C]//Proceedings of the 36th International Conference on Machine Learning. Brookline, MA: Microtome Publishing, 2019: 6105-6114.
[50] Lin J, Gan C, Han S. Tsm: Temporal shift module for efficient video understanding[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. Piscataway, NJ: IEEE, 2019: 7083-7093.
[51] Yan S, Xiong Y, Lin D. Spatial temporal graph convolutional networks for skeleton-based action recognition[C].Thirty-second AAAI Conference on Artificial Intelligence. Washington: AAAI, 2018.
[52] Zhou H, Zhou W G, Zhou Y, et al. Spatial-temporal multi-cue network for continuous sign language recognition[C]//Proceedings of the AAAI Conference on Artificial Intelligence. Washington: AAAI, 2020, 34(7): 13009-13016.
[53] Zhu G M, Zhang L, Mei L, et al. Large-scale isolated gesture recognition using pyramidal 3D convolutional networks[C]//2016 23rd International Conference on Pattern Recognition (ICPR). Piscataway, NJ: IEEE, 2016: 19-24.
[54] Li Y N, Miao Q G, Tian K, et al. Large-scale gesture recognition with a fusion of RGB-D data based on saliency theory and C3D model[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2017, 28(10): 2956-2964.
[55] Tran D, Ray J, Shou Z, et al. Convnet architecture search for spatiotemporal feature learning[J]. arXiv Preprint, 2017: 1708.05038.
[56] Miao Q G, Li Y N, Ouyang W L, et al. Multimodal gesture recognition based on the ResC3D network[C]//Proceedings of the IEEE International Conference on Computer Vision Workshops. Piscataway, NJ: IEEE, 2017: 3047-3055.
[57] Li Y N, Miao Q G, Qi X D, et al. A spatiotemporal attention-based ResC3D model for large-scale gesture recognition[J]. Machine Vision and Applications, 2019, 30(5): 875-888.
[58] Gupta P, Kautz K. Online detection and classification of dynamic hand gestures with recurrent 3D convolutional neural networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway NJ: IEEE, 2016: 4207-4215.
[59] Zhu G M, Zhang L, Shen P Y, et al. Multimodal gesture recognition using 3D convolution and convolutional LSTM[J]. IEEE Access, 2017, 5: 4517-4524.
[60] Black M H, Chen N, Iyer K K, et al. Mechanisms of facial emotion recognition in autism spectrum disorders: Insights from eye tracking and electroencephalography[J]. Neuroence and Biobehavioral Reviews, 2017, 80: 488-515.
[61] Moruzzi G, Magoun H W. Brain stem reticular formation and activation of the EEG[J]. Electroencephalography and Clinical Neurophysiology, 1949, 1(1/2/3/4): 455-473.
[62] Gibbs F A, Lennox W G, Gibbs E L. The electro-encephalogram in diagnosis and in localization of epileptic seizures[J]. Arch NeurPsych, 1936, 36(6): 1225-1235.
[63] Gabor A J, Seyal M. Automated interictal EEG spike detection using artificial neural networks[J]. Electroencephalography and Clinical Neurophysiology, 1992, 83(5): 271-280.
[64] Taran S, Bajaj V. Emotion recognition from single-channel EEG signals using a two-stage correlation and instantaneous frequency-based filtering method[J]. Computer Methods and Programs in Biomedicine, 2019, 173: 157-165.
[65] Chen L L, Zhang J, Zou J Z, et al. A framework on wavelet-based nonlinear features and extreme learning machine for epileptic seizure detection[J]. Biomedical Signal Processing & Control, 2014, 10: 1-10.
[66] 张涛, 陈万忠, 李明阳. 基于频率切片小波变换和支持向量机的癫痫脑电信号自动检测[J]. 物理学报, 2016(3): 038703.
[67] 邹凌, 王新光 . 独立分量分析结合小波去噪算法提取诱发电位信号的仿真实验[J]. 中国组织工程研究, 2009, 13(43): 8503-8505.
[68] 李冬梅 . 经验模式分解与代价敏感支持向量机在癫痫脑电信号分类中的应用[J]. 生物医学工程研究, 2017, 36(1): 33-37.
[69] 贺王鹏, 杨琳, 王芳, 等 . 基于 TQWT 的癫痫脑电信号的识别[J]. 生物医学工程研究, 2017, 36(4): 346-350.
[70] Pachori R B, Bajaj V. Analysis of normal and epileptic seizure EEG signals using empirical mode decomposition[J]. Computer Methods and Programs in Biomedicine, 2011, 104(3): 373-381.
[71] 张发华, 舒琳, 邢晓芬 . 头皮脑电采集技术研究[J]. 电子技术应用, 2017, 43(12): 3-8.
[72] 丁超. 便携式脑电采集系统设计[D]. 成都: 电子科技大学, 2013.
[73] 刘屏. 精神创伤后应激障碍及其防治研究进展[J]. 中国药物应用与监测, 2017, 14(1): 1-5.
[74] Pandey P, Seeja K R. Subject independent emotion recognition from EEG using VMD and deep learning[J]. Journal of King Saud University—Computer and Information Sciences, 2022, 34(5): 1730-1738.
[75] Camarda A, Salvia É, Vidal J, et al. Neural basis of functional fixedness during creative idea generation: An EEG study[J]. Neuropsychologia, 2018, 118(Part A): 4-12.
[76] Wankhade S B, Doye D D. IKKN predictor: An EEG signal based emotion recognition for HCI[J]. Wireless Personal Communications, 2019, 107(3): 12-15.
[77] Sutter E E. The brain response interface: Communication through visually-induced electrical brain responses[J]. Journal of Microcomputer Applications, 1992, 15(1): 31-45.
[78] Farwell L, Donchin E. Talking off the top of your head: Toward a mental prosthesis utilizing event-related brain potentials[J]. Electro-encephalography and Clinical Neurophysiology, 1989, 70(6): 510-523.
[79] Pfurtscheller G, Silva F. Event-related EEG/MEG synchronization and desynchronization: Basic principles[J]. Clinical Neurophysiology, 1999, 110(11): 1842-1857.
[80] Wolpaw J R, McFarland D J, Neat G W, et al. An EEG-based brain-computer interface for cursor control[J]. Electroencephalography and Clinical Neurophysiology, 1991, 78(3): 252-259.
Outlines

/