综述

人脸跨域合成技术研究进展

  • 刘琦 ,
  • 吴昊展 ,
  • 谢添鑫 ,
  • 韩琥
展开
  • 1. 河南警察学院网络安全系,郑州 450000
    2. 中国科学院大学,北京 100049
    3. 中国科学院计算技术研究所,北京 100190
刘琦,副教授,研究方向为信息技术及应用,电子信箱:569797767@qq.com

收稿日期: 2023-04-23

  修回日期: 2023-07-13

  网络出版日期: 2023-09-08

基金资助

河南省科技厅科技攻关项目(222102210089)

Research progress and trend of cross-domain face synthesis technology

  • LIU Qi ,
  • WU Haozhan ,
  • XIE Tianxin ,
  • HAN Hu
Expand
  • 1. Department of Network Security, Henan Police College, Zhengzhou 450000, China
    2. University of Chinese Academy of Sciences, Beijing 100049, China
    3. Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China

Received date: 2023-04-23

  Revised date: 2023-07-13

  Online published: 2023-09-08

摘要

总结了人脸跨域合成技术的起源、任务类型与难点、技术发展与挑战、潜在应用与问题等,从自监督与弱监督跨域合成、基于预训练大模型跨域合成、基于跨域合成隐私保护3个方面探讨了人脸跨域合成技术未来发展趋势与挑战。

本文引用格式

刘琦 , 吴昊展 , 谢添鑫 , 韩琥 . 人脸跨域合成技术研究进展[J]. 科技导报, 2023 , 41(16) : 113 -123 . DOI: 10.3981/j.issn.1000-7857.2023.16.010

Abstract

Advances in deep learning technology and the development of the digital economy have promoted the development of artificial intelligence-generated content (AIGC) technologies such as virtual humans. Cross-domain face synthesis is one of the key technologies in virtual human production, and it has a wide range of applications in social media, film and television production and other fields. This paper summarizes the origin of cross-domain face synthesis technology, and its typical task types and difficulties, technological development and challenges, potential applications, and issues, and discusses its future development trend and challenges from the aspects of self-supervised and weakly supervised cross-domain synthesis, utilization of pre-trained large models, and privacy protection.

参考文献

[1] Blanz V, Vetter T. A morphable model for the synthesis of 3D faces[C]//Proceedings of the 26th annual Conference on Computer Graphics and Interactive Techniques. New York: ACM, 1999: 187-194.
[2] 苏从勇, 庄越挺, 黄丽, 等. 基于正交图像生成人脸模型的合成分析方法[J]. 浙江大学学报(工学版), 2005, 39(2): 175-179.
[3] Tran A T, Hassner T, Masi I, et al. Regressing robust and discriminative 3d morphable models with a very deep neural network[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2017: 1493-1502.
[4] Tang X O, Wang X G. Face photo recognition using sketch[C]//Proceedings of International Conference on Image Processing. Piscataway: IEEE Press, 2002.
[5] Williams I. Performance-driven facial animation[C]//SIGGRAPH '06: ACM SIGGRAPH 2006 Courses. New York: ACM, 2006.
[6] Gleicher M. Animation from observation[J]. ACM SIGGRAPH Computer Graphics, 1999, 33(4): 51-54.
[7] Goodfellow I, Pouget-Abadie J, Mirza M, et al. Generative adversarial nets[C]//Advances in Neural Information Processing Systems. Montreal: Curran Associates Inc., 2014: 2672-2680.
[8] Karras T, Laine S, Aila T M. A style-based generator architecture for generative adversarial networks[C]//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE Press, 2020: 4396-4405.
[9] Mildenhall B, Srinivasan P P, Tancik M, et al. NeRF: Representing scenes as neural radiance fields for view synthesis[C]//Vedaldi A, Bischof H, Brox T, et al. European Conference on Computer Vision. Cham: Springer, 2020: 405-421.
[10] Ho J, Jain A, Abbeel P. Denoising diffusion probabilistic models [C]//Advances in Neural Information Processing Systems. Virtual: Curran Associates Inc., 2020: 6840-6851.
[11] Yu S K, Han H, Shan S G, et al. CMOS-GAN: Semi-supervised generative adversarial model for cross-modality face image synthesis[J]. IEEE Transactions on Image Processing, 2023, 32: 144-158.
[12] Hou A, Zhang Z, Sarkis M, et al. Towards high fidelity face relighting with realistic shadows[C]//2021 IEEE/ CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE Press, 2021: 14714-14723.
[13] Wei Y X, Liu M, Wang H L, et al. Learning flow-based feature warping for face frontalization with illumination inconsistent supervision[C]//European Conference on Computer Vision. Cham: Springer, 2020: 558-574.
[14] Shen Y J, Yang C Y, Tang X O, et al. InterFaceGAN: Interpreting the disentangled face representation learned by GANs[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022, 44(4): 2004-2018.
[15] Thies J, Zollhöfer M, Stamminger M, et al. Face2Face: Real-time face capture and reenactment of RGB videos[C]//2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE Press, 2016: 2387-2395.
[16] Saharia C, Ho J, Chan W, et al. Image super-resolution via iterative refinement[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022, 45(4): 4713-4726.
[17] Yuan X W, Park I K. Face de-occlusion using 3D morphable model and generative adversarial network[C]//2019 IEEE/CVF International Conference on Computer Vision (ICCV). Piscataway: IEEE Press, 2020: 10061-10070.
[18] Goodfellow I J, Shlens J, Szegedy C. Explaining and harnessing adversarial examples[DB/OL]. arXiv preprint: 1412.6572, 2014.
[19] Deng Y, Yang J L, Xu S C, et al. Accurate 3D face reconstruction with weakly-supervised learning: From single image to image set[C]//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops(CVPRW). Piscataway: IEEE Press, 2020: 285-295.
[20] Tang X O, Wang X G. Face sketch synthesis and recognition[C]// Proceedings Ninth IEEE International Conference on Computer Vision. Piscataway: IEEE Press, 2008: 687-694.
[21] Kingma D P, Welling M. Auto-encoding variational bayes[C]// International Conference on Learning Representations 2014. Banff, AB, Canada: 2014.
[22] Ho J, Jain A, Abbeel P. Denoising diffusion probabilistic models[J]. Advances in Neural Information Processing Systems, 2020, 33: 6840-6851.
[23] 肖冰 . 人脸画像——照片的合成与识别方法研究[D].西安: 西安电子科技大学, 2010.
[24] 黄法秀, 张世杰, 吴志红, 等 . 数据增广下的人脸识别研究[J]. 计算机技术与发展, 2020, 30(3): 67-72.
[25] Kirillov A, Mintun E, Ravi N, et al. Segment anything[DB/OL]. arXiv preprint: 2304.02643, 2023.
[26] Radford A, Kim J W, Hallacy C, et al. Learning transferable visual models from natural language supervision[DB/OL]. arXiv preprint: 2103.00020, 2022.
[27] 马玉琨 . 基于人脸的安全身份认证关键技术研究[D]. 北京: 北京工业大学, 2018.
文章导航

/