研究论文

人工智能时代知识图谱表示学习方法体系

  • 张翙 ,
  • 杨伟杰 ,
  • 刘文文 ,
  • 张珣 ,
  • 段大高 ,
  • 韩忠明
展开
  • 1. 北京工商大学计算机学院, 北京 100048;
    2. 北京工商大学人工智能学院, 北京 100048;
    3. 食品安全大数据技术北京市重点实验室, 北京 100048;
    4. 北京工商大学国际经管学院, 北京 100048
张翙,硕士研究生,研究方向为互联网数据挖掘,电子信箱:1930401027@st.btbu.edu.cn

收稿日期: 2021-03-13

  修回日期: 2021-07-29

  网络出版日期: 2021-12-21

基金资助

国家重点研发计划项目(2019YFC0507800);北京市自然科学基金项目(4172016);北京市教委科研计划一般项目(KM201710011006)

Knowledge graph representation learning method system in the era of artificial intelligence

  • ZHANG Hui ,
  • YANG Weijie ,
  • LIU Wenwen ,
  • ZHANG Xun ,
  • DUAN Dagao ,
  • HAN Zhongming
Expand
  • 1. School of Computer Science and Engineering, Beijing Technology and Business University, Beijing 100048, China;
    2. School of Artificial Intelligence, Beijing Technology and Business University, Beijing 100048, China;
    3. Beijing Key Laboratory of Big Data Technology for Food Safety, Beijing 100048, China;
    4. School of Economics and Management, Beijing Technology and Business University, Beijing 100048, China

Received date: 2021-03-13

  Revised date: 2021-07-29

  Online published: 2021-12-21

摘要

总结了不含辅助信息的知识图谱表示学习方法,主要是基于距离和基于语义匹配2类主流方法;研究了包含文本辅助信息和类别辅助信息的知识图谱表示学习方法;通过对比各类表示学习方法的优缺点,发现引入辅助信息能有效表达知识图谱中新实体,但时空开支大幅上升,因而在现阶段,不含辅助信息的方法更易应用于实际场景中。分析了知识图谱嵌入如何应用于三元组分类、链路预测、推荐系统等下游任务,整理归纳了应用于不同任务的数据集和开源库的集合,并展望了大规模、动态知识图谱等具有广泛应用前景的研究方向。

本文引用格式

张翙 , 杨伟杰 , 刘文文 , 张珣 , 段大高 , 韩忠明 . 人工智能时代知识图谱表示学习方法体系[J]. 科技导报, 2021 , 39(22) : 94 -110 . DOI: 10.3981/j.issn.1000-7857.2021.22.011

Abstract

In recent years, the knowledge graph representation learning has been used to represent the components of the knowledge graphs in a low-dimensional vector embedding, as a mainstream way to combine the artificial intelligence with the knowledge graphs. This paper reviews the mainstream knowledge graph representation learning methods without auxiliary information, mainly, the distance-based and the semantic matching-based methods, and the knowledge graph representation learning methods containing textual auxiliary information and category auxiliary information, along with the advantages and the disadvantages of various representation learning methods. It is found that the introduction of auxiliary information can effectively represent new entities and relationships in the knowledge graph, but the time and space costs are significantly increased, and thus the methods without auxiliary information are more easily applied in practical scenarios at this stage. Finally, we show how the knowledge graph embedding can be applied to downstream tasks such as the triad classification, the link prediction and the recommender systems. A collection of datasets and open source libraries for different tasks is compiled and, and a comprehensive outlook on promising research directions such as large-scale, dynamic knowledge graphs is given.

参考文献

[1] Petroni F, Rocktäschel T, Lewis P, et al. Language models as knowledge bases[C]//Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). Hong Kong:Association for Computational Linguistics, 2019:2463-2473.
[2] 刘峤, 李杨, 段宏, 等. 知识图谱构建技术综述[J]. 计算机研究与发展, 2016, 53(3):582-600.
[3] Hoffmann R, Zhang C, Ling X, et al. Knowledge-based weak supervision for information extraction of overlapping relations[C]//Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics:Human Language Technologies. Seattle, Washington, USA:Association for Computational Linguistics, 2011:541-550.
[4] Bordes A, Weston J, Usunier N. Open question answering with weakly supervised embedding models[C]//Joint European conference on machine learning and knowledge discovery in databases. Berlin, Heidelberg:Springer, 2014:165-180.
[5] 徐增林, 盛泳潘, 贺丽荣, 等. 知识图谱技术综述[J]. 电子科技大学学报, 2016, 45(4):589-606.
[6] Bordes A, Weston J, Collobert R, et al. Learning structured embeddings of knowledge bases[C]//Proceedings of the Twenty-Fifth AAAI Conference on Artificial Intelligence. San Francisco, California USA:The AAAI Press, 2011:301-306.
[7] Bordes A, Usunier N, Garcia-Duran A, et al. Translating embeddings for modeling multi-relational data[C]//Proceedings of the 26th International Conference on Neural Information Processing Systems. Red Hook, NY, USA:Curran Associates Inc., 2013:2787-2795.
[8] Weston J, Bordes A, Yakhnenko O, et al. Connecting language and knowledge bases with embedding models for relation extraction[C]//Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing. Seattle, Washington, USA:Association for Computational Linguistics, 2013:1366-1371.
[9] Nickel M, Tresp V, Kriegel H P. Factorizing yago:Scalable machine learning for linked data[C]//Proceedings of the 21st international conference on World Wide Web. Lyon, France:Association for Computing Machinery, 2012:271-280.
[10] Nickel M, Tresp V, Kriegel H P. A three-way model for collective learning on multi-relational data[C]//Proceedings of the 28th International Conference on International Conference on Machine Learning. Madison, WI, USA:Omnipress, 2011:809-816.
[11] Bordes A, Glorot X, Weston J, et al. A semantic matching energy function for learning with multi-relational data[J]. Machine Learning, 2014, 94(2):233-259.
[12] Blog G O. Introducing the knowledge graph:Thing, not strings[J]. Official Google Blog, 2012:1-8.
[13] Ehrlinger L, Wöß W. Towards a definition of knowledge graphs[J]. Association for Computing Machinery, 2016, 48:1-4.
[14] Wang Q, Mao Z, Wang B, et al. Knowledge graph embedding:A survey of approaches and applications[J]. IEEE Transactions on Knowledge and Data Engineering, 2017, 29(12):2724-2743.
[15] Turian J, Ratinov L, Bengio Y. Word representations:A simple and general method for semi-supervised learning[J]. Association for Computational Linguistics, 2010(7):384-394.
[16] Wang Z, Zhang J, Feng J, et al. Knowledge graph embedding by translating on hyperplanes[C]//Proceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence. Québec City, Québec, Canada:AAAI Press, 2014:1112-1119.
[17] Lin Y, Liu Z, Sun M, et al. Learning entity and relation embeddings for knowledge graph completion[C]//Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence. Austin, Texas:AAAI Press, 2015:2181-2187.
[18] Socher R, Chen D, Manning C D, et al. Reasoning with neural tensor networks for knowledge base completion[C]//Twenty-seventh Conference on Neural Information Processing Systems. Lake Tahoe, Nevada, United States:Curran Associates, Inc., 2013:926-934.
[19] He S, Liu K, Ji G, et al. Learning to represent knowledge graphs with gaussian embedding[C]//Proceedings of the 24th ACM International on Conference on Information and Knowledge Management. New York, NY, USA:Association for Computing Machinery, 2015:623-632.
[20] Xiao H, Huang M, Zhu X. TransG:A generative model for knowledge graph embedding[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. Berlin, Germany:Association for Computational Linguistics, 2016:2316-2325.
[21] Mikolov T, Yih W, Zweig G. Linguistic regularities in continuous space word representations[C]//Proceedings of the 2013 conference of the north american chapter of the association for computational linguistics:Human language technologies. Atlanta, Georgia:Association for Computational Linguistics, 2013:746-751.
[22] Yoon H G, Song H J, Park S B, et al. A translationbased knowledge graph embedding preserving logical property of relations[J]. Association for Computational Linguistics, 2016:907-916.
[23] Nguyen D Q, Sirts K, Qu L, et al. Stranse:A novel embedding model of entities and relationships in knowledge bases[J]. Association for Computational Linguistics, 2016:460-466.
[24] Ji G, He S, Xu L, et al. Knowledge graph embedding via dynamic mapping matrix[C]//Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing. Beijing, China:Association for Computational Linguistics, 2015:687-696.
[25] Ji G, Liu K, He S, et al. Knowledge graph completion with adaptive sparse transfer matrix[C]//Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence. Phoenix, Arizona:AAAI Press, 2016:985-991.
[26] Fan M, Zhou Q, Chang E, et al. Transition-based knowledge graph embedding with relational mapping properties[C]//Proceedings of the 28th Pacific Asia Conference on Language, Information and Computing. Chulalongkorn University, Phuket, Thailand:Department of Linguistics, 2014:328-337.
[27] Xiao H, Huang M, Zhu X. From one point to a manifold:Knowledge graph embedding for precise link prediction[C]//Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence. New York, USA:AAAI Press, 2016:1315-1321.
[28] Feng J, Huang M, Wang M, et al. Knowledge graph embedding by flexible translation[C]//Proceedings of the Fifteenth International Conference on Principles of Knowledge Representation and Reasoning. Cape Town, South Africa:AAAI Press, 2016:557-560.
[29] Xiao H, Huang M, Hao Y, et al. TransA:An adaptive approach for knowledge graph embedding[J]. arXiv preprint arXiv:1509.05490, 2015.
[30] Zhi S, Zhi D, Jian N, et al. RotatE:Knowledge graph embedding by relational rotation in complex space[C]//Seventh International Conference on Learning Representations. ICLR. Ernest N. Morial Convention Center, New Orleans. 2019:1-18.
[31] Zhan Z, Jian C, Yong Z, et al. Learning hierarchy-aware knowledge graph embeddings for link prediction[C]//The Thirty-Fourth AAAI Conference on Artificial Intelligence. New York, USA:AAAI Press. 2020:3065-3072.
[32] Jenatton R, Roux N, Bordes A, et al. A latent factor model for highly multi-relational data[C]//Proceedings of the 25th International Conference on Neural Information Processing Systems. Lake Tahoe, Nevada:Curran Associates Inc., 2012:3167-3175.
[33] García-Durán A, Bordes A, Usunier N. Effective blending of two and three-way interactions for modeling multi-relational data[C]//Joint European Conference on Machine Learning and Knowledge Discovery in Databases. Berlin, Heidelberg:Springer, 2014:434-449.
[34] Yang B, Yih W, He X, et al. Embedding entities and relations for learning and inference in knowledge bases[C]//International Conference on Learning Representations 2015. San Diego, CA, USA:Conference Track Proceedings, 2015:141-153.
[35] Nickel M, Rosasco L, Poggio T. Holographic embeddings of knowledge graphs[C]//Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence. Phoenix, Arizona:AAAI Press, 2016:1955-1961.
[36] Plate T A. Holographic reduced representations[J]. IEEE Transactions on Neural networks, 1995, 6(3):623-641.
[37] Trouillon T, Welbl J, Riedel S, et al. Complex embeddings for simple link prediction[C]//Proceedings of the 33rd International Conference on International Conference on Machine Learning-Volume 48, JMLR.org, New York, NY, USA:ACM, 2016:2071-2080.
[38] Hayashi K, Shimbo M. On the equivalence of holographic and complex embeddings for link prediction[J]. Association for Computational Linguistics, 2017:554-559.
[39] Liu H, Wu Y, Yang Y. Analogical inference for multi-relational embeddings[C]//Proceedings of the 34th International Conference on Machine Learning -Volume 70, JMLR.org. Sydney, NSW, Australia:ACM, 2017:2168-2178.
[40] Dong X, Gabrilovich E, Heitz G, et al. Knowledge vault:A web-scale approach to probabilistic knowledge fusion[C]//Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining. New York, USA:Association for Computing Machinery, 2014:601-610.
[41] Liu Q, Jiang H, Evdokimov A, et al. Probabilistic reasoning via deep learning:Neural association models[C]//25th International Joint Conference on Artificial Intelligence. New York City, NY, USA:Deep Learning for Artificial Intelligence, 2016:271-278.
[42] Dettmers T, Minervini P, Stenetorp P, et al. Convolutional 2D knowledge graph embeddings[C]//32nd AAAI Conference on Artificial Intelligence, AAAI 2018. New Orleans, Louisiana USA:AAAI Publications, 2018, 32:1811-1818.
[43] Dai Quoc Nguyen T D N, Nguyen D Q, Phung D. A novel embedding model for knowledge base completion based on convolutional neural network[C]//Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies, Volume 2(Short Papers). New Orleans, Louisiana:Association for Computational Linguistics, 2018:327-333.
[44] Balažević I, Allen C, Hospedales T M. Hypernetwork knowledge graph embeddings[C]//28th International Conference on Artificial Neural Networks. Munich, Germany:Springer, 2019:553-565.
[45] Wang Q, Huang P, Wang H, et al. CoKE:Contextualized knowledge graph embedding[J]. arXiv preprint arXiv:1911.02168, 2019.
[46] Yao L, Mao C, Luo Y. KG-BERT:BERT for knowledge graph completion[J]. arXiv preprint arXiv:1909.03193, 2019.
[47] Wang X, Gao T, Zhu Z, et al. KEPLER:A unified model for knowledge embedding and pre-trained language representation[J]. Transactions of the Association for Computational Linguistics, 2021(9):176-194.
[48] Yu D, Zhu C, Yang Y, et al. JAKET:Joint Pre-training of Knowledge Graph and Language Understanding[C]//Eighth International Conference on Learning Representations. Formerly Addis Ababa ETHIOPIA:ICLR, 2020:1-11.
[49] Devlin J, Chang M W, Lee K, et al. Bert:Pre-training of deep bidirectional transformers for language understanding[C]//Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies, Volume 1(Long and Short Papers). Minneapolis, Minnesota:Association for Computational Linguistics, 2018:4171-4186.
[50] Zaremoodi P, Buntine W, Haffari G. Adaptive knowledge sharing in multi-task learning:Improving low-resource neural machine translation[C]//Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2:Short Papers). Melbourne, Australia:Association for Computational Linguistics, 2018:656-661.
[51] Scarselli F, Gori M, Tsoi A C, et al. The graph neural network model[J]. IEEE Transactions on Neural Networks, 2008, 20(1):61-80.
[52] Schlichtkrull M, Kipf T N, Bloem P, et al. Modeling relational data with graph convolutional networks[C]//5th SemWebEval Challenge at ESWC 2018. Heraklion, Greece:Springer, 2018:593-607.
[53] Shang C, Tang Y, Huang J, et al. End-to-end structureaware convolutional networks for knowledge base completion[C]//The Thirty-Third AAAI Conference on Artificial Intelligence. Honolulu, Hawaii, USA:AAAI Press, 2019, 33:3060-3067.
[54] Vashishth S, Sanyal S, Nitin V, et al. Compositionbased multi-relational graph convolutional networks[C]//Eighth International Conference on Learning Representations. Formerly Addis Ababa ETHIOPIA:ICLR, 2020:1-15.
[55] Xie R, Liu Z, Jia J, et al. Representation learning of knowledge graphs with entity descriptions[C]//The Thirtieth AAAI Conference on Artificial Intelligence. Phoenix, Arizona, USA:AAAI Press, 2016:2659-2665.
[56] Xiao H, Huang M, Meng L, et al. SSP:Semantic space projection for knowledge graph embedding with text descriptions[C]//The Thirty-First AAAI Conference on Artificial Intelligence. San Francisco, California, USA:AAAI Press, 2017:3104-3110.
[57] Guo S, Wang Q, Wang B, et al. Semantically smooth knowledge graph embedding[C]//The 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1:Long Papers). Beijing, China:Association for Computational Linguistics, 2015(1):84-94.
[58] Xie R, Liu Z, Sun M. Representation learning of knowledge graphs with hierarchical types[C]//The TwentyFifth International Joint Conference on Artificial Intelligence (IJCAI). New York, USA:AAAI Press, 2016:2965-2971.
[59] Lin Y, Liu Z, Sun M. Knowledge representation learning with entities, attributes and relations[C]//The TwentyFifth International Joint Conference on Artificial Intelligence (IJCAI). New York, USA:AAAI Press, 2016:2866-2872.
文章导航

/