专题:网络空间地理学理论与应用

面向大规模网络安全知识图谱的快速表示学习模型

  • 韩忠明 ,
  • 熊峙冰 ,
  • 陈福宇 ,
  • 杨伟杰 ,
  • 张珣
展开
  • 1. 北京工商大学国际经管学院,北京 100048
    2. 食品安全大数据技术北京市重点实验室,北京 100048
    3. 北京工商大学计算机学院,北京 100048
    4. 北京工商大学人工智能学院,北京 100048
韩忠明,教授,研究方向为互联网数据挖掘,电子信箱:hanzm@th.btbu.edu.cn

收稿日期: 2023-02-23

  修回日期: 2023-05-12

  网络出版日期: 2023-08-11

基金资助

国家重点研发计划项目(2019YFC0507800);北京市自然科学基金项目(4172016)

A fast representation learning model for large-scale cybersecurity knowledge graphs

  • HAN Zhongming ,
  • XIONG Zhibing ,
  • CHEN Fuyu ,
  • YANG Weijie ,
  • ZHANG Xun
Expand
  • 1. School of Economics and Management, Beijing Technology and Business University, Beijing 100048, China
    2. Beijing Key Laboratory of Big Data Technology for Food Safety, Beijing 100048, China
    3. School of Computer Science, Beijing Technology and Business University, Beijing 100048, China
    4. School of Artificial Intelligence, Beijing Technology and Business University, Beijing 100048, China

Received date: 2023-02-23

  Revised date: 2023-05-12

  Online published: 2023-08-11

摘要

针对大规模网络安全知识图谱表示学习训练速度慢、对头尾实体的关系表达缺乏的问题,提出一种基于随机游走的快速训练模型。该模型首先通过关系路径下的随机游走对整体知识图谱的实体进行初步训练表示;设计了主宾语嵌入,联合关系特定主语嵌入与关系特定宾语嵌入,学习知识图谱中关系的语法含义;再次通过关系路径下的随机游走辅助知识图谱的快速训练。在多个数据集上进行了大量实验,并与多个现有模型进行对比,结果表明,提出的模型能够缩短1/3的训练时间,提升约3%的表示效果,在加快知识图谱表示学习训练速度的同时,有效改善了表示学习的效果。

本文引用格式

韩忠明 , 熊峙冰 , 陈福宇 , 杨伟杰 , 张珣 . 面向大规模网络安全知识图谱的快速表示学习模型[J]. 科技导报, 2023 , 41(13) : 23 -31 . DOI: 10.3981/j.issn.1000-7857.2023.13.003

Abstract

This paper comes up with a fast-training model based on random walk to address the problems of slow training speed for representation learning of large-scale cybersecurity knowledge graph and lack of relational representation of head and tail entities,. The model first performs an initial training representation of the entities of the overall knowledge graph by random walk under relational paths, then, a subject-object embedding is designed to learn the syntactical meaning of the relations in the knowledge graph by combining the relation-specific subject embedding with the relation-specific object embedding. Finally, fast training of the knowledge graph is again assisted by random wandering under relational paths. In this paper, extensive experiments are conducted on several datasets and the results are compared with those using several existing models. The results show that the model proposed in this paper can shorten the training time by 1/3 and improve representation by about 3%, effectively improving the representation learning effect while speeding up the training speed of knowledge graph representation learning.

参考文献

[1] Bordes A, Usunier N, Garcia-Duran A, et al. Translating embeddings for embeddings for modeling multi-relational data[C]//Proceedings of the 26th International Conference on Neural Information Processing Systems 2013. Red Hook, NY, USA: Curran Associates Inc., 2013: 2782- 2795.
[2] Wang Z, Zhang J, Feng J, et al. Knowledge graph embedding by translating on hyperplanes[C]//Proceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence 2014. Québec City, Québec, Canada: AAAI Press, 2014: 1112-1119.
[3] Lin Y, Liu Z, Sun M, et al. Learning entity and relation embeddings for knowledge graph completion[C]//Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence 2015. Austin, Texas: AAAI Press, 2015: 2181-2187.
[4] Ji G, He S, Xu L, et al. Knowledge graph embedding via dynamic mapping matrix[C]//Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing 2015. Beijing, China: Association for Computational Linguistics, 2015: 687-696.
[5] Ji G, Liu K, He S, et al. Knowledge graph completion with adaptive sparse transfer matrix[C]//Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence 2016. Phoenix, Arizona: AAAI Press, 2016: 985-991.
[6] Fan M, Zhou Q, Chang E, et al. Transition-based knowledge graph embedding with relational mapping properties[C]//Proceedings of the 28th Pacific Asia Conference on Language, Information and Computing 2014. Chulalongkorn University, Phuket, Thailand: Department of Linguis⁃tics, 2014: 328-337.
[7] Xiao H, Huang M, Zhu X. From one point to a manifold: Knowledge graph embedding for precise link prediction[C]//Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence 2016. New York, USA: AAAI Press, 2016: 1315-1321.
[8] Feng J, Huang M, Wang M, et al. Knowledge graph embedding by flexible translation[C]//Proceedings of the Fifteenth International Conference on Principles of Knowledge Representation and Reasoning 2016. Cape Town, South Africa: AAAI Press, 2016: 557-560.
[9] Xiao H, Huang M, Hao Y, et al. TransA: An adaptive approach for knowledge graph embedding[J]. arXiv preprint, arXiv:1509.05490, 2015.
[10] He S, Liu K, Ji G, et al. Learning to represent knowledge graphs with gaussian embedding[C]//Proceedings of the 24th ACM International on Conference on Information and Knowledge Management 2015. NY, USA: Association for Computing Machinery, New York, 2015: 623-632.
[11] Xiao H, Huang M, Zhu X. TransG: A generative model for knowledge graph embedding[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics 2016. Berlin, Germany: Association for Computational Linguistics, 2016: 2316-2325.
[12] Sun Z Q, Deng Z H, Nie J Y, et al. Rotate: Knowledge graph embedding by relational rotation in complex space[C]//In International Conference on Learning Representations, Ernest N 2019. New Orleans: Morial Convention Center, 2019: 1-18.
[13] Zhang Z Q, Cai J Y, Zhang Y D, et al. Learning hierarchy-aware knowledge graph embeddings for link prediction[C]//The Thirty-Fourth AAAI Conference on Artificial Intelligence 2020. New York, USA: AAAI Press, 2020: 3065-3072.
[14] Nickel M, Tresp V, Kriegel H P. A three-way model for collective learning on multi-relational data[C]//Proceedings of the 28th International Conference on International Conference on Machine Learning 2011. Madison, WI, USA: Omnipress, 2011: 809-816.
[15] García-Durán A, Bordes A, Usunier N. Effective blending of two and three-way interactions for modeling multi-relational data[C]//Joint European Conference on Machine Learning and Knowledge Discovery in Databases. Berlin, Heidelberg: Springer, 2014: 434-449.
[16] Yang B, Yih W, He X, et al. Embedding entities and relations for learning and inference in knowledge bases[C]//International Conference on Learning Representations 2015. San Diego, CA, USA: Conference Track Proceedings, 2015: 141-153.
[17] Nickel M, Rosasco L, Poggio T. Holographic embeddings of knowledge graphs[C]//Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence 2016. Phoenix, Arizona: AAAI Press, 2016: 1955-1961.
[18] Trouillon T, Welbl J, Riedel S, et al. Complex Embeddings for Simple Link Prediction[C]//Proceedings of the 33rd International Conference on International Conference on Machine Learning-Volume 48 2016. New York, USA: JMLR.org, 2016: 2071-2080.
[19] Bordes A, Glorot X, Weston J, et al. A semantic matching energy function for learning with multi-relational data[J]. Machine Learning, 2014, 94(5): 233-259.
[20] Socher R, Chen D, Manning C D, et al. Reasoning with neural tensor networks for knowledge base completion[C]//Twenty-seventh Conference on Neural Information Processing Systems 2013. Lake Tahoe, Nevada, USA: Curran Associates, 2013: 926-934.
[21] Dong X, Gabrilovich E, Heitz G, et al. Knowledge vault: A web-scale approach to probabilistic knowledge fusion[C]//Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 2014. New York, USA: Association for Computing Machinery, 2014: 601-610.
[22] Liu Q, Jiang H, Evdokimov A, et al. Probabilistic reasoning via deep learning: Neural association models[C]//25th International Joint Conference on Artificial Intelligence 2016. New York, USA: Deep Learning for Artificial Intelligence, 2016: 271-278.
[23] Dettmers T, Minervini P, Stenetorp P, et al. Convolutional 2D knowledge graph embeddings[C]//32nd AAAI Conference on Artificial Intelligence, AAAI 2018. New Orleans, Louisiana USA: AAAI Publications, 2018: 1811-1818.
[24] Schlichtkrull M, Kipf T N, Bloem P, et al. Modeling relational data with graph convolutional networks[C]//European Semantic Web Conference. Cham: Springer, 2018:593-607.
[25] Shang C, Tang Y, Huang J, et al. End-to-end structureaware convolutional networks for knowledge base completion[C]. The Thirty-Third AAAI Conference on Artificial Intelligence. Honolulu, Hawaii, USA: AAAI Press, 2019, 33: 3060-3067.
[26] Vashishth S, Sanyal S, Nitin V, et al. Compositionbased multi-relational graph convolutional networks[J]. arXiv preprint, arXiv:1911.03082, 2019.
[27] Carl A, Ivana Balažević, Timothy H. Interpreting knowledge graph relation representation from word embeddings[C]//The Ninth International Conference on Learning Representations 2021. USA: Virtual Conference, 2021: 1-16.
[28] 方阳, 赵翔, 谭真, 等 . 一种改进的基于翻译的知识图谱表示方法[J]. 计算机研究与发展, 2018, 55(1): 139-150.
[29] 彭敏, 黄婷, 田纲, 等 . 聚合邻域信息的联合知识表示模型[J]. 中文信息学报, 2021, 35(5): 46-54.
[30] 李鑫超, 李培峰, 朱巧明. 一种基于改进向量投影距离的知识图谱表示方法[J]. 计算机科学, 2020, 47(4): 189-193.
[31] 文洋, 张茂元, 周礼全, 等 . 基于实体相似性的知识表示学习方法[J]. 计算机应用研究, 2021, 38(4): 1008-1012.
[32] 陈恒, 王维美, 李冠宇, 等 . 四元数关系旋转的知识图谱补全模型[J]. 计算机科学, 2021, 48(5): 225-231.
文章导航

/