基于隐马尔科夫模型的浏览兴趣预测

孙秀娟;金民锁;陈孝国

科技导报 ›› 2009, Vol. 27 ›› Issue (0918) : 75-77.

PDF(310 KB)
PDF(310 KB)
科技导报 ›› 2009, Vol. 27 ›› Issue (0918) : 75-77.
研究论文

基于隐马尔科夫模型的浏览兴趣预测

作者信息 -
1. 黑龙江科技学院 数力系2. 黑龙江科技学院信息网络中心

Prediction Based on Hidden Markov Model

Author information -

摘要

Web上的信息量正以惊人的速度增加,人们迫切需要能自动地从Web上发现、抽取和过滤信息的工具,即如何从数以亿计的页面中发现需要的内容、如何从大量的访问中发现固有的模式和关联。马尔科夫模型的网页浏览预测,仅仅从用户的浏览网页本身出发,预测用户的下一步链接,并不能捕获到用户的真正兴趣。本文提出基于隐马尔科夫模型的网页浏览路径预测,并将其与基于马尔科夫模型的方法进行对比。根据已知的浏览序列判断用户的类别,当浏览序列长度很短时,本文方法的预测准确性比马尔科夫模型低。这是由于序列长度过短,系统获取判断的信息少,增加了对用户错误分类的可能性。随着浏览序列长度逐渐增加,系统捕获的用户浏览信息越来越多,进而能够折射出用户的兴趣所在,预测准确率也逐步增加。当浏览序列长度大于或等于8时,预测准确率已经到达80%,提高了浏览兴趣预测的准确率。

Abstract

The amount of information on web is increasing at an alarming rate. It is an urgent need to find tools to automatically obtain, extract and filter information from web, from hundreds of millions of pages to find the content in need, to find related patterns and associations. Markov model predicts the user's next link, only from the user's browser start page, which does not involve the real interest of the user. In this paper, HMM-based prediction of the web browser path is presented. First of all, according to known sequences to determine the browser type of user. As can be seen for the browser with a very short sequence length, the accuracy of the forecasts is lower than the Markov model. This is due to the short sequence length, the system can access only limited information to make judgement, with more classification of user errors as more likely. However, with a gradual increase in sequence length, the system may capture more and more user's browsing information to reflect user's interest and to increase the accuracy of prediction. When the sequence length is greater than or equal to 8, the forecasting accuracy rate reaches 80%.

关键词

马尔科夫模型 / 浏览预测 / Web使用挖掘 / 聚类

Key words

Markov model / research prediction / web usage mining / cluster

引用本文

导出引用
孙秀娟;金民锁;陈孝国. 基于隐马尔科夫模型的浏览兴趣预测[J]. 科技导报, 2009, 27(0918): 75-77
Prediction Based on Hidden Markov Model[J]. Science & Technology Review, 2009, 27(0918): 75-77
PDF(310 KB)

190

Accesses

0

Citation

Detail

段落导航
相关文章

/