因特网上的数据越来越多、越来越复杂,这些异构、动态、分布的信息使得传统数据挖掘方式已经不能达到实际要求。本文提出了一种面向web 数据挖掘的改进型迭代算法,将迭代方法与多服务器并行算法进行结合,并采用该算法建立了一个支持并行关联规则的web 数据挖掘模型,融合存储节点本地计算的思想。实验证明,该模型能够提高web 数据挖掘的效率,并有随着数据量增加执行率升高的特点。
							
							
														
														
														
														
						 
					 	
										
					
					
										
										
						
						
							
								With the increasing dependency of all aspects of social life on Internet, the data on the internet is becoming more and more massive, and also more complex. This heterogeneous and dynamic information which is also distributed makes the traditional data mining unable to achieve actual requirements. This paper proposes an improved iterative algorithm for web data mining: combining iteration method with a parallel algorithm. And a web data mining mode is set up by the algorithm with the idea of local computing of storage nodes, which supports the parallel association rule. Experimental results show that this mode can improve the efficiency of web data mining and its implementation rate will rise as the data quantity increases.
							
							
														
														
														
														
						 
					 	
										
					
					
					 
					
										
										
										
					
										
										
										
										
						
						
							[1] 黄晓霞, 萧蕴诗. 数据挖掘集成技术研究[J]. 计算机应用研究, 2003, 20(4): 37-39. Huang Xiaoxia, Xiao Yunshi. Research on the integration techniques of data mining[J]. Application Research of Computers, 2003, 20(4): 37-39.
[2] 李军怀, 周明全, 耿国华, 等. XML在异构数据集成中的应用研究[J]. 计算机应用, 2002, 22(9): 10-12. LI Junhuai, Zhou Mingquan, Geng Guohua, et al. Research and application of heterogeneous dataintegration based XML[J]. Computer Applications, 2002, 22(9): 10-12.
[3] 程苗. 基于云计算的Web数据挖掘[J]. 计算机科学, 2011(增1): 146-149. Cheng Miao. Web data mining based on cloud-computing[J]. Computer Science, 2011 (Suppl 1): 146-149.
[4] 胡开明, 陈建华. 一种改进的增量数据挖掘算法[J]. 计算机应用与软 件, 2011, 28(8): 260-264. Hu Kaiming, Chen Jianhua. An improved algorithm for incremental data mining[J]. Computer Applications and Software, 2011, 28(8): 260-264.
[5] 管忆军, 王勇, 何德牛. 一种采用函数迭代运算的数据流挖掘方法[J]. 广西民族大学学报, 2012, 18(1): 45-49. Guan Yijun, Wang Yong, He Deniu. A data stream mining approach based on function iterative opration[J]. Journal of Guangxi University for Nationalities, 2012, 18(1): 45-49.
[6] 张浩, 景凤宣, 谢晓尧. 基于数据挖掘关联规则Apriori改进算法的入侵检测系统的研究[J]. 贵州师范大学学报: 自然科学版, 2011, 29(3): 84-87. Zhang Hao, Jing Fengxuan, Xie Xiaoyao. The research of intrusion detection system based on improved apriori algorithm of data mining association rules[J]. Journal of Guizhou Normal University: Natural Sciences Edition, 2011, 29(3): 84-87.
[7] 彭宏玉, 柴旭光, 陈晓纪. 基于层次迭代思想的聚类算法的研究[J]. 唐 山学院学报, 2011, 24(3): 86-87, 91. Peng Hongyu, Chai Xuguang, Chen Xiaoji. The clustering algorithm of level lterated theory[J]. Journal of Tangshan College, 2011, 24(3): 86-87, 91.
[8] 赵洪英, 蔡乐才, 李先杰. 关联规则挖掘的Apriori算法综述[J]. 四川 理工学院学报: 自然科学版, 2011, 24(1): 66-70. Zhao Hongying, Cai Lecai, Li Xianjie. Overview of association rules apriori mining algorithm[J]. Journal of Sichuan University of Science & Engineering: Natural Science Edition, 2011, 24(1): 66-70.
[9] 柳莺, 赵艳红, 钱旭, 等. 数据仓库技术研究和应用探讨[J]. 计算机应 用, 2001, 21(2): 46-48. Liu Ying, Zhao Yanhong, Qian Xu, et al. Data warehouse technology research and application[J]. Computer Applications, 2001, 21(2): 46-48.
[10] 赵虎. 云计算环境下的关联数据挖掘算法实现[D]. 成都: 电子科技 大学, 2011. Zhao Hu. The implementation of association data mining algorithm In the environment of cloud computing[D]. Chengdu: University of Electronic Science Technology of China, 2011.