Abstract
XML (eXtensible Markup Language) is a standard and entirely user-driven language for storage and transfer of information. XML frequent itemsets are usually found for mining XML association rules from XML transactional databases. These XML frequent itemsets lead researchers to find interesting XML patterns in large databases with the use of a threshold value. Apriori algorithm is one of the most leading solutions to discover XML frequent itemsets based on support value. XML frequent itemsets consist of similar items which show evidence of association. This relationship can be found with the use of Bayesian Network by learning structure of XML frequent itemsets. K2 algorithm is used to learn the structure of XML frequent itemsets. In this work, we propose a novel Apriori K2 algorithm. This algorithm is composed novel direction of apriori and K2 algorithms to find XML frequent itemsets and learning a level-wise Bayesian Network structure. For learning each level of this structure, XML frequent itemsets are found from XML candidate itemsets with the use of support measure using apriori algorithm. An updated binary table is prepared based on XML frequent itemset during the execution of apriori algorithm. K2 algorithm is used in conjunction with apriori algorithm to learn Bayesian Network structure of XML large frequent itemsets and find their relationship at each level. We have extensively tested our solution over UCI machine learning datasets and measured its performance. The results have shown that performance of our proposed solution is better than the combined performance of apriori and K2 algorithms.