Frequent sets, sequences and taxonomies: new efficient algorithmic proposals
Document typeExternal research report
Rights accessOpen Access
We describe efficient algorithmic proposals to approach three fundamental problems in data mining: association rules, episodes in sequences, and generalized association rules over hierarchical taxonomies. The association rule discovery problem aims at identifying frequent itemsets in a database and then forming conditional implication rules among them. For this association task, we will introduce a new algorithmic proposal to reduce substantially the number of processed transactions. The resulting algorithm, called Ready-and-Go, is used to discover frequent sets efficiently. Then, for the discovery of patterns in sequences of events in ordered collections of data, we propose to apply the appropiate variant of that algorithm, and additionally we introduce a new framework for the formalization of the concept of intereseting episodes. Finally, we adapt our algorithm to the generalization of the frequent sets problem where data comes organized in taxonomic hierarchies, and here additionally we contribute with a new heuristic that, under certain natural conditions, improves the performance.
CitationBaixeries, J.; Casas, G.; Balcazar, J. "Frequent sets, sequences and taxonomies: new efficient algorithmic proposals". 2000.
Is part ofLSI-00-78-R