On-line sampling methods for discovering association rules
Visualitza/Obre
Estadístiques de LA Referencia / Recolecta
Inclou dades d'ús des de 2022
Cita com:
hdl:2117/91378
Tipus de documentReport de recerca
Data publicació1999-02
Condicions d'accésAccés obert
Tots els drets reservats. Aquesta obra està protegida pels drets de propietat intel·lectual i
industrial corresponents. Sense perjudici de les exempcions legals existents, queda prohibida la seva
reproducció, distribució, comunicació pública o transformació sense l'autorització del titular dels drets
Abstract
Association rule discovery is one of the prototypical problems in
data mining. In this problem, the input database is assumed to be very
large and most of the algorithms are designed to minimize the number
of scans of the database. Enumerating association rules is usually an
expensive task due to the size of the input database. A proposed
approach for reducing the running time of this process is random
sampling. Of course, any implementation of
an algorithm that uses sampling must solve the problem of determining
which sample size is appropriate. Previous research of sampling for
association rule mining has approached this problem concluding that,
in general, the theoretically obtained sample size bounds are far from
what is observed in practice. In this paper, we try to reduce this
gap between theory and practice. We propose two on-line sampling
algorithms for association rule mining. Our algorithms maintain the
same theoretical guarantees of previous approaches while using a much
smaller number of transactions in most of the cases. In the experiments
we report, this improvement is often by an order of magnitude.
CitacióDomingo, C., Gavaldà, R., Watanabe, O. "On-line sampling methods for discovering association rules". 1999.
Forma partLSI-99-4-R
Col·leccions
Fitxers | Descripció | Mida | Format | Visualitza |
---|---|---|---|---|
R99-4(1).pdf | 226,5Kb | Visualitza/Obre |