One of the core applications of machine learning to knowledge
discovery consists on building a function (a hypothesis) from a given
amount of data (for instance a decision tree or a neural network) such
that we can use it afterwards to predict new instances of the data. In
this paper, we focus on a particular situation where we assume that
the hypothesis we want to use for prediction is very simple, and thus,
the hypotheses class is of feasible size. We study the problem of how
to determine which of the hypotheses in the class is almost the best
one. We present two on-line sampling algorithms for selecting
hypotheses, give theoretical bounds on the number of necessary
examples, and analize them exprimentally. We compare them with the
simple batch sampling approach commonly used and show that in most of
the situations our algorithms use much fewer number of examples.
CitationDomingo, C., Gavaldà, R., Watanabe, O. "Practical algorithms for on-line sampling". 1998.
All rights reserved. This work is protected by the corresponding intellectual and industrial property rights. Without prejudice to any existing legal exemptions, reproduction, distribution, public communication or transformation of this work are prohibited without permission of the copyright holder. If you wish to make any use of the work not provided for in the law, please contact: email@example.com