Mostra el registre d'ítem simple

dc.contributor.authorBalle Pigem, Borja de
dc.contributor.authorCastro Rabal, Jorge
dc.contributor.authorGavaldà Mestre, Ricard
dc.contributor.otherUniversitat Politècnica de Catalunya. Departament de Llenguatges i Sistemes Informàtics
dc.date.accessioned2013-01-21T09:08:53Z
dc.date.available2013-01-21T09:08:53Z
dc.date.created2012
dc.date.issued2012
dc.identifier.citationB. Balle; Castro, J.; Gavaldà, R. Bootstrapping and learning PDFA in data streams. A: International Colloquim on Grammatical Inference. "Proceedings of the Eleventh International Conference on Grammatical Inference". Washington: 2012, p. 34-48.
dc.identifier.isbn1533-7928
dc.identifier.urihttp://hdl.handle.net/2117/17434
dc.descriptionBest Student Paper ICGI 2012
dc.description.abstractMarkovian models with hidden state are widely-used formalisms for modeling sequential phenomena. Learnability of these models has been well studied when the sample is given in batch mode, and algorithms with PAC-like learning guarantees exist for specic classes of models such as Probabilistic Deterministic Finite Automata (PDFA). Here we focus on PDFA and give an algorithm for infering models in this class under the stringent data stream scenario: unlike existing methods, our algorithm works incrementally and in one pass, uses memory sublinear in the stream length, and processes input items in amortized constant time. We provide rigorous PAC-like bounds for all of the above, as well as an evaluation on synthetic data showing that the algorithm performs well in practice. Our algorithm makes a key usage of several old and new sketching techniques. In particular, we develop a new sketch for implementing bootstrapping in a streaming setting which may be of independent interest. In experiments we have observed that this sketch yields important reductions in the examples required for performing some crucial statistical tests in our algorithm.
dc.format.extent15 p.
dc.language.isoeng
dc.subjectÀrees temàtiques de la UPC::Informàtica::Sistemes d'informació
dc.subject.lcshProbabilistic deterministic finite automata
dc.subject.lcshData streams
dc.subject.lcshBootstrapping
dc.titleBootstrapping and learning PDFA in data streams
dc.typeConference report
dc.subject.lemacAutòmats finits
dc.contributor.groupUniversitat Politècnica de Catalunya. LARCA - Laboratori d'Algorísmia Relacional, Complexitat i Aprenentatge
dc.description.peerreviewedPeer Reviewed
dc.description.awardwinningAward-winning
dc.relation.publisherversionhttp://jmlr.csail.mit.edu/proceedings/papers/v21/balle12a/balle12a.pdf
dc.rights.accessOpen Access
local.identifier.drac11055435
dc.description.versionPostprint (published version)
local.citation.authorB. Balle; Castro, J.; Gavaldà, R.
local.citation.contributorInternational Colloquim on Grammatical Inference
local.citation.pubplaceWashington
local.citation.publicationNameProceedings of the Eleventh International Conference on Grammatical Inference
local.citation.startingPage34
local.citation.endingPage48


Fitxers d'aquest items

Thumbnail

Aquest ítem apareix a les col·leccions següents

Mostra el registre d'ítem simple