Show simple item record

dc.contributor.authorFont Valverde, Martí
dc.contributor.authorPuig Oriol, Xavier
dc.contributor.authorGinebra Molins, Josep
dc.contributor.otherUniversitat Politècnica de Catalunya. Departament d'Estadística i Investigació Operativa
dc.date.accessioned2012-01-25T12:11:32Z
dc.date.available2012-01-25T12:11:32Z
dc.date.created2011
dc.date.issued2011
dc.identifier.citationFont, M.; Puig, X.; Ginebra, J. Bayesian analysis of frequency count data. "Journal of statistical computation and simulation", 2011, p. 1-18.
dc.identifier.issn0094-9655
dc.identifier.urihttp://hdl.handle.net/2117/14798
dc.description.abstractThe zero truncated inverse Gaussian–Poisson model, obtained by first mixing the Poisson model assuming its expected value has an inverse Gaussian distribution and then truncating the model at zero, is very useful when modelling frequency count data. A Bayesian analysis based on this statistical model is implemented on the word frequency counts of various texts, and its validity is checked by exploring the posterior distribution of the Pearson errors and by implementing posterior predictive consistency checks. The analysis based on this model is useful because it allows one to use the posterior distribution of the model mixing density as an approximation of the posterior distribution of the density of the word frequencies of the vocabulary of the author, which is useful to characterize the style of that author. The posterior distribution of the expectation and of measures of the variability of that mixing distribution can be used to assess the size and diversity of his vocabulary. An alternative analysis is proposed based on the inverse Gaussian-zero truncated Poisson mixture model, which is obtained by switching the order of the mixing and the truncation stages. Even though this second model fits some of the word frequency data sets more accurately than the first model, in practice the analysis based on it is not as useful because it does not allow one to estimate the word frequency distribution of the vocabulary.
dc.format.extent18 p.
dc.language.isoeng
dc.rightsAttribution-NonCommercial-NoDerivs 3.0 Spain
dc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/3.0/es/
dc.subjectÀrees temàtiques de la UPC::Matemàtiques i estadística::Estadística aplicada
dc.subject.lcshBayesian statistical decision theory
dc.titleBayesian analysis of frequency count data
dc.typeArticle
dc.subject.lemacAnàlisi de dades
dc.subject.lemacVocabulari -- Models estadístics
dc.contributor.groupUniversitat Politècnica de Catalunya. GRESA - Grup de recerca en estadística aplicada
dc.identifier.doi10.1080/00949655.2011.600311
dc.rights.accessRestricted access - publisher's policy
local.identifier.drac8957862
dc.description.versionPostprint (published version)
local.citation.authorFont, M.; Puig, X.; Ginebra, J.
local.citation.publicationNameJournal of statistical computation and simulation
local.citation.startingPage1
local.citation.endingPage18


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record

Attribution-NonCommercial-NoDerivs 3.0 Spain
Except where otherwise noted, content on this work is licensed under a Creative Commons license : Attribution-NonCommercial-NoDerivs 3.0 Spain