Mostra el registre d'ítem simple

dc.contributor.authorDelicado Useros, Pedro Francisco
dc.contributor.authorPeña Sanchez de Rivera, Daniel
dc.contributor.otherUniversitat Politècnica de Catalunya. Departament d'Estadística i Investigació Operativa
dc.date.accessioned2023-02-15T13:48:58Z
dc.date.available2023-02-15T13:48:58Z
dc.date.issued2022-08-24
dc.identifier.citationDelicado, P.; Peña, D. Understanding complex predictive models with ghost variables. "Test", 24 Agost 2022, vol. 32; núm. 1; p. 107–145
dc.identifier.issn1863-8260
dc.identifier.urihttp://hdl.handle.net/2117/383386
dc.descriptionThe version of record of this article, first published in Test, is available online at Publisher’s website: http://dx.doi.org/10.1007/s11749-022-00826-x
dc.description.abstractFramed in the literature on Interpretable Machine Learning, we propose a new procedure to assign a measure of relevance to each explanatory variable in a complex predictive model. We assume that we have a training set to fit the model and a test set to check its out-of-sample performance. We propose to measure the individual relevance of each variable by comparing the predictions of the model in the test set with those obtained when the variable of interest is substituted (in the test set) by its ghost variable, defined as the prediction of this variable by using the rest of explanatory variables. In linear models it is shown that, on the one hand, the proposed measure gives similar results to leave-one-covariate-out (loco, with a lowest computational cost) and outperforms random permutations, and on the other hand, it is strongly related to the usual F-statistic measuring the significance of a variable. In nonlinear predictive models (as neural networks or random forests) the proposed measure shows the relevance of the variables in an efficient way, as shown by a simulation study comparing ghost variables with other alternative methods (including loco and random permutations, and also knockoff variables and estimated conditional distributions). Finally, we study the joint relevance of the variables by defining the relevance matrix as the covariance matrix of the vectors of effects on predictions when using every ghost variable. Our proposal is illustrated with simulated examples and the analysis of a large real data set.
dc.language.isoeng
dc.publisherSpringer
dc.rightsAttribution 4.0 International
dc.rights.urihttp://creativecommons.org/licenses/by/4.0/
dc.subjectÀrees temàtiques de la UPC::Matemàtiques i estadística::Anàlisi matemàtica
dc.subject.lcshMathematical statistics
dc.subject.otherExplainable artificial intelligence
dc.subject.otherEstimated conditional distributions
dc.subject.otherInterpretable machine learning
dc.subject.otherKnockoffs
dc.subject.otherLeave-one-covariate-out
dc.subject.otherOut-of-sample prediction
dc.subject.otherPartial correlation matrix
dc.subject.otherRandom permutations
dc.titleUnderstanding complex predictive models with ghost variables
dc.typeArticle
dc.subject.lemacEstadística matemàtica
dc.contributor.groupUniversitat Politècnica de Catalunya. ADBD - Anàlisi de Dades Complexes per a les Decisions Empresarials
dc.identifier.doi10.1007/s11749-022-00826-x
dc.description.peerreviewedPeer Reviewed
dc.subject.amsClassificació AMS::62 Statistics::62G Nonparametric inference
dc.subject.amsClassificació AMS::68 Computer science::68T Artificial intelligence
dc.relation.publisherversionhttps://link.springer.com/article/10.1007/s11749-022-00826-x
dc.rights.accessOpen Access
local.identifier.drac34221675
dc.description.versionPostprint (author's final draft)
dc.relation.projectidinfo:eu-repo/grantAgreement/AEI/Plan Estatal de Investigación Científica y Técnica y de Innovación 2013-2016/MTM2017-88142-P/ES/ESTRECHANDO LA BRECHA ENTRE LA ESTADISTICA Y LA CIENCIA DE DATOS/
dc.relation.projectidinfo:eu-repo/grantAgreement/AEI/Plan Estatal de Investigación Científica y Técnica y de Innovación 2017-2020/PID2020-116294GB-I00/ES/ESTADISTICA AVANZADA Y CIENCIA DE DATOS: INTERPRETANDO MODELOS CAJA-NEGRA Y ANALIZANDO CONJUNTOS DE DATOS GRANDES Y COMPLEJOS/
local.citation.authorDelicado, P.; Peña, D.
local.citation.publicationNameTest


Fitxers d'aquest items

Thumbnail
Thumbnail

Aquest ítem apareix a les col·leccions següents

Mostra el registre d'ítem simple