Ir al contenido (pulsa Retorno)

Universitat Politècnica de Catalunya

    • Català
    • Castellano
    • English
    • LoginRegisterLog in (no UPC users)
  • mailContact Us
  • world English 
    • Català
    • Castellano
    • English
  • userLogin   
      LoginRegisterLog in (no UPC users)

UPCommons. Global access to UPC knowledge

Banner header
60.651 UPC E-Prints
You are here:
View Item 
  •   DSpace Home
  • E-prints
  • Grups de recerca
  • ADBD - Anàlisi de Dades Complexes per a les Decisions Empresarials
  • Articles de revista
  • View Item
  •   DSpace Home
  • E-prints
  • Grups de recerca
  • ADBD - Anàlisi de Dades Complexes per a les Decisions Empresarials
  • Articles de revista
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

Understanding complex predictive models with ghost variables

Thumbnail
View/Open
Authors' version of the paper (841,5Kb)
Supplements (401,6Kb)
Share:
 
 
10.1007/s11749-022-00826-x
 
  View Usage Statistics
Cita com:
hdl:2117/383386

Show full item record
Delicado Useros, Pedro FranciscoMés informacióMés informacióMés informació
Peña Sanchez de Rivera, Daniel
Document typeArticle
Defense date2022-08-24
PublisherSpringer
Rights accessOpen Access
Attribution 4.0 International
Except where otherwise noted, content on this work is licensed under a Creative Commons license : Attribution 4.0 International
ProjectESTRECHANDO LA BRECHA ENTRE LA ESTADISTICA Y LA CIENCIA DE DATOS (AEI-MTM2017-88142-P)
ESTADISTICA AVANZADA Y CIENCIA DE DATOS: INTERPRETANDO MODELOS CAJA-NEGRA Y ANALIZANDO CONJUNTOS DE DATOS GRANDES Y COMPLEJOS (AEI-PID2020-116294GB-I00)
Abstract
Framed in the literature on Interpretable Machine Learning, we propose a new procedure to assign a measure of relevance to each explanatory variable in a complex predictive model. We assume that we have a training set to fit the model and a test set to check its out-of-sample performance. We propose to measure the individual relevance of each variable by comparing the predictions of the model in the test set with those obtained when the variable of interest is substituted (in the test set) by its ghost variable, defined as the prediction of this variable by using the rest of explanatory variables. In linear models it is shown that, on the one hand, the proposed measure gives similar results to leave-one-covariate-out (loco, with a lowest computational cost) and outperforms random permutations, and on the other hand, it is strongly related to the usual F-statistic measuring the significance of a variable. In nonlinear predictive models (as neural networks or random forests) the proposed measure shows the relevance of the variables in an efficient way, as shown by a simulation study comparing ghost variables with other alternative methods (including loco and random permutations, and also knockoff variables and estimated conditional distributions). Finally, we study the joint relevance of the variables by defining the relevance matrix as the covariance matrix of the vectors of effects on predictions when using every ghost variable. Our proposal is illustrated with simulated examples and the analysis of a large real data set.
Description
The version of record of this article, first published in Test, is available online at Publisher’s website: http://dx.doi.org/10.1007/s11749-022-00826-x
CitationDelicado, P.; Peña, D. Understanding complex predictive models with ghost variables. "Test", 24 Agost 2022, vol. 32; núm. 1; p. 107–145 
URIhttp://hdl.handle.net/2117/383386
DOI10.1007/s11749-022-00826-x
ISSN1863-8260
Publisher versionhttps://link.springer.com/article/10.1007/s11749-022-00826-x
Collections
  • ADBD - Anàlisi de Dades Complexes per a les Decisions Empresarials - Articles de revista [109]
  • Departament d'Estadística i Investigació Operativa - Articles de revista [670]
Share:
 
  View Usage Statistics

Show full item record

FilesDescriptionSizeFormatView
Relevance_matrix_TEST_authors_version.pdfAuthors' version of the paper841,5KbPDFView/Open
Relevance_matrix_TEST_Suppls.pdfSupplements401,6KbPDFView/Open

Browse

This CollectionBy Issue DateAuthorsOther contributionsTitlesSubjectsThis repositoryCommunities & CollectionsBy Issue DateAuthorsOther contributionsTitlesSubjects

© UPC Obrir en finestra nova . Servei de Biblioteques, Publicacions i Arxius

info.biblioteques@upc.edu

  • About This Repository
  • Contact Us
  • Send Feedback
  • Privacy Settings
  • Inici de la pàgina