Comparative of resampling methods for predictive modeling in social networks
Document typeMaster thesis (pre-Bologna period)
Rights accessOpen Access
[ANGLÈS] The aim of this project is to give some insight within the issue of applying resampling methods over correlated sets of data for predictive modeling, specifically social networks. These resampling methods were constructed over the principle of independence between samples, a principle that is virtually never satisfied in relational data. This project constructs a probabilistic network model, referred to as ground truth, and observes the behavior and performance of a simple prediction rule in conjunction with cross-validation and bootstrapping resampling methods. This project also enters in the issue of maintaining, or not, the correlation in the attribute values of the nodes present on the original data when a specific resample, whether it is for train or test, is withdrawn. We call the process of eliminating this correlation as reconstruction; which is essentially rebuilding the network with the extracted resample and re-computing the nodes’ attributes, erasing the influence of the nodes that are not present in the set. The results show a thorough comparison of the different resampling methodologies and also a strong compromise in the estimations whether reconstruction is present or not.
Projecte realitzat en el marc d’un programa de mobilitat amb L'Illinois Institute of Technology in Chicago