## Graph representatives: two different approaches based on the median and the barycenter concepts

dc.contributor | Ferrer Sumsi, Miquel |

dc.contributor | Sanfeliu Cortés, Alberto |

dc.contributor.author | Bardaji Goikoetxea, Itziar |

dc.contributor.other | Universitat Politècnica de Catalunya. Departament d'Enginyeria de Sistemes, Automàtica i Informàtica Industrial |

dc.date.accessioned | 2014-01-31T09:32:55Z |

dc.date.issued | 2009-09 |

dc.identifier.uri | http://hdl.handle.net/2099.1/20562 |

dc.description.abstract | The straight advantages of the use of graphs for representation purposes appear to be useless in some applications due to the lack of mathematical structure in graph domains. An illustrative example is the problem of finding a representative of a set of graphs. While in vector spaces it is easy to compute representatives such as medians and means with respect to a wide range of distances, in the graph domain the analogy turns out to be a highly non-trivial task. In this work we introduce and compare two different approaches to compute a representative of a given set of graphs, namely the median graph and the barycenter graph. Such ideal concepts suffer from a prohibitive computational cost in practice. Therefore, approximate methods for their computation are presented. The median graph is defined as the graph minimizing the sum of distances (SOD) to all the graphs in a given set. A first option is to restrict the search of the graph with minimum SOD to the original given set, which leads to the so called set median graph. In order to overcome this restriction, this minimum can be taken over a more general search space. The resulting graph is called generalized median graph, which is expected to provide better representatives but is much harder to obtain. The strategy presented in this work for its computation consists in mapping graphs into a vector space and computing the median in this image space. The result is then mapped back to the graph domain, to obtain the median graph. Several variations of this standard methodology are presented. In a second approach we present the new concept of barycenter graph, more concretely, we present the set barycenter graph and generalized barycenter graph. The barycenter graph is inspired in the concept of barycenter or centroid in vector spaces, which minimizes the sum of the square of the distances (SOSD) to the vectors for which we want to find a representative. The strategy proposed to compute the generalized barycenter graph can be straightly performed in the graph domain, in analogy to the barycenter computation in Euclidean spaces, by iteratively computing the weighted mean of pairs of graphs. Thus, in this case no auxiliary vector space is needed. We present a collection of variants of this algorithm. Similarly to the case of the median graph, we search for the graph in the given set with minimum SOSD, mainly for evaluation purposes. In order to compare the different methods presented in the work, we perform some clustering experiments on one semi{artificial and three real-world datasets. The evaluation is carried out using standard clustering performance indexes. It is worth mentioning that, up to our knowledge, no algorithms for graph representatives' computation can be found which are able to handle the sizes of both graphs and graph-sets we work with. Results presented in this work lead to promising conclusions on the proposed techniques. The methods presented to compute the generalized median and barycenter graph outperform in general the set median and barycenter graphs, respectively. In the case of the median graph, also some previous methods are overcome. Finally, we come to the conclusion that none of the two approaches is clearly better than the other, but that it depends on the characteristics of the underlying data. In other words, that the new concept of barycenter graph is as good a representative as the median graph. |

dc.language.iso | eng |

dc.publisher | Universitat Politècnica de Catalunya |

dc.subject | Àrees temàtiques de la UPC::Informàtica::Intel·ligència artificial |

dc.subject.lcsh | Graph theory |

dc.subject.lcsh | Knowledge representation (Information theory) |

dc.title | Graph representatives: two different approaches based on the median and the barycenter concepts |

dc.type | Master thesis |

dc.subject.lemac | Grafs, Teoria de |

dc.subject.lemac | Representació del coneixement (Teoria de la informació) |

dc.rights.access | Restricted access - author's decision |

dc.date.lift | 10000-01-01 |

dc.audience.educationlevel | Màster |

dc.audience.mediator | Escola Tècnica Superior d'Enginyeria Industrial de Barcelona |