Reports de recercahttp://hdl.handle.net/2117/36882024-03-29T10:33:24Z2024-03-29T10:33:24ZInformàtica bàsica II, ETSEIB. Memòria del curs 1994-1995Vila Marta, SebastiàPla García, NúriaSoto Riera, AntoniPérez Vidal, LluísRoura Ferret, SalvadorFranquesa Niubó, MartaCotrina Navau, JosepAlquézar Mancho, RenéMartínez Parra, Conradohttp://hdl.handle.net/2117/3691472022-09-11T11:13:40Z2022-06-27T14:52:19ZInformàtica bàsica II, ETSEIB. Memòria del curs 1994-1995
Vila Marta, Sebastià; Pla García, Núria; Soto Riera, Antoni; Pérez Vidal, Lluís; Roura Ferret, Salvador; Franquesa Niubó, Marta; Cotrina Navau, Josep; Alquézar Mancho, René; Martínez Parra, Conrado
Memòria de l'assignatura d'Infomàtica bàsica II del curs 94-95.
2022-06-27T14:52:19ZVila Marta, SebastiàPla García, NúriaSoto Riera, AntoniPérez Vidal, LluísRoura Ferret, SalvadorFranquesa Niubó, MartaCotrina Navau, JosepAlquézar Mancho, RenéMartínez Parra, ConradoMemòria de l'assignatura d'Infomàtica bàsica II del curs 94-95.To be or nought to be: una qüestió irrellevant?Belanche Muñoz, Luis Antoniohttp://hdl.handle.net/2117/3302062020-10-17T02:58:26Z2020-10-14T09:43:58ZTo be or nought to be: una qüestió irrellevant?
Belanche Muñoz, Luis Antonio
2020-10-14T09:43:58ZBelanche Muñoz, Luis AntonioAbout the attribute relevance's natureNúñez Esquer, GustavoCortés García, Claudio UlisesBelanche Muñoz, Luis AntonioAlvarado Mentado, Matíashttp://hdl.handle.net/2117/3295272020-10-11T10:47:10Z2020-09-30T09:33:34ZAbout the attribute relevance's nature
Núñez Esquer, Gustavo; Cortés García, Claudio Ulises; Belanche Muñoz, Luis Antonio; Alvarado Mentado, Matías
The notion of relevance of an attribute in machine learning is of common use in the construction of classfication rules in inductive learning processes. In this work a formal definition of the relevance concept for a given set of attributes is proposed, which includes the special case of non-relevant attributes or nought attributes. We establish the theoretical conditions that must satisfy the heuristics used to select the potentially more useful attribute to a classification, showing that some of the problems some classical heuristics based upon the information theory present, are actually due to the fact that they do not fulfill those conditions. We propose an heuristic that does satisfy them, and not enhance attributes with a lot of values more accurate than the rest.
2020-09-30T09:33:34ZNúñez Esquer, GustavoCortés García, Claudio UlisesBelanche Muñoz, Luis AntonioAlvarado Mentado, MatíasThe notion of relevance of an attribute in machine learning is of common use in the construction of classfication rules in inductive learning processes. In this work a formal definition of the relevance concept for a given set of attributes is proposed, which includes the special case of non-relevant attributes or nought attributes. We establish the theoretical conditions that must satisfy the heuristics used to select the potentially more useful attribute to a classification, showing that some of the problems some classical heuristics based upon the information theory present, are actually due to the fact that they do not fulfill those conditions. We propose an heuristic that does satisfy them, and not enhance attributes with a lot of values more accurate than the rest.Enabling interpretation of the outcome of a human obesity prediction machine learning analysis from genomic dataBilal, AhsanVellido Alcacena, AlfredoRibas Ripoll, Vicenthttp://hdl.handle.net/2117/1794932020-07-23T22:19:25Z2020-03-10T11:09:01ZEnabling interpretation of the outcome of a human obesity prediction machine learning analysis from genomic data
Bilal, Ahsan; Vellido Alcacena, Alfredo; Ribas Ripoll, Vicent
In this brief paper, we address the medical problem of human obesity prediction from genomic data. Genomic datasets may contain a huge number of features and they often have to be analyzed within the realm of Big Data technologies. As a medical problem, obesity prediction would welcome interpretables outcomes. Therefore, the analyst would benefit from appraches in which the problem of very high data dimensionality could be eased as much as possible. Feature selection can be an essential part of such approaches. In this context, though, traditional machine learning methods may struggle. Here, we propose a pipeline to address this problem using partitioning strategies: both vertical, by dividing the data based on gender, and horizontal, by splitting each of the analyzed chromosomes into 5,000-instances subsets. For each, Minimum Redundancy and Maximum Relevance feature selection is used to find rankings of the single nucleotide polymorphisms most relevant for classification in the medical dataset.
2020-03-10T11:09:01ZBilal, AhsanVellido Alcacena, AlfredoRibas Ripoll, VicentIn this brief paper, we address the medical problem of human obesity prediction from genomic data. Genomic datasets may contain a huge number of features and they often have to be analyzed within the realm of Big Data technologies. As a medical problem, obesity prediction would welcome interpretables outcomes. Therefore, the analyst would benefit from appraches in which the problem of very high data dimensionality could be eased as much as possible. Feature selection can be an essential part of such approaches. In this context, though, traditional machine learning methods may struggle. Here, we propose a pipeline to address this problem using partitioning strategies: both vertical, by dividing the data based on gender, and horizontal, by splitting each of the analyzed chromosomes into 5,000-instances subsets. For each, Minimum Redundancy and Maximum Relevance feature selection is used to find rankings of the single nucleotide polymorphisms most relevant for classification in the medical dataset.Similarity networks for classification: a case study in the Horse Colic problemBelanche Muñoz, Luis AntonioHernández González, Jerónimohttp://hdl.handle.net/2117/994502020-07-23T22:29:13Z2017-01-17T12:32:58ZSimilarity networks for classification: a case study in the Horse Colic problem
Belanche Muñoz, Luis Antonio; Hernández González, Jerónimo
This paper develops a two-layer neural network in which the neuron model computes a user-defined similarity function between inputs and weights. The neuron transfer function is formed by composition of an adapted logistic function with the mean of the partial input-weight similarities. The resulting neuron model is capable of dealing directly with variables of potentially different nature (continuous, fuzzy, ordinal, categorical). There is also provision for missing values. The network is trained using a two-stage procedure very similar to that used to train a radial basis function (RBF) neural network. The network is compared to two types of RBF networks in a non-trivial dataset: the Horse Colic problem, taken as a case study and analyzed in detail.
2017-01-17T12:32:58ZBelanche Muñoz, Luis AntonioHernández González, JerónimoThis paper develops a two-layer neural network in which the neuron model computes a user-defined similarity function between inputs and weights. The neuron transfer function is formed by composition of an adapted logistic function with the mean of the partial input-weight similarities. The resulting neuron model is capable of dealing directly with variables of potentially different nature (continuous, fuzzy, ordinal, categorical). There is also provision for missing values. The network is trained using a two-stage procedure very similar to that used to train a radial basis function (RBF) neural network. The network is compared to two types of RBF networks in a non-trivial dataset: the Horse Colic problem, taken as a case study and analyzed in detail.Exploiting the accumulated evidence for gene selection in microarray gene expression dataPrat Masramon, GabrielBelanche Muñoz, Luis Antoniohttp://hdl.handle.net/2117/994012020-07-23T20:25:46Z2017-01-17T09:31:03ZExploiting the accumulated evidence for gene selection in microarray gene expression data
Prat Masramon, Gabriel; Belanche Muñoz, Luis Antonio
Machine Learning methods have of late made signicant efforts to solving multidisciplinary problems in the field of cancer classification using microarray gene expression data. Feature subset selection methods can play an important role in the modeling process, since these tasks are characterized by a large number of features and a few observations, making the modeling a non-trivial undertaking. In this particular scenario, it is extremely important to select genes by taking into account the possible interactions with other gene subsets. This paper shows that, by accumulating the evidence in favour (or against) each gene along the search process, the obtained gene subsets may constitute better solutions, either in terms of predictive accuracy or gene size, or in both. The proposed technique is extremely simple and applicable at a negligible overhead in cost.
2017-01-17T09:31:03ZPrat Masramon, GabrielBelanche Muñoz, Luis AntonioMachine Learning methods have of late made signicant efforts to solving multidisciplinary problems in the field of cancer classification using microarray gene expression data. Feature subset selection methods can play an important role in the modeling process, since these tasks are characterized by a large number of features and a few observations, making the modeling a non-trivial undertaking. In this particular scenario, it is extremely important to select genes by taking into account the possible interactions with other gene subsets. This paper shows that, by accumulating the evidence in favour (or against) each gene along the search process, the obtained gene subsets may constitute better solutions, either in terms of predictive accuracy or gene size, or in both. The proposed technique is extremely simple and applicable at a negligible overhead in cost.Similarity and dissimilarity concepts in machine learningOrozco Luquero, Jorgehttp://hdl.handle.net/2117/979752020-07-23T23:34:41Z2016-12-12T09:41:25ZSimilarity and dissimilarity concepts in machine learning
Orozco Luquero, Jorge
Similarity and dissimilarity are rarely formalized concepts in Artificial Intelligence (AI). Similarity and dissimilarity have a psychological origin, and they have been adapted to AI. In this field, however, similarity and dissimilarity choice is not always dependent on the problem to solve. In this paper, a formalization of similarity and dissimilarity is presented. The purpose of this paper is to contribute to the design and understanding of similarity and dissimilarity in AI, increasing their general utility. A formal definition and some basic properties are introduced. Also, some transformation functions and similarity and dissimilarity operators are presented.
2016-12-12T09:41:25ZOrozco Luquero, JorgeSimilarity and dissimilarity are rarely formalized concepts in Artificial Intelligence (AI). Similarity and dissimilarity have a psychological origin, and they have been adapted to AI. In this field, however, similarity and dissimilarity choice is not always dependent on the problem to solve. In this paper, a formalization of similarity and dissimilarity is presented. The purpose of this paper is to contribute to the design and understanding of similarity and dissimilarity in AI, increasing their general utility. A formal definition and some basic properties are introduced. Also, some transformation functions and similarity and dissimilarity operators are presented.Studying embedded human EEG dynamics using generative topographic mappingVellido Alcacena, AlfredoEl-Deredy, W.Lisboa, Paulo J Ghttp://hdl.handle.net/2117/979712020-07-23T23:34:47Z2016-12-12T09:34:58ZStudying embedded human EEG dynamics using generative topographic mapping
Vellido Alcacena, Alfredo; El-Deredy, W.; Lisboa, Paulo J G
A method has recently been proposed [1] to extract multiple signal source information from single-channel electroencephalogram (EEG) recordings. A dynamical systems approach is used to analyze the resulting EEG time series, and its dynamics are captured by the transformation of the original data into an embedding matrix residing in a Euclidean embedding space. Measurements in [1] are taken to be of ongoing unbounded EEG recordings. Many experiments concerning the study of cognitive tasks, though, are developed in a multi-subject repetitive setting where time-boundaries are defined in relation to the onset time of certain stimuli. Each repetition of an experiment is known as a trial and, although the experimental setting might induce to expect little variability amongst responses, the reality usually yields high inter-trial and inter-subject variability. Pooling all responses may mislead their interpretation. In this paper we resort to the Generative Topographic Mapping (GTM, [2]), a neural-network inspired but statistically principled unsupervised model, to achieve the following goals: First, the definition of groups of trials with intra-group similarities and inter-group differences in order to improve the interpretability of the results in the aforementioned experimental settings; second, the visualization of embedded EEG dynamics in a 2-dimensional latent space; finally, the study of the trajectories of these EEG dynamics over the GTM latent space representation, showing that transitions and stationary states in these trajectories correspond to special features in the time-power and time-frequency representations of the EEG data.
2016-12-12T09:34:58ZVellido Alcacena, AlfredoEl-Deredy, W.Lisboa, Paulo J GA method has recently been proposed [1] to extract multiple signal source information from single-channel electroencephalogram (EEG) recordings. A dynamical systems approach is used to analyze the resulting EEG time series, and its dynamics are captured by the transformation of the original data into an embedding matrix residing in a Euclidean embedding space. Measurements in [1] are taken to be of ongoing unbounded EEG recordings. Many experiments concerning the study of cognitive tasks, though, are developed in a multi-subject repetitive setting where time-boundaries are defined in relation to the onset time of certain stimuli. Each repetition of an experiment is known as a trial and, although the experimental setting might induce to expect little variability amongst responses, the reality usually yields high inter-trial and inter-subject variability. Pooling all responses may mislead their interpretation. In this paper we resort to the Generative Topographic Mapping (GTM, [2]), a neural-network inspired but statistically principled unsupervised model, to achieve the following goals: First, the definition of groups of trials with intra-group similarities and inter-group differences in order to improve the interpretability of the results in the aforementioned experimental settings; second, the visualization of embedded EEG dynamics in a 2-dimensional latent space; finally, the study of the trajectories of these EEG dynamics over the GTM latent space representation, showing that transitions and stationary states in these trajectories correspond to special features in the time-power and time-frequency representations of the EEG data.Exploring dopamine-mediated reward processing through the analysis of EEG-measured gamma-band brain oscillationsVellido Alcacena, AlfredoEl-Deredy, W.http://hdl.handle.net/2117/979702020-07-23T23:34:42Z2016-12-12T09:24:53ZExploring dopamine-mediated reward processing through the analysis of EEG-measured gamma-band brain oscillations
Vellido Alcacena, Alfredo; El-Deredy, W.
The central role of the dopamine system on reward brain processing is now quite well delimited. Its influence on other brain areas for learning and decision-making is still a matter of intense research. Most of this is based on fMRI imaging methods, which excel in terms of spatial resolution for source localization but lack the ability to trace the time-course of the signals. Incipient efforts have been made to address this issue from the point of view of EEG-measured brain oscillation theories. We review recent advances in this area and propose a broad framework for EEG-based reward processing analysis.
2016-12-12T09:24:53ZVellido Alcacena, AlfredoEl-Deredy, W.The central role of the dopamine system on reward brain processing is now quite well delimited. Its influence on other brain areas for learning and decision-making is still a matter of intense research. Most of this is based on fMRI imaging methods, which excel in terms of spatial resolution for source localization but lack the ability to trace the time-course of the signals. Incipient efforts have been made to address this issue from the point of view of EEG-measured brain oscillation theories. We review recent advances in this area and propose a broad framework for EEG-based reward processing analysis.Generative topographic mapping as a constrained mixture of student t-distributions: theoretical developmentsVellido Alcacena, Alfredohttp://hdl.handle.net/2117/979112020-07-23T23:34:14Z2016-12-09T08:46:55ZGenerative topographic mapping as a constrained mixture of student t-distributions: theoretical developments
Vellido Alcacena, Alfredo
The Generative Topographic Mapping (GTM: Bishop et al. 1998a), a non-linear latent variable model, was originally defined as constrained mixture of Gaussians. Gaussian mixture models are known to lack robustness in the presence of outlier observations in the data sample, and multivariate Student t-distributions have recently been put forward as a more robust alternative to deal with continuous data in this context.
2016-12-09T08:46:55ZVellido Alcacena, AlfredoThe Generative Topographic Mapping (GTM: Bishop et al. 1998a), a non-linear latent variable model, was originally defined as constrained mixture of Gaussians. Gaussian mixture models are known to lack robustness in the presence of outlier observations in the data sample, and multivariate Student t-distributions have recently been put forward as a more robust alternative to deal with continuous data in this context.