Learning Probabilistic Finite State Automata For Opponent Modelling

Cebrián Chuliá, Toni

Visualitza/Obre

Tesis-Cebrian.pdf (960,1Kb)

Veure estadístiques d'ús d'UPCommons

Estadístiques de LA Referencia / Recolecta

Cita com:

Mostra el registre d'ítem complet

Cebrián Chuliá, Toni

Tutor / directorAlquézar Mancho, René

; Sanfeliu Cortés, Alberto

Tipus de documentProjecte Final de Màster Oficial

Data2011-01

Condicions d'accésAccés obert

Attribution-NonCommercial-NoDerivs 3.0 Spain

Llevat que s'hi indiqui el contrari, els continguts d'aquesta obra estan subjectes a la llicència de Creative Commons : Reconeixement-NoComercial-SenseObraDerivada 3.0 Espanya

Abstract

Artificial Intelligence (AI) is the branch of the Computer Science field that tries to imbue intelligent behaviour in software systems. In the early years of the field, those systems were limited to big computing units where researchers built expert systems that exhibited some kind of intelligence. But with the advent of different kinds of networks, which the more prominent of those is the Internet, the field became interested in Distributed Artificial Intelligence (DAI) as the normal move. The field thus moved from monolithic software architectures for its AI sys- tems to architectures where several pieces of software were trying to solve a problem or had interests on their own. Those pieces of software were called Agents and the architectures that allowed the interoperation of multiple agents were called Multi-Agent Systems (MAS). The agents act as a metaphor that tries to describe those software systems that are embodied in a given environ- ment and that behave or react intelligently to events in the environment. The AI mainstream was initially interested in systems that could be taught to behave depending on the inputs perceived. However this rapidly showed ineffective because the human or the expert acted as the knowledge bottleneck for distilling useful and efficient rules. This was in best cases, in worst cases the task of enumerating the rules was difficult or plainly not affordable. This sparked the interest of another subfield, Machine Learning and its counter part in a MAS, Distributed Machine Learning. If you can not code all the scenario combinations, code within the agent the rules that allows it to learn from the environment and the actions performed. With this framework in mind, applications are endless. Agents can be used to trade bonds or other financial derivatives without human intervention, or they can be embedded in a robotics hardware and learn unseen map config- uration in distant locations like distant planets. Agents are not restricted to interactions with humans or the environment, they can also interact with other agents themselves. For instance, agents can negotiate the quality of service of a channel before establishing a communication or they can share information about the environment in a cooperative setting like robot soccer players. But there are some shortcomings that emerge in a MAS architecture. The one related to this thesis is that partitioning the task at hand into agents usually entails that agents have less memory or computing power. It is not economically feasible to replicate the big computing unit on each separate agent in our system. Thus we can say that we should think about our agents as computationally bounded , that is, they have a limited amount of computing power to learn from the environment. This has serious implications on the algorithms that are commonly used for learning in these settings. The classical approach for learning in MAS system is to use some variation of a Reinforcement Learning (RL) algorithm [BT96, SB98]. The main idea around those algorithms is that the agent has to maintain a table with the per- ceived value of each action/state pair and through multiple iterations obtain a set of decision rules that allows to take the best action for a given environment. This approach has several flaws when the current action depends on a single observation seen in the past (for instance, a warning sign that a robot per- ceives). Several techniques has been proposed to alleviate those shortcomings. For instance to avoid the combinatorial explosion of states and actions, instead of storing a table with the value of the pairs an approximating function like a neural network can be used instead. And for events in the past, we can extend the state definition of the environment creating dummy states that correspond to the N-tuple (stateN, stateN−1, . . . , stateN−t)

MatèriesSistemes multiagent (Informàtica), Multi-Agent systems

TitulacióMÀSTER UNIVERSITARI EN INTEL·LIGÈNCIA ARTIFICIAL (Pla 2009)

URIhttp://hdl.handle.net/2099.1/11152

Col·leccions

Màsters oficials - Master in Artificial Intelligence - MAI (Pla 2006) [73]

Veure estadístiques d'ús d'UPCommons

Mostra el registre d'ítem complet

Fitxers	Descripció	Mida	Format	Visualitza
Tesis-Cebrian.pdf		960,1Kb	PDF	Visualitza/Obre

UPCommons. Portal del coneixement obert de la UPC

Learning Probabilistic Finite State Automata For Opponent Modelling

Visualitza/Obre

Explora