A searching agent on the web
Document typeMaster thesis (pre-Bologna period)
Rights accessRestricted access - confidentiality agreement
The main goal of the project will be to develop an agent which is able to learn to search information related to a query and present a summary of what it has found. Specifically, our agent will receive a domain and a query, and will make the summary from pages in the input domain. This idea is not new but our approach has not been deeply explored. Current search engines use expensive indexes to guide the search, but this project will use an idea which allows to reduce exponentially this index or even remove it. To do this we will use Sequential Modeling techniques and Reinforcement learning to learn to search on the web and combine them with Automatic summarizing techniques to present the information to the user. As this approach is almost new, this is a high risk research project and it may lead to poor results or be unable to explore deeply all possibilities and leave a lot of opened questions. Once the agent has learned to search, the user can give it a query and a domain and the agent will start to search then. Thus, the advantages of this agent are that he does not need a huge index to work and that Web changes do not affect it. Actually, it only needs to store the model to search and this is nothing compared to the index of a whole domain. The first task will be giving a theoretic context where we should be able to model Web Search as a Reinforcement Learning problem. Using this model we will define three different problems, Crawling, Search and Gather. Having in mind our main goal we will focus on last two problems and will design solutions and evaluate them experimentally. In order to evaluate our solutions another important task will be implementing independent libraries of reinforcement learning and automatic summarizing.
Projecte realitzat en el marc d’un programa de mobilitat amb la Universit e Pierre et Marie Curie of Paris