Personalizing web search and crawling from clickstream data
Tutor / director / evaluatorGavaldà Mestre, Ricard
Document typeMaster thesis
Rights accessOpen Access
Our aim is to improve web search engines, approaching the searching problem considering the user, his/her topics of interest and the navigation context. Furthermore, the clickstream also contains patterns inside. Our system will also try to predict the next pages that are going to be visited according to the clickstream. In a personalized search engine, two different users get different results for the same query, because the system considers the interests of each user separately. To personalize search, many sources of information can be used: the bookmarks of the user, his/her geographical location, his navigation history, etc. Web search engines have, broadly speaking, three basic phases. They are crawling, indexing and searching. The information available about the users interest can be considered in some of those three phases, depending on its nature. Work on search personalization already exists. We will see them in Chapter 3. In order to solve the problems of ignorance in relation to the user and his interests, we have developed a system that keeps track of the web pages that the user visits (his clickstream). Our system will analyze the clickstream, and will focus the crawling to pages related to the topics of interest of the user. Furthermore, each time the user executes a query, the system will consider his/her navigation context, and pages related to the navigation context will get better scores. Furthermore, our system also analyzes the clickstream of the user, and retrieves some navigation patterns from it. Those patterns will be used to give some navigation tips to the user based on his navigation context.