Analysis of the use of obfusctated web tracking
Tutor / directorBarlet Ros, Pere
Document typeMaster thesis
Rights accessOpen Access
All rights reserved. This work is protected by the corresponding intellectual and industrial property rights. Without prejudice to any existing legal exemptions, reproduction, distribution, public communication or transformation of this work are prohibited without permission of the copyright holder
In the last years, web tracking has became a fast-growing phenomenon. Pro- filing users to provide targeted advertisement is a business that counts hundreds of companies and billions of dollars. On the other hand, communities, researchers and other companies are building countermeasures to prevent tracking practices, so the techniques are becoming more sophisticated and hidden. This work has the goal of uncovering the obfuscation that is becoming common in web tracking methods and, in particular a popular tracking method called canvas fingerprinting. The proposed approach could also be used in the future for other tracking techniques. Our tests seek also to uncover web tracking methods not situated in the home pages, but in the sub links, in order to discover if there is a substantial difference. We crawled more than 830K links presents in the home pages of the first 5K most visited web sites according to Alexa’s ranking. Our tool uncovered the real calls of the canvas fingerprinting method toDataURL(), making it impossible to hide by web trackers. The results showed that 12% of the analyzed domains have plain-text canvas fingerprinting methods in the home page, while 1,2% uses obfuscation and 86,8% is canvas free. On the other hand, when we analyzed the sub links, the percentage increased to 30,5% for plain-text canvas fingerprinting and to 10,5% for the obfuscated one, while only 59% of the domains were canvas free. In addition, we uncovered 2695 trackers and just the 3 most popular covered more than 20% of the visited domains. Finally we analyzed the files from where the tracking method was called, and we found out that the same tracking code is used in many different domains; the most widespread was tanxssp.js, present in 71 different domains.
DegreeMÀSTER UNIVERSITARI EN INNOVACIÓ I RECERCA EN INFORMÀTICA (Pla 2012)