Análisis de datos de Twitter mediante servicios en la nube
memoria.pdf (2,285Mb) (Restricted access) Request copy
Què és aquest botó?
Aquest botó permet demanar una còpia d'un document restringit a l'autor. Es mostra quan:
- Disposem del correu electrònic de l'autor
- El document té una mida inferior a 20 Mb
Tutor / director / evaluatorRoyo Vallés, María Dolores
Document typeBachelor thesis
Rights accessRestricted access - author's decision
Twitter Analytics through cloud computing services This Project aims to deliver a full analytical process to transform big volumes of Twitter data into valuable information ready to be used. Nowadays, there are many social networks that allow us collect their platform information to be used for third party services. In this case, I chose Twitter as I considered that this network is one of the most popular all over the world and it has a more informative behaviour than its competitors. One of the main usages of Twitter is daily news post, this fact is very significant as it makes people share their opinion and create topic discussions about anything. Regarding this, the potential analysis about social trends and society vision are many and the final results will be much stronger than if we had used a more personal social network (Facebook, Instagram...). This type of analysis may grow exponentially and the technologies associated with this processes have to be flexible and scalable enough to not limit the users in their analysis. For this reason, I have used cloud services and technologies related to Big Data to fix and prevent this type of issues. In order to build this analytical process I have splitted the project in different parts, each one with each goals and features. In the next points I will introduce each of these parts and summarize their content: Data Collection: In this part of the process I built the logic to collect all Twitter information desired. The main objective for the data collection phase was to build a methodology to allow dynamic customization of the users, trends and key words that wanted to be downloaded from Twitter. To collect the data I used Twitter API and its documentation for JAVA to establish connectivity between my code and the different API offered by Twitter (users, streaming API...) Data Processing and Load: In the Data processing and load part I used Apache Beam technologies and Google Cloud Platform to build the needed pipelines to load data in Google BigQuery (cloud database). Objectives in this part were building streaming and batch loads to ingest data from twitter. Data Wrangling & Transforming: In this stage I used BigQuery features to model raw data and optimise its usage by third party services. Data Visualization: Finally, I built an example of visualization tool using .Net. The goal for this part was creating a user friendly and intuitive interface that allowed end users perform their analysis in an easy and understandable way. This project objective was to build a Twitter data analysis process that could be integrated in any enterprise environment and had advantages regarding the present tools in the market. The product built has many good features but one of the best things is the potential upgrades that can be done to build a complete product capable of making multiple analysis like forecasting, competitors analysis, own Twitter accounts audit...