Utilització d'Open Source Software per l'analisi de big data

View/Open
Author's e-mailamartinezotal
gmail.com, jordi.eetac@gmail.com

Document typeBachelor thesis
Date2016-09-14
Rights accessOpen Access
Except where otherwise noted, content on this work
is licensed under a Creative Commons license
:
Attribution-NonCommercial-ShareAlike 3.0 Spain
Abstract
In this project we aim to identify, analyze and justify the contribution that the data can do in big businesses or schools by creating added value from the data we collect. Being a relatively new concept, Big data, differential attributes, its purpose and different traditional data mining methodologies will be defined. We also want to highlight as Big data can become a source of competitive advantage with technologies like Hadoop and Spark as an alternative storage and processing of high volumes of data, also via large tools preprocessing data such as programs Microsoft Excel or more specific pre tools data processing as WEKA. Also since we have the opportunity to work with a tool in our work experience, closely related to our work, we will exploit some functionality in order to learn as much as possible about the analysis monitoring and further processing of our great data volumes. This work aims to define the scenarios in which Hadoop and Spark can be used instead of the classic models of storage as well as the possible reuse infrastructure Business existing Intelligence to use Hadoop and Spark as a data source further than the existing EDW. With this research we pretend to defiance the scenarios in which the Open Software tools that work with Big Data can be user instead of the classic models for data storage, as much as the usage of the infrastructure that existing Machine Learning algorithms provide to use those tools. We will also focus on one of our cases is in the academic field by gathering information about the "logs" we have from the students during the semester, so that we can handle and manage that information in a more useful way and always thinking for the students to predict future outcomes by their trajectories and so to give a feedback to the teacher or tutor before the end of the course.
DegreeGRAU EN ENGINYERIA TELEMÀTICA (Pla 2009)
Files | Description | Size | Format | View |
---|---|---|---|---|
memoria.pdf | 5,776Mb | View/Open |