Ir al contenido (pulsa Retorno)

Universitat Politècnica de Catalunya

    • Català
    • Castellano
    • English
    • AccederRegistro (usuarios no UPC)Acceder (usuarios no UPC)
  • mailContacto
  • world Castellano 
    • Català
    • Castellano
    • English
  • userInicia sesión   
      AccederRegistro (usuarios no UPC)Acceder (usuarios no UPC)

UPCommons. Portal de acceso abierto al conocimiento de la UPC

Banner header
66.636 Trabajos académicos UPC
You are here:
Ver ítem 
  •   UPCommons
  • Treballs acadèmics
  • Màsters oficials
  • Master in Innovation and Research in Informatics - MIRI
  • Ver ítem
  •   UPCommons
  • Treballs acadèmics
  • Màsters oficials
  • Master in Innovation and Research in Informatics - MIRI
  • Ver ítem
JavaScript is disabled for your browser. Some features of this site may not work without it.

Data locality in Hadoop

Thumbnail
Ver/Abrir
116331.pdf (1,111Mb)
  Ver Estadísticas de uso
  Estadisticas de LA Referencia / Recolecta
Cita com:
hdl:2117/104450

Mostrar el registro completo del ítem
Kaluzka, Justyna
Tutor / directorRomero Moral, ÓscarMés informacióMés informacióMés informació; Jovanovic, PetarMés informacióMés informacióMés informació
Tipo de documentoProjecte Final de Màster Oficial
Fecha2016-07
Condiciones de accesoAcceso abierto
Todos los derechos reservados. Esta obra está protegida por los derechos de propiedad intelectual e industrial. Sin perjuicio de las exenciones legales existentes, queda prohibida su reproducción, distribución, comunicación pública o transformación sin la autorización del titular de los derechos
Resumen
Current market tendencies show the need of storing and processing rapidly growing amounts of data. Therefore, it implies the demand for distributed storage and data processing systems. The Apache Hadoop is an open-source framework for managing such computing clusters in an effective, fault-tolerant way. Dealing with large volumes of data, Hadoop, and its storage system HDFS (Hadoop Distributed File System), face challenges to keep the high efficiency with computing in a reasonable time. The typical Hadoop implementation transfers computation to the data, rather than shipping data across the cluster. Otherwise, moving the big quantities of data through the network could significantly delay data processing tasks. However, while a task is already running, Hadoop favours local data access and chooses blocks from the nearest nodes. Next, the necessary blocks are moved just when they are needed in the given ask. For supporting the Hadoop’s data locality preferences, in this thesis, we propose adding an innovative functionality to its distributed file system (HDFS), that enables moving data blocks on request. In-advance shipping of data makes it possible to forcedly redistribute data between nodes in order to easily adapt it to the given processing tasks. New functionality enables the instructed movement of data blocks within the cluster. Data can be shifted either by user running the proper HDFS shell command or programmatically by other module like an appropriate scheduler. In order to develop such functionality, the detailed analysis of Apache Hadoop source code and its components (specifically HDFS) was conducted. Research resulted in a deep understanding of internal architecture, what made it possible to compare the possible approaches to achieve the desired solution, and develop the chosen one.
MateriasManagement information systems, Sistemes d'informació per a la gestió
TitulaciónMÀSTER UNIVERSITARI EN INNOVACIÓ I RECERCA EN INFORMÀTICA (Pla 2012)
URIhttp://hdl.handle.net/2117/104450
Colecciones
  • Màsters oficials - Master in Innovation and Research in Informatics - MIRI [425]
  Ver Estadísticas de uso

Mostrar el registro completo del ítem

FicherosDescripciónTamañoFormatoVer
116331.pdf1,111MbPDFVer/Abrir

Listar

Esta colecciónPor fechaAutoresOtras contribucionesTítulosMateriasEste repositorioComunidades & coleccionesPor fechaAutoresOtras contribucionesTítulosMaterias

© UPC Obrir en finestra nova . Servei de Biblioteques, Publicacions i Arxius

info.biblioteques@upc.edu

  • Sobre esta web
  • Contacto
  • Sugerencias
  • Configuración de privacidad
  • Inici de la pàgina