Mostra el registre d'ítem simple

dc.contributor.authorBrumar, Iulian
dc.contributor.authorCasas, Marc
dc.contributor.authorMoretó Planas, Miquel
dc.contributor.authorValero Cortés, Mateo
dc.contributor.authorSohi, Gurindar S.
dc.contributor.otherUniversitat Politècnica de Catalunya. Departament d'Arquitectura de Computadors
dc.contributor.otherBarcelona Supercomputing Center
dc.date.accessioned2017-09-15T07:21:17Z
dc.date.available2017-09-15T07:21:17Z
dc.date.issued2017
dc.identifier.citationBrumar, I., Casas, M., Moreto, M., Valero, M., Sohi, G. ATM: approximate task memoization in the runtime system. A: IEEE International Parallel and Distributed Processing Symposium. "2017 IEEE 31st International Parallel and Distributed Processing Symposium: 29 May–2 June 2017, Orlando, Florida: proceedings". Orlando, Florida: Institute of Electrical and Electronics Engineers (IEEE), 2017, p. 1140-1150.
dc.identifier.isbn978-1-5386-3914-6
dc.identifier.urihttp://hdl.handle.net/2117/107646
dc.description.abstractRedundant computations appear during the execution of real programs. Multiple factors contribute to these unnecessary computations, such as repetitive inputs and patterns, calling functions with the same parameters or bad programming habits. Compilers minimize non useful code with static analysis. However, redundant execution might be dynamic and there are no current approaches to reduce these inefficiencies. Additionally, many algorithms can be computed with different levels of accuracy. Approximate computing exploits this fact to reduce execution time at the cost of slightly less accurate results. In this case, expert developers determine the desired tradeoff between performance and accuracy for each application. In this paper, we present Approximate Task Memoization (ATM), a novel approach in the runtime system that transparently exploits both dynamic redundancy and approximation at the task granularity of a parallel application. Memoization of previous task executions allows predicting the results of future tasks without having to execute them and without losing accuracy. To further increase performance improvements, the runtime system can memoize similar tasks, which leads to task approximate computing. By defining how to measure task similarity and correctness, we present an adaptive algorithm in the runtime system that automatically decides if task approximation is beneficial or not. When evaluated on a real 8-core processor with applications from different domains (financial analysis, stencil-computation, machine-learning and linear-algebra), ATM achieves a 1.4x average speedup when only applying memoization techniques. When adding task approximation, ATM achieves a 2.5x average speedup with an average 0.7% accuracy loss (maximum of 3.2%).
dc.description.sponsorshipThis work has been supported by the RoMoL ERC Advanced Grant (GA 321253), by the Spanish Government (grant SEV2015-0493 of the Severo Ochoa Program), by the Spanish Ministry of Science and Innovation (contracts TIN2015-65316), by Generalitat de Catalunya (contracts 2014-SGR-1051 and 2014-SGR-1272) and the European HiPEAC Network of Excellence. M. Moretó has been partially supported by the Ministry of Economy and Competitiveness under Juan de la Cierva postdoctoral fellowship number JCI-2012-15047. M. Casas is supported by the Secretary for Universities and Research of the Ministry of Economy and Knowledge of the Government of Catalonia and the Cofund programme of the Marie Curie Actions of the 7th R&D Framework Programme of the European Union (Contract 2013 BP B 00243). I. Brumar has been partially supported by the Spanish Ministry of Education, Culture and Sports under grant FPU2015/12849.
dc.format.extent11 p.
dc.language.isoeng
dc.publisherInstitute of Electrical and Electronics Engineers (IEEE)
dc.subjectÀrees temàtiques de la UPC::Informàtica::Arquitectura de computadors
dc.subject.lcshApproximation theory
dc.subject.lcshParallel processing (Electronic computers)
dc.subject.otherRuntime
dc.subject.otherProgramming
dc.subject.otherApproximate computing
dc.subject.otherHistory
dc.subject.otherRedundancy
dc.subject.otherData structures
dc.subject.otherParallel processing
dc.titleATM: approximate task memoization in the runtime system
dc.typeConference report
dc.subject.lemacAproximació, Teoria de l'
dc.subject.lemacProcessament en paral·lel (Ordinadors)
dc.contributor.groupUniversitat Politècnica de Catalunya. CAP - Grup de Computació d'Altes Prestacions
dc.identifier.doi10.1109/IPDPS.2017.49
dc.description.peerreviewedPeer Reviewed
dc.relation.publisherversionhttp://ieeexplore.ieee.org/abstract/document/7967204/
dc.rights.accessOpen Access
local.identifier.drac21185609
dc.description.versionPostprint (author's final draft)
dc.relation.projectidinfo:eu-repo/grantAgreement/MINECO//TIN2015-65316-P/ES/COMPUTACION DE ALTAS PRESTACIONES VII/
dc.relation.projectidinfo:eu-repo/grantAgreement/MINECO//TIN2015-65316-P/ES/COMPUTACION DE ALTAS PRESTACIONES VII/
dc.relation.projectidinfo:eu-repo/grantAgreement/EC/FP7/321253/EU/Riding on Moore's Law/ROMOL
local.citation.authorBrumar, I.; Casas, M.; Moreto, M.; Valero, M.; Sohi, G.
local.citation.contributorIEEE International Parallel and Distributed Processing Symposium
local.citation.pubplaceOrlando, Florida
local.citation.publicationName2017 IEEE 31st International Parallel and Distributed Processing Symposium: 29 May–2 June 2017, Orlando, Florida: proceedings
local.citation.startingPage1140
local.citation.endingPage1150


Fitxers d'aquest items

Thumbnail

Aquest ítem apareix a les col·leccions següents

Mostra el registre d'ítem simple