Architectural support for high-performing hardware transactional memory systems
View/Open
Cita com:
hdl:2117/94562
Chair / Department / Institute
Universitat Politècnica de Catalunya. Departament d'Arquitectura de Computadors
Document typeDoctoral thesis
Data de defensa2011-12-23
PublisherUniversitat Politècnica de Catalunya
Rights accessOpen Access
All rights reserved. This work is protected by the corresponding intellectual and industrial
property rights. Without prejudice to any existing legal exemptions, reproduction, distribution, public
communication or transformation of this work are prohibited without permission of the copyright holder
Abstract
Parallel programming presents an efficient solution to exploit future multicore processors.
Unfortunately, traditional programming models depend on programmer’s skills for synchronizing
concurrent threads, which makes the development of parallel software a hard and errorprone
task. In addition to this, current synchronization techniques serialize the execution of
those critical sections that conflict in shared memory and thus limit the scalability of multithreaded
applications.
Transactional Memory (TM) has emerged as a promising programming model that solves
the trade-off between high performance and ease of use. In TM, the system is in charge of
scheduling transactions (atomic blocks of instructions) and guaranteeing that they are executed
in isolation, which simplifies writing parallel code and, at the same time, enables high concurrency
when atomic regions access different data. Among all forms of TM environments,
Hardware TM (HTM) systems is the only one that offers fast execution at the cost of adding
dedicated logic in the processor.
Existing HTMsystems suffer considerable delays when they execute complex transactional
workloads, especially when they deal with large and contending transactions because they lack
adaptability. Furthermore, most HTM implementations are ad hoc and require cumbersome
hardware structures to be effective, which complicates the feasibility of the design. This thesis
makes several contributions in the design and analysis of low-cost HTMsystems that yield good
performance for any kind of TM program.
Our first contribution, FASTM, introduces a novel mechanism to elegantly manage speculative
(and already validated) versions of transactional data by slightly modifying on-chip memory
engine. This approach permits fast recovery when a transaction that fits in private caches is discarded.
At the same time, it keeps non-speculative values in software, which allows in-place
x
memory updates. Thus, FASTM is not hurt from capacity issues nor slows down when it has to
undo transactional modifications.
Our second contribution includes two different HTM systems that integrate deferred resolution
of conflicts in a conventional multicore processor, which reduces the complexity of the
system with respect to previous proposals. The first one, FUSETM, combines different-mode
transactions under a unified infrastructure to gracefully handle resource overflow. As a result,
FUSETM brings fast transactional computation without requiring additional hardware nor extra
communication at the end of speculative execution. The second one, SPECTM, introduces a
two-level data versioning mechanism to resolve conflicts in a speculative fashion even in the
case of overflow.
Our third and last contribution presents a couple of truly flexible HTM systems that can
dynamically adapt their underlying mechanisms according to the characteristics of the program.
DYNTM records statistics of previously executed transactions to select the best-suited strategy
each time a new instance of a transaction starts. SWAPTM takes a different approach: it tracks
information of the current transactional instance to change its priority level at runtime. Both
alternatives obtain great performance over existing proposals that employ fixed transactional
policies, especially in applications with phase changes.
CitationLupon Navazo, M. Architectural support for high-performing hardware transactional memory systems. Tesi doctoral, UPC, Departament d'Arquitectura de Computadors, 2011. DOI 10.5821/dissertation-2117-94562 . Available at: <http://hdl.handle.net/2117/94562>
DLB. 15953-2012
Collections
Files | Description | Size | Format | View |
---|---|---|---|---|
TMLN1de1.pdf | 5,357Mb | View/Open |