This paper addresses the problem
of extracting the maximum synchronization-free parallelism
that may be present in loops.
In order to reduce communication and synchronization overheads, some
parallelizing compilers try to identify independent
computational partitions - if there are any -
of a sequential program. We focus on the case of loops with
constant dependence distance vectors. We consider a
statement instance as a basic unit that can be allocated to a
processor, in contrast other methods that use an iteration instance.
We show that a previously proposed family of scheduling heuristics
(Graph Traversal Scheduling) is optimal
in the sense that no more parallelism can be expressed
with synchronization-free code. Furthermore,
we give a quasi-linear time algorithm to find
such an optimal Graph Traversal Scheduling.
CitationGavaldà, R., Ayguade, E., Torres, J. "Obtaining synchronization-free code with maximum parallelism". 1996.
All rights reserved. This work is protected by the corresponding intellectual and industrial property rights. Without prejudice to any existing legal exemptions, reproduction, distribution, public communication or transformation of this work are prohibited without permission of the copyright holder. If you wish to make any use of the work not provided for in the law, please contact: email@example.com