Beitrag in einem Tagungsband
A Malleable and Fault-Tolerant Task Pool Framework for X10



Details zur Publikation
Autor(inn)en:
Bungart, M.; Fohry, C.
Herausgeber:
IEEE
Verlag:
Curran Associates
Verlagsort / Veröffentlichungsort:
Red Hook, New York
Publikationsjahr:
2017
Seitenbereich:
749-757
Buchtitel:
2017 IEEE International Conference on Cluster Computing (CLUSTER)
ISBN:
978-1-5386-2326-8

Zusammenfassung, Abstract
Current HPC environments require parallel programs that are both malleable and fault-tolerant. Malleability denotes the ability to embrace system-initiated resource changes, and fault tolerance denotes the ability to cope with, e.g., permanent node failures.This paper considers the task pool pattern, specifically its lifeline-based variant. It builds on a previous fault-tolerant realization, and integrates the ability to add resources. We suggest a growth protocol that is able to cope with failures of old and new resources during its execution. New resources replace failed ones in one data structure, while getting a new role in another.The algorithm was implemented in a framework for the programming language X10. Correctness tests and performance measurements used the Unbalanced Tree Search (UTS) benchmark. We compared the performance on a constant vs. varying number of workers with same average, and observed negligible differences in execution time and task throughput.


Autor(inn)en / Herausgeber(innen)

Zuletzt aktualisiert 2019-25-07 um 16:24