Beitrag in einem Tagungsband
Towards an efficient fault-tolerance scheme for GLB



Details zur Publikation
Autor(inn)en:
Fohry, C.; Bungart, M.; Posner, J.
Herausgeber:
Tardieu, Oliver; Amaral, José Nelson
Verlag:
ACM
Verlagsort / Veröffentlichungsort:
New York, NY
Publikationsjahr:
2015
Seitenbereich:
27-32
Buchtitel:
Proceedings of the ACM SIGPLAN Workshop on X10 - X10 2015
ISBN:
978-1-4503-3586-7

Zusammenfassung, Abstract


X10's Global Load Balancing framework GLB implements a user-level task pool for inter-place load balancing. It is based on work stealing and deploys the lifeline algorithm. A single worker per place alternates between processing tasks and answering steal requests. We have devised an efficient fault-tolerance scheme for this algorithm, improving on a simpler resilience scheme from our own previous work. Among the base ideas of the new scheme are incremental backups of ``stable'' tasks and an actor-like communication structure. The paper reports on our ongoing work to extend the GLB framework accordingly. While details of the scheme are left out, we discuss implementation issues and preliminary experimental results.




Autor(inn)en / Herausgeber(innen)

Zuletzt aktualisiert 2019-25-07 um 16:24