Beitrag in einem Tagungsband
Programming hybrid systems with implicit memory based synchronization

Details zur Publikation
Breitbart, J.
IEEE International Conference on High Performance Computing
Proceedings of the IEEE International Conference on High Performance Computing (HiPC)

Zusammenfassung, Abstract
In the last years CPU performance increases came with an increase in software development complexity. One of the next big changes in CPU architecture may be so-called hybrid multicore chips, which combine both multicore and manycore technologies on the same chip. Unfortunately, this increase in performance again may lead to an increase in development complexity. In this paper we suggest a dataflow driven programming model for hybrid multicore CPUs able to decrease complexity. We designed our model to work efficiently on non cache-coherent systems and in NUMA scenarios, which may both be issues exposed from the hardware in the near future. Our model is based on single assignment memory combined with PGAS-like explicit placement of data. Reading uninitialized memory results in the reading thread to be blocked until data is made available by another thread. That way threads can easily work in parallel and fine grained synchronization is done implicitly. Single assignment memory removes any cache coherency issues, whereas explicit data placement easily copes with NUMA effects. We implemented our model in a proof of concept C++ library for CUDA and x86 architectures, and use the scan algorithm to demonstrate that it is fairly easy to coordinate two GPUs and multiple CPU cores.

Autor(inn)en / Herausgeber(innen)


Zuletzt aktualisiert 2019-24-07 um 08:55