Portable Vectorization and Parallelization of C++ Multi-Dimensional Array Computations
This paper presents Legolas++ Arrays, a multi-dimensional array library. Legolas++ Arrays parametrized types enable data layout adaptation for specific Single Instruction Mul- tiple Data (SIMD) core architectures. The mapping of com- plex array-based kernels to regular collections of data is automatically and efficiently vectorized. In addition, Lego- las++ Arrays implementation can combine multi-threaded parallelism with SIMD acceleration. As an example, a direct tridiagonal solver applied to a collection of equally sized problems exhibits a speedup of more than ×22 on an 8-core SIMD processor.