Write a Blog >>
Sun 18 Jun 2017 17:00 - 17:30 at Vertex WS218 - Afternoon talks 2 Chair(s): P. Sadayappan

Recently we presented TTC, a domain-specific compiler for tensor transpositions. Despite the fact that the performance of the generated code is nearly optimal, due to its offline nature, TTC cannot be utilized in all the application codes in which the tensor sizes and the necessary tensor permutations are determined at runtime. To overcome this limitation, we introduce the open-source C++ library High-Performance Tensor Transposition (HPTT). Similar to TTC, HPTT incorporates optimizations such as blocking, multi-threading, and explicit vectorization; furthermore it decomposes any transposition into multiple loops around a so called micro-kernel. This modular design—inspired by BLIS—makes HPTT easy to port to different architectures, by only replacing the hand-vectorized micro-kernel (e.g., a 4x4 transpose). HPTT also offers an optional autotuning framework—guided by a performance model—that explores a vast search space of implementations at runtime (similar to FFTW). Across a wide range of different tensor transpositions and architectures (e.g., Intel Ivy Bridge, ARMv7, IBM Power7), HPTT attains a bandwidth comparable to that of SAXPY, and yields remarkable speedups over Eigen’s tensor transposition implementation. Most importantly, the integration of HPTT into the Cyclops Tensor Framework (CTF) improves the overall performance of tensor contractions by up to 3.1x.

Slides (hptt_array17.pdf)504KiB

Sun 18 Jun

Displayed time zone: Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna change

16:00 - 17:30
Afternoon talks 2ARRAY at Vertex WS218
Chair(s): P. Sadayappan Ohio State University
16:00
30m
Talk
Efficient Array Slicing on the Intel Xeon Phi Coprocessor
ARRAY
Benjamin Andreassen Norwegian University of Science and Technology, Jan Christian Norwegian University of Science and Technology, Lasse Natvig Norwegian University of Science and Technology
DOI File Attached
16:30
30m
Talk
Modular Array-based GPU Computing in a Dynamically-typed Language
ARRAY
Matthias Springer Tokyo Institute of Technology, Peter Wauligmann Tokyo Institute of Technology, Hidehiko Masuhara Tokyo Institute of Technology
DOI File Attached
17:00
30m
Talk
HPTT: A High-Performance Tensor Transposition C++ Library
ARRAY
DOI File Attached