Efficient 3D Transpositions in Graphics Processing Units
0101 mathematics
01 natural sciences
DOI:
10.1007/s10766-015-0366-5
Publication Date:
2015-04-03T14:00:31Z
AUTHORS (3)
ABSTRACT
Matrix transposition is a basic operation for several computing tasks. Hence, transposing a matrix in a computer's main memory has been well studied since many years ago. More recently, the out-of-place matrix transposition has been performed efficiently in graphical processing units (GPU), which are broadly used today for general purpose computing. However, due to the particular architecture of GPUs, the adaptation of the matrix transposition operation to 3D arrays is not straightforward. In this paper, we describe efficient implementations for graphical processing units of the 5 possible out-of-place 3D transpositions. Moreover, we also include the transposition of the most basic in-place 3D transpositions. The results show that the achieved bandwidth is close to a simple array copy and is similar to the 2D transposition.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (14)
CITATIONS (15)
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....