Parallel tiled QR factorization for multicore architectures
QR decomposition
Multi-core processor
Implementation
Linear algebra
DOI:
10.1002/cpe.1301
Publication Date:
2008-06-03T16:46:20Z
AUTHORS (4)
ABSTRACT
Abstract As multicore systems continue to gain ground in the high‐performance computing world, linear algebra algorithms have be reformulated or new developed order take advantage of architectural features on these processors. Fine‐grain parallelism becomes a major requirement and introduces necessity loose synchronization parallel execution an operation. This paper presents algorithm for QR factorization where operations can represented as sequence small tasks that operate square blocks data (referred ‘tiles’). These dynamically scheduled based dependencies among them availability computational resources. may result out‐of‐order will completely hide presence intrinsically sequential factorization. Performance comparisons are presented with LAPACK exploited only at level BLAS vendor implementations. Copyright © 2008 John Wiley & Sons, Ltd.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (28)
CITATIONS (77)
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....