Early Experience on Porting and Running a Lattice Boltzmann Code on the Xeon-phi Co-Processor

Many-core systems Lattice Boltzmann; Many-core systems; Performance optimization 0202 electrical engineering, electronic engineering, information engineering General Medicine 02 engineering and technology Lattice Boltzmann Performance optimization
DOI: 10.1016/j.procs.2013.05.219 Publication Date: 2013-06-01T15:12:04Z
ABSTRACT
AbstractIn this paper we report on our early experience on porting, optimizing and benchmarking a Lattice Boltzmann (LB) code on the Xeon-Phi co-processor, the first generally available version of the new Many Integrated Core (MIC) architecture, developed by Intel. We consider as a test-bed a state-of-the-art LB model, that accurately reproduces the thermo-hydrodynamics of a 2D- fluid obeying the equations of state of a perfect gas. The regular structure of LB algorithms makes it relatively easy to identify a large degree of available parallelism. However, mapping a large fraction of this parallelism onto this new class of processors is not straightforward. The D2Q37 LB algorithm considered in this paper is an appropriate test-bed for this architecture since the critical computing kernels require high performances both in terms of memory bandwidth for sparse memory access patterns and number crunching capability. We describe our implementation of the code, that builds on previous experience made on other (simpler) many-core processors and GPUs, present benchmark results and measure performances, and finally compare with the results obtained by previous implementations developed on state-of-the-art classic multi-core CPUs and GP-GPUs.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (11)
CITATIONS (41)
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....