Seyong Lee

ORCID: 0000-0001-8872-4932
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Parallel Computing and Optimization Techniques
  • Distributed and Parallel Computing Systems
  • Advanced Data Storage Technologies
  • Cloud Computing and Resource Management
  • Embedded Systems Design Techniques
  • 3D IC and TSV technologies
  • Electronic Packaging and Soldering Technologies
  • Interconnection Networks and Systems
  • Distributed systems and fault tolerance
  • Software System Performance and Reliability
  • Mercury impact and mitigation studies
  • Auditing, Earnings Management, Governance
  • Radiation Effects in Electronics
  • Lattice Boltzmann Simulation Studies
  • Scientific Computing and Data Management
  • Financial Reporting and Valuation Research
  • Real-Time Systems Scheduling
  • Additive Manufacturing and 3D Printing Technologies
  • Real-time simulation and control systems
  • Particle Detector Development and Performance
  • Insurance and Financial Risk Management
  • Impact of AI and Big Data on Business and Society
  • Heavy metals in environment
  • Mine drainage and remediation techniques
  • Advanced Database Systems and Queries

Oak Ridge National Laboratory
2015-2024

Oak Ridge Associated Universities
2024

Korea Advanced Institute of Science and Technology
2016-2020

Korea Environment Institute
2017-2018

Daejeon Institute of Science and Technology
2018

Korea Institute of Geoscience and Mineral Resources
2018

Gwangju Institute of Science and Technology
2010-2015

Purdue University West Lafayette
2006-2010

Dong-A University
2008

Kia Motors (South Korea)
2006

GPGPUs have recently emerged as powerful vehicles for general-purpose high-performance computing. Although a new Compute Unified Device Architecture (CUDA) programming model from NVIDIA offers improved programmability general computing, is still complex and error-prone. This paper presents compiler framework automatic source-to-source translation of standard OpenMP applications into CUDA-based GPGPU applications. The goal this to further improve make existing amenable execution on GPGPUs. In...

10.1145/1504176.1504194 article EN 2009-02-14

General-Purpose Graphics Processing Units (GPGPUs) are promising parallel platforms for high performance computing. The CUDA (Compute Unified Device Architecture) programming model provides improved programmability general computing on GPGPUs. However, its unique execution and memory still pose significant challenges developers of efficient GPGPU code. This paper proposes a new interface, called OpenMPC, which builds OpenMP to provide an abstraction the complex offers high-level controls...

10.1109/sc.2010.36 article EN 2010-11-01

The Cetus tool provides an infrastructure for research on multicore compiler optimizations that emphasizes automatic parallelization. infrastructure, which targets C programs, supports source-to-source transformations, is user-oriented and easy to handle, the most important parallelization passes as well underlying enabling techniques.

10.1109/mc.2009.385 article EN Computer 2009-12-01

GPGPUs have recently emerged as powerful vehicles for general-purpose high-performance computing. Although a new Compute Unified Device Architecture (CUDA) programming model from NVIDIA offers improved programmability general computing, is still complex and error-prone. This paper presents compiler framework automatic source-to-source translation of standard OpenMP applications into CUDA-based GPGPU applications. The goal this to further improve make existing amenable execution on GPGPUs. In...

10.1145/1594835.1504194 article EN ACM SIGPLAN Notices 2009-02-14

Flexible, accurate performance predictions offer numerous benefits such as gaining insight into and optimizing applications architectures. However, the development evaluation of has been a major research challenge, due to architectural complexities. To address this we have designed implemented prototype system, named COMPASS, for automated model generation prediction. COMPASS generates structured from target application's source code using static analysis, then, it evaluates various...

10.1145/2751205.2751220 article EN 2015-06-02

This paper presents Open Accelerator Research Compiler (OpenARC): an open-source framework that supports the full feature set of OpenACC V1.0 and performs source-to-source transformations, targeting heterogeneous devices, such as NVIDIA GPUs. Combined with its high-level, extensible Intermediate Representation (IR) rich semantic annotations, OpenARC serves a powerful research vehicle for prototyping optimization, instrumentation debugging, performance analysis, autotuning. In fact, is...

10.1145/2600212.2600704 article EN 2014-06-20

10.1016/j.jpdc.2012.12.012 article EN Journal of Parallel and Distributed Computing 2012-12-31

Graphics Processing Unit (GPU)-based parallel computer architectures have shown increased popularity as a building block for high performance computing, and possibly future Exascale computing. However, their programming complexity remains major hurdle widespread adoption. To provide better abstractions GPU architectures, researchers vendors proposed several directive-based models. These models different levels of abstraction, required effort to port optimize applications. Understanding these...

10.5555/2388996.2389028 article EN 2012-11-10

This paper presents a directive-based, high-level programming framework for high-performance reconfigurable computing. It takes standard, portable OpenACC C program as input and generates hardware configuration file execution on FPGAs. We implemented this prototype system using our open-source OpenARC compiler, it performs source-to-source translation optimization of the into an OpenCL code, which is further compiled FPGA by backend Altera Offline compiler. Internally, design uses...

10.1109/ipdps.2016.28 article EN 2022 IEEE International Parallel and Distributed Processing Symposium (IPDPS) 2016-05-01

With the rise of general purpose computing on graphics processing units (GPGPU), influence from consumer markets can now be seen across spectrum computer architectures. In fact, many high-ranking Top500 HPC systems include these accelerators. Traditionally, GPUs have connected to CPU via PCIe bus, which has proved a significant bottleneck for scalable scientific applications. Now, trend toward tighter integration between and GPU removed this unified memory hierarchy both cores. We examine...

10.1145/2212908.2212924 article EN 2012-05-15

Heterogeneous computing with accelerators is growing in importance high performance (HPC). Recently, application datasets have expanded beyond the memory capacity of these accelerators, and often their hosts. Meanwhile, nonvolatile (NVM) storage has emerged as a pervasive component HPC systems because NVM provides massive amounts at affordable cost. Currently, for accelerator applications to use NVM, they must manually orchestrate data movement across multiple memories this approach only...

10.1109/sc.2018.00035 article EN 2018-11-01

Across embedded, mobile, enterprise, and high performance computing systems, computer architectures are becoming more heterogeneous complex. This complexity is causing a crisis in programming systems portability. Several working to address these challenges, but the increasing architectural diversity forcing software stacks applications be specialized for each architecture. As we show, all of approaches critically depend on their runtime system discovery, execution, scheduling, data...

10.1109/hpec49654.2021.9622873 article EN 2021-09-20

Graphics Processing Unit (GPU)-based parallel computer architectures have shown increased popularity as a building block for high performance computing, and possibly future Exascale computing. However, their programming complexity remains major hurdle widespread adoption. To provide better abstractions GPU architectures, researchers vendors proposed several directive-based models. These models different levels of abstraction, required effort to port optimize applications. Understanding these...

10.1109/sc.2012.51 article EN International Conference for High Performance Computing, Networking, Storage and Analysis 2012-11-01

This paper introduces PapyrusKV, a parallel embedded key-value store (KVS) for distributed high-performance computing (HPC) architectures that offer potentially massive pools of nonvolatile memory (NVM). PapyrusKV stores keys with their values in arbitrary byte arrays across multiple NVMs system. provides standard KVS operations such as put, get, and delete. More importantly, advanced features HPC dynamic consistency control, zero-copy workflow, asynchronous checkpoint/restart. Beyond...

10.1145/3126908.3126943 article EN 2017-11-08

OpenACC was launched in 2010 as a portable programming model for heterogeneous accelerators. Although various implementations already exist, no extensible, open-source, production-quality compiler support is available to the community. This deficiency poses serious risk HPC application developers targeting GPUs and other accelerators, it limits experimentation progress specification. To address this deficiency, Clacc recent effort funded by US Exascale Computing Project develop production...

10.1109/llvm-hpc.2018.8639349 article EN 2018-11-01

Computer architecture experts expect that non-volatile memory (NVM) hierarchies will play a more significant role in future systems including mobile, enterprise, and HPC architectures. With this expectation mind, we present NVL-C: novel programming system facilitates the efficient correct of NVM main systems. The NVL-C abstraction extends C with small set intuitive language features target memory, can be combined directly traditional model for DRAM. We have designed these new to enable...

10.1145/2907294.2907303 article EN 2016-05-31

GPU performance of the lattice Boltzmann method (LBM) depends heavily on memory access patterns. When LBM is advanced with GPUs complex computational domains, geometric data typically accessed indirectly, and lexicographically in Structure Array (SoA) layout. Although there are a variety existing patterns beyond typical choices, no study has yet examined relative efficacy between them. Here, we compare suite schemes via empirical testing modeling. We find strong evidence that semi-direct...

10.1109/ipdps.2018.00092 article EN 2022 IEEE International Parallel and Distributed Processing Symposium (IPDPS) 2018-05-01

Fine-grained cycle sharing (FGCS) systems aim at utilizing the large amount of computational resources available on Internet. In FGCS, host computers allow guest jobs to utilize CPU cycles if do not significantly impact local users a host. A characteristic such is that they are generally provided voluntarily and their availability fluctuates highly. Guest may fail because unexpected resource unavailability. To provide fault tolerance without adding significant overhead, it requires predict...

10.1109/hpdc.2006.1652140 article EN 2006-07-21

Sparse matrix-vector (SpMV) multiplication is a widely used kernel in scientific applications. In these applications, the SpMV usually deeply nested within multiple loops and thus executed large number of times. We have observed that there can be significant performance variability, due to irregular memory access patterns. Static optimizations are difficult because patterns may known only at runtime. this paper, we propose adaptive runtime tuning mechanisms improve parallel on distributed...

10.1145/1375527.1375558 article EN 2008-06-07
Coming Soon ...