- Advanced Wireless Communication Techniques
- Error Correcting Code Techniques
- Parallel Computing and Optimization Techniques
- Advanced Memory and Neural Computing
- Coding theory and cryptography
- Advanced Data Compression Techniques
- Advanced MIMO Systems Optimization
- Embedded Systems Design Techniques
- CCD and CMOS Imaging Sensors
- Ferroelectric and Negative Capacitance Devices
- Low-power high-performance VLSI design
- Sparse and Compressive Sensing Techniques
- Analog and Mixed-Signal Circuit Design
- VLSI and FPGA Design Techniques
- Video Coding and Compression Technologies
- Advancements in PLL and VCO Technologies
- VLSI and Analog Circuit Testing
- Advanced Image and Video Retrieval Techniques
- Cellular Automata and Applications
- Quantum-Dot Cellular Automata
- Advanced Wireless Communication Technologies
- Digital Filter Design and Implementation
- 3D IC and TSV technologies
- Advanced biosensing and bioanalysis techniques
- Cooperative Communication and Network Coding
Pohang University of Science and Technology
2019-2025
Korea Post
2023
Texas Instruments (United States)
2002
Graph convolutional neural networks (GCNs) have emerged as a key technology in various application domains where the input data is relational. A unique property of GCNs that its two primary execution stages, aggregation and combination, exhibit drastically different dataflows. Consequently, prior GCN accelerators tackle this research space by casting combination stages series sparse-dense matrix multiplication. However, work frequently suffers from inefficient movements, leaving significant...
Allowing the superior error-correction performance even for short-length codewords, successive-cancellation list (SCL) decoding algorithm has allowed polar code to be adopted in 5G New Radio standard control channel. However, existing SCL decoders still suffer from long processing latency caused by a number of serialized internal operations. In this work, solve problem, we present several parallel computing solutions operations, i.e., simplified data dependencies and two overlapped pruning...
This paper presents a cost-efficient large-list SCL polar decoder supporting an ultra-reliable channel coding in 5G and beyond communications. To minimize huge implementation costs, the proposed design utilizes fully-reusable LLR buffers associated with stage unfolding overwriting schemes, significantly reducing on-chip buffer overheads by 67% compared to state-of-the-art decoder. Implemented 28nm CMOS, prototype list-8 achieves 1.1μs 1.56Gb/s/mm <sup...
To enable emerging mission-critical applications, e.g., healthcare monitoring, remote surgery, and autonomous driving, 5G/6G ultra-reliable low-latency communication (URLLC) devices demand the concurrent fulfillment of ultra-reliability, low-latency, low-power communications, particularly in short data transmissions as depicted Fig. 2.8.1. However, existing short-length forward error-correction (FEC) solutions for URLLC cannot meet all challenging requirements at same time. The recent...
In massive multiple-input multiple-output (MIMO) systems using a large number of antennas, it would be difficult to connect high-resolution analog-to-digital converters (ADCs) each antenna component due high cost and energy consumption problems. To resolve these issues, there has been much work on implementing symbol detectors channel estimators low-resolution ADCs for MIMO systems. Although is intuitively true that makes possible save amount in systems, the relationship between detection...
Due to an iterative nature, a low-density parity-check (LDPC) decoder is associated with long latency, being major bottleneck of the baseband processor in wireless communication systems. Based on practical min-sum (MS) decoding method, this paper, we present cost-effective algorithm for reducing processing latency LDPC decoders. By checking number short-length cycles code structure, proposed method dynamically changes reweighting factor at operations, successfully average iterations. In...
Achieving the attractive error-correcting capability with a simple decoder structure, polar code using successive cancellation (SC) decoding is now expected to be installed at resource-limited IoT or embedded communications. However, existing SC decoders normally suffer from long processing latency caused by serialized steps, limiting practical applications of codes. In this article, solve problem, we present new low-complexity merging operation that can increase number parallel factors for...
In this paper, we newly present a novel parallel polar decoding architecture that significantly reduces the processing latency for 5G wireless communications. Based on original tree, proposed scheme first constructs small trees generate multiple soft-decision messages in parallel, potentially reducing compared to previous serialized schemes. The hard-decision estimates are then calculated at following merging step decide decoded outputs and update trees. For each pruning is utilized further...
In this paper, we present a novel scheduling method that reduces the latency of polar decoders significantly. Unlike prior pruning-based successive cancellation list (SCL) decoding suffers from number idle cycles, proposed overlapped SCL scheme immediately begins node operations without waiting for to be sorted, being exempt such unfavorable cycles. All possible candidates next are precomputed in parallel with pruning operations, and readily selected minimize latency. For 5G New Radio...
In this paper, we introduce the design and veri-fication frameworks for developing a fully-functional emerging ternary processor. Based on existing compiling environments binary processors, given instructions, software-level framework provides an efficient way to convert programs assembly codes. We also present hardware-level rapidly evaluate performance of processor implemented in arbitrary technology. As case study, 9-trit advanced RISC-based (ART-9) core is newly developed by using...
In this paper, we present a novel weighted pruning method that effectively reduces the processing latency of hard-decision (HD) polar decoder for storage applications without compromising error-correcting capability. Based on previous Fast-SSCL-SPC decoding algorithm, thoroughly analyze cause performance degradation when using HD inputs. Introducing operation least reliable internal value, proposed successfully avoids faulty updates codeword candidates. Furthermore, demonstrate architecture...
Low bit-precisions and their bit-slice sparsity have recently been studied to accelerate general matrix-multiplications (GEMM) during large-scale deep neural network (DNN) inferences. While the conventional symmetric quantization facilitates low-resolution processing with for both weight activation, its accuracy loss caused by activation's asymmetric distributions cannot be acceptable, especially DNNs. In efforts mitigate this loss, recent studies actively utilized activations without...
This paper presents a novel baseband architecture that supports high-speed wireless VR solutions using 60 GHz RF circuits. Based on the experimental observations by our previous transceiver circuits, efficient is proposed to enhance quality of transmission. To achieve zero-latency transmission, we define an (106,920, 95,040) interleaved-BCH error-correction code (ECC), which removes iterative processing steps in LDPC ECC standardized for near-field communication. Introducing block-level...
The emerging digital audio compression technology brings both an opportunity and a new challenge to IC design. High quality multichannel is quickly becoming indispensable part of entertainment system. algorithms used in the result complex VLSI ICs. work presented this paper about design dedicated, high precision, low cost AC3/MPEG multi-standard decoder. IC's hardware software architecture, as well simulation/verification methodology are discussed detail.
Based on recent RISC-V designs, we present in this paper a low-power vector processor architecture for efficiently deploying vision transformer (ViT) models. To fairly measure the processing efficiency of different designs with instruction/data cache memories, first develop evaluation framework based numerous design tools jointly considering algorithm, architecture, and circuit performances together, numerically revealing that previous CSR-based data compression cannot accelerate pruned...
The compressive sensing (CS) based sparse vector coding (SVC) method is one of the promising ways for next-generation ultra-reliable and low-latency communications. In this paper, we present advanced algorithm-hardware co-optimization schemes realizing a cost-effective SVC decoding architecture. previous maximum posteriori subspace pursuit (MAP-SP) algorithm newly modified to relax computational overheads by applying novel residual forwarding LLR approximation schemes. A fully-pipelined...
The ordered statistic decoding (OSD) approach for short-length BCH codes has been continuously considered as one of the promising error-correction by achieving a block error rate (BLER) less than $10^{-6}$, which is attractive to ultra-reliable and low-latency communication (URLLC) industrial IoT (IIOT) solutions [1], [2]. However, it hard directly realize conventional OSD algorithm because compute-intensive Gaussian elimination iterative reprocessing steps. Based on recent segmentation...
In this paper, we introduce the design and verification frameworks for developing a fully-functional emerging ternary processor. Based on existing compiling environments binary processors, given instructions, software-level framework provides an efficient way to convert programs assembly codes. We also present hardware-level rapidly evaluate performance of processor implemented in arbitrary technology. As case study, 9-trit advanced RISC-based (ART-9) core is newly developed by using...
Graph convolutional neural networks (GCNs) have emerged as a key technology in various application domains where the input data is relational. A unique property of GCNs that its two primary execution stages, aggregation and combination, exhibit drastically different dataflows. Consequently, prior GCN accelerators tackle this research space by casting combination stages series sparse-dense matrix multiplication. However, work frequently suffers from inefficient movements, leaving significant...
The ordered statistic decoding (OSD) algorithm for short-length linear block codes provides an attractive ML-approaching performance, expected to be used the ultra-reliable low latency communication (URLLC) at next-generation wireless solutions. To find corrected codeword among numerous candidates, however, process requires a considerable amount of computational costs, which need simplified achieve low-latency processing. In this letter, we present several schemes that relax overall...