NFDI4DS | UHH-SEMS - Publication Details

Ching Chuen Jong

ORCID: 0000-0003-1178-9062

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5018516888

Research Areas

Embedded Systems Design Techniques
Digital Filter Design and Implementation
Low-power high-performance VLSI design
Numerical Methods and Algorithms
Interconnection Networks and Systems
VLSI and FPGA Design Techniques
Parallel Computing and Optimization Techniques
Analog and Mixed-Signal Circuit Design
VLSI and Analog Circuit Testing
3D IC and TSV technologies
Advanced Data Compression Techniques
Electronic Packaging and Soldering Technologies
Nanofabrication and Lithography Techniques
Image and Signal Denoising Methods
Cryptography and Residue Arithmetic
Advanced Adaptive Filtering Techniques
Manufacturing Process and Optimization
Coding theory and cryptography
Model Reduction and Neural Networks
Advanced Image Fusion Techniques
Formal Methods in Verification
Image Enhancement Techniques
Advancements in Semiconductor Devices and Circuit Design
Advanced Vision and Imaging
Industrial Vision Systems and Defect Detection

Nanyang Technological University
2010-2022

Agency for Science, Technology and Research
2008-2021

Institute of Microelectronics
2006-2021

Singapore Science Park
2011

University of Southampton
1989

Design of Low-Complexity FIR Filters Based on Signed-Powers-of-Two Coefficients With Reusable Common Subexpressions

OPENALEX - Publications

Fei Xu Chip-Hong Chang Ching Chuen Jong

In this paper, a new efficient algorithm is proposed for the synthesis of low-complexity finite-impulse response (FIR) filters with resource sharing. The original problem statement based on minimization signed-power-of-two (SPT) terms has been reformulated to account sharable adders. common SPT (CSPT) that were considered in our addresses optimization reusability adders two major types subexpressions, together are needed spare terms. coefficient set synthesized stages. first stage, CSPT...

10.1109/tcad.2007.895615 article EN IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 2007-09-24

Contention resolution algorithm for common subexpression elimination in digital filter design

OPENALEX - Publications

Fei Xu Chip-Hong Chang Ching Chuen Jong

In this paper, a new algorithm, called contention resolution algorithm for weight-two subexpressions (CRA-2), based on an ingenious graph synthesis approach has been developed the common subexpression elimination of multiplication block digital filter structures. CRA-2 provides leeway to break away from local minimum and flexibility varying optimization options through admissibility graph. It manages two-bit aims at achieving minimal logic depth as primary goal. The performances our proposed...

10.1109/tcsii.2005.851776 article EN IEEE Transactions on Circuits and Systems II Analog and Digital Signal Processing 2005-10-01

Three dimensional interconnects with high aspect ratio TSVs and fine pitch solder microbumps

OPENALEX - Publications

Aibin Yu John H. Lau Soon Wee Ho Aditya Kumar Hnin Wai Yin and 9 more

High density three dimensional (3D) interconnects formed by high aspect ratio through silicon vias (TSVs) and fine pitch solder microbumps are presented in this paper. The of the TSV is larger than 10 filled with Cu without voids; there electrical nickel immersion gold (ENIG) pads on top as under bump metallurgy (UBM) layer. On Si chip, Cu/Sn 16µm diameter 25µm fabricated. After singulating chip carrier, joined together interconnection between them micro bumps TSV.

10.1109/ectc.2009.5074039 article EN 2009-05-01

A Memory-Efficient Tables-and-Additions Method for Accurate Computation of Elementary Functions

OPENALEX - Publications

Joshua Yung Lih Low Ching Chuen Jong

The tables-and-additions methods for accurate computation of elementary functions are fast in speed but require large memory. A memory-efficient method named as the integrated Add-Table Lookup-Add (iATA) is proposed this paper. In iATA, mathematical formulation computing derived without using central difference to save Three additional techniques, specifically carry select technique, symmetry property exploitation and unequal partitioning input with aid error analysis, iATA further reduce...

10.1109/tc.2012.43 article EN IEEE Transactions on Computers 2012-02-07

A Memory-Efficient High-Throughput Architecture for Lifting-Based Multi-Level 2-D DWT

OPENALEX - Publications

Yusong Hu Ching Chuen Jong

In this paper, we present a novel memory-efficient high-throughput scalable architecture for multi-level 2-D DWT. We studied the existing DWT architectures and observed that data scanning method has significant impact on memory efficiency of architecture. propose parallel stripe-based based analysis dependency graph lifting scheme. With new 2D DWT, high efficient pipelined is developed. The proposed requires no frame temporal size only 3 N +682 3-level decomposition with an image ×N pixels...

10.1109/tsp.2013.2274640 article EN IEEE Transactions on Signal Processing 2013-07-24

Unified Mitchell-based Approximation for Efficient Logarithmic Conversion Circuit

OPENALEX - Publications

Joshua Yung Lih Low Ching Chuen Jong

This paper presents a novel method named the Unified Mitchell-based Approximation (UMA) to obtain an optimized logarithmic conversion circuit for any desired accuracy up 14 bits. UMA is first that able when specific required. In this work, we studied and analyzed five design parameters their impact on hardware merits. We formulate model of error correction in performance evaluation. Given requirement, proposed explores space parameters. As theoretically huge, propose constraints range...

10.1109/tc.2014.2329683 article EN IEEE Transactions on Computers 2014-01-01

A Memory-Efficient Scalable Architecture for Lifting-Based Discrete Wavelet Transform

OPENALEX - Publications

Yusong Hu Ching Chuen Jong

In this brief, we propose a new parallel lifting-based 2-D DWT architecture with high memory efficiency and short critical path. The is achieved novel scanning method that enables tradeoff of external bandwidth on-chip memory. Based on the data flow graph flipped lifting algorithm, processing units (PUs) are developed for maximally utilizing inherent parallelism. With S number PUs, throughput can be scaled while keeping latency constant. Compared best existing architecture, proposed requires...

10.1109/tcsii.2013.2268335 article EN IEEE Transactions on Circuits & Systems II Express Briefs 2013-07-04

Development of Fine Pitch Solder Microbumps for 3D Chip Stacking

OPENALEX - Publications

Aibin Yu Aditya Kumar Soon Wee Ho Hnin Wai Yin John H. Lau and 18 more

Developments of ultra fine pitch and high density solder microbumps for advanced 3D stacking technologies are discussed in this paper. CuSn with 25 ¿m fabricated at wafer level by electroplating method the total thicknesses platted Cu Sn 10 ¿m. After plating, micro bumps on Si chip reflowed 265°C variation bump height measured within a die is less than 5%. The under metallurgy (UBM) layer carrier used electroless plated nickel immersion gold (ENIG) thickness 5 Assembly conducted FC150 flip...

10.1109/eptc.2008.4763465 article EN 2008-12-01

Contention Resolution—A New Approach to Versatile Subexpressions Sharing in Multiple Constant Multiplications

OPENALEX - Publications

Fei Xu Chip-Hong Chang Ching Chuen Jong

Multiple constant multiplications (MCM) have been a core operation in many digital signal processing applications. In this paper, an efficient generalized contention resolution algorithm (CRA) is proposed to eliminate three broad categories of reusable common subexpressions MCM. The idea revert precedential decision suboptimal by localized cost function evaluation when there conflict between two competitive subexpressions. derivatives the basic CRA are versatile that they capable satisfying...

10.1109/tcsi.2007.913707 article EN IEEE Transactions on Circuits and Systems I Regular Papers 2008-03-01

A High Bit Rate Serial-Serial Multiplier With On-the-Fly Accumulation by Asynchronous Counters

OPENALEX - Publications

Manas Ranjan Meher Ching Chuen Jong Chip-Hong Chang

A novel approach of designing serial-serial hybrid multiplier is proposed for applications with high data sampling rate ( ≥4 GHz). The conventional way partial product formation revamped. Our technique effectively forms the entire matrix in just n cycles an × multiplication instead at least 2 multipliers. It achieves a bit by replacing full adders and 5:3 counters asynchronous 1's so that...

10.1109/tvlsi.2010.2060374 article EN IEEE Transactions on Very Large Scale Integration (VLSI) Systems 2010-09-07

A High-Throughput VLSI Architecture for Real-Time Full-HD Gradient Guided Image Filter

OPENALEX - Publications

Lei Wu Ching Chuen Jong

Guided image filtering has been applied widely in recent years as a solution to the ever-increasing demand of high-performance filtering, especially for real-time image/video processing. The lately proposed gradient domain guided filter (GDGIF) is one typical works focusing on improving quality result original (GIF), dealing with halo-artifacts problem edge-preserving smoothing. However, due involvement global pixel values computation, high computation complexity, and additional complex...

10.1109/tcsvt.2018.2852336 article EN IEEE Transactions on Circuits and Systems for Video Technology 2018-07-02

Efficient algorithms for common subexpression elimination in digital filter design

OPENALEX - Publications

Fei Xu Chip-Hong Chang Ching Chuen Jong

A contention resolution algorithm (CRA) is proposed for the common subexpression elimination of multiplier block digital filter structure. CRA synthesizes subexpressions any Hamming weight to achieve an overall minimization with emphasis that every logic depth increment must be accompanied by a reduction in complexity. new data structure, called admissibility graph introduced represent succinctly set coefficients; admissible are progressively labeled on as either precedence or edges (or...

10.1109/icassp.2004.1327066 article EN IEEE International Conference on Acoustics Speech and Signal Processing 2004-09-28

A curve fitting approach for non-iterative divider design with accuracy and performance trade-off

OPENALEX - Publications

Lei Wu Ching Chuen Jong

This paper presents an approach based on the curve fitting method for design of non-iterative divider circuits with accuracy and area-delay product (ADP) trade-offs. The curved surfaces representing quotient are partitioned into several regions, each which is then approximated by a square/triangular plane. planes obtained using optimization. proposed architecture implementing contains only simple arithmetic operations look-up table. Several different accuracies ADPs obtained. achieved in...

10.1109/newcas.2015.7182097 article EN 2015-06-01

Non-iterative high speed division computation based on Mitchell logarithmic method

OPENALEX - Publications

Joshua Yung Lih Low Ching Chuen Jong

A novel non-iterative circuit for computing division based on logarithm is proposed in the paper. Mitchell-based methods are used logarithmic and antilogarithmic conversions. Merging conversion stages implementation not possible if existing algorithms used. Thus, critical path has at least two carry propagate adders (CPAs). This work introduces a new algorithm to merge into single one remove of CPAs. Compared best computation method 3-D graphic system, design achieves improvements by 45.4%...

10.1109/iscas.2013.6572317 article EN 2022 IEEE International Symposium on Circuits and Systems (ISCAS) 2013-05-01

Scalable linear array architectures for matrix inversion using Bi-z CORDIC

OPENALEX - Publications

Jianwen Luo Ching Chuen Jong

In this paper, VLSI array architectures for matrix inversion are studied. A new binary-coded z-path (Bi-z) CORDIC is developed and implemented to compute the operations required in using Givens rotation (GR) based QR decomposition. The Bi-z allows both GR vectoring mode, as well division multiplication be executed a single unified processing element (PE). Hence, 2D (2 dimensional) consisting of PEs with different functionalities can folded into 1D reduce hardware complexity. also eliminates...

10.1016/j.mejo.2011.10.009 article EN Microelectronics Journal 2011-12-11

Development of 25-$\mu{\rm m}$-Pitch Microbumps for 3-D Chip Stacking

OPENALEX - Publications

Aibin Yu Aditya Kumar Soon Wee Ho Hnin Wai Yin John H. Lau and 12 more

The development of ultrafine-pitch microbumps and the thermal compression bonding (TCB) process for advanced 3-D stacking technology are discussed in this paper. Microbumps, consisting Cu pillars thin Sn caps with a pitch 25 μm, fabricated on an Si chip by electroplating method. Total thickness pillar cap is 10 μm. Electroless nickel immersion gold pads total 4 μm carrier. TCB carrier conducted FC150 flip-chip bonder, good joining higher than 10-MPa die shear strength achieved. After...

10.1109/tcpmt.2012.2203130 article EN IEEE Transactions on Components Packaging and Manufacturing Technology 2012-09-18

Modified reduced adder graph algorithm for multiplierless FIR filters

OPENALEX - Publications

Fei Xu Chip-Hong Chang Ching Chuen Jong

A modified reduced adder graph (MRAG) algorithm and its hybrid version are proposed for efficient digital filter implementation. Several improvements made to exploit fully the optimal part of n-dimensional (RAG-n) algorithm. Simulation results demonstrate that MRAG is capable generating lower cost solutions.

10.1049/el:20057392 article EN Electronics Letters 2005-01-01

Hamming weight pyramid – A new insight into canonical signed digit representation and its applications

OPENALEX - Publications

Fei Xu Chip-Hong Chang Ching Chuen Jong

10.1016/j.compeleceng.2006.09.001 article EN Computers & Electrical Engineering 2007-01-16

A look-ahead synthesis technique with backtracking for switching activity reduction in low power high-level synthesis

OPENALEX - Publications

Xianwu Xing Ching Chuen Jong

Research work done has shown that power consumption in digital integrated circuits can be effectively reduced by reducing the switching activity occurring on functional modules. High-level synthesis of for low often optimizes during two main processes, operation scheduling and module binding, which are usually performed one control step at a time separated stages. As processes strongly interdependent, separate optimization step-by-step manner frequently leads to sub-optimal solutions. In...

10.1016/j.mejo.2007.03.001 article EN Microelectronics Journal 2007-04-01

HWP: a new insight into canonical signed digit

OPENALEX - Publications

Fei Xu Chip-Hong Chang Ching Chuen Jong

A new Hamming weight pyramid (HWP) that resembles the Pascal triangle is proposed to succinctly compress information about distribution of in canonical signed digit (CSD) represented numbers a visually appealing manner for analysis and synthesis. Many interesting properties are discovered this regularly structured HWP. These lead novel elegant way convert decimal their binary equivalence, which an ineluctable intermediate process conventional CSD conversion algorithms.

10.1109/iscas.2004.1329497 article EN 2004-11-30

Partially reconfigurable matrix multiplication for area and time efficiency on FPGAs

OPENALEX - Publications

Jianwen Luo Ching Chuen Jong

This paper presents an architecture for matrix multiplication implemented on reconfigurable hardware with partially feature. The proposed design significantly reduces the size and achieves minimum computation cycles n /spl times/ multiplication. Compared linear array (Jang et al., 2002) area of our is reduced by 72%-81% while AT metrics (product latency) 40%-58% between 3 48 48. versatility demonstrated in different parameterisable instantiation to cater implementations various time...

10.1109/dsd.2004.72 article EN Digital Systems Design 2004-08-31

A Low-Cost 256-Point FFT Processor for Portable Speech and Audio Applications

OPENALEX - Publications

Chao Wang Woon‐Seng Gan Ching Chuen Jong Jianwen Luo

In this paper, a low-cost 256-point FFT processor design is presented for portable speech and audio applications. After an intensive review of existing architectures, single-butterfly architecture chosen to obtain low cost. architecture, two-multiplier three-adder pipelined butterfly unit proposed calculate the butterflies at different levels, recursively. Compared with other units, structure obtains best tradeoff between hardware cost processing throughput. The supply voltage scaling...

10.1109/isicir.2007.4441801 article EN 2007 International Symposium on Integrated Circuits 2007-09-01

High-speed and low-power serial accumulator for serial/parallel multiplier

OPENALEX - Publications

Manas Ranjan Meher Ching Chuen Jong Chip-Hong Chang

This paper presents a new approach to serial/parallel multiplier design by using parallel 1's counters accumulate the binary partial product bits. The in each column of matrix due serially input operands are accumulated serial T-flip flop (TFF) counter. Consequently, height is reduced from N ⌊log <inf xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">2</inf> Nl+ 1⌋. logarithmic reduction results very small carry save adder (CSA) array or tree required...

10.1109/apccas.2008.4745989 article EN 2008-11-01

A low complexity modulo 2n+1 squarer design

OPENALEX - Publications

Ramya Muralidharan Chip-Hong Chang Ching Chuen Jong

Modulo 2 n +1 squaring has been used in various applications like cryptography and Fermat number transform. Arithmetic modulo is also known to be the most time critical among three residue channels prevalent {2 -1, , +1} based system (RNS). In order speed up operation, diminished-1 representation widely employed. However use of results area overhead increased execution delay. this paper, we...

10.1109/apccas.2008.4746265 article EN 2008-11-01

Exploring module selection space for architectural synthesis of low power designs

OPENALEX - Publications

Zexiang Shen Ching Chuen Jong

Architectural synthesis for low power design is a complex optimization problem due to the interdependence of power, delay and area. In order obtain optimal architecture where both area are efficient, full space module selection must be explored. this paper we formulate as multi-objective propose branch bound approach explore large selection. Experiments show that can produce far better results than traditional architectural synthesizers all globally optimized simultaneously.

10.1109/iscas.1997.621420 article EN 2002-11-22

Coming Soon ...