- VLSI and FPGA Design Techniques
- VLSI and Analog Circuit Testing
- Low-power high-performance VLSI design
- Embedded Systems Design Techniques
- Algorithms and Data Compression
- Computational Geometry and Mesh Generation
- Evolutionary Algorithms and Applications
- Parallel Computing and Optimization Techniques
- Integrated Circuits and Semiconductor Failure Analysis
- Advanced Graph Theory Research
- Cloud Computing and Resource Management
- Molecular Communication and Nanonetworks
- Advanced Memory and Neural Computing
- Optimization and Packing Problems
- Advancements in Photolithography Techniques
- graph theory and CDMA systems
- Image Processing Techniques and Applications
- Graph Theory and Algorithms
- Scheduling and Optimization Algorithms
- Numerical Methods and Algorithms
- Manufacturing Process and Optimization
- Advanced Surface Polishing Techniques
- Forensic Toxicology and Drug Analysis
- Modular Robots and Swarm Intelligence
- Advanced Neural Network Applications
National Taiwan Ocean University
2025
Feng Chia University
2025
Xilinx (United States)
2014-2016
Magma (Germany)
2008
Software and Engineering Associates (United States)
2006-2008
University of Illinois Urbana-Champaign
1992-2005
Clarkson University
1996-2005
University of California, Berkeley
2005
Intel (United States)
1998-1999
Urbana University
1995
With the recent advancement of multilayer convolutional neural networks (CNN), deep learning has achieved amazing success in many areas, especially visual content understanding and classification. To improve performance energy-efficiency computation-demanding CNN, FPGA-based acceleration emerges as one most attractive alternatives. In this paper we design implement Caffeine, a hardware/software co-designed library to efficiently accelerate entire CNN on FPGAs. First, propose uniformed...
With the recent advancement of multilayer convolutional neural networks (CNNs) and fully connected (FCNs), deep learning has achieved amazing success in many areas, especially visual content understanding classification. To improve performance energy efficiency computation-demanding CNN, FPGA-based acceleration emerges as one most attractive alternatives. In this paper, we design implement Caffeine, a hardware/software co-designed library to efficiently accelerate entire CNN FCN on FPGAs....
The year 2011 marked an important transition for FPGA high-level synthesis (HLS), as it went from prototyping to deployment. A decade later, in this article, we assess the progress of deployment HLS technology and highlight successes several application domains, including deep learning, video transcoding, graph processing, genome sequencing. We also discuss challenges faced by today’s opportunities further research development, especially areas achieving high clock frequency, coping with...
Design automation or computer-aided design (CAD) for field programmable gate arrays (FPGAs) has played a critical role in the rapid advancement and adoption of FPGA technology over past two decades. The purpose this paper is to meet demand an up-to-date comprehensive survey/tutorial automation, with emphasis on recent developments within 5-10 years. focuses theory techniques that have been, most likely will be, reduced practice. It covers all major steps flow which includes: routing...
In this paper, w e presen t a new retiming-based technology mapping algorithm for look-up table-based field programmable gate arrays. The is based on novel iterative procedure computing all k-cuts of nodes in sequen tialcircuit, the presence retiming. completely avoids flow computation whic his bottleneck previous algorithms. Due to fact that k very small practice, v ery fast. Experimental results indicate overall efficient practice.
In this paper we consider the problem of clustering sequential circuits subject to a bound on area each cluster, with objective minimizing clock period. Current algorithms address combinational only, and treat circuit as special case, by removing all flip-flops (FF's) part circuit. This approach breaks signal dependencies assumes positions FF's are fixed. The in fact dynamic, because retiming. As result, current can only small portion whole solution space. paper, present algorithm that does...
In this paper we study the area minimization problem in floorplanning (also known as floorplan sizing problem). For a given floorplan, is to select layout alternative for each subcircuit on chip so minimize area. Two methods general floorplans are proposed. Both can be viewed generalizations of classical algorithm slicing Otten (1982) and Stockmeyer (1983) sense that they reduce naturally their floorplans. Compared with branch-and-bound Wimer et al (1989), which does not have nontrivial...
This paper focuses on the development of an infrastructure to enable FPGA-based acceleration in data centers. We present initial version integrated solution that includes automated compilation for accelerator generation, runtime resource scheduling and management, libraries customized computing big applications. The can help overcome some main challenges with accelerated computing. It has potential bring significant performance energy efficiency improvement center
Article Free Access Share on Optimal clock period FPGA technology mapping for sequential circuits Authors: Peichen Pan Dept. of Electrical & Computer Eng., Clarkson University, Potsdam, NY NYView Profile , C. L. Liu Science, University Illinois at Urbana-Champaign, Urbana, IL ILView Authors Info Claims DAC '96: Proceedings the 33rd annual Design Automation ConferenceJune 1996Pages 720–725https://doi.org/10.1145/240518.240655Published:01 June 1996Publication History...
We study the technology mapping problem for sequential circuits look-up table (LUT) based field programmable gate arrays (FPGAs). Existing approaches to simply remove flip-flops (FFs), then map remaining combinational logic, and finally put FFs back. These ignore nature of a circuit assume positions are fixed. However, in can be reposistioned by functionality-preserving transformation called retiming. As result, existing only consider very small portion available solution space. propose this...
Till now most efforts in low power lo gic synthesis have oncentr ated on minimizing the total switching activity of a circuit under zero delay model. This simplification ignor es effe cts glitch tr ansitions which may contribute as much 30% c onsumption circuit. Hence, logic techniques optimize zer o model ar e often not successful attaining “r eal” p ower savings measured more accurate gener al In pr actice, to ac curately estimate can be computationally expensive. repeatedly call but slow...
In this paper, we study the technology mapping problem for sequential circuits LUT-based FPGAs, Existing approaches map combinational logic between flip-flops (FFs) while assuming positions of FFs are fixed. We in paper a new approach to problem, which retiming is integrated into process. present polynomial time algorithm that can produce solution with minimum clock period be arbitrarily repositioned by retiming. The has been implemented. Experimental results on benchmark clearly demonstrate...
No abstract available.
A new problem called monotone bipartitioning of a planar point set is identified which found to be useful in VLSI layout design. Let F denote rectangular floor containing n points. The portion straight line formed by two points from the segment. increasing path ( MP ) connected and ordered sequence segments bottom-left corner its top-right corner, such that slope each segment nonnegative, pair consecutive share common . An said maximal MMP if no other can included it preserving monotonicity....
Article Free Access Share on Performance-driven integration of retiming and resynthesis Author: Peichen Pan Strategic CAD Labs, Intel Corporation, Hillsboro, OR ORView Profile Authors Info & Claims DAC '99: Proceedings the 36th annual ACM/IEEE Design Automation ConferenceJune 1999 Pages 243–246https://doi.org/10.1145/309847.309921Published:01 June 1999Publication History 1citation214DownloadsMetricsTotal Citations1Total Downloads214Last 12 Months16Last 6 weeks5 Get Citation AlertsNew Alert...
Two results are presented in this paper. First we settle the open problem on complexity of area minimization for hierarchical floorplans by showing it to be N P-complete. We then present a pseudo-polynomial algorithm order-5. The is based new determining set nonredundant realizations wheel. wheels has time cost O(k2logk) and space O(k2) if each (five) blocks wheel at mostkrealizations—a reduction factor k both costs comparison with previous algorithms. was implemented. Our experimental show...
Article Free Access Share on Optimal graph constraint reduction for symbolic layout compaction Authors: Peichen Pan View Profile , Sai-keung Dong C. L. Liu Authors Info & Claims DAC '93: Proceedings of the 30th international Design Automation ConferenceJuly 1993Pages 401–406https://doi.org/10.1145/157485.164950Published:01 July 1993Publication History 3citation291DownloadsMetricsTotal Citations3Total Downloads291Last 12 Months14Last 6 weeks0 Get Citation AlertsNew Alert added!This alert has...
Retiming is a transformation that optimizes sequential circuit by relocating the registers. When has an initial state, one must compute equivalent state for retimed circuit. In this paper we propose new efficient retiming algorithm performance optimization. The determined best with respect to computation. It easiest finding and if logic modification required, it incurs minimum amount of modification.
We consider the problem of clustering sequential circuits subject to a bound on area each cluster, with objective minimizing clock period. Current algorithms address combinational only, and treat circuit as special case, by removing flip-flops (FFs) remaining logic. This approach segments assumes positions FFs are fixed. The in fact dynamic, because retiming. As result, current can only small portion available solution space. In this paper, we present algorithm that does not remove FFs. It...