NFDI4DS | UHH-SEMS - Publication Details

FPGA HLS Today: Successes, Challenges, and Opportunities

OPENALEX - Publications

Jason Cong Jason Lau Gai Liu Stephen Neuendorffer Peichen Pan and 2 more

The year 2011 marked an important transition for FPGA high-level synthesis (HLS), as it went from prototyping to deployment. A decade later, in this article, we assess the progress of deployment HLS technology and highlight successes several application domains, including deep learning, video transcoding, graph processing, genome sequencing. We also discuss challenges faced by today’s opportunities further research development, especially areas achieving high clock frequency, coping with...

10.1145/3530775 article EN ACM Transactions on Reconfigurable Technology and Systems 2022-04-21

Mutual Antagonism between the Ebola Virus VP35 Protein and the RIG-I Activator PACT Determines Infection Outcome

OPENALEX - Publications

Priya Luthra Parameshwaran Ramanan Chad E. Mire Carla Weisend Yoshimi Tsuda and 7 more

10.1016/j.chom.2013.06.010 article EN publisher-specific-oa Cell Host & Microbe 2013-07-01

An Intrinsically Disordered Peptide from Ebola Virus VP35 Controls Viral RNA Synthesis by Modulating Nucleoprotein-RNA Interactions

OPENALEX - Publications

Daisy W. Leung Dominika Borek Priya Luthra Jennifer M. Binning Manu Anantpadma and 11 more

During viral RNA synthesis, Ebola virus (EBOV) nucleoprotein (NP) alternates between an RNA-template-bound form and a template-free to provide the polymerase access template. In addition, newly synthesized NP must be prevented from indiscriminately binding noncognate RNAs. Here, we investigate molecular bases for these critical processes. We identify intrinsically disordered peptide derived EBOV VP35 (NPBP, residues 20-48) that binds with high affinity specificity, inhibits oligomerization,...

10.1016/j.celrep.2015.03.034 article EN cc-by Cell Reports 2015-04-01

Rosetta

OPENALEX - Publications

Yuan Zhou Udit Gupta Steve Dai Ritchie Zhao Nitish Srivastava and 7 more

Modern high-level synthesis (HLS) tools greatly reduce the turn-around time of designing and implementing complex FPGA-based accelerators. They also expose various optimization opportunities, which cannot be easily explored at register-transfer level. With increasing adoption HLS design methodology continued advances optimization, there is a growing need for realistic benchmarks to (1) facilitate comparisons between tools, (2) evaluate stress-test new techniques, (3) establish meaningful...

10.1145/3174243.3174255 article EN 2018-02-15

Structural basis for Marburg virus VP35–mediated immune evasion mechanisms

OPENALEX - Publications

Parameshwaran Ramanan Megan R. Edwards Reed S. Shabman Daisy W. Leung Ariel Endlich-Frazier and 6 more

Filoviruses, marburgvirus (MARV) and ebolavirus (EBOV), are causative agents of highly lethal hemorrhagic fever in humans. MARV EBOV share a common genome organization but show important differences replication complex formation, cell entry, host tropism, transcriptional regulation, immune evasion. Multifunctional filoviral viral protein (VP) 35 proteins inhibit innate responses. Recent studies suggest double-stranded (ds)RNA sequestration is potential mechanism that allows VP35 to...

10.1073/pnas.1213559109 article EN Proceedings of the National Academy of Sciences 2012-11-26

Development of RNA Aptamers Targeting Ebola Virus VP35

OPENALEX - Publications

Jennifer M. Binning Tianjiao Wang Priya Luthra Reed S. Shabman Dominika Borek and 5 more

Viral protein 35 (VP35), encoded by filoviruses, is a multifunctional dsRNA binding that plays important roles in viral replication, innate immune evasion, and pathogenesis. The nature of these proteins also presents opportunities to develop countermeasures target distinct functional regions. However, validation the establishment therapeutic approaches toward such proteins, particularly for nonenzymatic targets, are often challenging. Our previous work on filoviral VP35 defined conserved...

10.1021/bi400704d article EN Biochemistry 2013-09-26

Differential Regulation of Interferon Responses by Ebola and Marburg Virus VP35 Proteins

OPENALEX - Publications

Megan R. Edwards Gai Liu Chad E. Mire Suhas Sureshchandra Priya Luthra and 7 more

Suppression of innate immune responses during filoviral infection contributes to disease severity. Ebola (EBOV) and Marburg (MARV) viruses each encode a VP35 protein that suppresses RIG-I-like receptor signaling interferon-α/β (IFN-α/β) production by several mechanisms, including direct binding double stranded RNA (dsRNA). Here, we demonstrate in cell culture, MARV results greater upregulation IFN as compared EBOV infection. This correlates with differences the efficiencies which VP35s...

10.1016/j.celrep.2016.01.049 article EN cc-by-nc-nd Cell Reports 2016-02-01

The Celerity Open-Source 511-Core RISC-V Tiered Accelerator Fabric: Fast Architectures and Design Methodologies for Fast Chips

OPENALEX - Publications

Scott Davidson Shaolin Xie Christopher Torng Khalid Al-Hawai Austin Rovinski and 15 more

Rapidly emerging workloads require rapidly developed chips. The Celerity 16-nm open-source SoC was implemented in nine months using an architectural trifecta to minimize development time: a general-purpose tier comprised of Linux-capable RISC-V cores, massively parallel tiled manycore array that can be scaled arbitrary sizes, and specialization uses high-level synthesis (HLS) create algorithmic neural-network accelerator. These tiers are tied together with efficient heterogeneous remote...

10.1109/mm.2018.022071133 article EN IEEE Micro 2018-03-01

A Parallel Bandit-Based Approach for Autotuning FPGA Compilation

OPENALEX - Publications

Chang Xu Gai Liu Ritchie Zhao Stephen Yang Guojie Luo and 1 more

Mainstream FPGA CAD tools provide an extensive collection of optimization options that have a significant impact on the quality final design. These together create enormous and complex design space cannot effectively be explored by human effort alone. Instead, we propose to search this parameter using autotuning, which is popular approach in compiler domain. Specifically, study effectiveness applying multi-armed bandit (MAB) technique automatically tune for complete compilation flow from RTL...

10.1145/3020078.3021747 article EN 2017-02-02

Dynamic Hazard Resolution for Pipelining Irregular Loops in High-Level Synthesis

OPENALEX - Publications

Steve Dai Ritchie Zhao Gai Liu S Srinath Udit Gupta and 2 more

Current pipelining approach in high-level synthesis (HLS) achieves high performance for applications with regular and statically analyzable memory access patterns. However, it cannot effectively handle infrequent data-dependent structural data hazards because they are conservatively assumed to always occur the synthesized pipeline. To enable high-throughput of irregular loops, we study problem augmenting HLS application-specific dynamic hazard resolution, examine its implications on...

10.1145/3020078.3021754 article EN 2017-02-02

ElasticFlow: A complexity-effective approach for pipelining irregular loop nests

OPENALEX - Publications

Mingxing Tan Gai Liu Ritchie Zhao Steve Dai Zhiru Zhang

Modern high-level synthesis (HLS) tools commonly employ pipelining to achieve efficient loop acceleration by overlapping the execution of successive iterations. However, existing HLS techniques provide inadequate support for irregular nests that contain dynamic-bound inner loops, where unrolling is either very expensive or not even applicable. To overcome this major limitation, we propose ElasticFlow, a novel architectural approach capable dynamically distributing loops an array processing...

10.1109/iccad.2015.7372553 article EN 2015 IEEE/ACM International Conference on Computer-Aided Design (ICCAD) 2015-11-01

ElasticFlow: A Complexity-Effective Approach for Pipelining Irregular Loop Nests

OPENALEX - Publications

Mingxing Tan Gai Liu Ritchie Zhao Steve Dai Zhiru Zhang

Modern high-level synthesis (HLS) tools commonly employ pipelining to achieve efficient loop acceleration by overlapping the execution of successive iterations. However, existing HLS techniques provide inadequate support for irregular nests that contain dynamic-bound inner loops, where unrolling is either very expensive or not even applicable. To overcome this major limitation, we propose ElasticFlow, a novel architectural approach capable dynamically distributing loops an array processing...

10.5555/2840819.2840831 article EN International Conference on Computer Aided Design 2015-11-02

Architectural Specialization for Inter-Iteration Loop Dependence Patterns

OPENALEX - Publications

S Srinath Berkin Ilbeyi Mingxing Tan Gai Liu Zhiru Zhang and 1 more

Hardware specialization is an increasingly common technique to enable improved performance and energy efficiency in spite of the diminished benefits technology scaling. This paper proposes a new approach called explicit loop (XLOOPS) based on idea elegantly encoding inter-iteration dependence patterns instruction set. XLOOPS supports variety data-and control-dependence for both single nested loops. The hardware/software abstraction requires only lightweight changes general-purpose compiler...

10.1109/micro.2014.31 article EN 2014-12-01

A Parallelized Iterative Improvement Approach to Area Optimization for LUT-Based Technology Mapping

OPENALEX - Publications

Gai Liu Zhiru Zhang

Modern FPGA synthesis tools typically apply a predetermined sequence of logic optimizations on the input network before carrying out technology mapping. While "known recipes" transformations often lead to improved mapping results, there remains nontrivial gap between quality metrics driving pre-mapping and those targeted by actual Needless mention, such miscorrelations would eventually result in suboptimal results. In this paper we propose PIMap, which couples under an iterative improvement...

10.1145/3020078.3021735 article EN 2017-02-02

Statistically certified approximate logic synthesis

OPENALEX - Publications

Gai Liu Zhiru Zhang

Approximate logic synthesis generates inexact implementations of functions in exchange for better design qualities such as area, timing and power consumption. However, the error behavior approximate circuits (e.g., rate or magnitude) depends heavily on specific technique well input vectors, hindering end users from confidently adopting designs. In this paper, we propose a statistically certified framework using techniques stochastic optimization, integrate it into state-of-the-art...

10.1109/iccad.2017.8203798 article EN 2015 IEEE/ACM International Conference on Computer-Aided Design (ICCAD) 2017-11-01

A Scalable Approach to Exact Resource-Constrained Scheduling Based on a Joint SDC and SAT Formulation

OPENALEX - Publications

Steve Dai Gai Liu Zhiru Zhang

Despite increasing adoption of high-level synthesis (HLS) for its design productivity advantage, success in achieving high quality-of-results out-of-the-box is often hindered by the inexactness common HLS optimizations. In particular, while scheduling forms algorithmic core to technology, current algorithms rely heavily on fundamentally inexact heuristics that make ad hoc local decisions and cannot accurately globally optimize over a rich set constraints. To tackle this challenge, we propose...

10.1145/3174243.3174268 article EN 2018-02-15

Enabling adaptive loop pipelining in high-level synthesis

OPENALEX - Publications

Steve Dai Gai Liu Ritchie Zhao Zhiru Zhang

Loop pipelining is an important optimization in high-level synthesis (HLS) because it allows successive loop iterations to be overlapped during execution. While current HLS approach achieves high performance for loops with regular and statically analyzable program patterns, remains challenging pipeline irregular memory accesses, dependence unbalanced workload. The lack of support dynamic behaviors results conservatively synthesized pipelines that sacrifice maintaining presumed regularity. In...

10.1109/acssc.2017.8335152 article EN 2017-10-01

Improving high-level synthesis with decoupled data structure optimization

OPENALEX - Publications

Ritchie Zhao Gai Liu S Srinath Christopher Batten Zhiru Zhang

Existing high-level synthesis (HLS) tools are mostly effective on algorithm-dominated programs that only use primitive data structures such as fixed size arrays and queues. However, many widely used priority queues, heaps, trees feature complex member methods with data-dependent work irregular memory access patterns. These can be inlined to their call sites, but this does not address the aforementioned issues may further complicate conventional HLS optimizations, resulting in a...

10.1145/2897937.2898030 article EN 2016-05-25

Statistically certified approximate logic synthesis

OPENALEX - Publications

Gai Liu Zhiru Zhang

Approximate logic synthesis generates inexact implementations of functions in exchange for better design qualities such as area, timing and power consumption. However, the error behavior approximate circuits (e.g., rate or magnitude) depends heavily on specific technique well input vectors, hindering end users from confidently adopting designs. In this paper, we propose a statistically certified framework using techniques stochastic optimization, integrate it into state-of-the-art...

10.5555/3199700.3199746 article EN International Conference on Computer Aided Design 2017-11-13

Architecture and Synthesis for Area-Efficient Pipelining of Irregular Loop Nests

OPENALEX - Publications

Gai Liu Mingxing Tan Steve Dai Ritchie Zhao Zhiru Zhang

Modern high-level synthesis (HLS) tools commonly employ pipelining to achieve efficient loop acceleration by overlapping the execution of successive iterations. While existing HLS techniques obtain good performance with low complexity for regular nests, they provide inadequate support effectively synthesizing irregular nests. For nests dynamic-bound inner loops, current require unrolling which is either very expensive in resource or even inapplicable due dynamic bounds. To address this major...

10.1109/tcad.2017.2664067 article EN publisher-specific-oa IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 2017-02-07

CASA

OPENALEX - Publications

Gai Liu Ye Tao Mingxing Tan Zhiru Zhang

Speculative adders divide addition into subgroups and execute them in parallel for higher execution speed energy efficiency, but at the risk of generating incorrect results. In this paper, we propose a lightweight correlation-aware speculative (CASA) method, which exploits correlation between input data carry-in values observed real-life benchmarks to improve accuracy adders. Experimental results show that applying CASA method leads significant reduction error rate with only marginal...

10.1145/2627369.2627635 article EN Proceedings of the International Symposium on Low Power Electronics and Design 2014-08-01

PIMap

OPENALEX - Publications

Gai Liu Zhiru Zhang

Modern FPGA synthesis tools typically apply a predetermined sequence of logic optimizations on the input network before carrying out technology mapping. While “known recipes” transformations often lead to improved mapping results, there remains nontrivial gap between quality metrics driving pre-mapping and those targeted by actual Needless mention, such miscorrelations would eventually result in suboptimal results. In this article, we propose PIMap, which couples under an iterative...

10.1145/3268344 article EN ACM Transactions on Reconfigurable Technology and Systems 2018-12-31

Rapid Generation of High-Quality RISC-V Processors from Functional Instruction Set Specifications

OPENALEX - Publications

Gai Liu Joseph Primmer Zhiru Zhang

The increasing popularity of compute acceleration for emerging domains such as artificial intelligence and computer vision has led to the growing need domain-specific accelerators, often implemented specialized processors that execute a set domain-optimized instructions. ability rapidly explore (1) various possibilities customized instruction set, (2) its corresponding micro-architectural features is critical achieve best quality-of-results (QoRs). However, this frequently hindered by manual...

10.1145/3316781.3317890 article EN 2019-05-23

A reconfigurable analog substrate for highly efficient maximum flow computation

OPENALEX - Publications

Gai Liu Zhiru Zhang

We present the design and analysis of a novel analog reconfigurable substrate that enables fast efficient computation maximum flow on directed graphs. The is composed memristors standard circuit components, where on/off states crossbar switches encode graph topology. show upon convergence, steady-state voltages in capture solution to problem. also provide techniques minimize impacts variability non-ideal components quality, enabling practical implementation proposed substrate. Performance...

10.1145/2744769.2744781 article EN 2015-06-02