Shiyang Chen

ORCID: 0000-0003-2626-7865
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Advanced Graph Neural Networks
  • Topic Modeling
  • Computational Drug Discovery Methods
  • Machine Learning in Materials Science
  • Multimodal Machine Learning Applications
  • Natural Language Processing Techniques
  • Advanced Neural Network Applications
  • Analytical Chemistry and Chromatography
  • Graph Theory and Algorithms
  • Neural Networks and Applications
  • Regional Economic and Spatial Analysis
  • Machine Learning in Bioinformatics
  • Topological and Geometric Data Analysis
  • Electrospun Nanofibers in Biomedical Applications
  • Drug Solubulity and Delivery Systems
  • EEG and Brain-Computer Interfaces
  • Silk-based biomaterials and applications
  • Caching and Content Delivery
  • Functional Brain Connectivity Studies
  • Neural dynamics and brain function
  • Tensor decomposition and applications
  • Intelligent Tutoring Systems and Adaptive Learning
  • Carbon and Quantum Dots Applications
  • Signaling Pathways in Disease
  • Globalization, Economics, and Policies

Guangxi University
2024

Rutgers, The State University of New Jersey
2023-2024

China Institute of Finance and Capital Markets
2024

Tiangong University
2023-2024

Stevens Institute of Technology
2021-2023

China Earthquake Administration
2022

Tokyo University of Science
2020-2021

Georgia Institute of Technology
2017

Emory University
2017

Beihang University
2015-2016

AlphaFold2 revolutionized structural biology with the ability to predict protein structures exceptionally high accuracy. Its implementation, however, lacks code and data required train new models. These are necessary (1) tackle tasks, like protein–ligand complex structure prediction, (2) investigate process by which model learns (3) assess model's capacity generalize unseen regions of fold space. Here we report OpenFold, a fast, memory efficient trainable implementation AlphaFold2. We...

10.1038/s41592-024-02272-z article EN cc-by Nature Methods 2024-05-14

Transformers are considered one of the most important deep learning models since 2018, in part because it establishes state-of-the-art (SOTA) records and could potentially replace existing Deep Neural Networks (DNNs). Despite remarkable triumphs, prolonged turnaround time Transformer is a widely recognized roadblock. The variety sequence lengths imposes additional computing overhead where inputs need to be zero-padded maximum sentence length batch accommodate parallel platforms. This paper...

10.1145/3489517.3530585 article EN Proceedings of the 59th ACM/IEEE Design Automation Conference 2022-07-10

The expressive power of neural networks in modelling non-trivial distributions can principle be exploited to bypass topological freezing and critical slowing down simulations lattice field theories. Some popular approaches are unable sample correctly topology, which may lead some classes configurations not being generated. In this contribution, we present a novel generative method inspired by model previously introduced the ML community (GFlowNets). We demonstrate its efficiency at exploring...

10.48550/arxiv.2502.02127 preprint EN arXiv (Cornell University) 2025-02-04

Large Multimodal Models (LMMs) exhibit major shortfalls when interpreting images and, by some measures, have poorer spatial cognition than small children or animals. Despite this, they attain high scores on many popular visual benchmarks, with headroom rapidly eroded an ongoing surge of model progress. To address there is a pressing need for difficult benchmarks that remain relevant longer. We take this idea to its limit introducing ZeroBench-a lightweight reasoning benchmark entirely...

10.48550/arxiv.2502.09696 preprint EN arXiv (Cornell University) 2025-02-13

Six-bit quantization (FP6) can effectively reduce the size of large language models (LLMs) and preserve model quality consistently across varied applications. However, existing systems do not provide Tensor Core support for FP6 struggle to achieve practical performance improvements during LLM inference. It is challenging on GPUs due (1) unfriendly memory access weights with irregular bit-width (2) high runtime overhead weight de-quantization. To address these problems, we propose TC-FPx,...

10.48550/arxiv.2401.14112 preprint EN other-oa arXiv (Cornell University) 2024-01-01

Shaoyi Huang, Dongkuan Xu, Ian Yen, Yijue Wang, Sung-En Chang, Bingbing Li, Shiyang Chen, Mimi Xie, Sanguthevar Rajasekaran, Hang Liu, Caiwen Ding. Proceedings of the 60th Annual Meeting Association for Computational Linguistics (Volume 1: Long Papers). 2022.

10.18653/v1/2022.acl-long.16 article DE cc-by Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) 2022-01-01

The solubility of a drug is higher when it in an amorphous form than crystalline form. To enhance the ibuprofen (IBU), poorly water-soluble drug, we attempted to adsorb IBU onto spherical porous calcium silicate (Florite® PS300, PS300) two ways: evaporation (EV) and sealed heating (SH) methods. crystallinity samples was evaluated using powder X-ray diffraction analysis (PXRD) differential scanning calorimetry (DSC). molecular interaction between PS300 with FTIR. In addition, dissolution...

10.3390/pharmaceutics13060767 article EN cc-by Pharmaceutics 2021-05-21

Graph learning is becoming increasingly popular due to its superior performance in tackling many grand challenges. While quantization widely used accelerate Neural Network (GNN) computation, quantized training faces remarkable roadblocks. Current GNN systems often experience longer time than their full-precision counterparts for two reasons: (i) addressing the accuracy challenge leads excessive overhead, and (ii) optimization potential exposed by not adequately leveraged. This paper...

10.1145/3581784.3607037 article EN 2023-11-11

Many real-world networks are characterized by being temporal and dynamic, wherein the information signifies changes in connections, such as addition or removal of links between nodes. Employing random walks on these is a crucial technique for understanding structural evolution graphs over time. However, existing state-of-the-art sampling methods designed traditional static graphs, such, they struggle to efficiently handle dynamic aspects networks. This deficiency can be attributed several...

10.1145/3652604 article EN ACM Transactions on Architecture and Code Optimization 2024-03-14

Transformer-based deep learning models have become a ubiquitous vehicle to drive variety of Natural Language Processing (NLP) related tasks beyond their accuracy ceiling. However, these also suffer from two pronounced challenges, that is, gigantic model size and prolonged turnaround time. To this end, we introduce ET. r<u>E</u>-thinks self-attention computation for <u>T</u>ransformer on GPUs with the following contributions: First, novel architecture, which encompasses tailored operators...

10.1145/3458817.3476138 article EN 2021-10-21

This work considers the task of representation learning on attributed relational graph (ARG). Both nodes and edges in an ARG are associated with attributes/features allowing ARGs to encode rich structural information widely observed real applications. Existing neural networks offer limited ability capture complex interactions within local contexts, which hinders them from taking advantage expression power ARGs. We propose motif convolution module (MCM), a new motif-based technique better...

10.3390/informatics10010008 article EN cc-by Informatics 2023-01-11

Summary Traditional tracking classification algorithm has been widely applied to target in wireless sensor networks. In this paper, focusing on the accuracy of networks, we propose an improved threshold factor track algorithm. The extracts motion model according intrinsic properties target. It updates iterative center real‐time state moving and timely filters out weak correlated or uncorrelated data. order show is more effective, compare proposed with based Euclidean distance comprehensive...

10.1002/dac.3164 article EN International Journal of Communication Systems 2016-07-26

Molecular similarity search has been widely used in drug discovery to identify structurally similar compounds from large molecular databases rapidly. With the increasing size of chemical libraries, there is growing interest efficient acceleration large-scale search. Existing works mainly focus on CPU and GPU accelerate computation Tanimoto coefficient measuring pairwise between different fingerprints. In this paper, we propose optimize an FPGA-based accelerator design exhaustive approximate...

10.1109/iccad51958.2021.9643528 article EN 2015 IEEE/ACM International Conference on Computer-Aided Design (ICCAD) 2021-11-01

Conventional wisdom in pruning Transformer-based language models is that reduces the model expressiveness and thus more likely to underfit rather than overfit. However, under trending pretrain-and-finetune paradigm, we postulate a counter-traditional hypothesis, is: increases risk of overfitting when performed at fine-tuning phase. In this paper, aim address problem improve performance via progressive knowledge distillation with error-bound properties. We show for first time reducing can...

10.48550/arxiv.2110.08190 preprint EN other-oa arXiv (Cornell University) 2021-01-01

10.4271/2015-01-2191 article EN SAE technical papers on CD-ROM/SAE technical paper series 2015-06-15

Although Transformer-based deep learning models have been widely used in many natural language processing (NLP) tasks as well computer vision, they suffer from gigantic model size and long latency. Network pruning can reduce the computational cost size. However, existing works mainly focus on irregular(sparse) pruning, which often causes irregular computations extra indices per remained weight. In this work, we propose a Tensor-core inspired hierarchical compression method to push...

10.1145/3453688.3461740 article EN Proceedings of the Great Lakes Symposium on VLSI 2022 2021-06-18
Coming Soon ...