NFDI4DS | UHH-SEMS - Publication Details

Yu Cheng

ORCID: 0000-0003-4258-0499

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5100622629

Research Areas

Topic Modeling
Natural Language Processing Techniques
Adversarial Robustness in Machine Learning
Dark Matter and Cosmic Phenomena
Cosmology and Gravitation Theories
Particle physics theoretical and experimental studies
Speech and dialogue systems
Speech and Audio Processing
Computational Physics and Python Applications
Anomaly Detection Techniques and Applications
Scientific Research and Discoveries
Sentiment Analysis and Opinion Mining
Speech Recognition and Synthesis
Text Readability and Simplification
Advanced Graph Neural Networks
Recommender Systems and Techniques
Bacillus and Francisella bacterial research
Neural Networks and Applications
Psychological and Temporal Perspectives Research
Machine Learning in Healthcare
Neuroscience and Music Perception
Artificial Intelligence in Games
Relativity and Gravitational Theory
Quantum Mechanics and Applications
Adaptive Dynamic Programming Control

Ningbo University
2022-2025

Institute of Psychology, Chinese Academy of Sciences
2025

Microsoft Research (United Kingdom)
2022-2023

Shanghai Jiao Tong University
2023

Microsoft (Finland)
2023

Chinese University of Hong Kong
2023

Peking University
2022

Qingdao University
2022

Alibaba Group (United States)
2022

Suqian University
2022

Patient Knowledge Distillation for BERT Model Compression

OPENALEX - Publications

Siqi Sun Yu Cheng Zhe Gan Jun Liu

Pre-trained language models such as BERT have proven to be highly effective for natural processing (NLP) tasks. However, the high demand computing resources in training hinders their application practice. In order alleviate this resource hunger large-scale model training, we propose a Patient Knowledge Distillation approach compress an original large (teacher) into equally-effective lightweight shallow network (student). Different from previous knowledge distillation methods, which only use...

10.48550/arxiv.1908.09355 preprint EN other-oa arXiv (Cornell University) 2019-01-01

SemAttack: Natural Textual Attacks via Different Semantic Spaces

OPENALEX - Publications

Boxin Wang Chejian Xu Xiangyu Liu Yu Cheng Bo Li

Recent studies show that pre-trained language models (LMs) are vulnerable to textual adversarial attacks. However, existing attack methods either suffer from low success rates or fail search efficiently in the exponentially large perturbation space. We propose an efficient and effective framework SemAttack generate natural text by constructing different semantic functions. In particular, optimizes generated perturbations constrained on generic spaces, including typo space, knowledge space...

10.18653/v1/2022.findings-naacl.14 article EN cc-by Findings of the Association for Computational Linguistics: NAACL 2022 2022-01-01

AdaLoRA: Adaptive Budget Allocation for Parameter-Efficient Fine-Tuning

OPENALEX - Publications

Qingru Zhang Minshuo Chen Alexander Bukharin Pengcheng He Yu Cheng and 2 more

Fine-tuning large pre-trained language models on downstream tasks has become an important paradigm in NLP. However, common practice fine-tunes all of the parameters a model, which becomes prohibitive when number are present. Therefore, many fine-tuning methods proposed to learn incremental updates weights parameter efficient way, e.g., low-rank increments. These often evenly distribute budget across weight matrices, and overlook varying importance different parameters. As consequence,...

10.48550/arxiv.2303.10512 preprint EN other-oa arXiv (Cornell University) 2023-01-01

Dark Matter Annihilation via Breit-Wigner Enhancement with Heavier Mediator

OPENALEX - Publications

Yu Cheng Shao-Feng Ge Jie Sheng Tsutomu T. Yanagida

10.1016/j.physletb.2025.139290 article EN cc-by Physics Letters B 2025-01-30

LASP-2: Rethinking Sequence Parallelism for Linear Attention and Its Hybrid

OPENALEX - Publications

Weigao Sun Disen Lan Yiran Zhong Xiaoye Qu Yu Cheng

Linear sequence modeling approaches, such as linear attention, provide advantages like linear-time training and constant-memory inference over lengths. However, existing parallelism (SP) methods are either not optimized for the right-product-first feature of attention or use a ring-style communication strategy, which results in lower computation parallelism, limits their scalability longer sequences distributed systems. In this paper, we introduce LASP-2, new SP method to enhance both when...

10.48550/arxiv.2502.07563 preprint EN arXiv (Cornell University) 2025-02-11

The impact of timing perception strategy on intertemporal decision-making in older adults: the role of subjective time perception

OPENALEX - Publications

Guanglin Li Yifan Chen Xiao-Ming Lu Yu Cheng Qing Jia and 2 more

With the global aging population, an increasing number of researchers are interested in intertemporal choice issues faced by older adults. Previous studies have examined how age-related differences time perception affect choices. However, impact strategy on decision-making among adults remains unclear. This study was designed to examine timing influence while also exploring possible mechanisms. We manipulated preferences through priming two experiments (Experiment 1, n = 160; Experiment 2,...

10.1080/13825585.2025.2459626 article EN Aging Neuropsychology and Cognition 2025-02-21

A Survey of Reasoning with Foundation Models

OPENALEX - Publications

Jiankai Sun Chuanyang Zheng Enze Xie Zhengying Liu Ruihang Chu and 29 more

https://github.com/reasoning-survey/Awesome-Reasoning-Foundation-Models Reasoning, a crucial ability for complex problem-solving, plays pivotal role in various real-world settings such as negotiation, medical diagnosis, and criminal investigation. It serves fundamental methodology the field of Artificial General Intelligence (AGI). With ongoing development foundation models, there is growing interest exploring their abilities reasoning tasks. In this paper, we introduce seminal models...

10.31219/osf.io/ac4sp preprint EN 2023-12-13

Right-handed neutrino dark matter with forbidden annihilation

OPENALEX - Publications

Yu Cheng Shao-Feng Ge Jie Sheng Tsutomu T. Yanagida

The seesaw mechanism with three right-handed neutrinos has one as a well-motivated dark matter candidate if stable and the other two can explain baryon asymmetry via thermal leptogenesis scenario. We explore possibility of introducing additional particles to make neutrino in equilibrium freeze out through forbidden annihilation channel. Nowadays Universe, this channel be reactivated by strong gravitational potential such supermassive black hole our galaxy center. Fermi-LAT gamma ray data...

10.1103/physrevd.107.123013 article EN cc-by Physical review. D/Physical review. D. 2023-06-12

Zoomer: Boosting Retrieval on Web-scale Graphs by Regions of Interest

OPENALEX - Publications

Yuezihan Jiang Yu Cheng Hanyu Zhao Wentao Zhang Xupeng Miao and 4 more

We introduce Zoomer, a system deployed at Taobao, the largest e-commerce platform in China, for training and serving GNN-based recommendations over web-scale graphs. Zoomer is designed tackling two challenges presented by massive user data Taobao: low training/serving efficiency due to huge scale of graphs, recommendation quality information overload which distracts model from specific intentions. achieves this introducing key concept, Region Interests (ROI) GNNs recommendations, i.e.,...

10.1109/icde53745.2022.00212 article EN 2022 IEEE 38th International Conference on Data Engineering (ICDE) 2022-05-01

Rethinking the Video Sampling and Reasoning Strategies for Temporal Sentence Grounding

OPENALEX - Publications

Jiahao Zhu Daizong Liu Pan Zhou Xing Di Yu Cheng and 6 more

Temporal sentence grounding (TSG) aims to identify the temporal boundary of a specific segment from an untrimmed video by query. All existing works first utilize sparse sampling strategy extract fixed number frames and then interact them with query for reasoning.However, we argue that these methods have overlooked two indispensable issues:1) Boundary-bias: The annotated target generally refers as corresponding start end timestamps. downsampling process may lose take adjacent irrelevant new...

10.18653/v1/2022.findings-emnlp.41 article EN cc-by 2022-01-01

DSEE: Dually Sparsity-embedded Efficient Tuning of Pre-trained Language Models

OPENALEX - Publications

Xuxi Chen Tianlong Chen Weizhu Chen Ahmed Hassan Awadallah Zhangyang Wang and 1 more

Xuxi Chen, Tianlong Weizhu Ahmed Hassan Awadallah, Zhangyang Wang, Yu Cheng. Proceedings of the 61st Annual Meeting Association for Computational Linguistics (Volume 1: Long Papers). 2023.

10.18653/v1/2023.acl-long.456 article EN cc-by 2023-01-01

Local Byte Fusion for Neural Machine Translation

OPENALEX - Publications

Makesh Narsimhan Sreedhar Xiangpeng Wan Yu Cheng Junjie Hu

Subword tokenization schemes are the dominant technique used in current NLP models. However, such can be rigid and tokenizers built on one corpus may not adapt well to other parallel corpora. It has also been observed that multilingual corpora, subword oversegment low-resource languages, leading a drop translation performance. An alternative is byte-based tokenization, i.e., into byte sequences using UTF-8 encoding scheme. Byte tokens often represent inputs at sub-character granularity,...

10.18653/v1/2023.acl-long.397 article EN cc-by 2023-01-01

Dark Matter Annihilation via Breit-Wigner Enhancement with Heavier Mediator

OPENALEX - Publications

Yu Cheng Shao-Feng Ge Jie Sheng Tsutomu T. Yanagida

We propose a new scenario that both the dark matter freeze-out in early Universe and its possible annihilation for indirect detection around supermassive black hole are enhanced by Breit-Wigner resonance. With mediator mass larger than total initial mass, this is almost forbidden at late times. Thus, stringent cosmic microwave background constraints do not apply. However, can accelerate particles to reactivate resonant whose subsequent decay photons leaves unique signal. The running...

10.48550/arxiv.2309.12043 preprint EN cc-by arXiv (Cornell University) 2023-01-01

Efficient Robust Training via Backward Smoothing

OPENALEX - Publications

Jinghui Chen Yu Cheng Zhe Gan Quanquan Gu Jingjing Liu

Adversarial training is so far the most effective strategy in defending against adversarial examples. However, it suffers from high computational costs due to iterative attacks each step. Recent studies show that possible achieve fast Training by performing a single-step attack with random initialization. such an approach still lags behind state-of-the-art algorithms on both stability and model robustness. In this work, we develop new understanding towards Fast Training, viewing...

10.48550/arxiv.2010.01278 preprint EN other-oa arXiv (Cornell University) 2020-01-01

Optimization of English Learning Mode under the Influence of Artificial Intelligence Translation

OPENALEX - Publications

Gang Shen Tao Feng Yu Cheng

With the development of social science and technology, artificial intelligence has been applied to many fields, translation provided great help for language learners. This paper analyzes necessity English learning, explores influence on proposes optimized learning modes which provide people involved.

10.1155/2022/7755297 article EN cc-by Discrete Dynamics in Nature and Society 2022-01-01

Dark photon kinetic mixing effects for the CDF W-mass measurement

OPENALEX - Publications

Yu Cheng Xiao-Gang He Fei Huang J. F. Sun Zhi-Peng Xing

A new $U(1)_X$ gauge boson $X$ primarily interacting with a dark sector can have renormalizable kinetic mixing the standard model (SM) $U(1)_Y$ $Y$. This besides introduces interactions of photon and SM particles, it also modifies among particles. The modified be casted into oblique $S$, $T$ $U$ parameters. We find that mass larger than $Z$ mass, effects reduce tension W excess problem reported recently by CDF from $7σ$ deviation to within $3 σ$ compared theory prediction. If there is...

10.48550/arxiv.2204.10156 preprint EN other-oa arXiv (Cornell University) 2022-01-01

Multimodal Instruction Tuning with Conditional Mixture of LoRA

OPENALEX - Publications

Ying Shen Zhiyang Xu Qifan Wang Yu Cheng Wenpeng Yin and 1 more

Multimodal Large Language Models (MLLMs) have demonstrated remarkable proficiency in diverse tasks across different domains, with an increasing focus on improving their zero-shot generalization capabilities for unseen multimodal tasks. instruction tuning has emerged as a successful strategy achieving by fine-tuning pre-trained models through instructions. As MLLMs grow complexity and size, the need parameter-efficient methods like Low-Rank Adaption (LoRA), which fine-tunes minimal set of...

10.48550/arxiv.2402.15896 preprint EN arXiv (Cornell University) 2024-02-24

F\'eeton ($B-L$ Gauge Boson) Dark Matter Testable in Future Direct Detection Experiments

OPENALEX - Publications

Yu Cheng Jie Sheng Tsutomu T. Yanagida

In this paper, we revisit the f\'eeton (gauge boson of $U(1)_{B-L}$ symmetry) dark matter scenario, and first point out $U(1)$ gauge symmetry can be a linear combination $B-L$ SM hypercharge symmetries. With redefinition charge fermions, coupling between electron enhanced. After showing parameter space required from DM stability cosmic production, discuss potential for verifying them in direct detection experiments. The results show that future experiments, such as SuperCDMS, have...

10.48550/arxiv.2410.12554 preprint EN arXiv (Cornell University) 2024-10-16

An actor-critic learning framework based on Lyapunov stability for automatic assembly

OPENALEX - Publications

Xinwang Li Juliang Xiao Yu Cheng Haitao Liu

10.1007/s10489-022-03844-2 article EN Applied Intelligence 2022-06-15

MLIA: modulated LED illumination-based adversarial attack on traffic sign recognition system for autonomous vehicle

OPENALEX - Publications

Yixuan Shen Yu Cheng Yini Lin Sicheng Long Canjian Jiang and 6 more

Traffic sign recognition (TSR) system is essential for autonomous vehicle and vulnerable to security threats from adversarial attacks. The existing attacks TSR are invasive suffer poor concealment high computational complexity, thus have low feasibility in real-world scenarios. This paper proposes a non-invasive modulated LED illumination-based attack scheme. By generating luminance flashes imperceptible human eyes through fast intensity modulation of lighting such as streetlights exploiting...

10.1109/trustcom56396.2022.00139 article EN 2022-12-01

Electroweak precision tests for triplet scalars

OPENALEX - Publications

Yu Cheng Xiao-Gang He Fei Huang J. F. Sun Zhi-Peng Xing

Electroweak precision observables are fundamentally important for testing the standard model (SM) or its extensions. The influences to from new physics within electroweak sector can be expressed in terms of oblique parameters S, T, U. recently reported W mass excess anomaly by CDF modifies these a significant way. By performing global fit with measurement data, we obtain $S=0.03 \pm 0.03$, $T=0.06 0.02$ and $U=0.16 0.03$ (or $S=0.14 $T=0.24 $U=0$) which is significantly away zero as SM would...

10.48550/arxiv.2208.06760 preprint EN other-oa arXiv (Cornell University) 2022-01-01

Right-Handed Neutrino Dark Matter with Forbidden Annihilation

OPENALEX - Publications

Yu Cheng Shao-Feng Ge Jie Sheng Tsutomu T. Yanagida

10.48550/arxiv.2304.02997 preprint EN cc-by arXiv (Cornell University) 2023-01-01

Cross-utterance Conditioned Coherent Speech Editing

OPENALEX - Publications

Yu Cheng Yang Li Weiqin Zu Fanglei Sun Tian Zheng and 1 more

10.21437/interspeech.2023-2558 article EN Interspeech 2022 2023-08-14

Coming Soon ...