- Topic Modeling
- Natural Language Processing Techniques
- Adversarial Robustness in Machine Learning
- Dark Matter and Cosmic Phenomena
- Cosmology and Gravitation Theories
- Particle physics theoretical and experimental studies
- Speech and dialogue systems
- Speech and Audio Processing
- Computational Physics and Python Applications
- Anomaly Detection Techniques and Applications
- Scientific Research and Discoveries
- Sentiment Analysis and Opinion Mining
- Speech Recognition and Synthesis
- Text Readability and Simplification
- Advanced Graph Neural Networks
- Recommender Systems and Techniques
- Bacillus and Francisella bacterial research
- Neural Networks and Applications
- Psychological and Temporal Perspectives Research
- Machine Learning in Healthcare
- Neuroscience and Music Perception
- Artificial Intelligence in Games
- Relativity and Gravitational Theory
- Quantum Mechanics and Applications
- Adaptive Dynamic Programming Control
Ningbo University
2022-2025
Institute of Psychology, Chinese Academy of Sciences
2025
Microsoft Research (United Kingdom)
2022-2023
Shanghai Jiao Tong University
2023
Microsoft (Finland)
2023
Chinese University of Hong Kong
2023
Peking University
2022
Qingdao University
2022
Alibaba Group (United States)
2022
Suqian University
2022
Pre-trained language models such as BERT have proven to be highly effective for natural processing (NLP) tasks. However, the high demand computing resources in training hinders their application practice. In order alleviate this resource hunger large-scale model training, we propose a Patient Knowledge Distillation approach compress an original large (teacher) into equally-effective lightweight shallow network (student). Different from previous knowledge distillation methods, which only use...
Recent studies show that pre-trained language models (LMs) are vulnerable to textual adversarial attacks. However, existing attack methods either suffer from low success rates or fail search efficiently in the exponentially large perturbation space. We propose an efficient and effective framework SemAttack generate natural text by constructing different semantic functions. In particular, optimizes generated perturbations constrained on generic spaces, including typo space, knowledge space...
Fine-tuning large pre-trained language models on downstream tasks has become an important paradigm in NLP. However, common practice fine-tunes all of the parameters a model, which becomes prohibitive when number are present. Therefore, many fine-tuning methods proposed to learn incremental updates weights parameter efficient way, e.g., low-rank increments. These often evenly distribute budget across weight matrices, and overlook varying importance different parameters. As consequence,...
Linear sequence modeling approaches, such as linear attention, provide advantages like linear-time training and constant-memory inference over lengths. However, existing parallelism (SP) methods are either not optimized for the right-product-first feature of attention or use a ring-style communication strategy, which results in lower computation parallelism, limits their scalability longer sequences distributed systems. In this paper, we introduce LASP-2, new SP method to enhance both when...
With the global aging population, an increasing number of researchers are interested in intertemporal choice issues faced by older adults. Previous studies have examined how age-related differences time perception affect choices. However, impact strategy on decision-making among adults remains unclear. This study was designed to examine timing influence while also exploring possible mechanisms. We manipulated preferences through priming two experiments (Experiment 1, n = 160; Experiment 2,...
https://github.com/reasoning-survey/Awesome-Reasoning-Foundation-Models Reasoning, a crucial ability for complex problem-solving, plays pivotal role in various real-world settings such as negotiation, medical diagnosis, and criminal investigation. It serves fundamental methodology the field of Artificial General Intelligence (AGI). With ongoing development foundation models, there is growing interest exploring their abilities reasoning tasks. In this paper, we introduce seminal models...
The seesaw mechanism with three right-handed neutrinos has one as a well-motivated dark matter candidate if stable and the other two can explain baryon asymmetry via thermal leptogenesis scenario. We explore possibility of introducing additional particles to make neutrino in equilibrium freeze out through forbidden annihilation channel. Nowadays Universe, this channel be reactivated by strong gravitational potential such supermassive black hole our galaxy center. Fermi-LAT gamma ray data...
We introduce Zoomer, a system deployed at Taobao, the largest e-commerce platform in China, for training and serving GNN-based recommendations over web-scale graphs. Zoomer is designed tackling two challenges presented by massive user data Taobao: low training/serving efficiency due to huge scale of graphs, recommendation quality information overload which distracts model from specific intentions. achieves this introducing key concept, Region Interests (ROI) GNNs recommendations, i.e.,...
Temporal sentence grounding (TSG) aims to identify the temporal boundary of a specific segment from an untrimmed video by query. All existing works first utilize sparse sampling strategy extract fixed number frames and then interact them with query for reasoning.However, we argue that these methods have overlooked two indispensable issues:1) Boundary-bias: The annotated target generally refers as corresponding start end timestamps. downsampling process may lose take adjacent irrelevant new...
Xuxi Chen, Tianlong Weizhu Ahmed Hassan Awadallah, Zhangyang Wang, Yu Cheng. Proceedings of the 61st Annual Meeting Association for Computational Linguistics (Volume 1: Long Papers). 2023.
Subword tokenization schemes are the dominant technique used in current NLP models. However, such can be rigid and tokenizers built on one corpus may not adapt well to other parallel corpora. It has also been observed that multilingual corpora, subword oversegment low-resource languages, leading a drop translation performance. An alternative is byte-based tokenization, i.e., into byte sequences using UTF-8 encoding scheme. Byte tokens often represent inputs at sub-character granularity,...
We propose a new scenario that both the dark matter freeze-out in early Universe and its possible annihilation for indirect detection around supermassive black hole are enhanced by Breit-Wigner resonance. With mediator mass larger than total initial mass, this is almost forbidden at late times. Thus, stringent cosmic microwave background constraints do not apply. However, can accelerate particles to reactivate resonant whose subsequent decay photons leaves unique signal. The running...
Adversarial training is so far the most effective strategy in defending against adversarial examples. However, it suffers from high computational costs due to iterative attacks each step. Recent studies show that possible achieve fast Training by performing a single-step attack with random initialization. such an approach still lags behind state-of-the-art algorithms on both stability and model robustness. In this work, we develop new understanding towards Fast Training, viewing...
With the development of social science and technology, artificial intelligence has been applied to many fields, translation provided great help for language learners. This paper analyzes necessity English learning, explores influence on proposes optimized learning modes which provide people involved.
A new $U(1)_X$ gauge boson $X$ primarily interacting with a dark sector can have renormalizable kinetic mixing the standard model (SM) $U(1)_Y$ $Y$. This besides introduces interactions of photon and SM particles, it also modifies among particles. The modified be casted into oblique $S$, $T$ $U$ parameters. We find that mass larger than $Z$ mass, effects reduce tension W excess problem reported recently by CDF from $7σ$ deviation to within $3 σ$ compared theory prediction. If there is...
Multimodal Large Language Models (MLLMs) have demonstrated remarkable proficiency in diverse tasks across different domains, with an increasing focus on improving their zero-shot generalization capabilities for unseen multimodal tasks. instruction tuning has emerged as a successful strategy achieving by fine-tuning pre-trained models through instructions. As MLLMs grow complexity and size, the need parameter-efficient methods like Low-Rank Adaption (LoRA), which fine-tunes minimal set of...
In this paper, we revisit the f\'eeton (gauge boson of $U(1)_{B-L}$ symmetry) dark matter scenario, and first point out $U(1)$ gauge symmetry can be a linear combination $B-L$ SM hypercharge symmetries. With redefinition charge fermions, coupling between electron enhanced. After showing parameter space required from DM stability cosmic production, discuss potential for verifying them in direct detection experiments. The results show that future experiments, such as SuperCDMS, have...
Traffic sign recognition (TSR) system is essential for autonomous vehicle and vulnerable to security threats from adversarial attacks. The existing attacks TSR are invasive suffer poor concealment high computational complexity, thus have low feasibility in real-world scenarios. This paper proposes a non-invasive modulated LED illumination-based attack scheme. By generating luminance flashes imperceptible human eyes through fast intensity modulation of lighting such as streetlights exploiting...
Electroweak precision observables are fundamentally important for testing the standard model (SM) or its extensions. The influences to from new physics within electroweak sector can be expressed in terms of oblique parameters S, T, U. recently reported W mass excess anomaly by CDF modifies these a significant way. By performing global fit with measurement data, we obtain $S=0.03 \pm 0.03$, $T=0.06 0.02$ and $U=0.16 0.03$ (or $S=0.14 $T=0.24 $U=0$) which is significantly away zero as SM would...
The seesaw mechanism with three right-handed neutrinos has one as a well-motivated dark matter candidate if stable and the other two can explain baryon asymmetry via thermal leptogenesis scenario. We explore possibility of introducing additional particles to make neutrino in equilibrium freeze out through forbidden annihilation channel. Nowadays Universe, this channel be reactivated by strong gravitational potential such supermassive black hole our galaxy center. Fermi-LAT gamma ray data...