- Stochastic Gradient Optimization Techniques
- Advanced Malware Detection Techniques
- Machine Learning and ELM
- Ferroelectric and Negative Capacitance Devices
- Advanced Bandit Algorithms Research
- Software Testing and Debugging Techniques
- Adversarial Robustness in Machine Learning
- Domain Adaptation and Few-Shot Learning
- Cloud Computing and Resource Management
- Reinforcement Learning in Robotics
- Neural Networks and Reservoir Computing
- Neural Networks and Applications
- IoT and Edge/Fog Computing
- Advanced Neural Network Applications
- Security and Verification in Computing
- Artificial Intelligence in Healthcare
- Real-Time Systems Scheduling
- Advanced Data Storage Technologies
- Radiation Effects in Electronics
- Imbalanced Data Classification Techniques
- Scientific Computing and Data Management
- Receptor Mechanisms and Signaling
- Software System Performance and Reliability
- Text and Document Classification Technologies
- Open Education and E-Learning
Alibaba Group (China)
2022-2024
Alibaba Group (United States)
2023
Princeton University
2021
Tsinghua University
2019
University of Electronic Science and Technology of China
2015
Sichuan University
2014
Large language models have demonstrated impressive performance on challenging mathematical reasoning tasks, which has triggered the discussion of whether is achieved by true capability or memorization. To investigate this question, prior work constructed benchmarks when questions undergo simple perturbations -- modifications that still preserve underlying patterns solutions. However, no explored hard perturbations, fundamentally change nature problem so original solution steps do not apply....
Coverage-guided kernel fuzzing is a widely-used technique that has helped developers and testers discover numerous vulnerabilities. However, due to the high complexity of application hardware environment, there little study on deploying enterprise-level Linux kernel. In this paper, collaborating with enterprise developers, we present industry practice deploy four different distributions are responsible for internal business external services company. We have addressed following outstanding...
Vulnerable code clones in the operating system (OS) threaten safety of smart industrial environment, and most vulnerable OS clone detection approaches neglect correlations between functions that limits effectiveness. In this article, we propose a two-phase framework to find by learning on functions. On training phase, as set are extracted from latest repository function features derived their AST structure. Then, external internal explored graph modeling Finally, convolutional network for...
It has been observed \citep{zhang2016understanding} that deep neural networks can memorize: they achieve 100\% accuracy on training data. Recent theoretical results explained such behavior in highly overparametrized regimes, where the number of neurons each layer is larger than samples. In this paper, we show be trained to memorize data perfectly a mildly regime, parameters just constant factor more samples, and much smaller.
Customer credit scoring is an important concern for numerous domestic and global industries. It difficult to achieve satisfactory performance by traditional models constructed on the assumption that training test data are subject same distribution, because customers usually come from different districts may be distributions in reality. This study combines ensemble learning transfer learning, proposes a clustering selecting based dynamic (CSTE) model related source domains target domain...
The generalization mystery of overparametrized deep nets has motivated efforts to understand how gradient descent (GD) converges low-loss solutions that generalize well. Real-life neural networks are initialized from small random values and trained with cross-entropy loss for classification (unlike the "lazy" or "NTK" regime training where analysis was more successful), a recent sequence results (Lyu Li, 2020; Chizat Bach, Ji Telgarsky, 2020) provide theoretical evidence GD may converge...
The performance testing and optimization of cloud applications is challenging, because manual tuning computing stacks tedious automated tools are rare used for services. To address this issue, we introduce KeenTune, an tool designed to optimize application facilitate testing. KeenTune a lightweight flexible that can be deployed with to-be-tuned negligible impact on their performance. Specifically, uses surrogate model implemented machine learning models filter out less relevant parameters...
Rt-Linux contains critical modifications that are much less tested than the vanilla kernel, thus placing many systems at risk. In this paper, we present DRLF, a directed fuzzer targeted towards fuzzing any code area in Rt- Linux, allowing for more efficient tests on Rt-Linux's unique sections. DRLF performs through kernel-level weighted callgraph construction technique, and prioritizing input sequences exhibit distance to target code. Evaluations show delivers better cover speed while...
Momentum is known to accelerate the convergence of gradient descent in strongly convex settings without stochastic noise. In optimization, such as training neural networks, folklore suggests that momentum may help deep learning optimization by reducing variance update, but previous theoretical analyses do not find offer any provable acceleration. Theoretical results this paper clarify role where rate small and noise dominant source instability, suggesting SGD with behave similarly short long...
Auto-tuning attracts increasing attention in industry practice to optimize the performance of a system with many configurable parameters. It is particularly useful for cloud applications and services since they have complex hierarchies intricate knob correlations. However, existing tools algorithms rarely consider practical problems such as workload pressure control, support distributed deployment, expensive time costs, etc., which are utterly important enterprise services. In this work, we...
Bandit problems with linear or concave reward have been extensively studied, but relatively few works studied bandits non-concave reward. This work considers a large family of bandit where the unknown underlying function is non-concave, including low-rank generalized and two-layer neural network polynomial activation problem. For problem, we provide minimax-optimal algorithm in dimension, refuting both conjectures [LMT21, JWWN19]. Our algorithms are based on unified zeroth-order optimization...
Deep Reinforcement Learning (RL) powered by neural net approximation of the Q function has had enormous empirical success. While theory RL traditionally focused on linear (or eluder dimension) approaches, little is known about nonlinear with approximations functions. This focus this work, where we study two-layer networks (considering both ReLU and polynomial activation functions). Our first result a computationally statistically efficient algorithm in generative model setting under...
In recent past, the rapid developing of mobile internet inspires widespread use WiFi (IEEE 802.11) technology. WiFi, access control a terminal to router remains significant challenge because PIN (password) and MAC address are easy guess forge. this paper, we present FastID - practical system that identifies terminals in real-time by fingerprinting their clocks. Previous approaches clock require tens minutes or even hours for data collection, thus cannot be applied into identification. Even...