NFDI4DS | UHH-SEMS - Publication Details

Shuai Lu

ORCID: 0000-0001-7466-2064

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5112547849

Research Areas

Software Engineering Research
Topic Modeling
Software Testing and Debugging Techniques
Natural Language Processing Techniques
Advanced Malware Detection Techniques
Software Reliability and Analysis Research
Bone health and osteoporosis research
Multimodal Machine Learning Applications
Web Data Mining and Analysis
Electrodeposition and Electroless Coatings
Corrosion Behavior and Inhibition
High Entropy Alloys Studies
High-Temperature Coating Behaviors
Electric and Hybrid Vehicle Technologies
Microstructure and Mechanical Properties of Steels
Aluminum Alloys Composites Properties
Climate Change and Health Impacts
Aluminum Alloy Microstructure Properties
Software System Performance and Reliability
Adversarial Robustness in Machine Learning
Metal Alloys Wear and Properties
Space Satellite Systems and Control
Industrial Vision Systems and Defect Detection
GDF15 and Related Biomarkers
Nanoporous metals and alloys

Peking University
2017-2025

Beijing Institute of Neurosurgery
2025

Beijing Jishuitan Hospital
2022-2025

Capital Medical University
2025

Changzhou University
2024-2025

Harbin University of Science and Technology
2024-2025

University of Science and Technology Beijing
2023-2024

Microsoft Research Asia (China)
2022-2024

Nanjing Medical University
2024

Henan University of Science and Technology
2024

Summarizing Source Code with Transferred API Knowledge

OPENALEX - Publications

Xing Hu Li Ge Xin Xia David Lo Shuai Lu and 1 more

Code summarization, aiming to generate succinct natural language description of source code, is extremely useful for code search and comprehension. It has played an important role in software maintenance evolution. Previous approaches summaries by retrieving from similar snippets. However, these heavily rely on whether snippets can be retrieved, how the are, fail capture API knowledge which carries vital information about functionality code. In this paper, we propose a novel approach, named...

10.24963/ijcai.2018/314 article EN 2018-07-01

CodeBLEU: a Method for Automatic Evaluation of Code Synthesis

OPENALEX - Publications

Shuo Ren Daya Guo Shuai Lu Long Zhou Shujie Liu and 5 more

Evaluation metrics play a vital role in the growth of an area as it defines standard distinguishing between good and bad models. In code synthesis, commonly used evaluation metric is BLEU or perfect accuracy, but they are not suitable enough to evaluate codes, because originally designed natural language, neglecting important syntactic semantic features accuracy too strict thus underestimates different outputs with same logic. To remedy this, we introduce new automatic metric, dubbed...

10.48550/arxiv.2009.10297 preprint EN other-oa arXiv (Cornell University) 2020-01-01

Automating code review activities by large-scale pre-training

OPENALEX - Publications

Zhiyu Li Shuai Lu Daya Guo Nan Duan Shailesh Jannu and 6 more

Code review is an essential part to software development lifecycle since it aims at guaranteeing the quality of codes. Modern code activities necessitate developers viewing, understanding and even running programs assess logic, functionality, latency, style other factors. It turns out that have spend far too much time reviewing their peers. Accordingly, in significant demand automate process. In this research, we focus on utilizing pre-training techniques for tasks scenario. We collect a...

10.1145/3540250.3549081 article EN Proceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering 2022-11-07

InferFix: End-to-End Program Repair with LLMs

OPENALEX - Publications

Matthew Jin Syed Shahriar Michele Tufano Xin Shi Shuai Lu and 2 more

Software development life cycle is profoundly influenced by bugs; their introduction, identification, and eventual resolution account for a significant portion of software cost. This has motivated engineering researchers practitioners to propose different approaches automating the identification repair defects.

10.1145/3611643.3613892 article EN 2023-11-30

TaskMatrix.AI: Completing Tasks by Connecting Foundation Models with Millions of APIs

OPENALEX - Publications

Yaobo Liang Chenfei Wu Ting Song Wenshan Wu Yan Xia and 9 more

In recent years, artificial intelligence (AI) has made incredible progress. Advanced foundation models such as ChatGPT can offer powerful conversation, in-context learning, and code generation abilities for a broad range of open-domain tasks. They also generate high-level solution outlines domain-specific tasks based on their acquired common-sense knowledge. Nonetheless, they still face difficulties in specialized because lack sufficient data during pretraining make errors neural network...

10.34133/icomputing.0063 article EN cc-by Intelligent Computing 2023-11-13

WhiteningBERT: An Easy Unsupervised Sentence Embedding Approach

OPENALEX - Publications

Junjie Huang Duyu Tang Wanjun Zhong Shuai Lu Linjun Shou and 3 more

Producing the embedding of a sentence in anunsupervised way is valuable to natural language matching and retrieval problems practice. In this work, we conduct thorough examination pretrained model based unsupervised embeddings. We study on fourpretrained models massive experiments seven datasets regarding semantics. have three main findings. First, averaging all tokens better than only using [CLS] vector. Second, combining both topand bottom layers toplayers. Lastly, an easy whitening-based...

10.18653/v1/2021.findings-emnlp.23 preprint EN cc-by 2021-01-01

Initial localized corrosion induced by multiscale precipitates in the new generation high-strength Al-Zn-Mg-Cu alloy

OPENALEX - Publications

Wei Xue Yixuan Wang Jiuyang Xia Zequn Zhang Kang Huang and 5 more

10.1016/j.corsci.2023.111516 article EN Corrosion Science 2023-09-09

Inter- and trans-generational impacts of real-world PM2.5 exposure on male-specific primary hypogonadism

OPENALEX - Publications

Xiaoyu Wei Zhonghao Zhang Yayun Gu Rong Zhang Jie Huang and 21 more

Exposure to PM

10.1038/s41421-024-00657-0 article EN cc-by Cell Discovery 2024-04-23

Influence of impurity content on corrosion behavior of Al-Zn-Mg-Cu alloys in a tropical marine atmospheric environment

OPENALEX - Publications

Wei Xue Yixuan Wang Shuai Wu Bowei Zhang Zequn Zhang and 6 more

10.1016/j.corsci.2024.112319 article EN Corrosion Science 2024-07-24

GraphCodeBERT: Pre-training Code Representations with Data Flow

OPENALEX - Publications

Daya Guo Shuo Ren Shuai Lu Zhangyin Feng Duyu Tang and 13 more

Pre-trained models for programming language have achieved dramatic empirical improvements on a variety of code-related tasks such as code search, completion, summarization, etc. However, existing pre-trained regard snippet sequence tokens, while ignoring the inherent structure code, which provides crucial semantics and would enhance understanding process. We present GraphCodeBERT, model that considers code. Instead taking syntactic-level like abstract syntax tree (AST), we use data flow in...

10.48550/arxiv.2009.08366 preprint EN other-oa arXiv (Cornell University) 2020-01-01

Learning to recommend method names with global context

OPENALEX - Publications

Fang Liu Ge Li Zhiyi Fu Shuai Lu Yiyang Hao and 1 more

In programming, the names for program entities, especially methods, are intuitive characteristic understanding functionality of code. To ensure readability and maintainability programs, method should be named properly. Specifically, meaningful consistent with other used in related contexts their codebase. recent years, many automated approaches proposed to suggest among which neural machine translation (NMT) based models widely have achieved state-of-the-art results. However, these NMT-based...

10.1145/3510003.3510154 article EN Proceedings of the 44th International Conference on Software Engineering 2022-05-21

Effect of ultrasound assistance on the mechanical performance and corrosion resistance of Zn-Ni coatings produced by cathode plasma electrolysis deposition

OPENALEX - Publications

Shuai Lu Kang Huang Ning Zhuang Jiuyang Xia Bowei Zhang and 1 more

10.1016/j.surfcoat.2024.130632 article EN Surface and Coatings Technology 2024-03-07

Effect of Austenitizing Temperature on Microstructure and Wear Resistance of Hot Work Die Steels with Different Silicon Content

OPENALEX - Publications

Bin Zhang Shuai Lu Wei Tan Zhong-Qi Ma

The study investigates the effects of austenitizing temperature on microstructure and wear resistance hot work die steels with different silicon (Si) content. results indicate that steel high Si content, austenitized at 1110 °C, exhibits superior resistance, which can be attributed to precipitation a large amount fine vanadium carbides during tempering process. elevated facilitates martensite transformation quenching increases hardness steels. Low impact toughness is obtained in low content...

10.1002/srin.202400952 article EN steel research international 2025-02-21

Early detection for elderly people with musculoskeletal aging related diseases based on artificial intelligence model

OPENALEX - Publications

Minjuan Li Shuai Lu Cheng Cheng Kai-Yuan Cheng Maoqi Gong and 2 more

<title>Abstract</title> Late-diagnosis is one of the main bottlenecks in musculoskeletal aging-related diseases prevention, and it urgent to build early detection model. Twenty-two features were included models based on binary multiple classification respectively by XGBoost. In testing, accuracy rate (63.74%~92.40%) AUC (0.74 ~ 0.96) binary-classification higher than (61.40% ~85.96%) (0.63 0.86) multiple-classification models. The optimal model had an 87.13% 0.92 including cooking, drinking...

10.21203/rs.3.rs-6124947/v1 preprint EN Research Square (Research Square) 2025-04-16

Effects of Surfactants on Microstructure and Performance of Ni Coatings via Cathodic Plasma Electrolytic Deposition

OPENALEX - Publications

Shuai Lu Kang Huang Xiaowei Sun Wei Xue Zequn Zhang and 2 more

10.1007/s11665-025-11234-1 article EN Journal of Materials Engineering and Performance 2025-05-07

Fluid thermal coupling analysis of double-layer cooled coaxial powder feeder laser sinter

OPENALEX - Publications

Yuan Zhang Shuai Lu Enwen Zhou Le Wang

10.1080/01495739.2025.2499697 article EN Journal of Thermal Stresses 2025-05-30

CodeXGLUE: A Machine Learning Benchmark Dataset for Code Understanding and Generation

OPENALEX - Publications

Shuai Lu Daya Guo Shuo Ren Junjie Huang A. Svyatkovskiy and 17 more

Benchmark datasets have a significant impact on accelerating research in programming language tasks. In this paper, we introduce CodeXGLUE, benchmark dataset to foster machine learning for program understanding and generation. CodeXGLUE includes collection of 10 tasks across 14 platform model evaluation comparison. also features three baseline systems, including the BERT-style, GPT-style, Encoder-Decoder models, make it easy researchers use platform. The availability such data baselines can...

10.48550/arxiv.2102.04664 preprint EN other-oa arXiv (Cornell University) 2021-01-01

Code Execution with Pre-trained Language Models

OPENALEX - Publications

Chenxiao Liu Shuai Lu Weizhu Chen Daxin Jiang Alexey Svyatkovskiy and 3 more

Code execution is a fundamental aspect of programming language semantics that reflects the exact behavior code. However, most pre-trained models for code intelligence ignore trace and only rely on source syntactic structures. In this paper, we investigate how well can understand perform execution. We develop mutation-based data augmentation technique to create large-scale realistic Python dataset task execution, which challenges existing such as Codex. then present CodeExecutor, Transformer...

10.18653/v1/2023.findings-acl.308 article EN cc-by Findings of the Association for Computational Linguistics: ACL 2022 2023-01-01

ReContrast: Domain-Specific Anomaly Detection via Contrastive Reconstruction

OPENALEX - Publications

Jia Guo Shuai Lu Lize Jia Weihang Zhang Huiqi Li

Most advanced unsupervised anomaly detection (UAD) methods rely on modeling feature representations of frozen encoder networks pre-trained large-scale datasets, e.g. ImageNet. However, the features extracted from encoders that are borrowed natural image domains coincide little with required in target UAD domain, such as industrial inspection and medical imaging. In this paper, we propose a novel epistemic method, namely ReContrast, which optimizes entire network to reduce biases towards...

10.48550/arxiv.2306.02602 preprint EN cc-by arXiv (Cornell University) 2023-01-01

Why Do Neural Dialog Systems Generate Short and Meaningless Replies? a Comparison between Dialog and Translation

OPENALEX - Publications

Bolin Wei Shuai Lu Lili Mou Hao Zhou Pascal Poupart and 2 more

This paper addresses the question: In neural dialog systems, why do sequence-to-sequence (Seq2Seq) networks generate short and meaningless replies for open-domain response generation? We conjecture that in a system, due to randomness of spoken language, there may be multiple equally plausible one utterance, causing deficiency Seq2Seq model. To evaluate our conjecture, we propose systematic way mimic scenario machine translation systems with both real datasets toy generated elaborately....

10.1109/icassp.2019.8682634 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2019-04-17

Why Do Neural Dialog Systems Generate Short and Meaningless Replies? A Comparison between Dialog and Translation

OPENALEX - Publications

Bolin Wei Shuai Lu Lili Mou Hao Zhou Pascal Poupart and 2 more

This paper addresses the question: Why do neural dialog systems generate short and meaningless replies? We conjecture that, in a system, an utterance may have multiple equally plausible replies, causing deficiency of networks application. propose systematic way to mimic scenario machine translation manage reproduce phenomenon generating less meaningful sentences setting, showing evidence our conjecture.

10.48550/arxiv.1712.02250 preprint EN other-oa arXiv (Cornell University) 2017-01-01

Coming Soon ...