NFDI4DS | UHH-SEMS - Publication Details

Yanhui Li

ORCID: 0000-0003-2282-7175

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5100360608

Research Areas

Software Engineering Research
Semantic Web and Ontologies
Software Reliability and Analysis Research
Software Testing and Debugging Techniques
Service-Oriented Architecture and Web Services
Natural Language Processing Techniques
Software System Performance and Reliability
Software Engineering Techniques and Practices
Logic, Reasoning, and Knowledge
Rough Sets and Fuzzy Logic
Data Management and Algorithms
Adversarial Robustness in Machine Learning
Topic Modeling
Advanced Malware Detection Techniques
Machine Learning and Data Classification
Advanced Database Systems and Queries
Advanced Computational Techniques and Applications
Imbalanced Data Classification Techniques
Web Data Mining and Analysis
Access Control and Trust
Data Mining Algorithms and Applications
Privacy-Preserving Technologies in Data
Logic, programming, and type systems
Transportation Planning and Optimization
Open Source Software Innovations

Nanjing University
2013-2025

Jiangxi University of Finance and Economics
2025

Huazhong University of Science and Technology
2025

Union Hospital
2025

Guilin University of Electronic Technology
2024

Affiliated Hospital of North Sichuan Medical College
2024

Nanjing University of Science and Technology
2012-2023

Liaocheng People's Hospital
2023

Wuhan University of Technology
2023

The University of Texas Southwestern Medical Center
2021

A Game-Theoretical Approach for User Allocation in Edge Computing Environment

OPENALEX - Publications

Qiang He Guangming Cui Xuyun Zhang Feifei Chen Shuiguang Deng and 3 more

Edge Computing provides mobile and Internet-of-Things (IoT) app vendors with a new distributed computing paradigm which allows an vendor to deploy its at hired edge servers near users the of cloud. This way, can be allocated nearby minimize network latency energy consumption. A cost-effective user allocation (EUA) requires maximum served minimum overall system cost. Finding centralized optimal solution this EUA problem is NP-hard. Thus, we propose EUAGame, game-theoretic approach that...

10.1109/tpds.2019.2938944 article EN IEEE Transactions on Parallel and Distributed Systems 2019-09-03

How Far We Have Progressed in the Journey? An Examination of Cross-Project Defect Prediction

OPENALEX - Publications

Yuming Zhou Yibiao Yang Hongmin Lu Lin Chen Yanhui Li and 3 more

Background. Recent years have seen an increasing interest in cross-project defect prediction (CPDP), which aims to apply models built on source projects a target project. Currently, variety of (complex) CPDP been proposed with promising performance. Problem. Most, if not all, the existing are compared against those simple module size that easy implement and shown good performance literature. Objective. We aim investigate how far we really progressed journey by comparing between models....

10.1145/3183339 article EN ACM Transactions on Software Engineering and Methodology 2018-01-31

Tracing the sources of nitrate in the rivers and lakes of the southern areas of the Tibetan Plateau using dual nitrate isotopes

OPENALEX - Publications

Mingming Hu Yuchun Wang Pengcheng Du Yong Shui Aimin Cai and 5 more

10.1016/j.scitotenv.2018.12.149 article EN The Science of The Total Environment 2018-12-11

Training data debugging for the fairness of machine learning software

OPENALEX - Publications

Yanhui Li Linghan Meng Lin Chen Li Yu Di Wu and 2 more

With the widespread application of machine learning (ML) software, especially in high-risk tasks, concern about their unfairness has been raised towards both developers and users ML software. The software indicates behavior affected by sensitive features (e.g., sex), which leads to biased illegal decisions become a worthy problem for whole engineering community.

10.1145/3510003.3510091 article EN Proceedings of the 44th International Conference on Software Engineering 2022-05-21

Multiple-boundary clustering and prioritization to promote neural network retraining

OPENALEX - Publications

Weijun Shen Yanhui Li Chen Lin Yuanlei Han Yuming Zhou and 1 more

With the increasing application of deep learning (DL) models in many safety-critical scenarios, effective and efficient DL testing techniques are much demand to improve quality models. One major challenges is data gap between training construct evaluate them. To bridge gap, testers aim collect an subset inputs from contexts, with limited labeling effort, for retraining

10.1145/3324884.3416621 article EN 2020-12-21

How Far Have We Progressed in Identifying Self-admitted Technical Debts? A Comprehensive Empirical Study

OPENALEX - Publications

Zhaoqiang Guo Shiran Liu Jinping Liu Yanhui Li Lin Chen and 2 more

Background. Self-admitted technical debt (SATD) is a special kind of that intentionally introduced and remarked by code comments. Those debts reduce the quality software increase cost subsequent maintenance. Therefore, it necessary to find out resolve these in time. Recently, many automatic approaches have been proposed identify SATD. Problem. Popular IDEs support number predefined task annotation tags for indicating SATD comments, which used projects. However, such clear prior knowledge...

10.1145/3447247 article EN ACM Transactions on Software Engineering and Methodology 2021-07-23

Towards Better Dependency Management: A First Look at Dependency Smells in Python Projects

OPENALEX - Publications

Yulu Cao Lin Chen Wanwangying Ma Yanhui Li Yuming Zhou and 1 more

Managing cross-project dependencies is tricky in modern software development. A primary way to manage using dependency configuration files, which brings convenience the entire ecosystem, including developers, maintainers, and users. However, developers may introduce smells if files are not well written maintained. Dependency recurring violations of management can potentially lead severe consequences. This paper provides an in-depth look at three smells, namely, <italic...

10.1109/tse.2022.3191353 article EN IEEE Transactions on Software Engineering 2022-07-18

Weighted Suspiciousness and Balanced Aggregation to Boost Spectrum-based Fault Localization of Deep Learning Models

OPENALEX - Publications

Wenjie Xu Yanhui Li Mingliang Ma Lin Chen Yuming Zhou

Deep learning (DL) models have proven to be highly successful and are now essential our everyday routines. However, DL models, like traditional software, inevitably contain bugs that affect their performance in real-world scenarios. Effective software engineering techniques necessary ensure dependability. In recent years, fault localization methods for gained significant attention as a valuable tool improving the reliability of models. Owing data-driven programming paradigm, challenging...

10.1145/3716849 article EN ACM Transactions on Software Engineering and Methodology 2025-02-12

Views on Life and Death in Taoism and Ancient Greek Religion: Deities Governing Life and Death in Taoism and Ancient Greek Religion

OPENALEX - Publications

Yanhui Li

Life and death are inevitable processes in life, religions have unfolded rich imaginations discussions around this topic. Taoism ancient Greek religion, as of the East West respectively, formed their own unique views on life death. This paper, through literature analysis comparative analysis, studies reflected deities that govern compares similarities differences death, explores reasons behind them. The study finds believes human originates from Dao, while religion holds humans were created...

10.54254/2753-7064/2025.20901 article EN Communications in Humanities Research 2025-02-14

Why and How We Combine Multiple Deep Learning Models With Functional Overlaps

OPENALEX - Publications

Mingliang Ma Yanhui Li Yingxin Chen Chen Lin Yuming Zhou

ABSTRACT The evolution (e.g., development and maintenance) of deep learning (DL) models has attracted much attention. One the main challenges during maintenance DL is model training, which often requires a lot human resources computing power (such as labeling costs parameter training). In recent years, to alleviate this problem, researchers have introduced idea software engineering (SE) into DL. Researchers consider new type software, borrowing practice traditional reuse, that is, focusing...

10.1002/smr.70003 article EN Journal of Software Evolution and Process 2025-02-01

Integration of Single Cell and Bulk RNA-Sequencing Reveals Key Genes and Immune Cell Infiltration to Construct a Predictive Model and Identify Drug Targets in Endometriosis

OPENALEX - Publications

Hanke Zhang Yuqing Fang Dan Luo Yanhui Li

Endometriosis is a common chronic neuroinflammatory disease with poorly understood pathogenesis. Molecular changes and specific immune cell infiltration in the eutopic endometrium are critical to progression. This study aims explore mechanisms molecular differences proliferative of endometriosis by integrating bulk RNA-seq single-cell RNA sequencing (scRNA-seq) data, develop diagnostic predictive models for disease. Gene expression profiles from patients healthy controls were obtained...

10.2147/jir.s497643 article EN cc-by-nc Journal of Inflammation Research 2025-02-01

Investigating Red Packet Fraud in Android Applications: Insights from User Reviews

OPENALEX - Publications

Yu Cheng Xiaofang Qi Yanhui Li

10.2139/ssrn.5162975 preprint EN 2025-01-01

Using Dynamic and Static Techniques to Establish Traceability Links Between Production Code and Test Code on Python Projects: A Replication Study

OPENALEX - Publications

Zhifei Chen Chiheng Jia Yanhui Li Lin Chen

ABSTRACT The relationship between test code and production code, that is, test‐to‐code traceability, plays an essential role in the verification, reliability, certification of software systems. Prior work on traceability focuses mainly Java. However, as Python allows more flexible testing styles, it is still unknown whether existing approaches well projects. In order to address this gap knowledge, paper evaluates can accurately identify links We collected seven popular projects carried out...

10.1002/smr.70011 article EN Journal of Software Evolution and Process 2025-03-01

Understanding and Identifying Technical Debt in the Co-Evolution of Production and Test Code

OPENALEX - Publications

Yimeng Guo Zhifei Chen Lu Xiao Lin Chen Yanhui Li and 1 more

10.1109/tse.2025.3553112 article EN IEEE Transactions on Software Engineering 2025-01-01

Extensive mutation for testing of word sense disambiguation models

OPENALEX - Publications

Deping Zhang Zhaohui Yang Xiang Huang Yanhui Li

10.1016/j.infsof.2025.107734 article EN Information and Software Technology 2025-04-01

Less is More: Feature Engineering for Fairness and Performance of Machine Learning Software

OPENALEX - Publications

Linghan Meng Yanhui Li Lin Chen Mingliang Ma Yuming Zhou and 1 more

Machine learning (ML) software employs statistical algorithms to perform high-stake tasks in our daily lives, whose results are usually discriminatory due protected features (e.g., gender), i.e., one part (called privileged, e.g., male) may be more likely obtain beneficial decisions than the other unprivileged, female). In alleviating unfairness, developers have obtained widely-held beliefs about trade-off between performance and fairness for ML software. Surprisingly, recent research on...

10.1145/3730577 article EN ACM Transactions on Software Engineering and Methodology 2025-04-18

Connecting software metrics across versions to predict defects

OPENALEX - Publications

Yibin Liu Yanhui Li Jianbo Guo Yuming Zhou Baowen Xu

Accurate software defect prediction could help practitioners allocate test resources to defect-prone modules effectively and efficiently. In the last decades, much effort has been devoted build accurate models, including developing quality predictors modeling techniques. However, current widely used such as code metrics process not well describe how change over project evolution, which we believe is important for prediction. order deal with this problem, in paper, propose use Historical...

10.1109/saner.2018.8330212 preprint EN 2018-03-01

Generating Python Type Annotations from Type Inference: How Far Are We?

OPENALEX - Publications

Yimeng Guo Zhifei Chen Lin Chen Wenjie Xu Yanhui Li and 2 more

In recent years, dynamic languages such as Python have become popular due to their flexibility and productivity. The lack of static typing makes programs face the challenges fixing type errors, early bug detection, code understanding. To alleviate these issues, PEP 484 introduced optional annotations for in 2014, but unfortunately, a large number are still not annotated by developers. Annotation generation tools can utilize inference techniques. However, several important aspects annotation...

10.1145/3652153 article EN ACM Transactions on Software Engineering and Methodology 2024-03-11

Could We Predict the Result of a Continuous Integration Build? An Empirical Study

OPENALEX - Publications

Jing Xia Yanhui Li

Software build integrates modules developed and maintained by different developers in parallel, tests the result of integration, serves as a crucial step cooperatiive software development. Predicting has drawn interest academia industry. In spite many previous researches, generalizability failure prediction over wide range open-source projects remains unclear.In this paper, we used 9 classifiers to construct models investigated performance both cross-validation on-line predictions on 126...

10.1109/qrs-c.2017.59 article EN 2017-07-01

An Empirical Study on Dynamic Typing Related Practices in Python Systems

OPENALEX - Publications

Zhifei Chen Yanhui Li Bihuan Chen Wanwangying Ma Lin Chen and 1 more

The dynamic typing discipline of Python allows developers to program at a high level abstraction. However, type related bugs are commonly encountered in systems due the lack declaration and static checking. Especially, misuse produces underlying increases maintenance efforts. In this paper, we introduce six types practices programs, which common but potentially risky usage by developers. We also implement tool named PYDYPE detect them. Based on tool, conduct an empirical study nine...

10.1145/3387904.3389253 article EN 2020-07-13

Code-line-level Bugginess Identification: How Far have We Come, and How Far have We Yet to Go?

OPENALEX - Publications

Zhaoqiang Guo Shiran Liu Xutong Liu Wei Lai Mingliang Ma and 7 more

Background. Code-line-level bugginess identification (CLBI) is a vital technique that can facilitate developers to identify buggy lines without expending large amount of human effort. Most the existing studies tried mine characteristics source codes train supervised prediction models, which have been reported be able discriminate code amongst others in target program. Problem. However, several simple and clear characteristics, such as complexity lines, disregarded current literature. Such...

10.1145/3582572 article EN ACM Transactions on Software Engineering and Methodology 2023-02-01

Measuring Discrimination to Boost Comparative Testing for Multiple Deep Learning Models

OPENALEX - Publications

Linghan Meng Yanhui Li Chen Lin Zhi Wang Di Wu and 2 more

The boom of DL technology leads to massive models built and shared, which facilitates the acquisition reuse models. For a given task, we encounter multiple available with same functionality, are considered as candidates achieve this task. Testers expected compare select more suitable ones w.r.t. whole testing context. Due limitation labeling effort, testers aim an efficient subset samples make precise rank estimation possible for these To tackle problem, propose Sample Discrimination based...

10.1109/icse43902.2021.00045 article EN 2021-05-01

Diagnosis of package installation incompatibility via knowledge base

OPENALEX - Publications

Yulu Cao Zhifei Chen Xiaowei Zhang Yanhui Li Lin Chen and 1 more

10.1016/j.scico.2024.103098 article EN Science of Computer Programming 2024-03-01

Coming Soon ...