- Software Engineering Research
- Software Testing and Debugging Techniques
- Software Reliability and Analysis Research
- Advanced Malware Detection Techniques
- Software System Performance and Reliability
- Software Engineering Techniques and Practices
- Topic Modeling
- Web Data Mining and Analysis
- Open Source Software Innovations
- Blockchain Technology Applications and Security
- Web Application Security Vulnerabilities
- Recommender Systems and Techniques
- Natural Language Processing Techniques
- Expert finding and Q&A systems
- Advanced Software Engineering Methodologies
- Scientific Computing and Data Management
- Text and Document Classification Technologies
- Soil Carbon and Nitrogen Dynamics
- FinTech, Crowdfunding, Digital Finance
- Spam and Phishing Detection
- Green IT and Sustainability
- Machine Learning and Data Classification
- Caching and Content Delivery
- Autonomous Vehicle Technology and Safety
- Imbalanced Data Classification Techniques
Huawei Technologies (China)
2021-2025
Zhejiang University
2014-2025
Zhejiang University of Science and Technology
2014-2025
Monash University
2017-2024
Sichuan University
2015-2024
Nanjing Agricultural University
2008-2024
Huawei Technologies (United Kingdom)
2015-2024
Guangzhou Automobile Group (China)
2024
University of Waterloo
2020-2024
Changzhou Architectural Research Institute Group (China)
2024
During software maintenance, code comments help developers comprehend programs and reduce additional time spent on reading navigating source code. Unfortunately, these are often mismatched, missing or outdated in the projects. Developers have to infer functionality from This paper proposes a new approach named DeepCom automatically generate for Java methods. The generated aim understand of applies Natural Language Processing (NLP) techniques learn large corpus generates learned features. We...
Smart contract, a term which was originally coined to refer the automation of legal contracts in general, has recently seen much interest due advent blockchain technology. Recently, is popularly used low-level code scripts running on platform. Our study focuses exclusively this subset smart contracts. Such have increasingly been gaining ground, finding numerous important applications (e.g., crowdfunding) real world. Despite increasing popularity, contract development still remains somewhat...
Session-based recommendation (SBR) focuses on next-item prediction at a certain time point. As user profiles are generally not available in this scenario, capturing the intent lying item transitions plays pivotal role. Recent graph neural networks (GNNs) based SBR methods regard as pairwise relations, which neglect complex high-order information among items. Hypergraph provides natural way to capture beyond-pairwise while its potential for has remained unexplored. In paper, we fill gap by...
Defect prediction is a very meaningful topic, particularly at change-level. Change-level defect prediction, which also referred as just-in-time could not only ensure software quality in the development process, but make developers check and fix defects time. Nowadays, deep learning hot topic machine literature. Whether can be used to improve performance of still uninvestigated. In this paper, bridge research gap, we propose an approach Deeper leverages techniques predict defect-prone...
Contrastive learning (CL) recently has spurred a fruitful line of research in the field recommendation, since its ability to extract self-supervised signals from raw data is well-aligned with recommender systems' needs for tackling sparsity issue. A typical pipeline CL-based recommendation models first augmenting user-item bipartite graph structure perturbations, and then maximizing node representation consistency between different augmentations. Although this paradigm turns out be...
Software engineering practitioners often spend significant amount of time and effort to debug. To help perform this crucial task, hundreds papers have proposed various fault localization techniques. Fault helps find the location a defect given its symptoms (e.g., program failures). These techniques pinpointed locations bugs systems diverse sizes, with varying degrees success, for usage scenarios. Unfortunately, it is unclear whether appreciate line research. fill gap, we performed an...
Most software defect prediction approaches are trained and applied on data from the same project. However, often a new project does not have enough training data. Cross-project prediction, which uses other projects to predict defects in particular project, provides perspective prediction. In this work, we propose HYbrid moDel Reconstruction Approach (HYDRA) for cross-project includes two phases: genetic algorithm (GA) phase ensemble learning (EL) phase. These phases create massive...
During software development and maintenance, developers spend a considerable amount of time on program comprehension activities. Previous studies show that takes up as much half developer's time. However, most these are performed in controlled setting, or with small number participants, investigate the activities only within IDEs. developers' go well beyond their IDE interactions. In this paper, we extend our ActivitySpace framework to collect analyze Human-Computer Interaction (HCI) data...
Code summarization, aiming to generate succinct natural language description of source code, is extremely useful for code search and comprehension. It has played an important role in software maintenance evolution. Previous approaches summaries by retrieving from similar snippets. However, these heavily rely on whether snippets can be retrieved, how the are, fail capture API knowledge which carries vital information about functionality code. In this paper, we propose a novel approach, named...
Employing Vehicle-to-Vehicle communication to enhance perception performance in self-driving technology has attracted considerable attention recently; however, the absence of a suitable open dataset for benchmarking algorithms made it difficult develop and assess cooperative technologies. To this end, we present first large-scale simulated perception. It contains over 70 interesting scenes, 11,464 frames, 232,913 annotated 3D vehicle bounding boxes, collected from 8 towns CARLA digital town...
Code clones are semantically similar code fragments pairs that syntactically or different. Detection of can help to reduce the cost software maintenance and prevent bugs. Numerous approaches detecting have been proposed previously, but most them focus on syntactic do not work well semantic with different features. To detect clones, researchers tried adopt deep learning for clone detection automatically learn latent features from data. Especially, leverage grammar information, several used...
Commit messages can be regarded as the documentation of software changes. These describe content and purposes changes, hence are useful for program comprehension maintenance. However, due to lack time direct motivation, commit sometimes neglected by developers. To address this problem, Jiang et al. proposed an approach (we refer it NMT), which leverages a neural machine translation algorithm automatically generate short from code. The reported performance their is promising, however, they...
Developers often need to search for appropriate APIs their programming tasks. Although most libraries have API reference documentation, it is not easy find due the lexical gap and knowledge between natural language description of task in documentation. Here, refers fact that same semantic meaning can be expressed by different words, documentation mainly describes functionality structure but lacks other types information like concepts purposes, which are usually key description. In this...
Self-supervised learning (SSL), which can automatically generate ground-truth samples from raw data, holds vast potential to improve recommender systems. Most existing SSL-based methods perturb the data graph with uniform node/edge dropout new views and then conduct self-discrimination based contrastive over different learn generalizable representations. Under this scheme, only a bijective mapping is built between nodes in two views, means that self-supervision signals other are being...
Adding an ability for a system to learn inherently adds uncertainty into the system. Given rising popularity of incorporating machine learning systems, we wondered how addition alters software development practices. We performed mixture qualitative and quantitative studies with 14 interviewees 342 survey respondents from 26 countries across four continents elicit significant differences between systems non-machine-learning systems. Our study uncovers in various aspects engineering (e.g.,...
<italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">Smart contracts</i> are programs running on a blockchain. They immutable to change, and hence can not be patched for bugs once deployed. Thus it is critical ensure they bug-free well-designed before deployment. A xmlns:xlink="http://www.w3.org/1999/xlink">Contract defect</i> an error, flaw or fault in smart contract that causes produce incorrect unexpected result, behave unintended ways. The...
Session-based recommendation targets next-item prediction by exploiting user behaviors within a short time period. Compared with other paradigms, session-based suffers more from the problem of data sparsity due to very limited short-term interactions. Self-supervised learning, which can discover ground-truth samples raw data, holds vast potentials tackle this problem. However, existing self-supervised models mainly rely on item/segment dropout augment are not fit for because leads sparser...
Contrastive learning (CL) has recently been demonstrated critical in improving recommendation performance. The underlying principle of CL-based models is to ensure the consistency between representations derived from different graph augmentations user-item bipartite graph. This self-supervised approach allows for extraction general features raw data, thereby mitigating issue data sparsity. Despite effectiveness this paradigm, factors contributing its performance gains have yet be fully...