- Mobile Crowdsensing and Crowdsourcing
- Software System Performance and Reliability
- Software Engineering Research
- Service-Oriented Architecture and Web Services
- Data Stream Mining Techniques
- Topic Modeling
- Software Reliability and Analysis Research
- Privacy-Preserving Technologies in Data
- Anomaly Detection Techniques and Applications
- Software Testing and Debugging Techniques
- Network Security and Intrusion Detection
- Recommender Systems and Techniques
- Internet Traffic Analysis and Secure E-voting
- Caching and Content Delivery
- Expert finding and Q&A systems
- Natural Language Processing Techniques
- Auction Theory and Applications
- Distributed and Parallel Computing Systems
- Multimodal Machine Learning Applications
- Advanced Image and Video Retrieval Techniques
- Evolution and Genetic Dynamics
- Distributed Control Multi-Agent Systems
- Mathematical and Theoretical Epidemiology and Ecology Models
- Adversarial Robustness in Machine Learning
- Web Data Mining and Analysis
Beihang University
2015-2024
Zhengzhou University
2024
Beijing Advanced Sciences and Innovation Center
2020-2023
Beijing Aerospace Flight Control Center
2022-2023
Jiangnan University
2021
Beijing University of Technology
2015
Source code summarization aims to automatically generate concise summaries of source in natural language texts, order help developers better understand and maintain code. Traditional work generates a summary by utilizing information retrieval techniques, which select terms from original or adapt similar snippets. Recent studies adopt Neural Machine Translation techniques snippets using encoder-decoder neural networks. The neural-based approaches prefer the high-frequency words corpus have...
Unsupervised sentence representation learning is a fundamental problem in natural language processing. Recently, contrastive has made great success on this task. Existing constrastive based models usually apply random sampling to select negative examples for training. Previous work computer vision shown that hard help achieve faster convergency and better optimization learning. However, the importance of negatives yet be explored. In study, we prove are essential maintaining strong gradient...
Multitask learning has shown promising performance in multiple related tasks simultaneously, and variants of model architectures have been proposed, especially for supervised classification problems. One goal multitask is to extract a good representation that sufficiently captures the relevant part input about output each task. To achieve this objective, paper we design architecture based on observation correlations exist between outputs some (e.g. entity recognition relation extraction...
Automatic software debugging mainly includes two tasks of fault localization and automated program repair. Compared with the traditional spectrum-based mutation-based methods, deep learning-based methods are proposed to achieve better performance for localization. However, existing ignore semantic features or only consider simple code representations. They do not leverage bug-related knowledge from large-scale open-source projects either. In addition, template-based repair techniques can...
In this paper, a new distributed consensus tracking protocol incorporating local disturbance rejection is devised for multi-agent system with heterogeneous dynamic uncertainties and disturbances over directed graph. It of two-degree-of-freedom nature. Specifically, robust controller designed tracking, while estimator each agent without requiring the input channel information disturbances. The condition asymptotic derived. Moreover, even when model not exactly known, developed method also...
In recent years, template-based and NMT-based automated program repair methods have been widely studied achieved promising results. However, there are still disadvantages in both methods. The cannot fix the bugs whose types beyond capabilities of templates only use syntax information to guide patch synthesis, while intend generate small range fixed code for better performance may suffer from OOV (Out-of-vocabulary) problem. To solve these problems, we propose a novel neural approach called...
End-to-end learning from crowds has recently been introduced as an EM-free approach to training deep neural networks directly noisy crowdsourced annotations. It models the relationship between true labels and annotations with a specific type of layer, termed crowd which can be trained using pure backpropagation. Parameters however, hardly interpreted annotator reliability, compared more principled probabilistic approach. The lack interpretation further prevents extensions account for...
Automated bug detection is essential for high-quality software development and has attracted much attention over the years. Among various bugs, previous studies show that condition expressions are quite error-prone condition-related bugs commonly found in practice. Traditional approaches to automated usually limited compilable code require tedious manual effort. Recent deep learning-based work tends learn general syntactic features based on Abstract Syntax Tree (AST) or apply existing Graph...
Training deep neural network (DNN) models, which has become an important task in today's software development, is often costly terms of computational resources and time. With the inspiration reuse, building DNN models through reusing existing ones gained increasing attention recently. Prior approaches to model reuse have two main limitations: 1) entire model, while only a small part model's functionalities (labels) are required, would cause much overhead (e.g., time costs for inference), 2)...
In e-commerce systems, customer reviews are important information for understanding market feedbacks on certain commodities. However, accurate analyzing is challenging due to the complexity of natural language processing and informal descriptions in reviews. Existing methods mainly focus studying efficient algorithms that cannot guarantee accuracy review analysis. Crowdsourcing can improve analysis while it subject extra costs low response time. this work, we combine machine learning...
Using machine to generate text has attracted considerable attention recently. However, low quality generated by will seriously impact the user experience due poor readability. Traditional methods for detecting heavily depend on hand-crafted features. While most deep learning general classification tend model semantic representation of topics, and thus overlook coherence that is also useful text. In this paper, we propose an end-to-end neural architecture learns sequences. We conduct...
Knowing developer expertise is critical for achieving effective task allocation. However, it of great challenge to accurately profile the developers over Internet as their activities often disperse across different online communities. In this regard, existing works either merely concern a single community, or simply sum up in individual The former suffers from low accuracy due incomplete data, while latter impractically assumes that completely independent and irrelavant To overcome those...
Mobile crowdsourcing is an emerging paradigm which utilizes the distributed smartphones to monitor diverse phenomena about human activities and surrounding environment, enabling a large number of mobile applications. For those applications collect sufficient data, motivating smartphone users be interested in campaign becomes very significant. Most incentive mechanisms assume that tasks are static systems. Even for studies take uncertain arrival into consideration, they always ignore...
Cloud computing is commonly characterized as a three-layer architecture including IaaS, PaaS and SaaS, while service oriented approach widely considered promising software development method. In this paper, we report our early experience of moving traditional to the cloud environment. Our primary goal provide instant development, deployment running services for developers. Corresponding three layers computing, work includes appliance management, app engine online We elaborate on design...
Fault localization (FL) and automated program repair (APR) are two main tasks of automatic software debugging. Compared with traditional methods, deep learning-based approaches have been demonstrated to achieve better performance in FL APR tasks. However, the existing methods ignore semantic features or only consider simple code representations. And for tasks, template-based weak selecting correct fix templates more effective repair, which also not able synthesize patches via embedded...
Summary In this paper, we present WS‐TaaS, a Web services load testing platform built on global PlanetLab. WS‐TaaS enables process to be simple, transparent, and as close possible the real running scenarios of target services. First, briefly introduce base Service4All. Second, provide detailed analysis requirements service its conceptual architecture well algorithm design for improving resource utilization. Third, implementation details WS‐TaaS. Finally, perform evaluation with set...