- Topic Modeling
- Natural Language Processing Techniques
- Multimodal Machine Learning Applications
- Advanced Text Analysis Techniques
- Advanced Graph Neural Networks
- Advanced Data Storage Technologies
- Machine Learning in Materials Science
- Speech Recognition and Synthesis
- Model-Driven Software Engineering Techniques
- Simulation Techniques and Applications
- Software Engineering Research
- Bioinformatics and Genomic Networks
- Computational Drug Discovery Methods
- Cryptography and Data Security
- Complex Network Analysis Techniques
- Machine Learning and Data Classification
- AI in cancer detection
- Network Security and Intrusion Detection
- Soil Mechanics and Vehicle Dynamics
- Text Readability and Simplification
- Speech and dialogue systems
- Internet Traffic Analysis and Secure E-voting
- Text and Document Classification Technologies
- Domain Adaptation and Few-Shot Learning
- Software Testing and Debugging Techniques
Shandong University of Technology
2024
Macquarie University
2023-2024
China Mobile (China)
2022-2023
University Town of Shenzhen
2023
Tsinghua University
2022-2023
NetEase (China)
2022
Xinjiang Technical Institute of Physics & Chemistry
2021
Chinese Academy of Sciences
2021
University of Chinese Academy of Sciences
2021
UNSW Sydney
2020-2021
In the emerging cloud computing paradigm, data owners become increasingly motivated to outsource their complex management systems from local sites commercial public for great flexibility and economic savings. For consideration of users' privacy, sensitive have be encrypted before outsourcing, which makes effective utilization a very challenging task. this paper, first time, we define solve problem privacy-preserving query over graph-structured in (PPGQ), establish set strict privacy...
With the increasing adoption of cloud computing for data storage, assuring service reliability, in terms correctness and availability, has been outstanding. While redundancy can be added into problem becomes challenging "pay-as-you-use" paradigm where we always want to efficiently resolve it both corruption detection repair. Prior distributed storage systems based on erasure codes or network coding techniques have either high decoding computational cost users, too much burden repair being...
Graph neural networks (GNNs) are now the mainstream method for mining graph-structured data and learning low-dimensional node- graph-level embeddings to serve downstream tasks. However, limited by bottleneck of interpretability that deep present, existing GNNs have ignored issue estimating appropriate number dimensions embeddings. Hence, we propose a novel framework called Minimum Entropy principle-guided Dimension Estimation, i.e. MGEDE, learns embedding both node graph representations. In...
Posters serve an essential function in marketing and advertising by improving visual communication brand visibility, thus significantly contributing to industrial design. With the latest developments controllable T2I diffusion models, research interest has surged text rendering within synthesized images. Although accuracy seen advancements, automatic poster generation remains a relatively untapped area. This paper presents framework featuring capabilities through use of LLMs. Our employs...
With the advancement of deep learning techniques, performance Automatic Program Repair(APR) techniques has reached a new level. Previous learning-based APR essentially modified program sentences in Autoregressive(AR) manner, which predicts future values based on past values. Due to manner word-by-word generation, AR-based technique huge time delay. This negative consequence overshadows widespread adoption real-life software development. To address issue, we aim apply Non-Autoregressive(NAR)...
Despite advances in generating fluent texts, existing pretraining models tend to attach incoherent event sequences involved entities when narratives such as stories and news. We conjecture that issues result from representing static embeddings of superficial words, while neglecting model their ever-changing states, i.e., the information they carry, text unfolds. Therefore, we extend Transformer dynamically conduct entity state updates sentence realization for narrative generation. propose a...
Graphs have a superior ability to represent relational data, like chemical compounds, proteins, and social networks. Hence, graph-level learning, which takes set of graphs as input, has been applied many tasks including comparison, regression, classification, more. Traditional approaches learning heavily rely on hand-crafted features, such substructures. But while these methods benefit from good interpretability, they often suffer computational bottlenecks cannot skirt the graph isomorphism...
Mining and understanding patients’ disease-development pattern is a major healthcare need. A huge number of research studies have focused on medical resource allocation, survivability prediction, risk management diagnosis, etc. In this article, we are specifically interested in discovering factors for patients with high probability developing cancers. We propose systematic data-driven algorithm build around the idea association rule mining. More precisely, rule-mining method firstly applied...
Despite the success of text-to-text pre-trained models in various natural language generation (NLG) tasks, performance is largely restricted by number labeled data downstream particularly data-to-text tasks. Existing works mostly utilize abundant unlabeled structured to conduct unsupervised pre-training for task adaption, which fail model complex relationship between source and target texts. Thus, we introduce self-training as a better few-shot learner than task-adaptive pre-training,...
Posters play a crucial role in marketing and advertising, contributing significantly to industrial design by enhancing visual communication brand visibility. With recent advances controllable text-to-image diffusion models, more concise research is now focusing on rendering text within synthetic images. Despite improvements accuracy, the field of end-to-end poster generation remains underexplored. This complex task involves striking balance between accuracy automated layout produce...
Context: With the waning of Moore's Law, software industry is placing increasing importance on finding alternative solutions for continuous performance enhancement. The significance and research results optimization have been rise in recent years, especially with advancement propelled by Large Language Models(LLMs). However, traditional strategies rectifying flaws shown significant limitations at competitive code efficiency level, this topic surprisingly scarce. Objective: This study aims to...
In this study, a closed multi-channel air-blowing plug seedling pick-up device and combined tray were designed to address the issues of complex structure, high damage rates low efficiency in fully automated vegetable transplanter systems. The operates by sealing seedlings cup, where compressed air is channeled into sealed cavity through multiple passages during process. upper surface subjected uniform force, overcoming friction adhesion between tray. This process presses guide tube,...
Haozhe Ji, Rongsheng Zhang, Zhenyu Yang, Zhipeng Hu, Minlie Huang. Proceedings of the 2022 Conference North American Chapter Association for Computational Linguistics: Human Language Technologies. 2022.
This paper propose to combine pretrained language models with the modular dialogue paradigm for open-domain modeling. Our method, semantic-enhanced finetuning, instantiates conversation understanding, planning, and response generation as a model finetuning task. At inference, we disentangle semantic token variations by specifying sampling methods constraints each module separately. For training evaluation, present X-Weibo, Chinese multi-turn dataset automatic annotation emotions, DAs,...
Natural language processing techniques have witnessed a notable success in many applications, such as dialogue generation, machine translation, and document summarization. Among them, the study of summarization is an active research area, which aims to generate concise version original preserve its most informative content. However, existing work fails consider interaction among raw data (in terms contextual similarity difference), results inaccurate even conflicting outcomes. In this paper,...
Despite advances in generating fluent texts, existing pretraining models tend to attach incoherent event sequences involved entities when narratives such as stories and news. We conjecture that issues result from representing static embeddings of superficial words, while neglecting model their ever-changing states, i.e., the information they carry, text unfolds. Therefore, we extend Transformer dynamically conduct entity state updates sentence realization for narrative generation. propose a...
Extracting entities and relations from unstructured sentences is one of the most concerned tasks in field natural language processing. However, existing works process entity relation information a certain order suffer error iteration. In this paper, we introduce relational triplet joint tagging network (RTJTN), which divided into layer judgment layer. layer, instead extracting separately, propose method that allows model to simultaneously extract prevent iteration; and, solve overlapping...
Although Transformers with fully connected self-attentions are powerful to model long-term dependencies, they struggling scale long texts thousands of words in language modeling. One the solutions is equip a recurrence memory. However, existing approaches directly reuse hidden states from previous segment that encodes contexts uni-directional way. As result, this prohibits memory dynamically interact current context provides up-to-date information for token prediction. To remedy issue, we...