- Business Process Modeling and Analysis
- Service-Oriented Architecture and Web Services
- Topic Modeling
- Natural Language Processing Techniques
- Semantic Web and Ontologies
- Petri Nets in System Modeling
- Data Quality and Management
- Software Engineering Research
- Text and Document Classification Technologies
- Multimodal Machine Learning Applications
- Software System Performance and Reliability
- Magnetic and transport properties of perovskites and related materials
- Multiferroics and related materials
- Advanced Condensed Matter Physics
- Scheduling and Optimization Algorithms
- Advanced Graph Neural Networks
- Advanced Text Analysis Techniques
- Spam and Phishing Detection
- Advanced Database Systems and Queries
- Open Source Software Innovations
- Software Engineering Techniques and Practices
- Advanced Computational Techniques and Applications
- Data Stream Mining Techniques
- Domain Adaptation and Few-Shot Learning
- Web Data Mining and Analysis
Tsinghua University
2016-2025
Second Affiliated Hospital of Dalian Medical University
2022-2024
Dalian Medical University
2022-2024
Yanshan University
2020-2023
Ningbo Institute of Industrial Technology
2021-2023
Chinese Academy of Sciences
2021-2023
Alibaba Group (Cayman Islands)
2021-2023
Amazon (United States)
2023
Aerospace Information Research Institute
2023
Medical Architecture (United Kingdom)
2023
This paper presents the first comprehensive analysis of ChatGPT's Text-to-SQL ability. Given recent emergence large-scale conversational language model ChatGPT and its impressive capabilities in both abilities code generation, we sought to evaluate performance. We conducted experiments on 12 benchmark datasets with different languages, settings, or scenarios, results demonstrate that has strong text-to-SQL abilities. Although there is still a gap from current state-of-the-art (SOTA)...
Open relation extraction is the task of extracting open-domain facts from natural language sentences. Existing works either utilize heuristics or distant-supervised annotations to train a supervised classifier over pre-defined relations, adopt unsupervised methods with additional assumptions that have less discriminative power. In this work, we propose self-supervised framework named SelfORE, which exploits weak, signals by leveraging large pretrained model for adaptive clustering on...
Context: The inclusion of grey literature (GL) is important to remove publication bias while gathering available evidence regarding a certain topic. number systematic reviews (SLRs) in Software Engineering (SE) increasing but we do not know about the extent GL usage these SLRs. Moreover, Google Scholar rapidly becoming search engine choice for many researchers which it can find primary studies known. Objective: This tertiary study an attempt i) measure SLRs SE. Furthermore this proposes...
Chen Qian, Fuli Feng, Lijie Wen, Chunping Ma, Pengjun Xie. Proceedings of the 59th Annual Meeting Association for Computational Linguistics and 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). 2021.
To alleviate human efforts from obtaining large-scale annotations, Semi-Supervised Relation Extraction methods aim to leverage unlabeled data in addition learning limited samples. Existing self-training suffer the gradual drift problem, where noisy pseudo labels on are incorporated during training. noise labels, we propose a method called MetaSRE, Label Generation Network generates accurate quality assessment by (meta) successful and failed attempts Classification as an additional...
Abstract Flexible magnetic materials with robust and controllable perpendicular anisotropy (PMA) are highly desirable for developing flexible high-performance spintronic devices. However, it is still challenge to fabricate PMA films on polymers directly. Here, we report a facile method synthesizing single-crystal freestanding SrRuO 3 membranes controlled crystal structure orientation using water-soluble Ca 3-x Sr x Al 2 O 6 sacrificial layers. Through cooperative effect of orientation,...
As social media platforms are evolving from text-based forums into multi-modal environments, the nature of misinformation in is also transforming accordingly. Misinformation spreaders have recently targeted contextual connections between modalities e.g., text and image. However, existing datasets for rumor detection mainly focus on a single modality i.e., text. To bridge this gap, we construct MR2, multimodal multilingual retrieval-augmented dataset detection. The covers rumors with images...
Low-resource Relation Extraction (LRE) aims to extract relation facts from limited labeled corpora when human annotation is scarce. Existing works either utilize self-training scheme generate pseudo labels that will cause the gradual drift problem, or leverage meta-learning which does not solicit feedback explicitly. To alleviate selection bias due lack of loops in existing LRE learning paradigms, we developed a Gradient Imitation Reinforcement Learning method encourage label data imitate...
Natural Language Inference (NLI) is a growingly essential task in natural language understanding, which requires inferring the relationship between sentence pairs ( <bold xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">premise</b> and xmlns:xlink="http://www.w3.org/1999/xlink">hypothesis</b> ). Recently, low-resource inference has gained increasing attention, due to significant savings manual annotation costs better fit with real-world scenarios....
The radioactive nature of Large Language Model (LLM) watermarking enables the detection watermarks inherited by student models when trained on outputs watermarked teacher models, making it a promising tool for preventing unauthorized knowledge distillation. However, robustness watermark radioactivity against adversarial actors remains largely unexplored. In this paper, we investigate whether can acquire capabilities through distillation while avoiding inheritance. We propose two categories...