- Topic Modeling
- Natural Language Processing Techniques
- Advanced Graph Neural Networks
- Advanced Malware Detection Techniques
- Network Security and Intrusion Detection
- Multimodal Machine Learning Applications
- Recommender Systems and Techniques
- Advanced Neural Network Applications
- Advanced Image and Video Retrieval Techniques
- Video Surveillance and Tracking Methods
- Software Testing and Debugging Techniques
- Advanced Text Analysis Techniques
- Robotics and Sensor-Based Localization
- Advanced Steganography and Watermarking Techniques
- Security and Verification in Computing
- Domain Adaptation and Few-Shot Learning
- Chaos-based Image/Signal Encryption
- Software Engineering Research
- Anomaly Detection Techniques and Applications
- Speech Recognition and Synthesis
- Crime Patterns and Interventions
- Complex Network Analysis Techniques
- Text Readability and Simplification
- Stability and Control of Uncertain Systems
- Image and Object Detection Techniques
Shandong University of Science and Technology
2023-2024
Hengyang Normal University
2020-2024
National University of Singapore
1992-2024
Meizu (China)
2023-2024
Southwest University
2023-2024
National Institute of Metrology
2024
China Agricultural University
2023-2024
Xidian University
2024
Johns Hopkins University
2024
China Information Technology Security Evaluation Center
2023
System auditing provides a low-level view into cyber threats by monitoring system entity interactions. In response to advanced cyber-attacks, one prevalent solution is apply data provenance analysis on audit records search for anomalies (anomalous behaviors) or specifications of known attacks. However, existing approaches suffer from several limitations: 1) generating high volumes false alarms, 2) relying expert knowledge, 3) producing coarse-grained detection signals. this paper, we...
Cross-lingual named entity recognition (CrossNER) faces challenges stemming from uneven performance due to the scarcity of multilingual corpora, especially for non-English data. While prior efforts mainly focus on data-driven transfer methods, a significant aspect that has not been fully explored is aligning both semantic and token-level representations across diverse languages. In this paper, we propose Multi-view Contrastive Learning Named Entity Recognition (MCL-NER). Specifically,...
We focus on essay generation, which is a challenging task that generates paragraph-level text with multiple topics.Progress towards understanding different topics and expressing diversity in this requires more powerful generators richer training evaluation resources. To address this, we develop multi-topic aware long short-term memory (MTA-LSTM) network.In model, maintain novel coverage vector, learns the weight of each topic sequentially updated during decoding process.Afterwards vector fed...
Logical rules are widely used to represent domain knowledge and hypothesis, which is fundamental symbolic reasoning-based human intelligence. Very recently, it has been demonstrated that integrating logical into regular learning tasks can further enhance performance in a label-efficient manner. Many attempts have made learn automatically from graphs (KGs). However, majority of existing methods entirely rely on observed rule instances define the score function for evaluation thus lack...
Jiduan Liu, Jiahao Qifan Wang, Jingang Wei Wu, Yunsen Xian, Dongyan Zhao, Kai Chen, Rui Yan. Proceedings of the 61st Annual Meeting Association for Computational Linguistics (Volume 1: Long Papers). 2023.
Automatic speaker recognition (ASR) is a stepping-stone technology towards semantic multimedia understanding and benefits versatile downstream applications. In recent years, neural network-based ASR methods have demonstrated remarkable power to achieve excellent performance with sufficient training data. However, it impractical collect data for every user, especially fresh users. Therefore, large portion of users usually has very limited number instances. As consequence, the lack prevents...
Fuli Luo, Wei Wang, Jiahao Liu, Yijia Bin Bi, Songfang Huang, Fei Luo Si. Proceedings of the 59th Annual Meeting Association for Computational Linguistics and 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). 2021.
Abstract The study of coke reactivity under high temperature conditions is crucial for understanding its behavior in industrial processes. This presents a novel compact heating furnace, developed in-situ synchrotron small-angle X-ray scattering (SAXS) studies on the high-temperature coke. furnace achieves temperatures up to 1400 ∘ C at rate 12 C/min and enables steam introduction, simulating conditions. Combination with SAXS setup radiation facilities facilitates real-time monitoring...
Recommender systems often suffer from popularity bias, where frequently interacted items are overrepresented in recommendations. This bias stems propensity factors influencing training data, leading to imbalanced exposure. In this paper, we introduce a Fair Sampling (FS) approach address issue by ensuring that both users and selected with equal probability as positive negative instances. Unlike traditional inverse score (IPS) methods, FS does not require estimation, eliminating errors...
Detecting code functional similarity forms the basis of various software engineering tasks. However, detection is challenging as functionally similar fragments can be implemented differently, e.g., with irrelevant syntax. Recent studies incorporate program dependencies semantics to identify syntactically different yet semantically programs, but they often focus only on local neighborhoods (e.g., one-hop dependencies), limiting expressiveness in modeling functionalities. In this paper, we...
Graph Database Management Systems (GDBMSs) store graphs as data. They are used naturally in applications such social networks, recommendation systems and program analysis. However, they can be affected by logic bugs, which cause the GDBMSs to compute incorrect results subsequently affect relying on them. In this work, we propose injective surjective Query Transformation (GQT) detect bugs GDBMSs. Given a query Q, derive mutated Q', so that either their result sets are: (i) semantically...
Most of the existing sentiment classification models use Word2Vec, GloVe, etc. to obtain word vector representation text. But these methods ignore context words. In response this problem, a neural network model based on combination BERT (bidirectional encoder representations from transformers) pre-trained language and BLSTM long short-term memory network) attention mechanism is proposed for text analysis in paper. First, which including contextual semantic information obtained through...
Android malware detection serves as the front line against malicious apps. With rapid advancement of machine learning (ML), ML-based has attracted increasing attention due to its capability automatically capturing patterns from APKs. These learning-driven methods have reported promising results in detecting malware. However, absence an in-depth analysis current research progress makes it difficult gain a holistic picture state art this area. This paper presents comprehensive investigation...
Developers insert logging statements into source code to monitor system execution, which forms the basis for software debugging and maintenance. For distinguishing diverse runtime information, each log is assigned with a separate verbosity level (e.g., trace error). However, choosing an appropriate challenging error-prone task due lack of specifications usages. Prior solutions aim suggest levels based on block in statement resides (i.e., intra-block features). Such suggestions, however, do...
Named entity recognition (NER) is an important research problem in natural language processing. There are three types of NER tasks, including flat, nested and discontinuous recognition. Most previous sequential labeling models task-specific, while recent years have witnessed the rising generative due to advantage unifying all tasks into seq2seq model framework. Although achieving promising performance, our pilot studies demonstrate that existing ineffective at detecting boundaries estimating...
Existing work in multilingual pretraining has demonstrated the potential of cross-lingual transferability by training a unified Transformer encoder for multiple languages. However, much this only relies on shared vocabulary and bilingual contexts to encourage correlation across languages, which is loose implicit aligning contextual representations between In paper, we plug cross-attention module into explicitly build interdependence It can effectively avoid degeneration predicting masked...
An efficient method to normalize the skew of document images is proposed. The based on idea that baseline a line text straight line, and inclination this with respect horizon angle. detecting process conducted in three stages. First, lowest points symbols are extracted. Then, height attribute these used filter out some unnecessary base belonging symbols. Finally, statistical approach get angle by using position information points.< <ETX xmlns:mml="http://www.w3.org/1998/Math/MathML"...
Application for jobs usually brings much work both appliers and HR. Appliers want to apply the which they are most suitable. The number of applications a particular position can be significant, making candidates' selection cumbersome Nowadays, hiring processes often conducted through Virtual mode with emails. This creates chances analyzing data in resume. Therefore, enhance problems' efficiency, resume parsing algorithms have been developed recent years predict resume-based skills or good...
An approach to the design of large variable structure systems subject control bounds is introduced. The method includes a switching hyperplane based on generalized inverses and system decomposition. To ensure reaching achieving sliding condition, switched between local equivalent bounded corrective control. component completed using decomposition into smaller subsystems. estimate domain attraction corresponding obtained used select appropriate controller bounds. illustrated fifth-order...
While transformer-based pre-trained language models (PLMs) have dominated a number of NLP applications, these are heavy to deploy and expensive use. Therefore, effectively compressing large-scale PLMs becomes an increasingly important problem. Quantization, which represents high-precision tensors with low-bit fix-point format, is viable solution. However, most existing quantization methods task-specific, requiring customized training large trainable parameters on each individual task....
Dynamic interaction graphs have been widely adopted to model the evolution of user-item interactions over time. There are two crucial factors when modelling user preferences for link prediction in dynamic graphs: 1) collaborative relationship among users and 2) personalized patterns. Existing methods often implicitly consider these together, which may lead noisy diverge. In addition, they usually require time-consuming parameter learning with back-propagation, is prohibitive real-time...