- Advanced Graph Neural Networks
- Data Stream Mining Techniques
- Complex Network Analysis Techniques
- Anomaly Detection Techniques and Applications
- Topic Modeling
- Graph Theory and Algorithms
- Machine Learning and Data Classification
- Recommender Systems and Techniques
- Time Series Analysis and Forecasting
- Network Security and Intrusion Detection
- Spam and Phishing Detection
- Data Management and Algorithms
- Bioinformatics and Genomic Networks
- Imbalanced Data Classification Techniques
- Machine Learning and Algorithms
- Human Mobility and Location-Based Analysis
- Domain Adaptation and Few-Shot Learning
- Multimodal Machine Learning Applications
- Caching and Content Delivery
- Face and Expression Recognition
- Text and Document Classification Technologies
- Privacy-Preserving Technologies in Data
- Organic Light-Emitting Diodes Research
- Bayesian Methods and Mixture Models
- Machine Learning and ELM
Nanjing University of Posts and Telecommunications
2024-2025
Guangzhou University
2021-2024
Nanjing University of Science and Technology
2022-2024
Beijing Institute of Big Data Research
2024
Association for Computing Machinery
2023
Chinese Academy of Sciences
2009-2022
Institute of Information Engineering
2014-2022
Northeastern University
2022
Chongqing University
2022
Shandong Agricultural University
2022
We introduce our first-generation reasoning models, DeepSeek-R1-Zero and DeepSeek-R1. DeepSeek-R1-Zero, a model trained via large-scale reinforcement learning (RL) without supervised fine-tuning (SFT) as preliminary step, demonstrates remarkable capabilities. Through RL, naturally emerges with numerous powerful intriguing behaviors. However, it encounters challenges such poor readability, language mixing. To address these issues further enhance performance, we DeepSeek-R1, which incorporates...
Graph neural networks (GNNs) emerged recently as a powerful tool for analyzing non-Euclidean data such social network data. Despite their success, the design of graph requires heavy manual work and domain knowledge. In this paper, we present architecture search method (GraphNAS) that enables automatic best based on reinforcement learning. Specifically, GraphNAS uses recurrent to generate variable-length strings describe architectures networks, trains with policy gradient maximize expected...
Multiview learning (MVL), by exploiting the complementary information among multiple feature sets, can improve performance of many existing tasks. Support vector machine (SVM)-based models have been frequently used for MVL. A typical SVM-based MVL model is SVM-2K, which extends SVM using distance minimization version kernel canonical correlation analysis. However, SVM-2K cannot fully unleash power different views. Recently, a framework privileged (LUPI) has proposed to data with information....
Attributed network embedding enables joint representation learning of node links and attributes. Existing attributed models are designed in continuous Euclidean spaces which often introduce data redundancy impose challenges to storage computation costs. To this end, we present a Binarized Network Embedding model (BANE for short) learn binary representation. Specifically, define new Weisfeiler-Lehman proximity matrix capture dependence between attributes by aggregating the information from...
In this paper, we address a new research problem on active learning from data streams where volumes grow continuously and labeling all is considered expensive impractical. The objective to label small portion of stream which model derived predict newly arrived instances as accurate possible. order tackle the challenges raised by streams' dynamic nature, propose classifier ensembling based framework selectively labels build an classifier. A minimal variance principle introduced guide instance...
Graph Neural Networks (GNNs) have been popularly used for analyzing non-Euclidean data such as social network and biological data. Despite their success, the design of graph neural networks requires a lot manual work domain knowledge. In this paper, we propose Architecture Search method (GraphNAS short) that enables automatic search best architecture based on reinforcement learning. Specifically, GraphNAS first uses recurrent to generate variable-length strings describe architectures...
In this paper, we study a new problem of continuous learning from doubly-streaming data where both volume and feature space increase over time. We refer to the as trapezoidal streams corresponding online streams. The is challenging because dimension time, existing <xref ref-type="bibr" rid="ref1"> [1]</xref> , rid="ref2">[2]</xref> selection rid="ref3">[3]</xref> streaming algorithms rid="ref4">[4]</xref> rid="ref5">[5]</xref> are inapplicable. propose Online Learning with Streaming Features...
Inductive link prediction for knowledge graph aims at predicting missing links between unseen entities, those not shown in training stage. Most previous works learn entity-specific embeddings of which cannot handle entities. Recent several methods utilize enclosing subgraph to obtain inductive ability. However, all these only consider the part without complete neighboring relations, leads issue that partial relations are neglected, and sparse subgraphs hard be handled. To address that, we...
Heterogeneous graphs are commonly used to describe networked data with multiple types of nodes and edges. Graph Neural Networks (HGNNs) powerful tools for analyzing heterogeneous graphs. However, designing neural architectures HGNNs requires extensive domain knowledge time-consuming manual work. Recently, architecture search algorithms have become popular in automatically homogeneous graph networks. In this paper, we present a Architecture Search algorithm (HGNAS short) which allows the...
Ensemble learning is a commonly used tool for building prediction models from data streams, due to its intrinsic merits of handling large volumes stream data. Despite extraordinary successes in mining, existing ensemble models, environments, mainly fall into the classifiers category, without realizing that requires labor intensive labeling process, and it often case we may have small number labeled samples train few classifiers, but unlabeled are available build clusters streams....
Ensemble learning is a common tool for data stream classification, mainly because of its inherent advantages handling large volumes and concept drifting. Previous studies, to date, have been primarily focused on building accurate ensemble models from data. However, linear scan number base classifiers in the during prediction incurs significant costs response time, preventing being practical many real-world time-critical applications, such as Web traffic monitoring, spam detection, intrusion...
With unlimited growth of real-world data size and increasing requirement real-time processing, immediate processing big stream has become an urgent problem. In data, hidden patterns commonly evolve over time (i.e.,concept drift), where many dynamic learning strategies have been proposed, such as the incremental ensemble learning. To best our knowledge, there is no work systematically compare these two methods. this paper we conduct comparative study between theses We first introduce concept...
The interaction of multiple drugs could lead to serious events, which causes injuries and huge medical costs. Accurate prediction drug-drug (DDI) events can help clinicians make effective decisions establish appropriate therapy programs. Recently, many AI-based techniques have been proposed for predicting DDI associated events. However, most existing methods pay less attention the potential correlations between other multimodal data such as targets enzymes. To address this problem, we...
Multiple-resonance thermally activated delayed fluorescence (MR-TADF) materials have attracted extensive attention due to their 100% exciton utilization efficiency and narrowband emissions. Numerous tube-shaped MR-TADF emitters with full-color emissions been reported, updated molecular design strategies need be proposed find more "recipes" narrow the emission spectral range. Upon changing shape of fluorophore from a tubular fan-shaped structure, investigated molecules exhibit based on...
In this paper, we propose a framework to build prediction models from data streams which contain both labeled and unlabeled examples. We argue that due the increasing collection ability but limited resources for labeling, stream collected at hand may only have small number of examples, whereas large portion remain can be beneficial learning. Unleashing full potential instances mining is, however, significant challenge, consider even fully suffer concept drifting, inappropriate uses samples...
Lazy learning, such as k-nearest neighbor has been widely applied to many applications. Known for well capturing data locality, lazy learning can be advantageous highly dynamic and complex environments streams. Yet its high memory consumption low prediction efficiency have made it less favorable stream oriented Specifically, traditional stores all the training inductive process is deferred until a query appears, whereas in applications, records flow continuously large volumes of class labels...
Knowledge Graph (KG) embedding has become crucial for the task of link prediction. Recent work applies encoder-decoder models to tackle this problem, where an encoder is formulated as a graph neural network (GNN) and decoder represented by method. These approaches enforce techniques with structure information. Unfortunately, existing GNN-based frameworks still confront 3 severe problems: low representational power, stacking in flat way, poor robustness noise. In work, we propose novel...
Graph neural networks (GNNs) are popularly used to analyze non-Euclidean graph data. Despite their successes, the design of requires heavy manual work and rich domain knowledge. Recently, architecture search algorithms widely automatically architectures for CNNs RNNs. Inspired by success algorithms, we present a algorithm GraphNAS that enables automatic best based on reinforcement learning. Specifically, uses recurrent network as controller generate variable-length strings describe networks,...
Ensemble learning has become a common tool for data stream classification, being able to handle large volumes of and concept drifting. Previous studies focus on building accurate prediction models from data. However, linear scan number base classifiers in the ensemble during incurs significant costs response time, preventing practical many real world time-critical applications, such as Web traffic monitoring, spam detection, intrusion detection. In these streams usually arrive at speed...
The latent friend recommendation in online social media is interesting, yet challenging, because the user-item ratings and user-user relationships are both sparse. In this paper, we propose a new dual implicit mining-based model that simultaneously considers interest topics of users link between local topic cliques. Specifically, first an algorithm called all reviews from user tags their corresponding items to learn weights, then compute similarity using symmetric Jensen-Shannon divergence....
In recent years, temporal knowledge graph (TKG) reasoning has received significant attention. Most existing methods assume that all timestamps and corresponding graphs are available during training, which makes it difficult to predict future events. To address this issue, works learn infer events based on historical information. However, these do not comprehensively consider the latent patterns behind changes, pass information selectively, update representations appropriately accurately....
Live-streaming platforms have recently gained significant popularity by attracting an increasing number of young users and become a very promising form online shopping. Similar to the traditional shopping such as Taobao, live-streaming also suffer from malicious fraudulent behaviors where many transactions are not genuine. The existing anti-fraud models proposed recognize on inapplicable platforms. This is mainly because characterized unique type heterogeneous networks multiple types nodes...
Currently, the blockchain technology has been widely applied to various industries, and attracted wide attention. However, because of its unique anonymity, digital currency become a haven for all kinds cyber crimes. It reported that Ethereum frauds provide huge profits, pose serious threat financial security network. To create desired environment, an effective method is urgently needed automatically detect identify in governance system. In view this, this paper proposes detecting by mining...