- Advanced Graph Neural Networks
- Recommender Systems and Techniques
- Topic Modeling
- Multimodal Machine Learning Applications
- Natural Language Processing Techniques
- Domain Adaptation and Few-Shot Learning
- Graph Theory and Algorithms
- Complex Network Analysis Techniques
- Advanced Image and Video Retrieval Techniques
- Advanced Neural Network Applications
- Advanced Bandit Algorithms Research
- Video Surveillance and Tracking Methods
- Human Pose and Action Recognition
- Generative Adversarial Networks and Image Synthesis
- Cloud Computing and Resource Management
- Data Management and Algorithms
- Video Analysis and Summarization
- Machine Learning in Healthcare
- Advanced Multi-Objective Optimization Algorithms
- Privacy-Preserving Technologies in Data
- Visual Attention and Saliency Detection
- Advanced Text Analysis Techniques
- Quantum Chromodynamics and Particle Interactions
- Metaheuristic Optimization Algorithms Research
- Advanced Database Systems and Queries
Shenyang Agricultural University
2025
Alibaba Group (United States)
2017-2025
Sichuan University
2025
Wuhan Textile University
2025
Zhejiang University
2021-2024
Alibaba Group (China)
2017-2024
China Agricultural University
2023
Alibaba Group (Cayman Islands)
2019-2023
Peking University
2014-2021
University of California, San Diego
2019-2020
Click-through rate (CTR) prediction, whose goal is to estimate the probability of a user clicking on item, has become one core tasks in advertising system. For CTR prediction model, it necessary capture latent interest behind behavior data. Besides, considering changing external environment and internal cognition, evolves over time dynamically. There are several methods for modeling, while most them regard representation as directly, lack specially modeling concrete behavior. Moreover,...
Text-to-Image generation in the general domain has long been an open problem, which requires both a powerful generative model and cross-modal understanding. We propose CogView, 4-billion-parameter Transformer with VQ-VAE tokenizer to advance this problem. also demonstrate finetuning strategies for various downstream tasks, e.g. style learning, super-resolution, text-image ranking fashion design, methods stabilize pretraining, eliminating NaN losses. CogView achieves state-of-the-art FID on...
A user can be represented as what he/she does along the history. common way to deal with modeling problem is manually extract all kinds of aggregated features over heterogeneous behaviors, which may fail fully represent data itself due limited human instinct. Recent works usually use RNN-based methods give an overall embedding a behavior sequence, then could exploited by downstream applications. However, this only preserve very information, or memories person. When application requires...
An increasing number of machine learning tasks require dealing with large graph datasets, which capture rich and complex relationship among potentially billions elements. Graph Neural Network (GNN) becomes an effective way to address the problem by converting data into a low dimensional space while keeping both structural property information maximum extent constructing neural network for training referencing. However, it is challenging provide efficient storage computation capabilities...
Recently, neural networks have been widely used in e-commerce recommender systems, owing to the rapid development of deep learning. We formalize system as a sequential recommendation problem, intending predict next items that user might be interacted with. Recent works usually give an overall embedding from user's behavior sequence. However, unified cannot reflect multiple interests during period. In this paper, we propose novel controllable multi-interest framework for recommendation,...
We propose a new CogQA framework for multi-hop reading comprehension question answering in web-scale documents. Founded on the dual process theory cognitive science, gradually builds graph an iterative by coordinating implicit extraction module (System 1) and explicit reasoning 2). While giving accurate answers, our further provides explainable paths. Specifically, implementation based BERT neural network efficiently handles millions of documents questions HotpotQA fullwiki dataset,...
Graph Embedding methods are aimed at mapping each vertex into a low dimensional vector space, which preserves certain structural relationships among the vertices in original graph. Recently, several works have been proposed to learn embeddings based on sampled paths from graph, e.g., DeepWalk, Line, Node2Vec. However, their only preserve symmetric proximities, could be insufficient many applications, even underlying graph is undirected. Besides, they lack of theoretical analysis what exactly...
To learn a sequential recommender, the existing methods typically adopt sequence-to-item (seq2item) training strategy, which supervises sequence model with user's next behavior as label and past behaviors input. The seq2item however, is myopic usually produces non-diverse recommendation lists. In this paper, we study problem of mining extra signals for supervision by looking at longer-term future. There exist two challenges: i) reconstructing future containing many exponentially harder than...
Heterogeneous graph neural networks (HGNNs) have been blossoming in recent years, but the unique data processing and evaluation setups used by each work obstruct a full understanding of their advancements. In this work, we present systematical reproduction 12 HGNNs using official codes, datasets, settings, hyperparameters, revealing surprising findings about progress HGNNs. We find that simple homogeneous GNNs, e.g., GCN GAT, are largely underestimated due to improper settings. GAT with...
Deep candidate generation (DCG) that narrows down the collection of relevant items from billions to hundreds via representation learning has become prevalent in industrial recommender systems. Standard approaches approximate maximum likelihood estimation (MLE) through sampling for better scalability and address problem DCG a way similar language modeling. However, live systems face severe exposure bias have vocabulary several orders magnitude larger than natural language, implying MLE will...
The mass table in the deformed relativistic Hartree-Bogoliubov theory continuum (DRHBc) with PC-PK1 density functional has been established for even-$Z$ nuclei $8\le Z\le120$, extended from previous work even-even [Zhang $\it{et.~al.}$ (DRHBc Mass Table Collaboration), At. Data Nucl. Tables 144, 101488 (2022)]. calculated binding energies, two-nucleon and one-neutron separation root-mean-square (rms) radii of neutron, proton, matter, charge distributions, quadrupole deformations, neutron...
Graph representation learning has been extensively studied in recent years, which sampling is a critical point. Prior arts usually focus on positive node pairs, while the strategy for negative left insufficiently explored. To bridge gap, we systematically analyze role of from perspectives both objective and risk, theoretically demonstrating that as important determining optimization resulted variance. best our knowledge, are first to derive theory quantify nice distribution pn(u|v) ∝...
User behavior data in recommender systems are driven by the complex interactions of many latent factors behind users' decision making processes. The highly entangled, and may range from high-level ones that govern user intentions, to low-level characterize a user's preference when executing an intention. Learning representations uncover disentangle these can bring enhanced robustness, interpretability, controllability. However, learning such disentangled is challenging, remains largely...
In this work, we explore a scalable way for building general representation model toward unlimited modalities. We release ONE-PEACE, highly extensible with 4B parameters that can seamlessly align and integrate representations across vision, audio, language The architecture of ONE-PEACE comprises modality adapters, shared self-attention layers, FFNs. This design allows the easy extension new modalities by adding adapters FFNs, while also enabling multi-modal fusion through layers. To pretrain...
Protecting crop yields is the most important aspect of agricultural production, and one measures in preserving control pests diseases; therefore, identification diseases irreplaceable importance. In recent years, with maturity computer vision technology, more possibilities have been provided for implementing plant disease detection. However, although deep learning methods are widely used various tasks, there still limitations obstacles practical applications. Traditional learning-based...
The answering quality of an aligned large language model (LLM) can be drastically improved if treated with proper crafting prompts. In this paper, we propose ExpertPrompting to elicit the potential LLMs answer as distinguished experts. We first utilize In-Context Learning automatically synthesize detailed and customized descriptions expert identity for each specific instruction, then ask provide conditioned on such agent background. Based augmented prompting strategy, produce a new set...
Video-based 3D human pose and shape estimations are evaluated by intra-frame accuracy inter-frame smoothness. Although these two metrics responsible for different ranges of temporal consistency, existing state-of-the-art methods treat them as a unified problem use monotonous modeling structures (e.g., RNN or attention-based block) to design their networks. However, using single kind structure is difficult balance the learning short-term long-term correlations, may bias network one them,...
Large language models (LLMs) have revolutionized the field of artificial intelligence, enabling natural processing tasks that were previously thought to be exclusive humans. In this work, we introduce Qwen, first installment our large model series. Qwen is a comprehensive series encompasses distinct with varying parameter counts. It includes base pretrained models, and Qwen-Chat, chat finetuned human alignment techniques. The consistently demonstrate superior performance across multitude...
This report introduces the Qwen2 series, latest addition to our large language models and multimodal models. We release a comprehensive suite of foundational instruction-tuned models, encompassing parameter range from 0.5 72 billion, featuring dense Mixture-of-Experts model. surpasses most prior open-weight including its predecessor Qwen1.5, exhibits competitive performance relative proprietary across diverse benchmarks on understanding, generation, multilingual proficiency, coding,...
Transcription factors are pivotal molecules involved in transcriptional and post-transcriptional regulation plants, playing a crucial role combating biological stress. Here, we have characterized regulatory factor, OsbHLH34, which governs the response of rice to infection by Rhizoctonia solani AG1-IA. The expression OsbHLH34 significantly impacts susceptibility infection. Through generation knockout overexpressing observed that acts as positive regulator resistance against sheath blight....
A user can be represented as what he/she does along the history. common way to deal with modeling problem is manually extract all kinds of aggregated features over heterogeneous behaviors, which may fail fully represent data itself due limited human instinct. Recent works usually use RNN-based methods give an overall embedding a behavior sequence, then could exploited by downstream applications. However, this only preserve very information, or memories person. When application requires...
Attention module does not always help deep models learn causal features that are robust in any confounding context, e.g., a foreground object feature is invariant to different backgrounds. This because the confounders trick attention capture spurious correlations benefit prediction when training and testing data IID (identical & independent distribution); while harm OOD (out-of-distribution). The sole fundamental solution by intervention, which requires additional annotations of confounders,...