Xiaobao Wu

ORCID: 0000-0003-0076-3924
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Topic Modeling
  • Multimodal Machine Learning Applications
  • Natural Language Processing Techniques
  • Computational and Text Analysis Methods
  • Human Pose and Action Recognition
  • Advanced Text Analysis Techniques
  • Video Analysis and Summarization
  • Sentiment Analysis and Opinion Mining
  • Text and Document Classification Technologies
  • Semantic Web and Ontologies
  • Catalytic C–H Functionalization Methods
  • Expert finding and Q&A systems
  • Domain Adaptation and Few-Shot Learning
  • Synthesis and Catalytic Reactions
  • Oxidative Organic Chemistry Reactions
  • Adversarial Robustness in Machine Learning
  • Digital Marketing and Social Media
  • Language and cultural evolution
  • Software Engineering Research
  • Complex Network Analysis Techniques
  • Imbalanced Data Classification Techniques
  • COVID-19 diagnosis using AI
  • Human Motion and Animation
  • Advanced Image and Video Retrieval Techniques
  • Machine Learning and Data Classification

Anhui Agricultural University
2021-2025

University of Science and Technology of China
2025

Hefei National Center for Physical Sciences at Nanoscale
2025

Microscale (United States)
2025

Nanyang Technological University
2022-2024

Tsinghua University
2019-2021

Abstract Topic models have been prevalent for decades to discover latent topics and infer topic proportions of documents in an unsupervised fashion. They widely used various applications like text analysis context recommendation. Recently, the rise neural networks has facilitated emergence a new research field—neural (NTMs). Different from conventional models, NTMs directly optimize parameters without requiring model-specific derivations. This endows with better scalability flexibility,...

10.1007/s10462-023-10661-7 article EN cc-by Artificial Intelligence Review 2024-01-25

Liangming Pan, Xiaobao Wu, Xinyuan Lu, Anh Tuan Luu, William Yang Wang, Min-Yen Kan, Preslav Nakov. Proceedings of the 61st Annual Meeting Association for Computational Linguistics (Volume 1: Long Papers). 2023.

10.18653/v1/2023.acl-long.386 article EN cc-by 2023-01-01

Topic models have been prevailing for many years on discovering latent semantics while modeling long documents. However, short texts they generally suffer from data sparsity because of extremely limited word co-occurrences; thus tend to yield repetitive or trivial topics with low quality. In this paper, address issue, we propose a novel neural topic model in the framework autoencoding new distribution quantization approach generating peakier distributions that are more appropriate texts....

10.18653/v1/2020.emnlp-main.138 article EN cc-by 2020-01-01

An asymmetric synthesis of 5-halomethyl pyrazolines and isoxazolines which bear a tertiary stereocenter by catalytic halocyclization β,γ-unsaturated hydrazones ketoximes is described. By using Brønsted acids anionic chiral Co(III) complexes as catalysts, variety were obtained in good yields with high enantioselectivities (up to 99% yield, 97:3 er). Preliminary bioassay results indicated that several isoxazoline derivatives exhibited significant antifungal activities.

10.1021/acs.orglett.1c03456 article EN Organic Letters 2021-11-23

To overcome the data sparsity issue in short text topic modeling, existing methods commonly rely on augmentation or characteristic of texts to introduce more word co-occurrence information. However, most them do not make full use augmented characteristic: they insufficiently learn relations among samples data, leading dissimilar distributions semantically similar pairs. better address sparsity, this paper we propose a novel modeling framework, Topic-Semantic Contrastive Topic Model (TSCTM)....

10.18653/v1/2022.emnlp-main.176 article EN cc-by Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing 2022-01-01

A high-performance anionic stereogenic-at-cobalt(III) complex/Oxone catalytic system was developed for various enantioselective intramolecular halocyclization of olefins using halide salts as halogen sources, delivering structurally diverse halogenated heterocyclic compounds with outstanding...

10.1039/d4cc06610c article EN Chemical Communications 2025-01-01

Previous research on multimodal entity linking (MEL) has primarily employed contrastive learning as the primary objective. However, using rest of batch negative samples without careful consideration, these studies risk leveraging easy features and potentially overlook essential details that make entities unique. In this work, we propose JD-CCL (Jaccard Distance-based Conditional Contrastive Learning), a novel approach designed to enhance ability match models. leverages meta-information...

10.48550/arxiv.2501.14166 preprint EN arXiv (Cornell University) 2025-01-23

Knowledge Base Question Answering (KBQA) aims to answer natural language questions with a large-scale structured knowledge base (KB). Despite advancements large models (LLMs), KBQA still faces challenges in weak KB awareness, imbalance between effectiveness and efficiency, high reliance on annotated data. To address these challenges, we propose KBQA-o1, novel agentic method Monte Carlo Tree Search (MCTS). It introduces ReAct-based agent process for stepwise logical form generation...

10.48550/arxiv.2501.18922 preprint EN arXiv (Cornell University) 2025-01-31

Large Vision-Language Models (LVLMs) have demonstrated impressive multimodal reasoning capabilities, but they remain susceptible to hallucination, particularly object hallucination where non-existent objects or incorrect attributes are fabricated in generated descriptions. Existing detection methods achieve strong performance rely heavily on expensive API calls and iterative LVLM-based validation, making them impractical for large-scale offline use. To address these limitations, we propose...

10.48550/arxiv.2502.12591 preprint EN arXiv (Cornell University) 2025-02-18

Direct Preference Optimization (DPO) often struggles with long-chain mathematical reasoning. Existing approaches, such as Step-DPO, typically improve this by focusing on the first erroneous step in reasoning chain. However, they overlook all other steps and rely heavily humans or GPT-4 to identify steps. To address these issues, we propose Full-Step-DPO, a novel DPO framework tailored for Instead of optimizing only step, it leverages step-wise rewards from entire This is achieved training...

10.48550/arxiv.2502.14356 preprint EN arXiv (Cornell University) 2025-02-20

An asymmetric oxidation of N,N-dialkyl sulfenamides is exhibited by using anionic stereogenic-at-cobalt(III) complexes as catalysts. This protocol provides an alternative approach to access a diverse set chiral tertiary sulfinamides with high enantioselectivities (24 examples, up 94:6 e.r.). Additionally, control experiments suggest that this could be accomplished through cationic S(IV) intermediate.

10.1021/acs.orglett.4c04857 article EN Organic Letters 2025-02-26

Chiral sulfilimines, aza analogues of sulfoxides, are essential in natural products and pharmaceuticals, highlighting the importance their synthesis asymmetric catalysis. However, efficient approaches for synthesizing chiral diaryl sulfilimines still rare challenging, particularly those with two sterically similar aryl groups. Herein, we present a mild protocol generating diverse enantioenriched alkyl via copper-catalyzed enantioselective S-arylation N-acyl sulfenamides diaryliodonium salts....

10.1021/acs.orglett.5c00132 article EN Organic Letters 2025-03-20

To equip artificial intelligence with a comprehensive understanding towards temporal world, video and 4D panoptic scene graph generation abstracts visual data into nodes to represent entities edges capture relations. Existing methods encode entity masks tracked across dimensions (mask tubes), then predict their relations pooling operation, which does not fully utilize the motion indicative of entities' relation. overcome this limitation, we introduce contrastive representation learning...

10.1609/aaai.v39i6.32665 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2025-04-11

Temporal grounding, which localizes video moments related to a natural language query, is core problem of vision-language learning and understanding. To encode varying lengths, recent methods employ multi-level structure known as feature pyramid. In this structure, lower levels concentrate on short-range moments, while higher address long-range moments. Because experience downsampling accommodate increasing moment length, their capacity capture information reduced consequently leads degraded...

10.1609/aaai.v39i6.32666 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2025-04-11

Hierarchical topic modeling aims to discover latent topics from a corpus and organize them into hierarchy understand documents with desirable semantic granularity. However, existing work struggles producing hierarchies of low affinity, rationality, diversity, which hampers document understanding. To overcome these challenges, we in this paper propose Transport Plan Context-aware Topic Model (TraCo). Instead early simple dependencies, transport plan dependency method. It constrains...

10.1609/aaai.v38i17.29895 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2024-03-24

Cross-lingual topic models have been prevalent for cross-lingual text analysis by revealing aligned latent topics. However, most existing methods suffer from producing repetitive topics that hinder further and performance decline caused low-coverage dictionaries. In this paper, we propose the Topic Modeling with Mutual Information (InfoCTM). Instead of direct alignment in previous work, a mutual information method. This works as regularization to properly align prevent degenerate...

10.1609/aaai.v37i11.26612 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2023-06-26

Fully fine-tuning pretrained large-scale transformer models has become a popular paradigm for video-language modeling tasks, such as temporal language grounding and summarization. With growing number of tasks limited training data, full approach leads to costly model storage unstable training. To overcome these shortcomings, we introduce lightweight adapters the pre-trained only update them at time. However, existing fail capture intrinsic relations among video frames or textual words....

10.1609/aaai.v38i17.29847 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2024-03-24

Large Language Models (LLMs), which bridge the gap between human language understanding and complex problem-solving, achieve state-of-the-art performance on several NLP tasks, particularly in few-shot zero-shot settings. Despite demonstrable efficacy of LLMs, due to constraints computational resources, users have engage with open-source models or outsource entire training process third-party platforms. However, research has demonstrated that are susceptible potential security...

10.36227/techrxiv.172832726.62863760/v1 preprint EN 2024-10-07

Since effective semantic representations are utilized in many practical applications, inferring discriminative and coherent latent topics from short texts is a critical basic task. Traditional topic models like Probabilistic Latent Semantic Analysis (PLSA) Dirichlet Allocation (LDA) behave not well on due to data sparsity problem. One novel model called Biterm Topic Model (BTM) which unordered word-pairs (i.e., biterms) whole corpus was proposed solve this However, both the performance...

10.1109/ijcnn.2019.8852366 article EN 2022 International Joint Conference on Neural Networks (IJCNN) 2019-07-01

Topic models have been prevalent for decades with various applications. However, existing topic commonly suffer from the notorious collapsing: discovered topics semantically collapse towards each other, leading to highly repetitive topics, insufficient discovery, and damaged model interpretability. In this paper, we propose a new neural model, Embedding Clustering Regularization Model (ECRTM). Besides reconstruction error, novel (ECR), which forces embedding be center of separately...

10.48550/arxiv.2306.04217 preprint EN other-oa arXiv (Cornell University) 2023-01-01

Existing solutions to zero-shot text classification either conduct prompting with pre-trained language models, which is sensitive the choices of templates, or rely on large-scale annotated data relevant tasks for meta-tuning. In this work, we propose a new paradigm based self-supervised learning solve by tuning models unlabeled data, called tuning. By exploring inherent structure free texts, objective first sentence prediction bridge gap between and tasks. After model learn predict in...

10.18653/v1/2023.findings-acl.110 article EN cc-by Findings of the Association for Computational Linguistics: ACL 2022 2023-01-01

Temporal Language Grounding seeks to localize video moments that semantically correspond a natural language query. Recent advances employ the attention mechanism learn relations between and text However, naive might not be able appropriately capture such relations, resulting in ineffective distributions where target are difficult separate from remaining ones. To resolve issue, we propose an energy-based model framework explicitly moment-query distributions. Moreover, DemaFormer, novel...

10.18653/v1/2023.findings-emnlp.235 article EN cc-by 2023-01-01
Coming Soon ...