Xuejing Liu

ORCID: 0000-0001-9612-3707
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Multimodal Machine Learning Applications
  • Topic Modeling
  • Human Pose and Action Recognition
  • Domain Adaptation and Few-Shot Learning
  • Natural Language Processing Techniques
  • Organic Light-Emitting Diodes Research
  • Organic Electronics and Photovoltaics
  • Advanced Image and Video Retrieval Techniques
  • Advanced Neural Network Applications
  • Video Surveillance and Tracking Methods
  • Metaheuristic Optimization Algorithms Research
  • Conducting polymers and applications
  • Image Processing Techniques and Applications
  • Luminescence and Fluorescent Materials
  • Surface Roughness and Optical Measurements
  • Handwritten Text Recognition Techniques
  • Optimization and Packing Problems
  • Industrial Vision Systems and Defect Detection
  • Optimization and Search Problems
  • Text Readability and Simplification
  • Optical measurement and interference techniques
  • Evaluation Methods in Various Fields
  • Advanced Text Analysis Techniques
  • Allelopathy and phytotoxic interactions
  • Advanced Multi-Objective Optimization Algorithms

Group Sense (China)
2023-2024

NARI Group (China)
2023

Chinese Academy of Sciences
2014-2022

University of Chinese Academy of Sciences
2014-2022

Institute of Computing Technology
2019-2022

Xi'an Polytechnic University
2020-2021

Hebei GEO University
2007-2020

State Key Laboratory of Polymer Physics and Chemistry
2014-2016

Changchun Institute of Applied Chemistry
2014-2016

Hong Kong Baptist University
2016

Vehicle Re-Identification is to find images of the same vehicle from various views in cross-camera scenario. The main challenges this task are large intra-instance distance caused by different and subtle inter-instance discrepancy similar vehicles. In paper, we propose a parsing-based view-aware embedding network (PVEN) achieve feature alignment enhancement for ReID. First, introduce parsing parse into four then align features mask average pooling. Such provides fine-grained representation...

10.1109/cvpr42600.2020.00713 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2020-06-01

Weakly supervised referring expression grounding aims at localizing the referential object in an image according to linguistic query, where mapping between and query is unknown training stage. To address this problem, we propose a novel end-to-end adaptive reconstruction network (ARN). It builds correspondence region proposal manner: collaborative reconstruction. Specifically, first extract subject, location context features represent proposals respectively. Then, design module compute...

10.1109/iccv.2019.00270 article EN 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2019-10-01

A novel red heteroleptic iridium complex, Ir(DPA-Flpy-CF<sub>3</sub>)<sub>2</sub>acac, was synthesized and whose corresponding solution-processed PhOLED shows a record power efficiency of 44.5 lm W<sup>−1</sup> with CIE coordinates (0.64, 0.36).

10.1039/c6tc01270a article EN Journal of Materials Chemistry C 2016-01-01

Weakly supervised Referring Expression Grounding (REG) aims to ground a particular target in an image described by language expression while lacking the correspondence between and expression. Two main problems exist weakly REG. First, lack of region-level annotations introduces ambiguities proposals queries. Second, most previous REG methods ignore discriminative location context referent, causing difficulties distinguishing from other same-category objects. To address above challenges, we...

10.1109/tpami.2022.3186410 article EN IEEE Transactions on Pattern Analysis and Machine Intelligence 2022-01-01

Weakly supervised referring expression grounding (REG) aims at localizing the referential entity in an image according to linguistic query, where mapping between region (proposal) and query is unknown training stage. In expressions, people usually describe a target terms of its relationship with other contextual entities as well visual attributes. However, previous weakly REG methods rarely pay attention entities. this paper, we propose knowledge-guided pairwise reconstruction network...

10.1145/3343031.3351074 article EN Proceedings of the 30th ACM International Conference on Multimedia 2019-10-15

Visual grounding (VG) aims to locate a specific target in an image based on given language query. The discriminative information from context is important for distinguishing the other objects, particularly targets that have same category as others. However, most previous methods underestimate such information. Moreover, they are usually designed standard scene (without any novel object), which limits their generalization open-vocabulary scene. In this paper, we propose framework with...

10.1109/tpami.2023.3339628 article EN IEEE Transactions on Pattern Analysis and Machine Intelligence 2023-12-05

We present the Qwen2-VL Series, an advanced upgrade of previous Qwen-VL models that redefines conventional predetermined-resolution approach in visual processing. introduces Naive Dynamic Resolution mechanism, which enables model to dynamically process images varying resolutions into different numbers tokens. This allows generate more efficient and accurate representations, closely aligning with human perceptual processes. The also integrates Multimodal Rotary Position Embedding (M-RoPE),...

10.48550/arxiv.2409.12191 preprint EN arXiv (Cornell University) 2024-09-18

Organometal halide perovskites (OHPs) are becoming a hot topic in the field of display and lighting. Unlike strategy used for solar cells, that is, using several hundred nanometers thick OHP film fully absorbing light to convert electricity, thin-film OHPs (<50 nm) advantageous restrain its self-absorption drawback thus beneficial preparing efficient light-emitting diodes (LEDs). Here we manipulate excess molar ratio MABr/PbBr2 precursors post-annealing temperature obtain uniform suppress...

10.1021/acs.jpclett.6b02160 article EN The Journal of Physical Chemistry Letters 2016-10-13

Recent advancements in multimodal foundation models (e.g., CLIP) have excelled zero-shot generalization. Prompt tuning involved the knowledge transfer from to downstream tasks has gained significant attention recently. Existing prompt-tuning methods cross-modal learning, however, either solely focus on language branch, or learn vision-language interaction a shallow mechanism. In this context, we propose Deeply coupled Cross-modal learning (DCP) method based CLIP. DCP flexibly accommodates...

10.18653/v1/2023.findings-acl.504 article EN cc-by Findings of the Association for Computational Linguistics: ACL 2022 2023-01-01

The knitting needle cylinder is one of the core parts a hosiery machine. operation its needles can directly affect production quality and efficiency To reduce loss machine caused by faults, fault detection system for machines based on synergistic combination laser vision proposed in this paper. When was operating normally, photoelectric detector collected signal reflected monitored using ratio adjacent peak-to-peak distances signals. detected, stopped immediately, charge-coupled device...

10.1177/0040517520935210 article EN Textile Research Journal 2020-06-28

Ant colony optimization (ACO) algorithm is a metaheuristic and stochastic search technology, which has been one of the effective tools for solving discrete problems. However, there are two bottlenecks large-scaled problems: ACO needs too much time to convergent solutions may not be really optimal. This paper proposes novel multidimensional knapsack problems (MKP), employs new pheromone diffusion model mutation scheme. First, in light preference better solutions, association distances among...

10.1109/iat.2007.26 article EN 2007-11-01

Problems with knitting needles are one of the main causes production loss fabric. In order to detect problems quickly and accurately, this paper proposes a hosiery needle detection system based on machine vision. Meanwhile, according working condition real needles, simulated cylinder rotary platform is built. The can problems, causing issue be identified as beginning fabric defect appears. losses caused by bending fracture reduced at source. image processing, vertical projection algorithm...

10.1177/0040517519899173 article EN Textile Research Journal 2020-01-20

The trade-off between charge transport and energy transfer is realized by manipulating the dendrimer host H2 aggregation with binary solvent mixture, along 25% device efficiency enhancement for FIrpic based blue PhOLED.

10.1039/c5tc00625b article EN Journal of Materials Chemistry C 2015-01-01

In this study, we aim to reduce generation latency for Named Entity Recognition (NER) with Large Language Models (LLMs). The main cause of high in LLMs is the sequential decoding process, which autoregressively generates all labels and mentions NER, significantly increase sequence length. To end, introduce Parallel Decoding LLM NE} (PaDeLLM-NER), a approach that integrates seamlessly into existing generative model frameworks without necessitating additional modules or architectural...

10.48550/arxiv.2402.04838 preprint EN arXiv (Cornell University) 2024-02-07

Vehicle Re-Identification is to find the same vehicle from images captured in different views under cross-camera scenarios. Traditional methods focus on depicting holistic appearance of a vehicle, but they suffer hard samples with type and color. Recent works leverage discriminative visual cues solve this problem, where three challenges exist as follows. First, features are misaligned distorted because viewpoint variance. Second, usually subtle, which easy be diluted by large area...

10.1109/tmm.2022.3154102 article EN IEEE Transactions on Multimedia 2022-03-01

Vehicle Re-Identification is to find images of the same vehicle from various views in cross-camera scenario. The main challenges this task are large intra-instance distance caused by different and subtle inter-instance discrepancy similar vehicles. In paper, we propose a parsing-based view-aware embedding network (PVEN) achieve feature alignment enhancement for ReID. First, introduce parsing parse into four views, then align features mask average pooling. Such provides fine-grained...

10.48550/arxiv.2004.05021 preprint EN other-oa arXiv (Cornell University) 2020-01-01

Referring Expression Grounding (REG) aims at localizing a particular object in an image according to language expression. Recent REG methods have achieved promising performance, but most of them are constrained limited categories due the scale current datasets. In this paper, we explore new scenario, where model can ground novel objects out training data. With motivation, propose Concept-Context Disentangled network (CCD) which transfers concepts from auxiliary classification data with...

10.1145/3394171.3413677 article EN Proceedings of the 30th ACM International Conference on Multimedia 2020-10-12

Despite the notable advancements achieved by leveraging pre-trained vision-language (VL) models through few-shot tuning for downstream tasks, our detailed empirical study highlights a significant dependence of learning outcomes on careful selection training examples - facet that has been previously overlooked in research. In this study, we delve into devising more effective strategies meticulous examples, as opposed to relying random sampling, enhance potential existing prompt methodologies....

10.48550/arxiv.2405.13532 preprint EN arXiv (Cornell University) 2024-05-22

This paper introduces SynthDoc, a novel synthetic document generation pipeline designed to enhance Visual Document Understanding (VDU) by generating high-quality, diverse datasets that include text, images, tables, and charts. Addressing the challenges of data acquisition limitations existing datasets, SynthDoc leverages publicly available corpora advanced rendering tools create comprehensive versatile dataset. Our experiments, conducted using Donut model, demonstrate models trained with...

10.48550/arxiv.2408.14764 preprint EN arXiv (Cornell University) 2024-08-26
Coming Soon ...