Weijia Wu

ORCID: 0000-0003-3912-7212
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Handwritten Text Recognition Techniques
  • Video Analysis and Summarization
  • Multimodal Machine Learning Applications
  • Advanced Image and Video Retrieval Techniques
  • Domain Adaptation and Few-Shot Learning
  • Advanced Neural Network Applications
  • Natural Language Processing Techniques
  • Generative Adversarial Networks and Image Synthesis
  • Vehicle License Plate Recognition
  • Image Processing and 3D Reconstruction
  • Topic Modeling
  • Image Retrieval and Classification Techniques
  • Human Pose and Action Recognition
  • Digital Media Forensic Detection
  • Image and Signal Denoising Methods
  • Spectroscopy Techniques in Biomedical and Chemical Research
  • Spectroscopy and Chemometric Analyses
  • Human Motion and Animation
  • Advanced Chemical Sensor Technologies
  • Lung Cancer Diagnosis and Treatment
  • Face recognition and analysis
  • Advanced Data Compression Techniques
  • Protein Degradation and Inhibitors
  • Anomaly Detection Techniques and Applications
  • Cell Image Analysis Techniques

Zhejiang University
2019-2025

Sichuan Academy of Medical Sciences & Sichuan Provincial People's Hospital
2025

University of Electronic Science and Technology of China
2025

China University of Geosciences
2024

China Mobile (China)
2024

Sichuan Agricultural University
2022-2023

Second Affiliated Hospital of Zhejiang University
2023

National University of Singapore
2023

Jilin University
2023

Hefei University
2022

Collecting and annotating images with pixel-wise labels is time-consuming laborious. In contrast, synthetic data can be freely available using a generative model (e.g., DALL-E, Stable Diffusion). this paper, we show that it possible to automatically obtain accurate semantic masks of generated by the Off-the-shelf Diffusion model, which uses only text-image pairs during training. Our approach, termed DiffuMask, exploits potential cross-attention map between text image, natural seamless extend...

10.1109/iccv51070.2023.00117 article EN 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2023-10-01

Rapid nondestructive testing of peanut seed vigor is great significance in current research. Before seeds are sown, effective screening high-quality for planting crucial to improve the quality crop yield, and vitality one important indicators evaluate quality, which can represent potential ability germinate quickly whole grow into normal seedlings or plants. Meanwhile, advantage technology that themselves will not be damaged. In this study, hyperspectral superoxide dismutase activity were...

10.3389/fpls.2023.1127108 article EN cc-by Frontiers in Plant Science 2023-02-27

Public large-scale text-to-image diffusion models, such as Stable Diffusion, have gained significant attention from the community. These models can be easily customized for new concepts using low-rank adaptations (LoRAs). However, utilization of multiple concept LoRAs to jointly support presents a challenge. We refer this scenario decentralized multi-concept customization, which involves single-client tuning and center-node fusion. In paper, we propose framework called Mix-of-Show that...

10.48550/arxiv.2305.18292 preprint EN other-oa arXiv (Cornell University) 2023-01-01

Current deep networks are very data-hungry and benefit from training on largescale datasets, which often time-consuming to collect annotate. By contrast, synthetic data can be generated infinitely using generative models such as DALL-E diffusion models, with minimal effort cost. In this paper, we present DatasetDM, a generic dataset generation model that produce diverse images the corresponding high-quality perception annotations (e.g., segmentation masks, depth). Our method builds upon...

10.48550/arxiv.2308.06160 preprint EN cc-by arXiv (Cornell University) 2023-01-01

Weakly supervised object localization (WSOL) remains challenging when learning models from image category labels. Conventional methods that discriminatively train activation ignore representative yet less discriminative parts. In this study, we propose a generative prompt model (GenPromp), defining the first pipeline to localize parts by formulating WSOL as conditional denoising procedure. During training, GenPromp converts labels learnable embeddings which are fed conditionally recover...

10.1109/iccv51070.2023.00584 article EN 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2023-10-01

Generally, pre-training and long-time training computation are necessary for obtaining a good-performance text detector based on deep networks. In this paper, we present new scene detection network (called FANet) with Fast convergence speed Accurate localization. The proposed FANet is an end-to-end transformer feature learning normalized Fourier descriptor modeling, where the Descriptor Proposal Network Iterative Text Decoding designed to efficiently accurately identify proposals....

10.1109/icme55011.2023.00035 article EN 2022 IEEE International Conference on Multimedia and Expo (ICME) 2023-07-01

Model binarization can significantly compress model size, reduce energy consumption, and accelerate inference through efficient bit-wise operations. Although binarizing convolutional neural networks have been extensively studied, there is little work on exploring of vision Transformers which underpin most recent breakthroughs in visual recognition. To this end, we propose to solve two fundamental challenges push the horizon Binary Vision (BiViT). First, traditional binary method does not...

10.1109/iccv51070.2023.00520 article EN 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2023-10-01

Text spotting, a task involving the extraction of textual information from image or video sequences, faces challenges in cross-domain adaption, such as image-to-image and image-to-video generalization. In this paper, we introduce new method, termed VimTS, which enhances generalization ability model by achieving better synergy among different tasks. Typically, propose Prompt Queries Generation Module Tasks-aware Adapter to effectively convert original single-task into multi-task suitable for...

10.1109/tpami.2025.3528950 article EN IEEE Transactions on Pattern Analysis and Machine Intelligence 2025-01-01

This study aimed to evaluate the cost-effectiveness of ASCT and maintenance therapy strategies for transplant-eligible patients with newly diagnosed multiple myeloma from a Chinese healthcare perspective. A short-run decision tree long-run Markov model were created assess mean costs quality-adjusted life-years (QALYs) plus over lifetime horizon. Utility values sourced published literature, while based on single-center retrospective analysis national drug bidding data. The strategy two-year...

10.1080/14737167.2025.2461636 article EN Expert Review of Pharmacoeconomics & Outcomes Research 2025-02-04

Abstract Dry eye disease (DED) is a multifactorial illness affecting tears and the ocular surface. The neurokinin 1 receptor (NK1R) target for controlling T helper 17 (Th17) regulatory cell (Treg) imbalances. This work creates silk fibroin (SF) nanoparticle hydrogel that targets NK1R with CP‐99,994 (CP). Combining CP SF to generate stable nanoparticles while integrating flexible material results in sustained‐release ophthalmic drop formulation (SF@CP@Gel), which provides long‐lasting...

10.1002/advs.202404835 article EN cc-by Advanced Science 2025-02-22

As the number of randomized clinical trials (RCTs) demonstrating survival benefits combination therapies in previously treated multiple myeloma (MM) patients increases, it is essential to determine most cost-effective treatment through robust economic evaluation. This study aims assess cost-effectiveness for first/second-relapse MM from perspective Chinese healthcare system. A Markov model was developed evaluate three therapy groups based on primary drugs (bortezomib, lenalidomide, and...

10.1186/s13561-025-00611-0 article EN cc-by-nc-nd Health Economics Review 2025-03-15

Question Answering (QA) systems are used to provide proper responses users' questions automatically. Sentence matching is an essential task in the QA and usually reformulated as a Paraphrase Identification (PI) problem. Given question, aim of find most similar question from knowledge base. In this paper, we propose Multi-task Encoding Model (MSEM) for PI problem, wherein connected graph employed depict relation between sentences, multi-task learning model applied address both sentence intent...

10.1109/ijcnn.2019.8852327 article EN 2022 International Joint Conference on Neural Networks (IJCNN) 2019-07-01

Collecting and annotating images with pixel-wise labels is time-consuming laborious. In contrast, synthetic data can be freely available using a generative model (e.g., DALL-E, Stable Diffusion). this paper, we show that it possible to automatically obtain accurate semantic masks of generated by the Off-the-shelf Diffusion model, which uses only text-image pairs during training. Our approach, called DiffuMask, exploits potential cross-attention map between text image, natural seamless extend...

10.48550/arxiv.2303.11681 preprint EN cc-by arXiv (Cornell University) 2023-01-01

<h3>Importance</h3> China, which has one-third of the worldwide smoking population, a substantial cancer burden, with lung being leading cause cancer-related death. The effectiveness screening for mortality reduction been confirmed, but cost-effectiveness diverse modalities remains unclear. <h3>Objective</h3> To compare low-dose computed tomography (LDCT) biomarker (micro-RNA signature classifier [MSC]) that LDCT alone by interval and cumulative exposure. <h3>Design, Setting,...

10.1001/jamanetworkopen.2022.13634 article EN cc-by-nc-nd JAMA Network Open 2022-05-24

Diffusion models have recently dominated image synthesis tasks. However, the iterative denoising process is expensive in computations at inference time, making diffusion less practical for low-latency and scalable real-world applications. Post-training quantization (PTQ) of can significantly reduce model size accelerate sampling without re-training. Nonetheless, applying existing PTQ methods directly to low-bit impair quality generated samples. Specifically, each step, noise leads deviations...

10.48550/arxiv.2305.10657 preprint EN other-oa arXiv (Cornell University) 2023-01-01

Current video text spotting methods can achieve preferable performance, powered with sufficient labeled training data. However, labeling data manually is time-consuming and labor-intensive. To overcome this, using low-cost synthetic a promising alternative. This paper introduces novel synthesis technique called FlowText, which utilizes optical flow estimation to synthesize large amount of at low cost for robust spotters. Unlike existing that focus on image-level synthesis, FlowText...

10.1109/icme55011.2023.00262 article EN 2022 IEEE International Conference on Multimedia and Expo (ICME) 2023-07-01

Large-scale pre-trained diffusion models have exhibited remarkable capabilities in diverse video generations. Given a set of clips the same motion concept, task Motion Customization is to adapt existing text-to-video generate videos with this motion. For example, generating car moving prescribed manner under specific camera movements make movie, or illustrating how bear would lift weights inspire creators. Adaptation methods been developed for customizing appearance like subject style, yet...

10.48550/arxiv.2310.08465 preprint EN other-oa arXiv (Cornell University) 2023-01-01
Coming Soon ...