Yu Qiao

ORCID: 0009-0002-0192-059X
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Face and Expression Recognition
  • Face recognition and analysis
  • Advanced Image and Video Retrieval Techniques
  • Advanced Vision and Imaging
  • Traffic Prediction and Management Techniques
  • Natural Language Processing Techniques
  • Anomaly Detection Techniques and Applications
  • Topic Modeling
  • Fire Detection and Safety Systems
  • Human Pose and Action Recognition
  • Video Surveillance and Tracking Methods
  • Glaucoma and retinal disorders
  • Ocular Surface and Contact Lens
  • EEG and Brain-Computer Interfaces
  • Image and Video Quality Assessment
  • Gait Recognition and Analysis
  • Video Coding and Compression Technologies
  • Brain Tumor Detection and Classification
  • Multimodal Machine Learning Applications

Tiangong University
2023

Shenzhen Institutes of Advanced Technology
2020-2021

Recent advancements have established Diffusion Transformers (DiTs) as a dominant framework in generative modeling. Building on this success, Lumina-Next achieves exceptional performance the generation of photorealistic images with Next-DiT. However, its potential for video remains largely untapped, significant challenges modeling spatiotemporal complexity inherent to data. To address this, we introduce Lumina-Video, that leverages strengths Next-DiT while introducing tailored solutions...

10.48550/arxiv.2502.06782 preprint EN arXiv (Cornell University) 2025-02-10

The rapid advance of Large Language Models (LLMs) has catalyzed the development Vision-Language (VLMs). Monolithic VLMs, which avoid modality-specific encoders, offer a promising alternative to compositional ones but face challenge inferior performance. Most existing monolithic VLMs require tuning pre-trained LLMs acquire vision abilities, may degrade their language capabilities. To address this dilemma, paper presents novel high-performance VLM named HoVLE. We note that have been shown...

10.48550/arxiv.2412.16158 preprint EN arXiv (Cornell University) 2024-12-20

The NerveStitcher demonstrated that a set of in vivo confocal microscopy (IVCM) images can be merged under framework graph convolutional neural network. However, the high similarity nerval structure on IVCM image results mis-stitching when internal sequence happen to drop. Particularly large gap between pair adjacent images, sometimes cannot detected. In this paper, we advance concept global optical flow and intergrate it existing framework. improvements algorithm robustness are caused by...

10.1145/3634875.3634882 article EN 2023-10-20
Coming Soon ...