Qifan Yu

ORCID: 0000-0003-0029-5622
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Multimodal Machine Learning Applications
  • Domain Adaptation and Few-Shot Learning
  • Video Analysis and Summarization
  • Advanced Malware Detection Techniques
  • Advanced Image and Video Retrieval Techniques
  • Natural Language Processing Techniques
  • Security and Verification in Computing
  • Psychedelics and Drug Studies
  • Topic Modeling
  • Speech and dialogue systems
  • Visual Attention and Saliency Detection
  • Video Surveillance and Tracking Methods
  • Cinema and Media Studies
  • Innovative Teaching and Learning Methods
  • Data Visualization and Analytics
  • Image and Video Quality Assessment
  • Blockchain Technology Applications and Security
  • Educational Technology and Assessment
  • Cryptography and Data Security
  • Generative Adversarial Networks and Image Synthesis
  • Human Motion and Animation
  • Adversarial Robustness in Machine Learning

Zhejiang University
2023-2024

Hohai University
2022-2023

Scene Graph Generation (SGG) aims to extract <subject, predicate, object> relationships in images for vision understanding. Although recent works have made steady progress on SGG, they still suffer long-tail distribution issues that tail-predicates are more costly train and hard distinguish due a small amount of annotated data compared frequent predicates. Existing re-balancing strategies try handle it via prior rules but confined pre-defined conditions, which not scalable various models...

10.1109/iccv51070.2023.01971 article EN 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2023-10-01

The rising demand for creating lifelike avatars in the digital realm has led to an increased need generating high-quality human videos guided by textual descriptions and poses. We propose Dancing Avatar, designed fabricate motion driven poses cues. Our approach employs a pretrained T2I diffusion model generate each video frame autoregressive fashion. crux of innovation lies our adept utilization producing frames successively while preserving contextual relevance. surmount hurdles posed...

10.48550/arxiv.2308.07749 preprint EN other-oa arXiv (Cornell University) 2023-01-01

Recent text-to-image generation models have shown promising results in generating high-fidelity photo-realistic images. In parallel, the problem of data scarcity has brought a growing interest employing AIGC technology for high-quality expansion. However, this paradigm requires well-designed prompt engineering that cost-less expansion and labeling remain under-explored. Inspired by LLM's powerful capability task guidance, we propose new annotated named as ChatGenImage. The core idea behind...

10.48550/arxiv.2305.12799 preprint EN other-oa arXiv (Cornell University) 2023-01-01

With the rising prominence of smart contracts, security attacks targeting them have increased, posing severe threats to their and intellectual property rights. Existing simplistic datasets hinder effective vulnerability detection, raising concerns. To address these challenges, we propose <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">BiAn</i> , a source code level contract obfuscation method that generates complex test datasets. protects...

10.1109/tse.2023.3298609 article EN IEEE Transactions on Software Engineering 2023-07-27

Scene Graph Generation (SGG) aims to extract <subject, predicate, object> relationships in images for vision understanding. Although recent works have made steady progress on SGG, they still suffer long-tail distribution issues that tail-predicates are more costly train and hard distinguish due a small amount of annotated data compared frequent predicates. Existing re-balancing strategies try handle it via prior rules but confined pre-defined conditions, which not scalable various models...

10.48550/arxiv.2303.13233 preprint EN other-oa arXiv (Cornell University) 2023-01-01

Multi-modal Large Language Models (MLLMs) tuned on machine-generated instruction-following data have demonstrated remarkable performance in various multi-modal understanding and generation tasks. However, the hallucinations inherent data, which could lead to hallucinatory outputs MLLMs, remain under-explored. This work aims investigate (i.e., object, relation, attribute hallucinations) mitigate those toxicities large-scale visual instruction datasets. Drawing human ability identify factual...

10.48550/arxiv.2311.13614 preprint EN other-oa arXiv (Cornell University) 2023-01-01

Video Large Language Models (Video-LLMs) have recently shown strong performance in basic video understanding tasks, such as captioning and coarse-grained question answering, but struggle with compositional reasoning that requires multi-step spatio-temporal inference across object relations, interactions, events. The hurdles to enhancing this capability include extensive manual labor, the lack of compositionality existing data absence explicit supervision. In paper, we propose STEP, a novel...

10.48550/arxiv.2412.00161 preprint EN arXiv (Cornell University) 2024-11-29

Instruction tuning fine-tunes pre-trained Multi-modal Large Language Models (MLLMs) to handle real-world tasks. However, the rapid expansion of visual instruction datasets introduces data redundancy, leading excessive computational costs. We propose a collaborative framework, DataTailor, which leverages three key principles--informativeness, uniqueness, and representativeness--for effective selection. argue that valuable sample should be informative task, non-redundant, represent...

10.48550/arxiv.2412.06293 preprint EN arXiv (Cornell University) 2024-12-09

Ethereum smart contracts face serious security problems, which not only cause huge economic losses, but also destroy the credit system. To solve this problem, code obfuscation techniques are applied to improve their complexity and security. However, current source methods have insufficient anti-decompilation ability. Therefore, we propose a novel bytecode approach called BOSC based on four kinds of techniques, is directed at solidity. The experimental results show that, after obfuscation,...

10.1109/apsec57359.2022.00083 article EN 2022-12-01
Coming Soon ...