Zhu Liu

ORCID: 0000-0003-0975-2711
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Video Analysis and Summarization
  • Advanced Image and Video Retrieval Techniques
  • Image Retrieval and Classification Techniques
  • Advanced Image Fusion Techniques
  • Music and Audio Processing
  • Image Enhancement Techniques
  • Advanced Vision and Imaging
  • Visual Attention and Saliency Detection
  • Image Processing Techniques and Applications
  • Remote-Sensing Image Classification
  • Video Surveillance and Tracking Methods
  • Multimodal Machine Learning Applications
  • Speech and Audio Processing
  • Photoacoustic and Ultrasonic Imaging
  • Advanced Image Processing Techniques
  • Advanced Neural Network Applications
  • Thermography and Photoacoustic Techniques
  • Digital Media Forensic Detection
  • Web Data Mining and Analysis
  • Industrial Vision Systems and Defect Detection
  • Data Management and Algorithms
  • Network Security and Intrusion Detection
  • Multimedia Communication and Technology
  • Adversarial Robustness in Machine Learning
  • Infrared Target Detection Methodologies

Dalian University of Technology
2018-2024

State Grid Corporation of China (China)
2017-2022

Alibaba Group (Cayman Islands)
2022

Beijing University of Posts and Telecommunications
2021

Yanshan University
2020

Taizhou University
2020

Alibaba Group (United States)
2019

Bellevue Hospital Center
2019

AT&T (United States)
2006-2018

China Ocean Shipping (China)
2016

Multi-modality image fusion and segmentation play a vital role in autonomous driving robotic operation. Early efforts focus on boosting the performance for only one task, e.g., or segmentation, making it hard to reach 'Best of Both Worlds'. To overcome this issue, paper, we propose Multi-interactive Feature learning architecture Segmentation, namely SegMiF, exploit dual-task correlation promote both tasks. The SegMiF is cascade structure, containing sub-network commonly used sub-network. By...

10.1109/iccv51070.2023.00745 article EN 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2023-10-01

10.1023/a:1008066223044 article EN The Journal of VLSI Signal Processing Systems for Signal Image and Video Technology 1998-01-01

Recently, multi-modality scene perception tasks, e.g., image fusion and understanding, have attracted widespread attention for intelligent vision systems. However, early efforts always consider boosting a single task unilaterally neglecting others, seldom investigating their underlying connections joint promotion. To overcome these limitations, we establish the hierarchical dual tasks-driven deep model to bridge tasks. Concretely, firstly construct an module fuse complementary...

10.24963/ijcai.2023/138 article EN 2023-08-01

Infrared-visible image fusion (IVIF) is a fundamental and critical task in the field of computer vision. Its aim to integrate unique characteristics both infrared visible spectra into holistic representation. Since 2018, growing amount diversity IVIF approaches step deep-learning era, encompassing introduced broad spectrum networks or loss functions for improving visual enhancement. As research deepens practical demands grow, several intricate issues like data compatibility, perception...

10.1109/tpami.2024.3521416 article EN IEEE Transactions on Pattern Analysis and Machine Intelligence 2024-12-23

Infrared and visible image fusion is a powerful technique that combines complementary information from different modalities for downstream semantic perception tasks. Existing learning-based methods show remarkable performance, but are suffering the inherent vulnerability of adversarial attacks, causing significant decrease in accuracy. In this work, perception-aware framework proposed to promote segmentation robustness scenes. We first conduct systematic analyses about components fusion,...

10.1145/3581783.3611928 article EN 2023-10-26

10.1016/j.jvcir.2020.102851 article EN Journal of Visual Communication and Image Representation 2020-07-20

Multi-modality image fusion refers to generating a complementary that integrates typical characteristics from source images. In recent years, we have witnessed the remarkable progress of deep learning models for multi-modality fusion. Existing CNN-based approaches strain every nerve design various architectures realizing these tasks in an end-to-end manner. However, handcrafted designs are unable cope with high demanding tasks, resulting blurred targets and lost textural details. To...

10.1145/3474085.3475299 article EN Proceedings of the 30th ACM International Conference on Multimedia 2021-10-17

A video sequence usually consists of separate scenes, and each scene includes many shots. For understanding purposes, it is most important to detect breaks. To analyze the content scene, detection shot breaks also required. Usually, a break associated with simultaneous change image, motion, audio characteristics, while only accompanied changes in image or motion both. We propose use information along accomplish segmentation at different levels. Promising results have been obtained videos...

10.1109/icip.1998.727252 article EN 2002-11-27

Infrared-visible image fusion (IVIF) is a critical task in computer vision, aimed at integrating the unique features of both infrared and visible spectra into unified representation. Since 2018, field has entered deep learning era, with an increasing variety approaches introducing range networks loss functions to enhance visual performance. However, challenges such as data compatibility, perception accuracy, efficiency remain. Unfortunately, there lack recent comprehensive surveys that...

10.48550/arxiv.2501.10761 preprint EN arXiv (Cornell University) 2025-01-18

Scene classification and segmentation are fundamental steps for efficient accessing, retrieving browsing large amount of video data. We have developed a scene scheme using Hidden Markov Model (HMM)-based classifier. By utilizing the temporal behaviors different classes, HMM classifier can effectively classify presegmented clips into one predefined classes. In this paper, we describe three approaches joint based on HMM, which search most likely class transition path by dynamic programming...

10.1109/tmm.2005.843346 article EN IEEE Transactions on Multimedia 2005-05-17

Video copy detection techniques are essential for a number of applications including discovering copyright infringement multimedia content, monitoring commercial air time, and querying videos by example. Over the last decade, video has received rapidly growing attention from research community. To encourage more innovative technology benchmark state art approaches in this field, TRECVID conference series, sponsored NIST, initiated an evaluation task on content based 2008. In paper, we...

10.1145/1743384.1743409 article EN 2010-03-29

In recent years, learning-based methods have achieved significant advancements in multi-exposure image fusion. However, two major stumbling blocks hinder the development, including pixel misalignment and inefficient inference. Reliance on aligned pairs existing causes susceptibility to artifacts due device motion. Additionally, techniques often rely handcrafted architectures with huge network engineering, resulting redundant parameters, adversely impacting inference efficiency flexibility....

10.1109/tcsvt.2024.3351933 article EN IEEE Transactions on Circuits and Systems for Video Technology 2024-01-09

Video deraining is an important issue for outdoor vision systems and has been investigated extensively. However, designing optimal architectures by the aggregating model formation data distribution a challenging task video deraining. In this paper, we develop model-guided triple-level optimization framework to deduce network architecture with cooperating auto-searching mechanism, named Triple-level Model Inferred Cooperating Searching (TMICS), dealing various rain circumstances. particular,...

10.1109/tip.2021.3128327 article EN IEEE Transactions on Image Processing 2021-11-30

The proposed shot boundary determination (SBD) algorithm contains a set of finite state machine (FSM) based detectors for pure cut, fast dissolve, fade in, out, and wipe. Support vector machines (SVM) are applied to the cut dissolve further boost performance. Our SBD system was highly effective when evaluated in TRECVID 2006 (TREC video retrieval evaluation) its performance ranked highest overall.

10.1109/icme.2007.4284943 article EN 2007-07-01

Infrared and visible image fusion plays a vital role in the field of computer vision. Previous approaches make efforts to design various rules loss functions. However, these experimental designed methods more complex. Besides, most them only focus on boosting visual effects, thus showing unsatisfactory performance for follow-up high-level vision tasks. To address challenges, this letter, we develop semantic-level network sufficiently utilize semantic guidance, emancipating rules. In...

10.1109/lsp.2023.3266980 article EN IEEE Signal Processing Letters 2023-01-01

With the rapid development of smart grids, number various types power IoT terminal devices has grown by leaps and bounds. An attack on either difficult-to-protect end or any node in a large complex network can put grid at risk. The traffic generated Distributed Denial Service (DDoS) attacks is characterised short bursts time, making it difficult to apply existing centralised detection methods that rely manual setting characteristics changing scenarios. In this paper, DDoS model based...

10.3390/en15217882 article EN cc-by Energies 2022-10-24

This paper addresses the problem of recovering semantic structure broadcast news. A hierarchy retrievable units is automatically constructed by integrating information from different media. The provides a compact, yet meaningful, abstraction news data, similar to conventional table content that can serve as an effective index table, facilitating capability browsing through large amounts data in nonlinear fashion. recovery further enables automated solutions constructing visual...

10.1117/12.333880 article EN Proceedings of SPIE, the International Society for Optical Engineering/Proceedings of SPIE 1998-12-17

Video classification and segmentation are fundamental steps for efficient accessing, retrieval browsing of large amounts video data. We have developed a scene scheme using hidden Markov model (HMM) based classifier. By utilizing the temporal behaviors different classes, HMM classifier can effectively classify segments into one pre-defined classes. In this paper, we describe two approaches joint on HMM, which works by searching most likely class transition path dynamic programming technique.

10.1109/icme.2000.871064 article EN 2002-11-07

To effectively extract the typical features of bearing, a new method that related local mean decomposition Shannon entropy and improved kernel principal component analysis model was proposed. First, are extracted by time–frequency domain method, decomposition, using to process original separated product functions, so as get features. However, been still contain superfluous information; nonlinear multi-features technique, analysis, is introduced fuse characters. The weight factor....

10.1177/1687814016661087 article EN cc-by Advances in Mechanical Engineering 2016-08-01

In recent years, there has been a growing interest in combining learnable modules with numerical optimization to solve low-level vision tasks. However, most existing approaches focus on designing specialized schemes generate image/feature propagation. There is lack of unified consideration construct propagative modules, provide theoretical analysis tools, and design effective learning mechanisms. To mitigate the above issues, this paper proposes optimization-inspired framework aggregate...

10.1109/tip.2023.3328486 article EN IEEE Transactions on Image Processing 2023-01-01

Major casts, for example, the anchor persons or reporters in news broadcast programs and principle characters movies, play an important role video, their occurrences provide meaningful indices organizing presenting video content. This paper describes a new approach automatically generating list of major casts sequence based on multiple modalities, specifically, speaker information audio track face track. The core algorithm is composed three steps. First, boundaries are detected segments...

10.1109/tmm.2006.886360 article EN IEEE Transactions on Multimedia 2006-12-19

Multi-modality image fusion and segmentation play a vital role in autonomous driving robotic operation. Early efforts focus on boosting the performance for only one task, \emph{e.g.,} or segmentation, making it hard to reach~`Best of Both Worlds'. To overcome this issue, paper, we propose \textbf{M}ulti-\textbf{i}nteractive \textbf{F}eature learning architecture \textbf{Seg}mentation, namely SegMiF, exploit dual-task correlation promote both tasks. The SegMiF is cascade structure, containing...

10.48550/arxiv.2308.02097 preprint EN other-oa arXiv (Cornell University) 2023-01-01
Coming Soon ...