Qiming Zhang

ORCID: 0000-0003-0060-0543
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Advanced Neural Network Applications
  • Domain Adaptation and Few-Shot Learning
  • Neural Networks Stability and Synchronization
  • Neural Networks and Reservoir Computing
  • Nonlinear Differential Equations Analysis
  • Neural Networks and Applications
  • Human Pose and Action Recognition
  • Photonic and Optical Devices
  • Visual Attention and Saliency Detection
  • Optical Network Technologies
  • Advanced Memory and Neural Computing
  • Multimodal Machine Learning Applications
  • stochastic dynamics and bifurcation
  • Video Surveillance and Tracking Methods
  • Higher Education and Teaching Methods
  • Anomaly Detection Techniques and Applications
  • Infrared Target Detection Methodologies
  • Advanced Image and Video Retrieval Techniques
  • Stability and Controllability of Differential Equations
  • Remote-Sensing Image Classification
  • Target Tracking and Data Fusion in Sensor Networks
  • Graph Labeling and Dimension Problems
  • Spectral Theory in Mathematical Physics
  • Nonlinear Partial Differential Equations
  • Adversarial Robustness in Machine Learning

The University of Sydney
2017-2025

Beihang University
2019-2025

Tencent (China)
2025

University of Shanghai for Science and Technology
2021-2024

Harbin Institute of Technology
2022-2024

Qingdao University of Science and Technology
2023-2024

Xi'an Jiaotong University
2022-2023

Shanghai University
2023

Columbia University
2023

Guilin University of Technology
2022

Abstract The growing demands of brain science and artificial intelligence create an urgent need for the development neural networks (ANNs) that can mimic structural, functional biological features human networks. Nanophotonics, which is study behaviour light light–matter interaction at nanometre scale, has unveiled new phenomena led to applications beyond diffraction limit light. These emerging nanophotonic devices have enabled scientists develop paradigm shifts research into ANNs. In...

10.1038/s41377-019-0151-0 article EN cc-by Light Science & Applications 2019-05-08

Although no specific domain knowledge is considered in the design, plain vision transformers have shown excellent performance visual recognition tasks. However, little effort has been made to reveal potential of such simple structures for pose estimation In this paper, we show surprisingly good capabilities from various aspects, namely simplicity model structure, scalability size, flexibility training paradigm, and transferability between models, through a baseline called ViTPose....

10.48550/arxiv.2204.12484 preprint EN other-oa arXiv (Cornell University) 2022-01-01

Large-scale vision foundation models have made significant progress in visual tasks on natural images, with transformers (ViTs) being the primary choice due to their good scalability and representation ability. However, large-scale remote sensing (RS) not yet been sufficiently explored. In this article, we resort plain ViTs about 100 million parameters make first attempt propose large tailored RS investigate how such perform. To handle sizes objects of arbitrary orientations a new rotated...

10.1109/tgrs.2022.3222818 article EN IEEE Transactions on Geoscience and Remote Sensing 2022-11-21

Transformers have shown great potential in various computer vision tasks owing to their strong capability modeling long-range dependency using the self-attention mechanism. Nevertheless, transformers treat an image as 1D sequence of visual tokens, lacking intrinsic inductive bias (IB) local structures and dealing with scale variance. Alternatively, they require large-scale training data longer schedules learn IB implicitly. In this paper, we propose a novel Vision Transformer Advanced by...

10.48550/arxiv.2106.03348 preprint EN other-oa arXiv (Cornell University) 2021-01-01

Optical machine learning has emerged as an important research area that, by leveraging the advantages inherent to optical signals, such parallelism and high speed, paves way for a future where hardware can process data at speed of light. In this work, we present devices processing in form single-layer nanoscale holographic perceptrons trained perform inference tasks. We experimentally show functionality these passive example decryptors single or whole classes keys through symmetric...

10.1038/s41377-021-00483-z article EN cc-by Light Science & Applications 2021-03-03

Unsupervised domain adaptation (UDA) aims to enhance the generalization capability of a certain model from source target domain. UDA is particular significance since no extra effort devoted annotating samples. However, different data distributions in two domains, or \emph{domain shift/discrepancy}, inevitably compromise performance. Although there has been progress matching marginal between classifier favors features and makes incorrect predictions on due category-agnostic feature alignment....

10.48550/arxiv.1910.13049 preprint EN other-oa arXiv (Cornell University) 2019-01-01

Single hyperspectral image super-resolution (single-HSI-SR) aims to restore a high-resolution from low-resolution observation. However, the prevailing CNN-based approaches have shown limitations in building long-range dependencies and capturing interaction information between spectral features. This results inadequate utilization of artifacts after upsampling. To address this issue, we propose ES-SAformer, an ESSA attention-embedded Transformer network for single-HSI-SR with iterative...

10.1109/iccv51070.2023.02109 article EN 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2023-10-01

In this paper, we show the surprisingly good properties of plain vision transformers for body pose estimation from various aspects, namely simplicity in model structure, scalability size, flexibility training paradigm, and transferability knowledge between models, through a simple baseline dubbed ViTPose. ViTPose employs non-hierarchical transformer as an encoder to encode features lightweight decoder decode keypoints either top-down or bottom-up manner. It can be scaled 1B parameters by...

10.1109/tpami.2023.3330016 article EN IEEE Transactions on Pattern Analysis and Machine Intelligence 2023-11-03

Window-based attention has become a popular choice in vision transformers due to its superior performance, lower computational complexity, and less memory footprint. However, the design of hand-crafted windows, which is data-agnostic, constrains flexibility adapt objects varying sizes, shapes, orientations. To address this issue, we propose novel quadrangle (QA) method that extends window-based general formulation. Our employs an end-to-end learnable regression module predicts transformation...

10.1109/tpami.2023.3347693 article EN IEEE Transactions on Pattern Analysis and Machine Intelligence 2024-01-08

Human parsing, or human body part semantic segmentation, has been an active research topic due to its wide potential applications. In this paper, we propose a novel GRAph PYramid Mutual Learning (Grapy-ML) method address the cross-dataset parsing problem, where annotations are at different granularities. Starting from prior knowledge of hierarchical structure, devise graph pyramid module (GPM) by stacking three levels structures coarse granularity fine subsequently. At each level, GPM...

10.1609/aaai.v34i07.6728 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2020-04-03

Large-scale vision foundation models have made significant progress in visual tasks on natural images, with transformers being the primary choice due to their good scalability and representation ability. However, large-scale remote sensing (RS) not yet been sufficiently explored. In this paper, we resort plain about 100 million parameters make first attempt propose large tailored RS investigate how such perform. To handle sizes objects of arbitrary orientations a new rotated varied-size...

10.48550/arxiv.2208.03987 preprint EN other-oa arXiv (Cornell University) 2022-01-01

In this paper, we propose a novel progressive parameter pruning method for Convolutional Neural Network acceleration, named Structured Probabilistic Pruning (SPP), which effectively prunes weights of convolutional layers in probabilistic manner. Unlike existing deterministic approaches, where unimportant are permanently eliminated, SPP introduces probability each weight, and is guided by sampling from the probabilities. A mechanism designed to increase decrease probabilities based on...

10.48550/arxiv.1709.06994 preprint EN other-oa arXiv (Cornell University) 2017-01-01

The rapid development of artificial intelligence has stimulated the interest in novel designs photonic neural networks. As three-dimensional (3D) networks, diffractive networks (DNNs) relying on phenomena light, demonstrated their superb performance direct parallel processing two-dimensional (2D) optical data at speed light. Despite outstanding achievements, DNNs utilize centimeter-scale devices to generate input passively, making miniaturization and on-chip integration a challenging task....

10.1515/nanoph-2022-0437 article EN cc-by Nanophotonics 2023-01-12

The 1 <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">st</sup> Workshop on Maritime Computer Vision (MaCVi) 2023 focused maritime computer vision for Unmanned Aerial Vehicles (UAV) and Surface Vehicle (USV), organized several subchallenges in this domain: (i) UAV-based Object Detection, (ii) Mar-itime Tracking, (iii) USV-based Obstacle Segmentation (iv) Detection. were based the SeaDronesSee MODS benchmarks. This report summarizes main findings...

10.1109/wacvw58289.2023.00033 article EN 2023-01-01

In this paper, shunting inhibitory cellular neural networks(SICNNs) with neutral type delays and time-varying leakage are investigated. By applying Lyapunov functional method differential inequality techniques, a set of sufficient conditions obtained for the existence exponential stability pseudo almost periodic solutions model. An example is given to support theoretical findings. Our results improve generalize those previous studies.

10.3109/0954898x.2014.978406 article EN Network Computation in Neural Systems 2014-11-11

In this letter, a class of Cohen-Grossberg shunting inhibitory neural networks with time-varying delays and impulses is investigated. Sufficient conditions for the existence exponential stability antiperiodic solutions such are established. Our results new complementary to previously known results. An example given illustrate feasibility effectiveness our main

10.1162/neco_a_00642 article EN Neural Computation 2014-07-24

Multi-view camera-based 3D object detection has become popular due to its low cost, but accurately inferring geometry solely from camera data remains challenging and may lead inferior performance. Although distilling precise knowledge LiDAR could help tackle this challenge, the benefits of information be greatly hindered by significant modality gap between different sensory modalities. To address issue, we propose a Simulated multi-modal Distillation (SimDistill) method carefully crafting...

10.1609/aaai.v38i7.28577 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2024-03-24

Parameter pruning is a promising approach for CNN compression and acceleration by eliminating redundant model parameters with tolerable performance degrade. Despite its effectiveness, existing regularization-based parameter methods usually drive weights towards zero large constant regularization factors, which neglects the fragility of expressiveness CNNs, thus calls more gentle scheme so that networks can adapt during pruning. To achieve this, we propose new novel method, named IncReg, to...

10.1109/ijcnn.2019.8852463 preprint EN 2022 International Joint Conference on Neural Networks (IJCNN) 2019-07-01

Modern Convolutional Neural Networks (CNNs) are usually restricted by their massive computation and high storage. Parameter pruning is a promising approach for CNN compression acceleration eliminating redundant model parameters with tolerable performance degradation. Despite its effectiveness, existing regularization-based parameter methods drive weights towards zero large constant regularization factors, which neglects the fragility of expressiveness CNNs, thus calls more gentle scheme so...

10.1109/jstsp.2019.2961233 article EN IEEE Journal of Selected Topics in Signal Processing 2019-12-24

Abstract Artificial intelligence applications in extreme environments place high demands on hardware robustness, power consumption, and speed. Recently, diffractive neural networks have demonstrated superb advantages high-throughput light-speed reasoning. However, the robustness lifetime of existing cannot be guaranteed, severely limiting their compactness long-term inference accuracy. Here, we developed a millimeter-scale robust bilayer-integrated network chip with virtually unlimited for...

10.1038/s44172-024-00211-6 article EN cc-by Communications Engineering 2024-05-01
Coming Soon ...