Qi Tian

ORCID: 0009-0003-2676-5300
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Advanced Image and Video Retrieval Techniques
  • Image Retrieval and Classification Techniques
  • Multimodal Machine Learning Applications
  • Domain Adaptation and Few-Shot Learning
  • Human Pose and Action Recognition
  • Video Surveillance and Tracking Methods
  • Advanced Neural Network Applications
  • Video Analysis and Summarization
  • Remote-Sensing Image Classification
  • Face and Expression Recognition
  • Gait Recognition and Analysis
  • Generative Adversarial Networks and Image Synthesis
  • Music and Audio Processing
  • Face recognition and analysis
  • Advanced Vision and Imaging
  • Robotics and Sensor-Based Localization
  • Medical Image Segmentation Techniques
  • Anomaly Detection Techniques and Applications
  • Visual Attention and Saliency Detection
  • Image Processing Techniques and Applications
  • Advanced Image Processing Techniques
  • Text and Document Classification Technologies
  • Topic Modeling
  • Machine Learning and Data Classification
  • Speech and Audio Processing

Tianjin Children's Hospital
2023-2025

University of Pittsburgh
2022-2025

First Hospital of Qinhuangdao
2024-2025

Huawei Technologies (China)
2017-2024

Renmin Hospital of Wuhan University
2022-2024

Hohai University
2024

Wuhan University
2022-2024

Shandong Mental Health Center
2024

Shandong University
2024

Jilin University
2022-2024

Although the performance of person Re-Identification (ReID) has been significantly boosted, many challenging issues in real scenarios have not fully investigated, e.g., complex scenes and lighting variations, viewpoint pose changes, large number identities a camera network. To facilitate research towards conquering those issues, this paper contributes new dataset called MSMT171 with important features, 1) raw videos are taken by an 15-camera network deployed both indoor outdoor scenes, 2)...

10.1109/cvpr.2018.00016 preprint EN 2018-06-01

Feature extraction and matching are two crucial components in person Re-Identification (ReID). The large pose deformations the complex view variations exhibited by captured images significantly increase difficulty of learning features from images. To overcome these difficulties, this work we propose a Pose-driven Deep Convolutional (PDC) model to learn improved feature models end end. Our deep architecture explicitly leverages human part cues alleviate robust representations both global...

10.1109/iccv.2017.427 article EN 2017-10-01

The huge variance of human pose and the misalignment detected images significantly increase difficulty person Re-Identification (Re-ID). Moreover, efficient Re-ID systems are required to cope with massive visual data being produced by video surveillance systems. Targeting solve these problems, this work proposes a Global-Local-Alignment Descriptor (GLAD) an indexing retrieval framework, respectively. GLAD explicitly leverages local global cues in body generate discriminative robust...

10.1145/3123266.3123279 preprint EN Proceedings of the 30th ACM International Conference on Multimedia 2017-10-19

Microblog has been a popular media platform for reporting and propagating news. However, fake news spreading on microblogs would severely jeopardize its public credibility. To identify the truthfulness of microblogs, images are very crucial content. In this paper, we explore key role image content in task automatic verification microblogs. Existing approaches to depend features extracted mainly from text tweets, while often ignored. According our study, however, have great influence...

10.1109/tmm.2016.2617078 article EN IEEE Transactions on Multimedia 2016-10-12

Geometric deep learning is increasingly important thanks to the popularity of 3D sensors. Inspired by recent advances in NLP domain, self-attention transformer introduced consume point clouds. We develop Point Attention Transformers (PATs), using a parameter-efficient Group Shuffle (GSA) replace costly Multi-Head Attention. demonstrate its ability process size-varying inputs, and prove permutation equivariance. Besides, prior work uses heuristics dependence on input data (e.g., Furthest...

10.1109/cvpr.2019.00344 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2019-06-01

Emotion recognition is challenging due to the emotional gap between emotions and audio-visual features. Motivated by powerful feature learning ability of deep neural networks, this paper proposes bridge using a hybrid model, which first produces segment features with Convolutional Neural Networks (CNNs) 3D-CNN, then fuses in Deep Belief (DBNs). The proposed method trained two stages. First, CNN 3D-CNN models pre-trained on corresponding large-scale image video classification tasks are...

10.1109/tcsvt.2017.2719043 article EN IEEE Transactions on Circuits and Systems for Video Technology 2017-06-23

We propose novel dynamic multiscale graph neural networks (DMGNN) to predict 3D skeleton-based human motions. The core idea of DMGNN is use a comprehensively model the internal relations body for motion feature learning. This adaptive during training and across network layers. Based on this graph, we computational unit (MGCU) extract features at individual scales fuse scales. entire action-category-agnostic follows an encoder-decoder framework. encoder consists sequence MGCUs learn features....

10.1109/cvpr42600.2020.00029 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2020-06-01

Most existing person re-identification algorithms either extract robust visual features or learn discriminative metrics for images. However, the underlying manifold which those images reside on is rarely investigated. That arises a problem that learned metric not smooth with respect to local geometry structure of data manifold. In this paper, we study manifold-based affinity learning, did receive enough attention from area. An unconventional manifold-preserving algorithm proposed, can 1)...

10.1109/cvpr.2017.358 article EN 2017-07-01

Learning discriminative representations for unseen person images is critical Re-Identification (ReID). Most of current approaches learn deep in classification tasks, which essentially minimize the empirical risk on training set. As shown our experiments, such commonly focus several body parts to set, rather than entire human body. Inspired by structural minimization principle SVM, we revise traditional representation learning procedure both and risk. The evaluated proposed part loss,...

10.1109/tip.2019.2891888 article EN IEEE Transactions on Image Processing 2019-01-10

Recently, deep Convolutional Neural Networks (CNNs) can achieve human-level performance in edge detection with the rich and abstract representation capacities. However, high of CNN based is achieved a large pretrained backbone, which memory energy consuming. In addition, it surprising that previous wisdom from traditional detectors, such as Canny, Sobel, LBP are rarely investigated rapid-developing learning era. To address these issues, we propose simple, lightweight yet effective...

10.1109/iccv48922.2021.00507 article EN 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2021-10-01

Person re-identification (re-ID) has been a popular topic in computer vision and pattern recognition communities for decade. Several important milestones such as metric-based deeply-learned re-ID recent years have promoted this topic. However, most existing works are designed closed-world scenarios rather than realistic open-world settings, which limits the practical application of technique. On one hand, performance latest methods surpassed human-level on several commonly used benchmarks...

10.1109/tcsvt.2019.2898940 article EN IEEE Transactions on Circuits and Systems for Video Technology 2019-02-12

This paper proposes the Global-Local Temporal Representation (GLTR) to exploit multi-scale temporal cues in video sequences for person Re-Identification (ReID). GLTR is constructed by first modeling short-term among adjacent frames, then capturing long-term relations inconsecutive frames. Specifically, are modeled parallel dilated convolutions with different dilation rates represent motion and appearance of pedestrian. The captured a self-attention model alleviate occlusions noises...

10.1109/iccv.2019.00406 article EN 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2019-10-01

During a long period of time we are combating overfitting in the CNN training process with model regularization, including weight decay, averaging, data augmentation, etc. In this paper, present DisturbLabel, an extremely simple algorithm which randomly replaces part labels as incorrect values each iteration. Although it seems weird to intentionally generate labels, show that DisturbLabel prevents network from over-fitting by implicitly averaging over exponentially many networks trained...

10.1109/cvpr.2016.514 preprint EN 2016-06-01

While video-based person re-identification (Re-ID) has drawn increasing attention and made great progress in recent years, it is still very challenging to effectively overcome the occlusion problem visual ambiguity for visually similar negative samples. On other hand, we observe that different frames of a video can provide complementary information each other, structural pedestrians extra discriminative cues appearance features. Thus, modeling temporal relations spatial within frame...

10.1109/cvpr42600.2020.00335 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2020-06-01

Objects in unmanned aerial vehicle (UAV) images are generally small due to the high-photography altitude. Although many efforts have been made object detection, how accurately and quickly detect objects is still one of remaining open challenges. In this paper, we propose a feature fusion scaling-based single shot detector (FS-SSD) for detection UAV images. The FS-SSD an enhancement based on FSSD, variety original multibox (SSD). We add extra scaling branch deconvolution module with average...

10.1109/tcsvt.2019.2905881 article EN IEEE Transactions on Circuits and Systems for Video Technology 2019-03-20

Person re-identification aims at identifying a certain person across non-overlapping multi-camera networks. It is fundamental and challenging task in automated video surveillance. Most existing researches mainly rely on hand-crafted features, resulting unsatisfactory performance. In this paper, we propose multi-scale triplet convolutional neural network which captures visual appearance of various scales. We to optimize the parameters by comparative similarity loss massive sample triplets,...

10.1145/2964284.2967209 article EN Proceedings of the 30th ACM International Conference on Multimedia 2016-09-29

The past year has witnessed the rapid development of applying Transformer module to vision problems. While some researchers have demonstrated that Transformer-based models enjoy a favorable ability fitting data, there are still growing number evidences showing these suffer over-fitting especially when training data is limited. This paper offers an empirical study by performing step-by-step operations gradually transit model convolution-based model. results we obtain during transition process...

10.1109/iccv48922.2021.00063 article EN 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2021-10-01

With the prevalence of multimedia content on Web, developing recommender solutions that can effectively leverage rich signal in data is urgent need. Owing to success deep neural networks representation learning, recent advances recommendation has largely focused exploring learning methods improve accuracy. To date, however, there been little effort investigate robustness and its impact performance recommendation. In this paper, we shed light system. Using state-of-the-art framework image...

10.1109/tkde.2019.2893638 article EN IEEE Transactions on Knowledge and Data Engineering 2019-01-18

Recent years have witnessed the remarkable progress of applying deep learning models in video person re-identification (Re-ID). A key factor for Re-ID is to effectively construct discriminative and robust feature representations many complicated situations. Part-based approaches employ spatial temporal attention extract representative local features. While correlations between parts are ignored previous methods, leverage relations different parts, we propose an innovative adaptive graph...

10.1109/tip.2020.3001693 article EN IEEE Transactions on Image Processing 2020-01-01

10.1109/cvpr52733.2024.01920 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2024-06-16

License plates detection is widely considered a solved problem, with many systems already in operation. However, the existing algorithms or work well only under some controlled conditions. There are still challenges for license plate an open environment, such as various observation angles, background clutter, scale changes, multiple plates, uneven illumination, and so on. In this paper, we propose novel scheme to automatically locate by principal visual word (PVW), discovery local feature...

10.1109/tip.2012.2199506 article EN IEEE Transactions on Image Processing 2012-08-15

The explosively increasing 3-D objects make their efficient retrieval technology highly desired. Extensive research efforts have been dedicated to view-based object for its advantage of using 2-D views represent objects. In this paradigm, typically the is accomplished by matching query with in database. However, all may not only introduce difficulty rapid but also degrade accuracy when there a mismatch between and work, we propose an interactive scheme. Given set views, first perform...

10.1109/tmm.2011.2160619 article EN IEEE Transactions on Multimedia 2011-06-29
Coming Soon ...