Jialian Wu

ORCID: 0000-0003-2629-3270
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Advanced Neural Network Applications
  • Video Surveillance and Tracking Methods
  • Advanced Image and Video Retrieval Techniques
  • Domain Adaptation and Few-Shot Learning
  • Multimodal Machine Learning Applications
  • Advanced Radiotherapy Techniques
  • Human Pose and Action Recognition
  • Adversarial Robustness in Machine Learning
  • Video Analysis and Summarization
  • Digital Radiography and Breast Imaging
  • Radiation Dose and Imaging
  • Image and Signal Denoising Methods
  • Sperm and Testicular Function
  • Anomaly Detection Techniques and Applications
  • Image and Video Quality Assessment
  • Human Mobility and Location-Based Analysis
  • Automated Road and Building Extraction
  • Streptococcal Infections and Treatments
  • Natural Language Processing Techniques
  • Text and Document Classification Technologies
  • Bee Products Chemical Analysis
  • Topic Modeling
  • Advanced Measurement and Detection Methods
  • Machine Learning and Data Classification
  • Neonatal and Maternal Infections

University at Buffalo, State University of New York
1995-2024

Wuhan University
2021-2022

Buffalo State University
2021

Animal Technology Institute Taiwan
1999

Most online multi-object trackers perform object detection stand-alone in a neural net without any input from tracking. In this paper, we present new joint and tracking model, TraDeS (TRAck to DEtect Segment), exploiting clues assist end-to-end. infers offset by cost volume, which is used propagate previous features for improving current segmentation. Effectiveness superiority of are shown on 4 datasets, including MOT (2D tracking), nuScenes (3D MOTS Youtube-VIS (instance segmentation...

10.1109/cvpr46437.2021.01217 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2021-06-01

Satellite video is an emerging type of earth observation tool, which has attracted increasing attention because its application in dynamic analysis. However, most studies only focus on improving the spatial resolution satellite imagery. In contrast, few works are committed to enhancing temporal resolution, and joint spatial-temporal improvement even less. The enhancement can not produce high-resolution imagery for subsequent applications, but also provide potentials clear motion dynamics...

10.1016/j.jag.2022.102731 article EN cc-by-nc-nd International Journal of Applied Earth Observation and Geoinformation 2022-02-25

Despite the previous success of object analysis, detecting and segmenting a large number categories with long-tailed data distribution remains challenging problem is less investigated. For large-vocabulary classifier, chance obtaining noisy logits much higher, which can easily lead to wrong recognition. In this paper, we exploit prior knowledge relations among cluster fine-grained classes into coarser parent classes, construct classification tree that responsible for parsing an instance...

10.1145/3394171.3413970 article EN Proceedings of the 30th ACM International Conference on Multimedia 2020-10-12

State-of-the-art pedestrian detectors have performed promisingly on non-occluded pedestrians, yet they are still confronted by heavy occlusions. Although many previous works attempted to alleviate the occlusion issue, most of them rest images. In this paper, we exploit local temporal context pedestrians in videos and propose a tube feature aggregation network (TFAN) aiming at enhancing against severe Specifically, for an occluded current frame, iteratively search its relevant counterparts...

10.1109/cvpr42600.2020.01344 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2020-06-01

This paper presents a Generative RegIon-to-Text transformer, GRiT, for object understanding. The spirit of GRiT is to formulate understanding as pairs, where region locates objects and text describes objects. For example, the in detection denotes class names while that dense captioning refers descriptive sentences. Specifically, consists visual encoder extract image features, foreground extractor localize objects, decoder generate open-set descriptions. With same model architecture, can...

10.48550/arxiv.2212.00280 preprint EN other-oa arXiv (Cornell University) 2022-01-01

Detecting small-scale pedestrians is one of the most challenging problems in pedestrian detection. Due to lack visual details, representations tend be weak distinguished from background clutters. In this paper, we conduct an in-depth analysis detection problem, which reveals that are main cause for a classifier miss them. To address issue, propose novel Self-Mimic Learning (SML) method improve performance on pedestrians. We enhance by mimicking rich large-scale Specifically, design mimic...

10.1145/3394171.3413634 article EN Proceedings of the 30th ACM International Conference on Multimedia 2020-10-12

Multi-view pedestrian detection aims to predict a bird's eye view (BEV) occupancy map from multiple camera views. This task is confronted with two challenges: how establish the 3D correspondences views BEV and assemble information across In this paper, we propose novel Stacked HOmography Transformations (SHOT) approach, which motivated by approximating projections in world coordinates via stack of homographies. We first construct transformations for projecting ground plane at different...

10.1109/iccv48922.2021.00599 article EN 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2021-10-01

Video Instance Segmentation (VIS) aims to simultaneously classify, segment, and track multiple object instances in videos. Recent clip-level VIS takes a short video clip as input each time showing stronger performance than frame-level (tracking-by-segmentation), more temporal context from frames is utilized. Yet, most methods are neither end-to-end learnable nor real-time. These limitations addressed by the recent transformer (VisTR) [25] which performs within clip. However, VisTR suffers...

10.1109/cvpr52688.2022.00103 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2022-06-01

With the development of video satellites, multimoving object tracking in satellite is possible and has become a new challenging task. The difficulties are mainly caused by characteristics videos: 1) small objects; 2) low contrast between objects background; 3) background state continuous motion. These make it difficult for advanced multiobject algorithms natural to give full play their advantages, resulting vast false alarms, missed objects, ID switches, low-confidence bounding boxes. To...

10.1109/tgrs.2021.3139121 article EN IEEE Transactions on Geoscience and Remote Sensing 2021-12-28

Object detection and instance segmentation with a large number of object categories long-tailed data distribution are challenging for most existing deep learning models. As the classes increases, outputs classifier become sensitive to likely noisy logits, which can easily result in an incorrect recognition. To alleviate large-vocabulary problem, we cluster fine-grained into coarser parent then build classification tree classify class via its class. Because is much fewer, their logits more...

10.1109/tmm.2021.3106096 article EN publisher-specific-oa IEEE Transactions on Multimedia 2021-08-20

Multi-label image classification aims to predict multiple labels for a single image. However, the difficulties of predicting different may vary dramatically due semantic variations label as well context. Direct learning multi-label models has risk being biased and overfitting those difficult labels, e.g., deep network based classifiers are over-trained on therefore, lead false-positive errors during testing. To handle classification, we propose calibrate model, which not only predicts but...

10.1145/3474085.3475406 article EN Proceedings of the 30th ACM International Conference on Multimedia 2021-10-17

When adopting deep neural networks for a new vision task, common practice is to start with fine-tuning some off-the-shelf well-trained network models from the community. Since task may require training different architecture domain data, taking advantage of not trivial and generally requires considerable try-and-error parameter tuning. In this paper, we denote model as teacher student network. We aim ease efforts transferring knowledge network, robust gaps between their architectures,...

10.1609/aaai.v35i3.16358 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2021-05-18

Despite the previous success of object analysis, detecting and segmenting a large number categories with long-tailed data distribution remains challenging problem is less investigated. For large-vocabulary classifier, chance obtaining noisy logits much higher, which can easily lead to wrong recognition. In this paper, we exploit prior knowledge relations among cluster fine-grained classes into coarser parent classes, construct classification tree that responsible for parsing an instance...

10.48550/arxiv.2008.05676 preprint EN other-oa arXiv (Cornell University) 2020-01-01

The sgp gene of Streptococcus mutans was recently detected immediately downstream from the dgk within same operon. In this study, subcloned into pMAL-c2 vector and SGP (S. G protein) overexpressed in Escherichia coli as a fusion protein with maltose-binding at level 40% total cellular protein. One-step amylose affinity chromatography purification yielded product approximately 95% purity. purified following cleavage protease factor Xa DEAE-Sephacel chromatography. nucleotide binding assays,...

10.1128/iai.63.7.2516-2521.1995 article EN Infection and Immunity 1995-07-01

Most online multi-object trackers perform object detection stand-alone in a neural net without any input from tracking. In this paper, we present new joint and tracking model, TraDeS (TRAck to DEtect Segment), exploiting clues assist end-to-end. infers offset by cost volume, which is used propagate previous features for improving current segmentation. Effectiveness superiority of are shown on 4 datasets, including MOT (2D tracking), nuScenes (3D MOTS Youtube-VIS (instance segmentation...

10.48550/arxiv.2103.08808 preprint EN other-oa arXiv (Cornell University) 2021-01-01

To develop a library of graphic human models that closely match patients undergoing interventional fluoroscopic procedures in order to obtain an accurate estimate their skin dose.A dose tracking system (DTS) has been developed calculates the patient's real time during based on graphical simulation x-ray and patient. The calculation is performed using lookup table containing values mGy per mAs at reference point inverse-square correction distance from source individual points skin. For proper...

10.1118/1.4734743 article EN Medical Physics 2012-06-01

Video instance segmentation (VIS) task requires classifying, segmenting, and tracking object instances over all frames in a video clip. Recently, VisTR [1] has been proposed as end-to-end transformer-based VIS framework, while demonstrating state-of-the-art performance. However, is slow to converge during training, requiring around 1000 GPU hours due the high computational cost of its transformer attention module. To improve training efficiency, we propose Deformable VisTR, leveraging...

10.1109/icassp43922.2022.9746665 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2022-04-27

Purpose: To develop pediatric graphic models for a real‐time skin dose‐tracking system (DTS) and to verify the accuracy of dose dose‐rate determination with these cardiac projections. Methods: A has been developed track during fluoroscopic interventional procedures provide representation cumulative distribution in real time. Accurate operation DTS verified adult patient models. extend application this software procedures, series 3D varying heights ranging from 60 128 cm three weight ranges...

10.1118/1.4814135 article EN Medical Physics 2013-06-01

Video instance segmentation (VIS) task requires classifying, segmenting, and tracking object instances over all frames in a video clip. Recently, VisTR has been proposed as end-to-end transformer-based VIS framework, while demonstrating state-of-the-art performance. However, is slow to converge during training, requiring around 1000 GPU hours due the high computational cost of its transformer attention module. To improve training efficiency, we propose Deformable VisTR, leveraging...

10.48550/arxiv.2203.06318 preprint EN other-oa arXiv (Cornell University) 2022-01-01

Video Instance Segmentation (VIS) aims to simultaneously classify, segment, and track multiple object instances in videos. Recent clip-level VIS takes a short video clip as input each time showing stronger performance than frame-level (tracking-by-segmentation), more temporal context from frames is utilized. Yet, most methods are neither end-to-end learnable nor real-time. These limitations addressed by the recent transformer (VisTR) which performs within clip. However, VisTR suffers long...

10.48550/arxiv.2203.01853 preprint EN other-oa arXiv (Cornell University) 2022-01-01

Recent progress in environment sensing technology focuses more on measuring the physical properties of environment, e.g., temperature and noise, but lacks ability to understand subjective responses, or feelings about indoor comfort. Feelings depend both environmental conditions individual needs preferences. Different people may feel differently same room experiencing conditions. In this work, we apply a crowdsensing based approach predict personalized We assume that similar users share...

10.1109/mipr54900.2022.00053 article EN 2022-08-01
Coming Soon ...