Yujiao Shi

ORCID: 0000-0001-6028-9051
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Robotics and Sensor-Based Localization
  • Advanced Image and Video Retrieval Techniques
  • Advanced Vision and Imaging
  • Advanced Neural Network Applications
  • Computer Graphics and Visualization Techniques
  • Video Surveillance and Tracking Methods
  • 3D Surveying and Cultural Heritage
  • Metaheuristic Optimization Algorithms Research
  • Remote Sensing and LiDAR Applications
  • Advanced Image Processing Techniques
  • Evolutionary Algorithms and Applications
  • Image and Object Detection Techniques
  • Impact of Light on Environment and Health
  • Multimodal Machine Learning Applications
  • Robotic Path Planning Algorithms
  • Satellite Image Processing and Photogrammetry
  • Infrared Target Detection Methodologies
  • Space Satellite Systems and Control
  • Domain Adaptation and Few-Shot Learning
  • Modular Robots and Swarm Intelligence
  • Spacecraft Design and Technology
  • Artificial Immune Systems Applications
  • Advanced Numerical Analysis Techniques
  • Medical Image Segmentation Techniques
  • Image Enhancement Techniques

ShanghaiTech University
2024

Tianjin Agricultural University
2024

Australian National University
2019-2023

Australian Centre for Robotic Vision
2020

Nanjing University of Posts and Telecommunications
2014-2019

Cross-view geo-localization is the problem of estimating position and orientation (latitude, longitude azimuth angle) a camera at ground level given large-scale database geo-tagged aerial (eg., satellite) images. Existing approaches treat task as pure location estimation by learning discriminative feature descriptors, but neglect alignment. It well-recognized that knowing between images can significantly reduce matching ambiguity these two views, especially when ground-level have limited...

10.1109/cvpr42600.2020.00412 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2020-06-01

This paper addresses the problem of cross-view image geo-localization, where geographic location a ground-level street-view query is estimated by matching it against large scale aerial map (e.g., high-resolution satellite image). State-of-the-art deep-learning based methods tackle this as deep metric learning which aims to learn global feature representations scene seen two different views. Despite promising results are obtained such methods, they, however, fail exploit crucial cue relevant...

10.1609/aaai.v34i07.6875 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2020-04-03

This paper addresses the problem of vehicle-mounted camera localization by matching a ground-level image with an overhead-view satellite map. Existing methods often treat this as cross-view retrieval, and use learned deep features to match query im-age partition (e.g., small patch) By these methods, accuracy is limited partitioning density map (often in order tens meters). Departing from conventional wisdom presents novel solution that can achieve highly-accurate localization. The key idea...

10.1109/cvpr52688.2022.01650 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2022-06-01

The artificial bee colony is a popular evolutionary algorithm that exhibits strong exploration ability but slow convergence. This paper proposes two new updating equations to boost the performances of employed and onlooker bees, respectively. In equations, intelligent learning strategies give bees chance learn from individuals with better performances. New control operators are also utilized balance global local searches. Second, we define search direction mechanism overcome oscillation...

10.1109/tii.2018.2857198 article EN IEEE Transactions on Industrial Informatics 2018-07-18

We address the problem of ground-to-satellite image geo-localization, that is, estimating camera latitude, longitude and orientation (azimuth angle) by matching a query captured at ground level against large-scale database with geotagged satellite images. Our prior arts treat above task as pure retrieval selecting most similar reference ground-level image. However, such an approach often produces coarse location estimates because geotag retrieved only corresponds to center while can be...

10.1109/tpami.2022.3189702 article EN IEEE Transactions on Pattern Analysis and Machine Intelligence 2022-01-01

This paper presents a new approach for synthesizing novel street-view panorama given satellite image, as if captured from the geographical location at center of image. Existing works this an image generation problem, adopting generative adversarial networks to implicitly learn cross-view transformations, but ignore geometric constraints. In paper, we make correspondences between and images explicit so facilitate transfer information domains. Specifically, observe that when 3D point is...

10.1109/tpami.2022.3140750 article EN IEEE Transactions on Pattern Analysis and Machine Intelligence 2022-01-07

Image retrieval-based cross-view localization methods often lead to very coarse camera pose estimation, due the limited sampling density of database satellite images. In this paper, we propose a method increase accuracy ground camera's location and orientation by estimating relative rotation translation between ground-level image its matched/retrieved image. Our approach designs geometry-guided transformer that combines benefits conventional geometry learnable transformers map ground-view...

10.1109/iccv51070.2023.01967 article EN 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2023-10-01

Large garages are ubiquitous yet intricate scenes that present unique challenges due to their monotonous colors, repetitive patterns, reflective surfaces, and transparent vehicle glass. Conventional Structure from Motion (SfM) methods for camera pose estimation 3D reconstruction often fail in these environments poor correspondence construction. To address challenges, we introduce LetsGo, a LiDAR-assisted Gaussian splatting framework large-scale garage modeling rendering. We develop handheld...

10.1145/3687762 article EN cc-by-nc-sa ACM Transactions on Graphics 2024-11-19

Generating street-view images from satellite imagery is a challenging task, particularly in maintaining accurate pose alignment and incorporating diverse environmental conditions. While diffusion models have shown promise generative tasks, their ability to maintain strict throughout the process limited. In this paper, we propose novel Iterative Homography Adjustment (IHA) scheme applied during denoising process, which effectively addresses misalignment ensures spatial consistency generated...

10.48550/arxiv.2502.03498 preprint EN arXiv (Cornell University) 2025-02-05

Unmanned Aerial Vehicles (UAVs), also known as drones, have become increasingly popular in recent years due to their ability capture high-quality multimedia data from the sky. With rise of applications, such aerial photography, cinematography, and mapping, UAVs emerged a powerful tool for gathering rich diverse content. This workshop aims bring together researchers, practitioners, enthusiasts interested UAV explore latest advancements, challenges, opportunities this exciting field. The...

10.1145/3581783.3610937 article EN 2023-10-26

We address the problem of novel view synthesis (NVS) from a few sparse source images. Conventional image-based rendering methods estimate scene geometry and synthesize views in two separate steps. However, erroneous estimation will decrease NVS performance as highly depends on quality estimated geometry. In this paper, we propose an end-to-end framework to eliminate error propagation issue. To be specific, construct volume under target design source-view visibility (SVE) module determine...

10.1109/cvpr46437.2021.00955 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2021-06-01

Large garages are ubiquitous yet intricate scenes in our daily lives, posing challenges characterized by monotonous colors, repetitive patterns, reflective surfaces, and transparent vehicle glass. Conventional Structure from Motion (SfM) methods for camera pose estimation 3D reconstruction fail these environments due to poor correspondence construction. To address challenges, this paper introduces LetsGo, a LiDAR-assisted Gaussian splatting approach large-scale garage modeling rendering. We...

10.48550/arxiv.2404.09748 preprint EN arXiv (Cornell University) 2024-04-15

Visual tracking is one of the most important applications in computer vision. Since process can be formed as a dynamic optimization problem. PSO, an effective algorithm to solve problem, has been used widely. However, it proved that traditional PSO easy converge local optimum. In this paper, we adopt quantum-behaved particle swarm (QPSO) for visual tracking. QPSO better global convergence compared with and overcome shortcomings algorithm. order achieve performance, improve framework based on...

10.1109/chicc.2015.7260232 article EN 2015-07-01

Image registration is a hot topic in the field of image processing, aim which to find best spatial transformation between two images by optimizing similarity metric. Mutual information, as an effective and reliable criterion, used this paper. Local optimization technique always fail process because function metric with respect parameters usually non-convex irregular, thus, global method required. This paper proposes improved artificial bee colony algorithm hybrid differential evolution for...

10.1109/chicc.2016.7553778 article EN 2016-07-01

This paper addresses the problem of cross-view image geo-localization, where geographic location a ground-level street-view query is estimated by matching it against large scale aerial map (e.g., high-resolution satellite image). State-of-the-art deep-learning based methods tackle this as deep metric learning which aims to learn global feature representations scene seen two different views. Despite promising results are obtained such methods, they, however, fail exploit crucial cue relevant...

10.48550/arxiv.1907.05021 preprint EN other-oa arXiv (Cornell University) 2019-01-01

As a modern Evolutionary Algorithm, Differential Evolution (DE) is usually criticized for its slow convergence when compared to Particle Swarm Optimization (PSO) on the PSO's benchmark functions. In this paper, by combing merits of PSO and DE, we first present new hybrid DE algorithm accelerate speed. Then novel mutation strategy with local global search operators proposed balancing exploration ability rate improved DE. The applied set test problems basic algorithms their variants....

10.1109/sde.2014.7031540 article EN 2014-12-01

Multi-level threshold segmentation techniques are one of the most important parts in image processing. They simple, robust, and accurate. However, some them have long computation time it grows exponentially with number thresholds increase. This paper proposed an improved differential evolution novel mutation strategy adaptive parameter controlling method (MApcDE) so as to avoid time-consuming overcome relation between dimensions. OTSU method, which maximizes variance foreground background...

10.1109/ccdc.2015.7162447 article EN 2022 34th Chinese Control and Decision Conference (CCDC) 2015-05-01

Vision-based localization for autonomous driving has been of great interest among researchers. When a pre-built 3D map is not available, the techniques visual simultaneous and mapping (SLAM) are typically adopted. Due to error accumulation, SLAM (vSLAM) usually suffers from long-term drift. This paper proposes framework increase accuracy by fusing vSLAM with deep-learning-based ground-to-satellite (G2S) image registration method. In this framework, coarse (spatial correlation bound check)...

10.48550/arxiv.2404.09169 preprint EN arXiv (Cornell University) 2024-04-14

Given a ground-level query image and geo-referenced aerial that covers the query's local surroundings, fine-grained cross-view localization aims to estimate location of ground camera inside image. Recent works have focused on developing advanced networks trained with accurate truth (GT) locations images. However, models always suffer performance drop when applied images in new target area differs from training. In most deployment scenarios, acquiring fine GT, i.e. GT locations, for...

10.48550/arxiv.2406.00474 preprint EN arXiv (Cornell University) 2024-06-01
Coming Soon ...