Yiyang Zhou

ORCID: 0000-0002-1534-8005
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Advanced Neural Network Applications
  • 3D Surveying and Cultural Heritage
  • Robotics and Sensor-Based Localization
  • Autonomous Vehicle Technology and Safety
  • Video Surveillance and Tracking Methods
  • Machine Learning and Data Classification
  • Text and Document Classification Technologies
  • CO2 Sequestration and Geologic Interactions
  • Enhanced Oil Recovery Techniques
  • 3D Shape Modeling and Analysis
  • Remote Sensing and LiDAR Applications
  • Advanced Image and Video Retrieval Techniques
  • Video Analysis and Summarization
  • Multimodal Machine Learning Applications
  • Domain Adaptation and Few-Shot Learning
  • Face and Expression Recognition
  • Underwater Vehicles and Communication Systems
  • Hydrocarbon exploration and reservoir analysis
  • Spacecraft and Cryogenic Technologies
  • Advanced Battery Technologies Research
  • Control and Dynamics of Mobile Robots
  • Advanced Vision and Imaging
  • Nuclear and radioactivity studies
  • Music and Audio Processing
  • Industrial Vision Systems and Defect Detection

Xi'an Jiaotong University
2023-2025

Civil Aviation University of China
2024

University of North Carolina at Chapel Hill
2024

Beijing University of Posts and Telecommunications
2024

University of North Carolina Health Care
2024

Systems Control (United States)
2020-2022

University of California, Berkeley
2019-2022

Communications & Power Industries (United States)
2020

Beijing Normal University
2019

Nanjing University
2016

Mapping and localization is a critical module of autonomous driving, significant achievements have been reached in this field. Beyond Global Navigation Satellite System (GNSS), research point cloud registration, visual feature matching, inertia navigation has greatly enhanced the accuracy robustness mapping different scenarios. However, highly urbanized scenes are still challenging: LIDAR- camera-based methods perform poorly with numerous dynamic objects; GNSS-based solutions experience...

10.1109/icra40945.2020.9196526 article EN 2020-05-01

High definition (HD) maps have demonstrated their essential roles in enabling full autonomy, especially complex urban scenarios. As a crucial layer of the HD map, lane-level are particularly useful: they contain geometrical and topological information for both lanes intersections. However, large scale construction is limited by tedious human labeling high maintenance costs, scenarios with complicated road structures irregular markings. This paper proposes an approach based on...

10.1109/iros51168.2021.9636205 article EN 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2021-09-27

Recent advancements in video generation have significantly improved the ability to synthesize videos from text instructions. However, existing models still struggle with key challenges such as instruction misalignment, content hallucination, safety concerns, and bias. Addressing these limitations, we introduce MJ-BENCH-VIDEO, a large-scale preference benchmark designed evaluate across five critical aspects: Alignment, Safety, Fineness, Coherence & Consistency, Bias Fairness. This...

10.48550/arxiv.2502.01719 preprint EN arXiv (Cornell University) 2025-02-03

Driven by the complementarity and consistency inherent in multiview data, clustering (MVC) has garnered widespread attention various domains. Real-world data often encounters issue of missing information, leading to a surge interest domain incomplete MVC (IMVC). Despite existing approaches having made significant progress addressing IMVC, two challenges persist: 1) many alignment-based methodologies tend overlook topological relationships among instances 2) view representations based on...

10.1109/tnnls.2025.3540437 article EN IEEE Transactions on Neural Networks and Learning Systems 2025-01-01

Detecting dynamic objects and predicting static road information such as drivable areas ground heights are crucial for safe autonomous driving. Previous works studied each perception task separately, lacked a collective quantitative analysis. In this work, we show that it is possible to perform all tasks via simple efficient multi-task network. Our proposed network, LidarMTL, takes raw LiDAR point cloud inputs, predicts six outputs 3D object detection understanding. The network based on an...

10.1109/iros51168.2021.9635858 article EN 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2021-09-27

The availability of many real-world driving datasets is a key reason behind the recent progress object detection algorithms in autonomous driving. However, there exist ambiguity or even failures labels due to error-prone annotation process sensor observation noise. Current public only provide deterministic without considering their inherent uncertainty, as does common training evaluation metrics for detectors. As result, an in-depth among different methods remains challenging, and detectors...

10.1109/tits.2021.3096943 article EN IEEE Transactions on Intelligent Transportation Systems 2021-07-27

The availability of real-world datasets is the prerequisite for developing object detection methods autonomous driving. While ambiguity exists in labels due to error-prone annotation process or sensor observation noises, current only provide deterministic annotations without considering their uncertainty. This precludes an in-depth evaluation among different methods, especially those that explicitly model predictive probability. In this work, we propose a generative estimate bounding box...

10.1109/iros45743.2020.9340798 article EN 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2020-10-24

With information from multiple input modalities, sensor fusion-based algorithms usually out-perform their single-modality counterparts in robotics. Camera and LIDAR, with complementary semantic depth information, are the typical choices for detection tasks complicated driving environments. For most camera-LIDAR fusion algorithms, however, calibration of suite will greatly impact performance. More specifically, algorithm requires an accurate geometric relationship among sensors as input, it...

10.1109/itsc55140.2022.9922085 article EN 2022 IEEE 25th International Conference on Intelligent Transportation Systems (ITSC) 2022-10-08

Point cloud completion estimates complete shapes from incomplete point clouds to obtain higher-quality data. Most existing methods only consider global object features, ignoring spatial and semantic information of adjacent points. They cannot distinguish structural well between different parts, the robustness models is poor. To tackle these challenges, we propose an interaction-based generative network for (DualGenerator). It contains upper adversarial generation path a lower variational...

10.1109/lra.2023.3310406 article EN IEEE Robotics and Automation Letters 2023-08-30

Instruction-following Vision Large Language Models (VLLMs) have achieved significant progress recently on a variety of tasks. These approaches merge strong pre-trained vision models and large language (LLMs). Since these components are trained separately, the learned representations need to be aligned with joint training additional image-language pairs. This procedure is not perfect can cause model hallucinate - provide answers that do accurately reflect image, even when core LLM highly...

10.48550/arxiv.2402.11411 preprint EN arXiv (Cornell University) 2024-02-17

The availability of real-world datasets is the prerequisite for developing object detection methods autonomous driving. While ambiguity exists in labels due to error-prone annotation process or sensor observation noises, current only provide deterministic annotations without considering their uncertainty. This precludes an in-depth evaluation among different methods, especially those that explicitly model predictive probability. In this work, we propose a generative estimate bounding box...

10.48550/arxiv.2003.03644 preprint EN other-oa arXiv (Cornell University) 2020-01-01

This work focuses on the potential of Vision LLMs (VLLMs) in visual reasoning. Different from prior studies, we shift our focus evaluating standard performance to introducing a comprehensive safety evaluation suite, covering both out-of-distribution (OOD) generalization and adversarial robustness. For OOD evaluation, present two novel VQA datasets, each with one variant, designed test model under challenging conditions. In exploring robustness, propose straightforward attack strategy for...

10.48550/arxiv.2311.16101 preprint EN other-oa arXiv (Cornell University) 2023-01-01

The paper mainly builds a machine learning model for tennis match score prediction and conducts significance analysis on the effects of concomitant conditions using PSM. Firstly, constructed set new metrics system, including whether athlete is serve side, his/her personal skill, level fatigue, mentality during match, tested significant effect these by binary logistic regression, used various such as XGBOOST, SVC, LGBM to build model. Then, trained an based data improved system realize...

10.62051/0epgvv37 article EN cc-by-nc Transactions on Computer Science and Intelligent Systems Research 2024-08-12
Coming Soon ...