- Advanced Neural Network Applications
- Advanced Image and Video Retrieval Techniques
- Autonomous Vehicle Technology and Safety
- Robotics and Sensor-Based Localization
- Data Management and Algorithms
- Data Visualization and Analytics
- Video Surveillance and Tracking Methods
- Advanced Vision and Imaging
- Human Pose and Action Recognition
- Augmented Reality Applications
- Multimodal Machine Learning Applications
- Artificial Intelligence in Games
- Image and Video Quality Assessment
- Peer-to-Peer Network Technologies
- Distributed and Parallel Computing Systems
- Visual Attention and Saliency Detection
- Geochemistry and Geologic Mapping
- 3D Shape Modeling and Analysis
- Transportation and Mobility Innovations
- Robotic Path Planning Algorithms
- Human Motion and Animation
- Infrared Target Detection Methodologies
- Digital Games and Media
- Text Readability and Simplification
- Digital Media Forensic Detection
New York University
2021-2025
Shanghai Electric (China)
2023-2024
Huazhong University of Science and Technology
2021-2024
China University of Geosciences (Beijing)
2024
City University of New York
2023
Fujian Business University
2023
Horizon Robotics (China)
2022
Georgia Institute of Technology
2017-2021
National Central University
2020
Tianjin Research Institute of Electric Science (China)
2020
Instance segmentation on point clouds is a fundamental task in 3D scene perception. In this work, we propose concise clustering-based framework named HAIS, which makes full use of spatial relation points and sets. Considering methods may result over-segmentation or under-segmentation, introduce the hierarchical aggregation to progressively generate instance proposals, i.e., for preliminarily clustering sets set generating complete instances from Once are obtained, sub-network intra-instance...
In this paper, we propose a conceptually novel, efficient, and fully convolutional framework for real-time instance segmentation. Previously, most segmentation methods heavily rely on object detection perform mask prediction based bounding boxes or dense centers. contrast, sparse set of activation maps, as new representation, to high-light informative regions each foreground object. Then instance-level features are obtained by aggregating according the highlighted recognition Moreover,...
Autonomous driving requires a comprehensive understanding of the surrounding environment for reliable trajectory planning. Previous works rely on dense rasterized scene representation (e.g., agent occupancy and semantic map) to perform planning, which is computationally intensive misses instance-level structure information. In this paper, we propose VAD, an end-to-end vectorized paradigm autonomous driving, models as fully representation. The proposed has two significant advantages. On one...
Machine learning advances have afforded an increase in algorithms capable of creating art, music, stories, games, and more. However, it is not yet well-understood how machine might best collaborate with people to support creative expression. To investigate practicing designers perceive the role AI process, we developed a game level design tool for Super Mario Bros.-style games built-in designer. In this paper discuss our Morai Maker intelligent through two mixed-methods studies total over...
Labeling objects with pixel-wise segmentation requires a huge amount of human labor compared to bounding boxes. Most existing methods for weakly supervised instance focus on designing heuristic losses priors from While, we find that box-supervised can produce some fine masks and wonder whether the detectors could learn these while ignoring low-quality masks. To answer this question, present BoxTeacher, an efficient end-to-end training framework high-performance segmentation, which leverages...
High-definition (HD) map provides abundant and precise environmental information of the driving scene, serving as a fundamental indispensable component for planning in autonomous system. We present MapTR, structured end-to-end Transformer efficient online vectorized HD construction. propose unified permutation-equivalent modeling approach, i.e., element point set with group equivalent permutations, which accurately describes shape stabilizes learning process. design hierarchical query...
3D detection based on surround-view camera system is a critical technique in autopilot. In this work, we present Polar Parametrization for detection, which reformulates position parametrization, velocity decomposition, perception range, label assignment and loss function polar coordinate system. establishes explicit associations between image patterns prediction targets, exploiting the view symmetry of cameras as inductive bias to ease optimization boost performance. Based Parametrization,...
Learning Bird's Eye View (BEV) representation from surrounding-view cameras is of great importance for autonomous driving. In this work, we propose a Geometry-guided Kernel Transformer (GKT), novel 2D-to-BEV learning mechanism. GKT leverages the geometric priors to guide transformer focus on discriminative regions and unfolds kernel features generate BEV representation. For fast inference, further introduce look-up table (LUT) indexing method get rid camera's calibrated parameters at...
Existing end-to-end autonomous driving (AD) algorithms typically follow the Imitation Learning (IL) paradigm, which faces challenges such as causal confusion and open-loop gap. In this work, we establish a 3DGS-based closed-loop Reinforcement (RL) training paradigm. By leveraging 3DGS techniques, construct photorealistic digital replica of real physical world, enabling AD policy to extensively explore state space learn handle out-of-distribution scenarios through large-scale trial error. To...
The concept of an AI assistant for task guidance is rapidly shifting from a science fiction staple to impending reality. Such system inherently complex, requiring models perceptual grounding, attention, and reasoning, intuitive interface that adapts the performer's needs, orchestration data streams many sensors. Moreover, all acquired by must be readily available post-hoc analysis enable developers understand performer behavior quickly detect failures. We introduce TIM, first end-to-end...
Exploring large virtual environments, such as cities, is a central task in several domains, gaming and urban planning. VR systems can greatly help this by providing an immersive experience; however, common issue with viewing navigating city the traditional sense that users either obtain local or global view, but not both at same time, requiring them to continuously switch between perspectives, losing context distracting from their analysis. In paper, our goal allow navigate points of...
Media streaming, with an edge-cloud setting, has been adopted for a variety of applications such as entertainment, visualization, and design. Unlike video/audio streaming where the content is usually consumed passively, virtual reality require 3D assets stored on edge to facilitate frequent edge-side interactions object manipulation viewpoint movement. Compared audio video asset often requires larger data sizes yet lower latency ensure sufficient rendering quality, resolution, perceptual...
The concept of augmented reality (AR) assistants has captured the human imagination for decades, becoming a staple modern science fiction. To pursue this goal, it is necessary to develop artificial intelligence (AI)-based methods that simultaneously perceive 3D environment, reason about physical tasks, and model performer, all in real-time. Within framework, wide variety sensors are needed generate data across different modalities, such as audio, video, depth, speech, time-of-flight....
Text presented in augmented reality provides in-situ, real-time information for users. However, this content can be challenging to apprehend quickly when engaging cognitively demanding AR tasks, especially it is on a head-mounted display. We propose ARTiST, an automatic text simplification system that uses few-shot prompt and GPT-3 models specifically optimize the length semantic reality. Developed out of formative study included seven users three experts, our combines customized error...
In-vehicle automated safety features aim to increase safety; however, they are not always perfect. When systems fail, leave the driver unprepared recover quickly and safely. Reliability displays, informing of system's confidence in itself, could help keep drivers aware automation's status when failures occur. This study proposed two metrics for displaying this information driver: automation reliability (AR), a system-centric metric; required engagement (RDE), human-centric metric. Visual...
Small object detection requires the head to scan a large number of positions on image feature maps, which is extremely hard for computation- and energy-efficient lightweight generic detectors. To accurately detect small objects with limited computation, we propose two-stage framework low computation complexity, termed as TinyDet. It enables high-resolution maps dense anchoring better cover objects, proposes sparsely-connected convolution reduction, enhances early stage features in backbone,...
With the explosive growth in demand for lithium (Li) resources, Mufushan area has been a hotspot Li deposit exploration China recent years. Geochemical maps and geochemical anomaly are basic of mineral resources. A fixed-value method to contour map is presented here, which concentrations divided into 19 levels on 18 fixed values, ranging from 5 μg/g (corresponding detection limit) 1858 cut-off grade hard-rock type) illustrated six color tones corresponding areas low background, high anomaly,...
High-definition (HD) map provides abundant and precise static environmental information of the driving scene, serving as a fundamental indispensable component for planning in autonomous system. In this paper, we present \textbf{Map} \textbf{TR}ansformer, an end-to-end framework online vectorized HD construction. We propose unified permutation-equivalent modeling approach, \ie, element point set with group equivalent permutations, which accurately describes shape stabilizes learning process....