Jiankai Sun

ORCID: 0000-0001-5633-1739
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Privacy-Preserving Technologies in Data
  • Advanced Neural Network Applications
  • Reinforcement Learning in Robotics
  • Advanced Image and Video Retrieval Techniques
  • Robotics and Sensor-Based Localization
  • Robot Manipulation and Learning
  • Human Pose and Action Recognition
  • Anomaly Detection Techniques and Applications
  • Topic Modeling
  • Adversarial Robustness in Machine Learning
  • Traffic Prediction and Management Techniques
  • Domain Adaptation and Few-Shot Learning
  • Autonomous Vehicle Technology and Safety
  • Natural Language Processing Techniques
  • Gait Recognition and Analysis
  • Multimodal Machine Learning Applications
  • Vehicular Ad Hoc Networks (VANETs)
  • Internet Traffic Analysis and Secure E-voting
  • Cryptography and Data Security
  • Remote Sensing and LiDAR Applications
  • Machine Fault Diagnosis Techniques
  • Soft Robotics and Applications
  • Explainable Artificial Intelligence (XAI)
  • Medical Image Segmentation Techniques
  • Robotic Path Planning Algorithms

Stanford University
2022-2024

Huaqiao University
2023

Chinese University of Hong Kong
2019-2023

Wuhan Textile University
2023

Southwest Jiaotong University
2022-2023

Tencent (China)
2022

Vaughn College of Aeronautics and Technology
2022

Shanghai Jiao Tong University
2019-2020

Zhejiang Sci-Tech University
2012

We survey applications of pretrained foundation models in robotics. Traditional deep learning robotics are trained on small datasets tailored for specific tasks, which limits their adaptability across diverse applications. In contrast, internet-scale data appear to have superior generalization capabilities, and some instances display an emergent ability find zero-shot solutions problems that not present the training data. Foundation may hold potential enhance various components robot...

10.1177/02783649241281508 article EN The International Journal of Robotics Research 2024-09-25

Camera re-localization is an important but challenging task in applications like robotics and autonomous driving. Recently, retrieval-based methods have been considered as a promising direction they can be easily generalized to novel scenes. Despite significant progress has made, we observe that the performance bottleneck of previous actually lies retrieval module. These use same features for both relative pose regression tasks which potential conflicts learning. To this end, here present...

10.1109/iccv.2019.00296 article EN 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2019-10-01

Two-party split learning is a popular technique for model across feature-partitioned data. In this work, we explore whether it possible one party to steal the private label information from other during training, and there are methods that can protect against such attacks. Specifically, first formulate realistic threat propose privacy loss metric quantify leakage in learning. We then show exist two simple yet effective within allow accurately recover ground-truth labels owned by party. To...

10.48550/arxiv.2102.08504 preprint EN other-oa arXiv (Cornell University) 2021-01-01

Imitation learning from human demonstrations is a promising paradigm for teaching robots manipulation skills in the real world. However, complex long-horizon tasks often requires an unattainable amount of demonstrations. To reduce high data requirement, we resort to play - video sequences people freely interacting with environment using their hands. Even different morphologies, hypothesize that contain rich and salient information about physical interactions can readily facilitate robot...

10.48550/arxiv.2302.12422 preprint EN other-oa arXiv (Cornell University) 2023-01-01

3D vehicle detection based on point cloud is a challenging task in real-world applications such as autonomous driving. Despite significant progress has been made, we observe two aspects to be further improved. First, the semantic context information LiDAR seldom explored previous works, which may help identify ambiguous vehicles. Second, distribution of vehicles varies continuously with increasing depths, not well modeled by single model. In this work, propose unified model SegVoxelNet...

10.1109/icra40945.2020.9196556 article EN 2020-05-01

Sensing surroundings plays a crucial role in human spatial perception, as it extracts the configuration of objects well free space from observations. To facilitate robot perception with such surrounding sensing capability, we introduce novel visual task called Cross-view Semantic Segmentation framework named View Parsing Network (VPN) to address it. In cross-view semantic segmentation task, agent is trained parse first-view observations into top-down-view map indicating location all at...

10.1109/lra.2020.3004325 article EN IEEE Robotics and Automation Letters 2020-06-23

In this work, we study the problem of how to leverage instructional videos facilitate understanding human decision-making processes, focusing on training a model with ability plan goal-directed procedure from real-world videos. Learning structured and plannable state action spaces directly unstructured is key technical challenge our task. There are two problems: first, appearance gap between validation datasets could be large for videos; second, these gaps lead decision errors that compound...

10.1109/lra.2022.3150855 article EN IEEE Robotics and Automation Letters 2022-02-14

Abstract Accurate prediction of the health status is critical for reliability and safety lithium-ion batteries (LIBs). However, some methods do not consider physical information in battery capacity degradation typically overlook regeneration phenomenon (CRP) their predictions. In this study, a multi-resolution ensemble method based on physics-informed deep learning LIBs status, proposed. Specifically, decomposition performed trends to analyze global local features. Global features are...

10.1088/1361-6501/ada849 article EN Measurement Science and Technology 2025-01-09

We present SIREN for registration of multi-robot Gaussian Splatting (GSplat) maps, with zero access to camera poses, images, and inter-map transforms initialization or fusion local submaps. To realize these capabilities, harnesses the versatility robustness semantics in three critical ways derive a rigorous pipeline GSplat maps. First, utilizes identify feature-rich regions maps where problem is better posed, eliminating need any which generally required prior work. Second, identifies...

10.48550/arxiv.2502.06519 preprint EN arXiv (Cornell University) 2025-02-10

In many real-world applications where specifying a proper reward function is difficult, it desirable to learn policies from expert demonstrations. Adversarial Inverse Reinforcement Learning (AIRL) one of the most common approaches for learning However, due stochastic policy, current computation graph AIRL no longer end-to-end differentiable like Generative Networks (GANs), resulting in need high-variance gradient estimation methods and large sample size. this work, we propose Model-based...

10.1109/lra.2021.3061397 article EN IEEE Robotics and Automation Letters 2021-02-23

We present BEHAVIOR-1K, a comprehensive simulation benchmark for human-centered robotics. BEHAVIOR-1K includes two components, guided and motivated by the results of an extensive survey on "what do you want robots to you?". The first is definition 1,000 everyday activities, grounded in 50 scenes (houses, gardens, restaurants, offices, etc.) with more than 9,000 objects annotated rich physical semantic properties. second OMNIGIBSON, novel environment that supports these activities via...

10.48550/arxiv.2403.09227 preprint EN arXiv (Cornell University) 2024-03-14

Split learning is a distributed training framework that allows multiple parties to jointly train machine model over vertically partitioned data (partitioned by attributes). The idea only intermediate computation results, rather than private features and labels, are shared between so raw remains private. Nevertheless, recent works showed the plaintext implementation of split suffers from severe privacy risks semi-honest adversary can easily reconstruct labels. In this work, we propose...

10.48550/arxiv.2203.02073 preprint EN cc-by arXiv (Cornell University) 2022-01-01

https://github.com/reasoning-survey/Awesome-Reasoning-Foundation-Models Reasoning, a crucial ability for complex problem-solving, plays pivotal role in various real-world settings such as negotiation, medical diagnosis, and criminal investigation. It serves fundamental methodology the field of Artificial General Intelligence (AGI). With ongoing development foundation models, there is growing interest exploring their abilities reasoning tasks. In this paper, we introduce seminal models...

10.31219/osf.io/ac4sp preprint EN 2023-12-13

We survey applications of pretrained foundation models in robotics. Traditional deep learning robotics are trained on small datasets tailored for specific tasks, which limits their adaptability across diverse applications. In contrast, internet-scale data appear to have superior generalization capabilities, and some instances display an emergent ability find zero-shot solutions problems that not present the training data. Foundation may hold potential enhance various components robot...

10.48550/arxiv.2312.07843 preprint EN cc-by arXiv (Cornell University) 2023-01-01

In this paper, we address the problem of forecasting trajectory an egocentric camera wearer (ego-person) in crowded spaces. The ability learned from data different wearers walking around real world can be transferred to assist visually impaired people navigation, as well instill human navigation behaviours mobile robots, enabling better human-robot interactions. To end, a novel dataset was constructed, containing trajectories navigating spaces wearing camera, extracted rich contextual data....

10.1109/lra.2022.3188101 article EN IEEE Robotics and Automation Letters 2022-07-04

In this paper, we propose a novel framework, called Semi-supervised Embedding in Attributed Networks with Outliers (SEANO), to learn low-dimensional vector representation that systematically captures the topological proximity, attribute affinity and label similarity of vertices partially labeled attributed network (PLAN). Our method is designed work both transductive inductive settings while explicitly alleviating noise effects from outliers. Experimental results on various datasets drawn...

10.48550/arxiv.1703.08100 preprint EN other-oa arXiv (Cornell University) 2017-01-01

Vertical federated learning (vFL) has gained much attention and been deployed to solve machine problems with data privacy concerns in recent years. However, some work demonstrated that vFL is vulnerable leakage even though only the forward intermediate embedding (rather than raw features) backpropagated gradients labels) are communicated between involved participants. As labels often contain highly sensitive information, proposed prevent label from effectively vFL. these identified defended...

10.48550/arxiv.2203.01451 preprint EN other-oa arXiv (Cornell University) 2022-01-01

3D vehicle detection based on point cloud is a challenging task in real-world applications such as autonomous driving. Despite significant progress has been made, we observe two aspects to be further improved. First, the semantic context information LiDAR seldom explored previous works, which may help identify ambiguous vehicles. Second, distribution of vehicles varies continuously with increasing depths, not well modeled by single model. In this work, propose unified model SegVoxelNet...

10.48550/arxiv.2002.05316 preprint EN other-oa arXiv (Cornell University) 2020-01-01

Grasping in cluttered scenes is challenging for robot vision systems, as detection accuracy can be hindered by partial occlusion of objects. We adopt a reinforcement learning (RL) framework and 3D architectures to search feasible viewpoints grasping the use hand-mounted RGB-D cameras. To overcome disadvantages photo-realistic environment simulation, we propose large-scale dataset called Real Embodied Dataset (RED), which includes full-viewpoint real samples on upper hemisphere with amodal...

10.1109/icra40945.2020.9197185 article EN 2020-05-01

Vertical Federated Learning (vFL) allows multiple parties that own different attributes (e.g. features and labels) of the same data entity a person) to jointly train model. To prepare training data, vFL needs identify common entities shared by all parties. It is usually achieved Private Set Intersection (PSI) which identifies intersection samples from using personal identifiable information email) as sample IDs align instances. As result, PSI would make visible parties, therefore each party...

10.48550/arxiv.2106.05508 preprint EN other-oa arXiv (Cornell University) 2021-01-01

Low-bit quantization is challenging to maintain high performance with limited model capacity (e.g., 4-bit for both weights and activations). Naturally, the distribution of activations in deep neural network are Gaussian-like. Nevertheless, due bitwidth low-bit model, uniform-like distributed have been proved be more friendly while preserving accuracy. Motivated by this, we propose Scale-Clip, a Distribution Reshaping technique that can reshape or into dynamic manner. Furthermore, increase...

10.1109/cvprw50498.2020.00348 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) 2020-06-01
Coming Soon ...