Junnan Li

ORCID: 0000-0002-1405-2034
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Domain Adaptation and Few-Shot Learning
  • Advanced Neural Network Applications
  • Multimodal Machine Learning Applications
  • Human Pose and Action Recognition
  • Video Surveillance and Tracking Methods
  • Machine Learning and Data Classification
  • Anomaly Detection Techniques and Applications
  • AI in cancer detection
  • Gait Recognition and Analysis
  • COVID-19 diagnosis using AI
  • Brain Tumor Detection and Classification
  • Microplastics and Plastic Pollution
  • Advanced Image and Video Retrieval Techniques
  • Software Engineering Research
  • Caching and Content Delivery
  • Network Packet Processing and Optimization
  • Artificial Intelligence in Healthcare
  • Cloud Computing and Resource Management
  • IoT and Edge/Fog Computing
  • Network Security and Intrusion Detection
  • Privacy-Preserving Technologies in Data
  • Natural Language Processing Techniques
  • Nanoparticles: synthesis and applications
  • Topic Modeling
  • Medical Imaging and Analysis

Manchester University
2024

Shanghai Jiao Tong University
2015-2023

Salesforce (United States)
2019-2023

Singapore-HUJ Alliance for Research and Enterprise
2022

National University of Defense Technology
2020-2022

National University of Singapore
2016-2020

Fudan University
2018-2020

Xi'an Jiaotong University
2018

Yunnan University
2016

This paper presents Prototypical Contrastive Learning (PCL), an unsupervised representation learning method that addresses the fundamental limitations of instance-wise contrastive learning. PCL not only learns low-level features for task instance discrimination, but more importantly, it implicitly encodes semantic structures data into learned embedding space. Specifically, we introduce prototypes as latent variables to help find maximum-likelihood estimation network parameters in...

10.48550/arxiv.2005.04966 preprint EN other-oa arXiv (Cornell University) 2020-01-01

Semi-supervised learning has been an effective paradigm for leveraging unlabeled data to reduce the reliance on labeled data. We propose CoMatch, a new semi-supervised method that unifies dominant approaches and addresses their limitations. CoMatch jointly learns two representations of training data, class probabilities low-dimensional embeddings. The interact with each other evolve. embeddings impose smoothness constraint improve pseudo-labels, whereas pseudo-labels regularize structure...

10.1109/iccv48922.2021.00934 article EN 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2021-10-01

Large language models (LLMs) pretrained on vast source code have achieved prominent progress in intelligence. However, existing LLMs two main limitations. First, they often adopt a specific architecture (encoder-only or decoder-only) rely unified encoder-decoder network for different downstream tasks, lacking the flexibility to operate optimal task. Secondly, employ limited set of pretraining objectives which might not be relevant some tasks and hence result substantial performance degrade....

10.18653/v1/2023.emnlp-main.68 article EN cc-by Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing 2023-01-01

Data heterogeneity across clients in federated learning (FL) settings is a widely acknowledged challenge. In response, personalized (PFL) emerged as framework to curate local models for clients' tasks. PFL, common strategy develop and global jointly - the model (for generalization) informs models, personalization) are aggregated update model. A key observation that if we can improve generalization ability of then which turn builds better models. this work, consider class imbalance, an...

10.1609/aaai.v37i6.25891 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2023-06-26

The recent success in human action recognition with deep learning methods mostly adopt the supervised paradigm, which requires significant amount of manually labeled data to achieve good performance. However, label collection is an expensive and time-consuming process. In this work, we propose unsupervised framework, exploits unlabeled learn video representations. Different from previous works representation learning, our task predict 3D motion multiple target views using a source view. By...

10.48550/arxiv.1809.01844 preprint EN other-oa arXiv (Cornell University) 2018-01-01

We introduce Reward-Guided Speculative Decoding (RSD), a novel framework aimed at improving the efficiency of inference in large language models (LLMs). RSD synergistically combines lightweight draft model with more powerful target model, incorporating controlled bias to prioritize high-reward outputs, contrast existing speculative decoding methods that enforce strict unbiasedness. employs process reward evaluate intermediate steps and dynamically decide whether invoke optimizing trade-off...

10.48550/arxiv.2501.19324 preprint EN arXiv (Cornell University) 2025-01-31

Training deep learning based video classifiers for action recognition requires a large amount of labeled videos. The labeling process is labor-intensive and time-consuming. On the other hand, weakly-labeled images are uploaded to Internet by users everyday. To harness rich highly diverse set Web images, scalable approach crawl these train classifier, such as Convolutional Neural Networks (CNN). However, due domain shift problem, performance trained tend degrade when directly deployed One way...

10.1145/3123266.3123432 article EN Proceedings of the 30th ACM International Conference on Multimedia 2017-10-19

Dielectric elastomer actuators (DEAs) have been widely employed as artificial muscles in soft robots. Due to material viscoelasticity and nonlinear electromechanical coupling, it is challenging accurately model a viscoelastic DEA, especially when the actuator of complex or irregular configuration. Control DEAs thus but significant. In this letter, we propose model-free method for control DEAs, based on deep reinforcement learning. We perform dynamic feedback by considering time-dependent...

10.1109/lra.2019.2898710 article EN IEEE Robotics and Automation Letters 2019-02-11

Self-supervised feature representations have been shown to be useful for supervised classification, few-shot learning, and adversarial robustness. We show that features obtained using self-supervised learning are comparable to, or better than, domain generalization in computer vision. introduce a new pretext task of predicting responses Gabor filter banks demonstrate multi-task compatible tasks improves performance as compared training individual alone. Features learnt through...

10.48550/arxiv.2003.13525 preprint EN other-oa arXiv (Cornell University) 2020-01-01

Recent advancements in multimodal pre-training methods have shown promising efficacy 3D representation learning by aligning features across shapes, their 2D counterparts, and language descriptions. However, the used existing frameworks to gather data for applications lack scalability comprehensiveness, potentially constraining full potential of learning. The main bottleneck lies modality's comprehensiveness. To address this, we introduce ULIP-2, a tri-modal framework that leverages...

10.48550/arxiv.2305.08275 preprint EN cc-by-sa arXiv (Cornell University) 2023-01-01

10.1007/s11263-020-01295-1 article EN International Journal of Computer Vision 2020-02-03

The recent development of commodity 360° cameras have enabled a single video to capture an entire scene, which endows promising potentials in surveillance scenarios. However, research omnidirectional analysis has lagged behind the hardware advances. In this work, we address important problem action recognition topview videos. Due wide filed-of-view, videos usually multiple people performing actions at same time. Furthermore, appearance are deformed. proposed framework first transforms...

10.1109/wacv45572.2020.9093283 article EN 2020-03-01

Training deep object detectors requires significant amount of human-annotated images with accurate labels and bounding box coordinates, which are extremely expensive to acquire. Noisy annotations much more easily accessible, but they could be detrimental for learning. We address the challenging problem training noisy annotations, where noise contains a mixture label noise. propose learning framework jointly optimizes labels, model parameters by performing alternating correction training. To...

10.48550/arxiv.2003.01285 preprint EN other-oa arXiv (Cornell University) 2020-01-01
Coming Soon ...