NFDI4DS | UHH-SEMS - Publication Details

Jiafan He

ORCID: 0009-0008-0815-5783

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5101300665

Research Areas

Advanced Bandit Algorithms Research
Reinforcement Learning in Robotics
Military Defense Systems Analysis
Guidance and Control Systems
Adversarial Robustness in Machine Learning
Optimization and Search Problems
UAV Applications and Optimization
Aerospace and Aviation Technology
Musculoskeletal pain and rehabilitation
Smart Grid Energy Management
Auction Theory and Applications
Robotics and Sensor-Based Localization
Data Stream Mining Techniques
Pain Management and Opioid Use
Machine Learning and ELM
Simulation Techniques and Applications
Age of Information Optimization
Machine Learning and Algorithms
Influenza Virus Research Studies
Advanced Image Processing Techniques
Game Theory and Voting Systems
Spacecraft Dynamics and Control
Face and Expression Recognition
Pain Management and Placebo Effect
Image Processing Techniques and Applications

Institute of Electronics
2024

Nanjing University of Information Science and Technology
2020-2024

Tsinghua University
2019

Exploring the Pain Situation, Pain Impact, and Educational Preferences of Pain Among Adults in Mainland China, a Cross-Sectional Study

OPENALEX - Publications

Jiafan He Mun Yee Mimi Tse T Kwok Timothy Chung Ming Wu Saizhao Tang

This study aimed to investigate the pain situation, functional limitations, treatment used, care-seeking behaviors, and educational preferences of adults with in mainland China. An online questionnaire was developed through expert validation, participants were recruited via social media platforms. Inclusion criteria required having access Internet smartphones, while individuals significant cognitive impairments or severe mental illness excluded. 1566 participants, predominantly male (951) a...

10.3390/healthcare13030289 article EN Healthcare 2025-01-31

Achieving a Fairer Future by Changing the Past

OPENALEX - Publications

Jiafan He Ariel D. Procaccia Alexandros Psomas David Zeng

We study the problem of allocating T indivisible items that arrive online to agents with additive valuations. The allocation must satisfy a prominent fairness notion, envy-freeness up one item (EF1), at each round. To make this possible, we allow reallocation previously allocated items, but aim minimize these so-called adjustments. For case two agents, show algorithms are informed about values future can get by without any adjustments, whereas uninformed require Theta(T) general three or...

10.24963/ijcai.2019/49 article EN 2019-07-28

Multi-Augmentation-Based Contrastive Learning for Semi-Supervised Learning

OPENALEX - Publications

Jie Wang Jie Yang Jiafan He Dongliang Peng

Semi-supervised learning has been proven to be effective in utilizing unlabeled samples mitigate the problem of limited labeled data. Traditional semi-supervised methods generate pseudo-labels for and train classifier using both pseudo-labeled samples. However, data-scarce scenarios, reliance on initial generation can degrade performance. Methods based consistency regularization have shown promising results by encouraging consistent outputs different semantic variations same sample obtained...

10.3390/a17030091 article EN cc-by Algorithms 2024-02-20

A Simple and Provably Efficient Algorithm for Asynchronous Federated Contextual Linear Bandits

OPENALEX - Publications

Jiafan He Tianhao Wang Yifei Min Quanquan Gu

We study federated contextual linear bandits, where $M$ agents cooperate with each other to solve a global bandit problem the help of central server. consider asynchronous setting, all work independently and communication between one agent server will not trigger agents' communication. propose simple algorithm named \texttt{FedLinUCB} based on principle optimism. prove that regret is bounded by $\tilde{O}(d\sqrt{\sum_{m=1}^M T_m})$ complexity $\tilde{O}(dM^2)$, $d$ dimension vector $T_m$...

10.48550/arxiv.2207.03106 preprint EN other-oa arXiv (Cornell University) 2022-01-01

An Air Combat Simulation System for Intelligent Decision-Making

OPENALEX - Publications

Chao Liu Jiafan He Liqingwei Wangman

Command and control in modern air combat is showing pressing demand for intelligent decision-making. Research on decision-making technique relies data accumulation by simulations, which still remains underdeveloped at present. In this paper we introduce a tactical-level simulation system decision-making, can simulate between formations, has multiple application modes, namely Man-Man, Man-Machine, Machine-Machine. This collect diversified fine-grid data, provide support training of air-combat...

10.1109/ihmsc49165.2020.10102 article EN 2020-08-01

Emergency Localization for Mobile Ground Users: An Adaptive UAV Trajectory Planning Method

OPENALEX - Publications

Zhihao Zhu Jiafan He Luyang Hou Lianming Xu Wendi Zhu and 1 more

In emergency search and rescue scenarios, the quick location of trapped people is essential. However, disasters can render Global Positioning System (GPS) unusable. Unmanned aerial vehicles (UAVs) with localization devices serve as mobile anchors due to their agility high line-of-sight (LoS) probability. Nonetheless, number available UAVs during initial stages disaster relief limited, innovative methods are needed quickly plan UAV trajectories locate non-uniformly distributed dynamic targets...

10.48550/arxiv.2401.07256 preprint EN other-oa arXiv (Cornell University) 2024-01-01

Reinforcement Learning from Human Feedback with Active Queries

OPENALEX - Publications

Kaixuan Ji Jiafan He Quanquan Gu

Aligning large language models (LLM) with human preference plays a key role in building modern generative and can be achieved by reinforcement learning from feedback (RLHF). Despite their superior performance, current RLHF approaches often require amount of human-labelled data, which is expensive to collect. In this paper, inspired the success active learning, we address problem proposing query-efficient methods. We first formalize alignment as contextual dueling bandit design an...

10.48550/arxiv.2402.09401 preprint EN arXiv (Cornell University) 2024-02-14

Settling Constant Regrets in Linear Markov Decision Processes

OPENALEX - Publications

Weitong Zhang Zhiyuan Fan Jiafan He Quanquan Gu

We study the constant regret guarantees in reinforcement learning (RL). Our objective is to design an algorithm that incurs only finite over infinite episodes with high probability. introduce algorithm, Cert-LSVI-UCB, for misspecified linear Markov decision processes (MDPs) where both transition kernel and reward function can be approximated by some up misspecification level $\zeta$. At core of Cert-LSVI-UCB innovative certified estimator, which facilitates a fine-grained concentration...

10.48550/arxiv.2404.10745 preprint EN arXiv (Cornell University) 2024-04-16

Nearly Optimal Algorithms for Contextual Dueling Bandits from Adversarial Feedback

OPENALEX - Publications

Qiwei Di Jiafan He Quanquan Gu

Learning from human feedback plays an important role in aligning generative models, such as large language models (LLM). However, the effectiveness of this approach can be influenced by adversaries, who may intentionally provide misleading preferences to manipulate output undesirable or harmful direction. To tackle challenge, we study a specific model within problem domain--contextual dueling bandits with adversarial feedback, where true preference label flipped adversary. We propose...

10.48550/arxiv.2404.10776 preprint EN arXiv (Cornell University) 2024-04-16

Emergency Localization for Mobile Ground Users: An Adaptive UAV Trajectory Planning Method

OPENALEX - Publications

Zhihao Zhu Jiafan He Luyang Hou Lianming Xu Wendi Zhu and 1 more

10.1109/infocomwkshps61880.2024.10620725 article EN IEEE INFOCOM 2022 - IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS) 2024-05-20

OGRN: optical-guided residual dense network for SAR image super-resolution reconstruction network

OPENALEX - Publications

Yanshan Li Jiafan He Fan Xu Jianlin Xiang Jiaxin Chen

Although Convolutional Neural Networks have significantly improved the development of SAR image super-resolution (SR) technology in recent years, it is a very challenging problem to reconstruct with large-scale factors, such as ×4 and ×8 due limited available information from low-resolution image. The co-registered high-resolution optical has been successfully applied enhance quality its discriminative characteristics. Compared single-frame SR reconstruction technology, image-guided better...

10.1080/01431161.2024.2408039 article EN International Journal of Remote Sensing 2024-10-07

Cooperative Multi-Agent Reinforcement Learning: Asynchronous Communication and Linear Function Approximation

OPENALEX - Publications

Yifei Min Jiafan He Tianhao Wang Quanquan Gu

We study multi-agent reinforcement learning in the setting of episodic Markov decision processes, where multiple agents cooperate via communication through a central server. propose provably efficient algorithm based on value iteration that enable asynchronous while ensuring advantage cooperation with low overhead. With linear function approximation, we prove our enjoys an $\tilde{\mathcal{O}}(d^{3/2}H^2\sqrt{K})$ regret $\tilde{\mathcal{O}}(dHM^2)$ complexity, $d$ is feature dimension, $H$...

10.48550/arxiv.2305.06446 preprint EN other-oa arXiv (Cornell University) 2023-01-01

Towards Robust Model-Based Reinforcement Learning Against Adversarial Corruption

OPENALEX - Publications

Chenlu Ye Jiafan He Quanquan Gu Tong Zhang

This study tackles the challenges of adversarial corruption in model-based reinforcement learning (RL), where transition dynamics can be corrupted by an adversary. Existing studies on corruption-robust RL mostly focus setting model-free RL, robust least-square regression is often employed for value function estimation. However, these techniques cannot directly applied to RL. In this paper, we and take maximum likelihood estimation (MLE) approach learn model. Our work encompasses both online...

10.48550/arxiv.2402.08991 preprint EN arXiv (Cornell University) 2024-02-14

Nearly Minimax Optimal Reinforcement Learning for Linear Markov Decision Processes

OPENALEX - Publications

Jiafan He Heyang Zhao Dongruo Zhou Quanquan Gu

We study reinforcement learning (RL) with linear function approximation. For episodic time-inhomogeneous Markov decision processes (linear MDPs) whose transition probability can be parameterized as a of given feature mapping, we propose the first computationally efficient algorithm that achieves nearly minimax optimal regret $\tilde O(d\sqrt{H^3K})$, where $d$ is dimension $H$ planning horizon, and $K$ number episodes. Our based on weighted regression scheme carefully designed weight, which...

10.48550/arxiv.2212.06132 preprint EN other-oa arXiv (Cornell University) 2022-01-01

Nearly Optimal Algorithms for Linear Contextual Bandits with Adversarial Corruptions

OPENALEX - Publications

Jiafan He Dongruo Zhou Tong Zhang Quanquan Gu

We study the linear contextual bandit problem in presence of adversarial corruption, where reward at each round is corrupted by an adversary, and corruption level (i.e., sum magnitudes over horizon) $C\geq 0$. The best-known algorithms this setting are limited that they either computationally inefficient or require a strong assumption on their regret least $C$ times worse than without corruption. In paper, to overcome these limitations, we propose new algorithm based principle optimism face...

10.48550/arxiv.2205.06811 preprint EN other-oa arXiv (Cornell University) 2022-01-01

Variance-Dependent Regret Bounds for Linear Bandits and Reinforcement Learning: Adaptivity and Computational Efficiency

OPENALEX - Publications

Heyang Zhao Jiafan He Dongruo Zhou Tong Zhang Quanquan Gu

Recently, several studies (Zhou et al., 2021a; Zhang 2021b; Kim 2021; Zhou and Gu, 2022) have provided variance-dependent regret bounds for linear contextual bandits, which interpolates the worst-case regime deterministic reward regime. However, these algorithms are either computationally intractable or unable to handle unknown variance of noise. In this paper, we present a novel solution open problem by proposing first efficient algorithm bandits with heteroscedastic Our is adaptive noise...

10.48550/arxiv.2302.10371 preprint EN other-oa arXiv (Cornell University) 2023-01-01

On the Interplay Between Misspecification and Sub-optimality Gap in Linear Contextual Bandits

OPENALEX - Publications

Weitong Zhang Jiafan He Zhiyuan Fan Quanquan Gu

We study linear contextual bandits in the misspecified setting, where expected reward function can be approximated by a class up to bounded misspecification level $\zeta>0$. propose an algorithm based on novel data selection scheme, which only selects vectors with large uncertainty for online regression. show that, when $\zeta$ is dominated $\tilde O (\Delta / \sqrt{d})$ $\Delta$ being minimal sub-optimality gap and $d$ dimension of vectors, our enjoys same gap-dependent regret bound...

10.48550/arxiv.2303.09390 preprint EN other-oa arXiv (Cornell University) 2023-01-01

Coming Soon ...