NFDI4DS | UHH-SEMS - Publication Details

Kai Fan

ORCID: 0000-0002-8256-0807

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5055225377

Research Areas

Topic Modeling
Natural Language Processing Techniques
Multimodal Machine Learning Applications
Generative Adversarial Networks and Image Synthesis
Speech Recognition and Synthesis
Bayesian Methods and Mixture Models
Gaussian Processes and Bayesian Inference
Markov Chains and Monte Carlo Methods
Evolutionary Algorithms and Applications
Machine Learning and Algorithms
Anomaly Detection Techniques and Applications
Image and Signal Denoising Methods
Metaheuristic Optimization Algorithms Research
Advanced Database Systems and Queries
Complex Systems and Time Series Analysis
Model Reduction and Neural Networks
Advanced Text Analysis Techniques
Advanced Image Processing Techniques
Data Mining Algorithms and Applications
Biomedical Text Mining and Ontologies
Sparse and Compressive Sensing Techniques
AI and Big Data Applications
Stock Market Forecasting Methods
Internet Traffic Analysis and Secure E-voting
Random Matrices and Applications

University of International Business and Economics
2021-2023

Alibaba Group (United States)
2018-2023

Alibaba Group (China)
2021-2022

Alibaba Group (Cayman Islands)
2018-2022

North China Institute of Aerospace Engineering
2021

Tongji University
2021

Shenzhen Institutes of Advanced Technology
2020

China Southern Power Grid (China)
2020

Duke University
2015-2019

Harbin University of Science and Technology
2019

Adversarial Feature Matching for Text Generation

OPENALEX - Publications

Yizhe Zhang Zhe Gan Kai Fan Zhi Chen Ricardo Henao and 2 more

The Generative Adversarial Network (GAN) has achieved great success in generating realistic (real-valued) synthetic data. However, convergence issues and difficulties dealing with discrete data hinder the applicability of GAN to text. We propose a framework for text via adversarial training. employ long short-term memory network as generator, convolutional discriminator. Instead using standard objective GAN, we matching high-dimensional latent feature distributions real sentences, kernelized...

10.48550/arxiv.1706.03850 preprint EN other-oa arXiv (Cornell University) 2017-01-01

Massive computational acceleration by using neural networks to emulate mechanism-based biological models

OPENALEX - Publications

Shangying Wang Kai Fan Nan Luo Yangxiaolu Cao Feilun Wu and 3 more

Abstract For many biological applications, exploration of the massive parametric space a mechanism-based model can impose prohibitive computational demand. To overcome this limitation, we present framework to improve efficiency by orders magnitude. The key concept is train neural network using limited number simulations generated mechanistic model. This small enough such that be completed in short time frame but large enable reliable training. trained then used explore much larger space. We...

10.1038/s41467-019-12342-y article EN cc-by Nature Communications 2019-09-25

Unsupervised Multi-Modal Neural Machine Translation

OPENALEX - Publications

Yuanhang Su Kai Fan Nguyễn Bách C.‐C. Jay Kuo Fei Huang

Unsupervised neural machine translation (UNMT) has recently achieved remarkable results \cite{lample2018phrase} with only large monolingual corpora in each language. However, the uncertainty of associating target source sentences makes UNMT theoretically an ill-posed problem. This work investigates possibility utilizing images for disambiguation to improve performance UNMT. Our assumption is intuitively based on invariant property image, i.e., description same visual content by different...

10.1109/cvpr.2019.01073 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2019-06-01

“Bilingual Expert” Can Find Translation Errors

OPENALEX - Publications

Kai Fan Jiayi Wang Bo Li Fengming Zhou Boxing Chen and 1 more

The performances of machine translation (MT) systems are usually evaluated by the metric BLEU when golden references provided. However, in case model inference or production deployment, expensively available, such as human annotation with bilingual expertise. In order to address issue quality estimation (QE) without reference, we propose a general framework for automatic evaluation output QE task Conference on Statistical Machine Translation (WMT). We first build conditional target language...

10.1609/aaai.v33i01.33016367 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2019-07-17

Lattice Transformer for Speech Translation

OPENALEX - Publications

Pei Zhang Niyu Ge Boxing Chen Kai Fan

Recent advances in sequence modeling have highlighted the strengths of transformer architecture, especially achieving state-of-the-art machine translation results. However, depending on up-stream systems, e.g., speech recognition, or word segmentation, input to system can vary greatly. The goal this work is extend attention mechanism naturally consume lattice addition traditional sequential input. We first propose a general for where output automatic recognition (ASR) which contains multiple...

10.18653/v1/p19-1649 preprint EN cc-by 2019-01-01

Zero-Shot Learning via Class-Conditioned Deep Generative Models

OPENALEX - Publications

Wenlin Wang Yunchen Pu Vinay Kumar Verma Kai Fan Yizhe Zhang and 3 more

We present a deep generative model for learning to predict classes not seen at training time. Unlike most existing methods this problem, that represent each class as point (via semantic embedding), we seen/unseen using class-specific latent-space distribution, conditioned on attributes. use these distributions prior supervised variational autoencoder (VAE), which also facilitates highly discriminative feature representations the inputs. The entire framework is learned end-to-end only...

10.48550/arxiv.1711.05820 preprint EN other-oa arXiv (Cornell University) 2017-01-01

Boosting Variational Inference

OPENALEX - Publications

Fangjian Guo Xiangyu Wang Kai Fan Tamara Broderick David B. Dunson

Variational inference (VI) provides fast approximations of a Bayesian posterior in part because it formulates approximation as an optimization problem: to find the closest distribution exact over some family distributions. For practical reasons, distributions VI is usually constrained so that does not include posterior, even limit point. Thus, no matter how long run, resulting will approach posterior. We propose instead consider more flexible approximating consisting all possible finite...

10.48550/arxiv.1611.05559 preprint EN other-oa arXiv (Cornell University) 2016-01-01

Improving Distantly Supervised Relation Extraction with Neural Noise Converter and Conditional Optimal Selector

OPENALEX - Publications

Shanchan Wu Kai Fan Qiong Zhang

Distant supervised relation extraction has been successfully applied to large corpus with thousands of relations. However, the inevitable wrong labeling problem by distant supervision will hurt performance extraction. In this paper, we propose a method neural noise converter alleviate impact noisy data, and conditional optimal selector make proper prediction. Our learns structured transition matrix on logit level captures property dataset. The other hand helps prediction decision an entity...

10.1609/aaai.v33i01.33017273 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2019-07-17

Assessing historical snowfall patterns in Seoul from 1625 to 1907 CE in relation to the Grand Solar Minima

OPENALEX - Publications

Yuqi Wang Wei Yong Feng Shi Zhonghua Yao Shiling Yang and 8 more

10.26464/epp2025022 article EN cc-by-nc-nd Earth and Planetary Physics 2025-01-01

Long-Short Term Masking Transformer: A Simple but Effective Baseline for Document-level Neural Machine Translation

OPENALEX - Publications

Pei Zhang Boxing Chen Niyu Ge Kai Fan

Many document-level neural machine translation (NMT) systems have explored the utility of context-aware architecture, usually requiring an increasing number parameters and computational complexity. However, few attention is paid to baseline model. In this paper, we research extensively pros cons standard transformer in translation, find that auto-regressive property can simultaneously bring both advantage consistency disadvantage error accumulation. Therefore, propose a surprisingly simple...

10.18653/v1/2020.emnlp-main.81 article EN cc-by 2020-01-01

AlphaMath Almost Zero: process Supervision without process

OPENALEX - Publications

Chen Guoxin Minpeng Liao Chengxi Li Kai Fan

Recent advancements in large language models (LLMs) have substantially enhanced their mathematical reasoning abilities. However, these still struggle with complex problems that require multiple steps, frequently leading to logical or numerical errors. While mistakes can largely be addressed by integrating a code interpreter, identifying errors within intermediate steps is more challenging. Moreover, manually annotating for training not only expensive but also demands specialized expertise....

10.48550/arxiv.2405.03553 preprint EN arXiv (Cornell University) 2024-05-06

Learning a generative classifier from label proportions

OPENALEX - Publications

Kai Fan Hongyi Zhang Songbai Yan Liwei Wang Wensheng Zhang and 1 more

10.1016/j.neucom.2013.09.057 article EN Neurocomputing 2014-03-31

High-Order Stochastic Gradient Thermostats for Bayesian Learning of Deep Models

OPENALEX - Publications

Chunyuan Li Changyou Chen Kai Fan Lawrence Carin

Learning in deep models using Bayesian methods has generated significant attention recently. This is largely because of the feasibility modern to yield scalable learning and inference, while maintaining a measure uncertainty model parameters. Stochastic gradient MCMC algorithms (SG-MCMC) are family diffusion-based sampling for large-scale learning. In SG-MCMC, multivariate stochastic thermostats (mSGNHT) augment each parameter interest, with momentum thermostat variable maintain stationary...

10.1609/aaai.v30i1.10199 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2016-02-21

A Unifying Variational Inference Framework for Hierarchical Graph-Coupled HMM with an Application to Influenza Infection

OPENALEX - Publications

Kai Fan Chunyuan Li Katherine Heller

The Hierarchical Graph-Coupled Hidden Markov Model (hGCHMM) is a useful tool for tracking and predicting the spread of contagious diseases, such as influenza, by leveraging social contact data collected from individual wearable devices. However, existing inference algorithms depend on assumption that infection rates are small in probability, typically close to 0. purpose this paper build unified learning framework latent state estimation hGCHMM, regardless rate transition function. We derive...

10.1609/aaai.v30i1.9894 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2016-03-05

Hierarchical Graph-Coupled HMMs for Heterogeneous Personalized Health Data

OPENALEX - Publications

Kai Fan Marisa C. Eisenberg Alison R. Walsh Allison E. Aiello Katherine Heller

The purpose of this study is to leverage modern technology (mobile or web apps) enrich epidemiology data and infer the transmission disease. We develop hierarchical Graph-Coupled Hidden Markov Models (hGCHMMs) simultaneously track spread infection in a small cell phone community capture person-specific parameters by leveraging link prior that incorporates additional covariates. In paper we investigate two functions, beta-exponential sigmoid link, both which allow development principled...

10.1145/2783258.2783326 article EN 2015-08-07

An inner-loop free solution to inverse problems using deep neural networks

OPENALEX - Publications

Qi Wei Kai Fan Lawrence Carin Katherine Heller

We propose a new method that uses deep learning techniques to accelerate the popular alternating direction of multipliers (ADMM) solution for inverse problems. The ADMM updates consist proximity operator, least squares regression includes big matrix inversion, and an explicit updating dual variables. Typically, inner loops are required solve first two sub-minimization problems due intractability prior inversion. To avoid such drawbacks or limitations, we inner-loop free update rule with...

10.48550/arxiv.1709.01841 preprint EN other-oa arXiv (Cornell University) 2017-01-01

Fast Second-Order Stochastic Backpropagation for Variational Inference

OPENALEX - Publications

Kai Fan Ziteng Wang Jeffrey Beck James T. Kwok Katherine Heller

We propose a second-order (Hessian or Hessian-free) based optimization method for variational inference inspired by Gaussian backpropagation, and argue that quasi-Newton can be developed as well. This is accomplished generalizing the gradient computation in stochastic backpropagation via reparametrization trick with lower complexity. As an illustrative example, we apply this approach to problems of Bayesian logistic regression auto-encoder (VAE). Additionally, compute bounds on estimator...

10.48550/arxiv.1509.02866 preprint EN other-oa arXiv (Cornell University) 2015-01-01

The Effectiveness Assessment of Massage Therapy Using Entropy-Based EEG Features Among Lumbar Disc Herniation Patients Comparing With Healthy Controls

OPENALEX - Publications

Huihui Li Wenjing Du Kai Fan Junsong Ma Kamen Ivanov and 1 more

Massage therapy (MT) is a useful complementary and alternative widely used in treating low back pain (LBP), including lumbar disc herniation (LDH). However, few studies revealed the quantitative entropy-based features of electroencephalography (EEG) for MT effectiveness LDH patients. This study investigated immediate effects Chinese massage on four EEG rhythms, using eight (approximation entropy (ApEn), Sample Entropy (SampEn), wavelet (WaveEn), Hilbert-Huang Transform Marginal spectrum...

10.1109/access.2020.2964050 article EN cc-by IEEE Access 2020-01-01

Unifying Cross-lingual Summarization and Machine Translation with Compression Rate

OPENALEX - Publications

Yu Bai Heyan Huang Kai Fan Yang Gao Yiming Zhu and 3 more

Cross-Lingual Summarization (CLS) is a task that extracts important information from source document and summarizes it into summary in another language. It challenging requires system to understand, summarize, translate at the same time, making highly related Monolingual (MS) Machine Translation (MT). In practice, training resources for are far more than cross-lingual monolingual summarization. Thus incorporating corpus CLS would be beneficial its performance. However, present work only...

10.1145/3477495.3532071 article EN Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval 2022-07-06

A Simple Concatenation can Effectively Improve Speech Translation

OPENALEX - Publications

Lin‐Lin Zhang Kai Fan Boxing Chen Luo Si

A triple speech translation data comprises speech, transcription, and translation.In the end-to-end paradigm, text machine (MT) usually plays role of a teacher model for (ST) via knowledge distillation. Parameter sharing with is often adopted to construct ST architecture, however, two modalities are independently fed trained different losses. This situation does not match ST’s properties across also limits upper bound performance. Inspired by works video Transformer, we propose simple...

10.18653/v1/2023.acl-short.153 article EN cc-by 2023-01-01

Coming Soon ...