NFDI4DS | UHH-SEMS - Publication Details

Conghui Tan

ORCID: 0000-0003-3993-4751

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5101874795

Research Areas

Sparse and Compressive Sensing Techniques
Privacy-Preserving Technologies in Data
Stochastic Gradient Optimization Techniques
Music and Audio Processing
Speech Recognition and Synthesis
Speech and Audio Processing
Tensor decomposition and applications
Face and Expression Recognition
Mobile Crowdsensing and Crowdsourcing
Markov Chains and Monte Carlo Methods
Advanced Optimization Algorithms Research
Recommender Systems and Techniques
Advanced Bandit Algorithms Research
Optimization and Variational Analysis
Topic Modeling
Statistical Methods and Inference
Hydrological Forecasting Using AI
Metaheuristic Optimization Algorithms Research
Data Quality and Management
Acoustic Wave Phenomena Research
Complexity and Algorithms in Graphs
Digital Media Forensic Detection
Water resources management and optimization
Artificial Intelligence in Healthcare and Education

West Bengal Electronics Industry Development Corporation Limited (India)
2019-2020

Chinese University of Hong Kong
2016-2020

ITT Technical Institute
2005

Central Server Free Federated Learning over Single-sided Trust Social Networks

OPENALEX - Publications

Chaoyang He Conghui Tan Hanlin Tang Shuang Qiu Ji Liu

Federated learning has become increasingly important for modern machine learning, especially data privacy-sensitive scenarios. Existing federated mostly adopts the central server-based architecture or centralized architecture. However, in many social network scenarios, is not applicable (e.g., a agent server connecting all users may exist, communication cost to affordable). In this paper, we consider generic setting: 1) and 2) unidirectional of single-sided trust (i.e., user A trusts B but...

10.48550/arxiv.1910.04956 preprint EN cc-by arXiv (Cornell University) 2019-01-01

Barzilai-Borwein Step Size for Stochastic Gradient Descent

OPENALEX - Publications

Conghui Tan Shiqian Ma Yu‐Hong Dai Yuqiu Qian

One of the major issues in stochastic gradient descent (SGD) methods is how to choose an appropriate step size while running algorithm. Since traditional line search technique does not apply for optimization algorithms, common practice SGD either use a diminishing size, or tune fixed by hand, which can be time consuming practice. In this paper, we propose Barzilai-Borwein (BB) method automatically compute sizes and its variant: variance reduced (SVRG) method, leads two algorithms: SGD-BB...

10.48550/arxiv.1605.04131 preprint EN other-oa arXiv (Cornell University) 2016-01-01

Heterogeneous Latent Topic Discovery for Semantic Text Mining

OPENALEX - Publications

Yawen Li Di Jiang Rongzhong Lian Xueyang Wu Conghui Tan and 2 more

In order to mine latent semantics from text data, word embedding and topic modeling are two major methodologies in industry. From a pragmatic perspective, each of these lines semantic models faces increasing challenges real-life applications. However, modern mining tasks typically require panoramic view the semantics. Hence, discovering heterogeneous (e.g., types topics) is critical for performance tasks, it necessary design model that meets this demand. Furthermore, with arrival big data...

10.1109/tkde.2021.3077025 article EN IEEE Transactions on Knowledge and Data Engineering 2021-01-01

Fast and Secure Distributed Nonnegative Matrix Factorization

OPENALEX - Publications

Yuqiu Qian Conghui Tan Danhao Ding Hui Li Nikos Mamoulis

Nonnegative matrix factorization (NMF) has been successfully applied in several data mining tasks. Recently, there is an increasing interest the acceleration of NMF, due to its high cost on large matrices. On other hand, privacy issue NMF over federated worthy attention, since prevalently image and text analysis which may involve leveraging (e.g, medical record) across parties (e.g., hospitals). In this paper, we study <italic xmlns:mml="http://www.w3.org/1998/Math/MathML"...

10.1109/tkde.2020.2985964 article EN IEEE Transactions on Knowledge and Data Engineering 2020-04-10

A GDPR-compliant Ecosystem for Speech Recognition with Transfer, Federated, and Evolutionary Learning

OPENALEX - Publications

Di Jiang Conghui Tan Jinhua Peng Chaotao Chen Xueyang Wu and 7 more

Automatic Speech Recognition (ASR) is playing a vital role in wide range of real-world applications. However, Commercial ASR solutions are typically “one-size-fits-all” products and clients inevitably faced with the risk severe performance degradation field test. Meanwhile, new data regulations such as European Union’s General Data Protection Regulation (GDPR) coming into force, vendors, which traditionally utilize speech training centralized approach, becoming increasingly helpless to solve...

10.1145/3447687 article EN ACM Transactions on Intelligent Systems and Technology 2021-05-05

Neural Moderation of ASMR Erotica Content in Social Networks

OPENALEX - Publications

Yixin Chen Di Jiang Conghui Tan Yuanfeng Song Chen Zhang and 1 more

With the popularity of video/audio streaming applications in recent years, wide spread Autonomous Sensory Meridian Response (ASMR) erotica content is becoming a serious issue social networks. Due to subtle nature ASMR and its relative rareness real scenario, detecting contents challenging task. In this article, we propose novel neural framework for moderation. The proposed consists pipeline strategies tackle challenges unique Erotica Contents such as data scarcity imbalanced data. Based on...

10.1109/tkde.2023.3283501 article EN IEEE Transactions on Knowledge and Data Engineering 2023-06-07

A De Novo Divide-and-Merge Paradigm for Acoustic Model Optimization in Automatic Speech Recognition

OPENALEX - Publications

Conghui Tan Di Jiang Jinhua Peng Xueyang Wu Qian Xu and 1 more

Due to the rising awareness of privacy protection and voluminous scale speech data, it is becoming infeasible for Automatic Speech Recognition (ASR) system developers train acoustic model with complete data as before. In this paper, we propose a novel Divide-and-Merge paradigm solve salient problems plaguing ASR field. Divide phase, multiple models are trained based upon different subsets while in Merge phase two algorithms utilized generate high-quality those on subsets. We first Genetic...

10.24963/ijcai.2020/513 article EN 2020-07-01

A Study of Applying Genetic Algorithm to Predict Reservoir Water Quality

OPENALEX - Publications

L. Chen Mohammad S. Jamal Conghui Tan Basmah Alabbadi

This paper is aimed at demonstrating a genetic algorithm method and applying it to predict the water quality of reservoir in Taiwan island using remote sensing data.Genetic algorithms will be combined with operation tree (GAOT) find relationships between input output data.A fittest function type obtained automatically from this method.The advantages GA are global optimization, nonlinearity, flexibility parallelism.In current case study, used construct relationship algae concentration Landsat...

10.7763/ijmo.2017.v7.566 article EN International Journal of Modeling and Optimization 2017-04-01

Application of a sequential pattern learning system to connected speech recognition

OPENALEX - Publications

Alice E. Smith Jeffrey N. Denenberg Thomas B. Slack Conghui Tan R. Wohlford

An Experimental Learning Element (ELE) for learning and recognizing sequential patterns is being developed as an adaptable pattern classifier of a larger system. Once external are converted into linear sequence named objects, the ELE can build models that associate input object sequences with expected output state sequences. The has been successfully demonstrated in hand-printed characters. This paper describes compares its performance Dynamic Time Wrap (DTW) based speech recognition system...

10.1109/icassp.1985.1168282 article EN 2005-03-23

DSANLS

OPENALEX - Publications

Yuqiu Qian Conghui Tan Nikos Mamoulis David W. Cheung

Nonnegative matrix factorization (NMF) has been successfully applied in different fields, such as text mining, image processing, and video analysis. NMF is the problem of determining two nonnegative low rank matrices U V, for a given input M, that m ≈ UV⊥. There an increasing interest parallel distributed algorithms, due to high cost centralized on large matrices. In this paper, we propose sketched alternating least squares(DSANLS) framework NMF, which utilizes sketching technique reduce...

10.1145/3159652.3159662 article EN 2018-02-02

Accelerated dual-averaging primal–dual method for composite convex minimization

OPENALEX - Publications

Conghui Tan Yuqiu Qian Shiqian Ma Tong Zhang

Dual averaging-type methods are widely used in industrial machine learning applications due to their ability promoting solution structure (e.g., sparsity) efficiently. In this paper, we propose a novel accelerated dual-averaging primal-dual algorithm for minimizing composite convex function. We also derive stochastic version of the proposed method which solves empirical risk minimization, and its advantages on handling sparse data demonstrated both theoretically empirically.

10.1080/10556788.2020.1713779 article EN Optimization methods & software 2020-01-20

Acoustic Model Optimization over Multiple Data Sources: Merging and Valuation

OPENALEX - Publications

Victor Junqiu Wei W.B. Wang Di Jiang Conghui Tan Rongzhong Lian

Due to the rising awareness of privacy protection and voluminous scale speech data, it is becoming infeasible for Automatic Speech Recognition (ASR) system developers train acoustic model with complete data as before. For example, may be owned by different curators, not allowed share others. In this paper, we propose a novel paradigm solve salient problems plaguing ASR field. first stage, multiple models are trained based upon subsets while in second phase, two algorithms utilized generate...

10.48550/arxiv.2410.15620 preprint EN arXiv (Cornell University) 2024-10-20

Stochastic Primal-Dual Method for Empirical Risk Minimization with $\mathcal{O}(1)$ Per-Iteration Complexity

OPENALEX - Publications

Conghui Tan Tong Zhang Shiqian Ma Liu Ji

Regularized empirical risk minimization problem with linear predictor appears frequently in machine learning. In this paper, we propose a new stochastic primal-dual method to solve class of problems. Different from existing methods, our proposed methods only require O(1) operations each iteration. We also develop variance-reduction variant the algorithm that converges linearly. Numerical experiments suggest are faster than ones such as proximal SGD, SVRG and SAGA on high-dimensional

10.48550/arxiv.1811.01182 preprint EN other-oa arXiv (Cornell University) 2018-01-01

A Platform for Deploying the TFE Ecosystem of Automatic Speech Recognition

OPENALEX - Publications

Yuanfeng Song Rongzhong Lian Yixin Chen Di Jiang Xuefang Zhao and 3 more

Since data regulations such as the European Union's General Data Protection Regulation (GDPR) have taken effect, traditional two-step Automatic Speech Recognition (ASR) optimization strategy (i.e., training a one-size-fits-all model with vendor's centralized and fine-tuning clients' private data) has become infeasible. To meet these privacy requirements, TFE, novel GDPR-compliant ASR ecosystem, been proposed by us to incorporate transfer learning, federated evolutionary learning towards...

10.1145/3503161.3547731 article EN Proceedings of the 30th ACM International Conference on Multimedia 2022-10-10

Coming Soon ...