NFDI4DS | UHH-SEMS - Publication Details

Chao Zhang

ORCID: 0000-0002-7730-5131

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5100460206

Research Areas

Speech Recognition and Synthesis
Speech and Audio Processing
Music and Audio Processing
Natural Language Processing Techniques
Speech and dialogue systems
Topic Modeling
Sentiment Analysis and Opinion Mining
Text and Document Classification Technologies
Security and Verification in Computing
Emotion and Mood Recognition
Catalytic C–H Functionalization Methods
Advanced Malware Detection Techniques
Blockchain Technology Applications and Security
Analytical chemistry methods development
Neural Networks and Applications
Anomaly Detection Techniques and Applications
Misinformation and Its Impacts
Opinion Dynamics and Social Influence
Blind Source Separation Techniques
Analytical Chemistry and Sensors
Mass Spectrometry Techniques and Applications
Oxidative Organic Chemistry Reactions
Vanadium and Halogenation Chemistry
Face and Expression Recognition
Multimodal Machine Learning Applications

Tsinghua University
2013-2024

University of Cambridge
2018-2024

Jingdong (China)
2020-2023

Google (United States)
2023

Bridge University
2022

Georgia Institute of Technology
2020

Donghua University
2017

China Institute of Atomic Energy
2002

Multimodal Intelligence: Representation Learning, Information Fusion, and Applications

OPENALEX - Publications

Chao Zhang Zichao Yang Xiaodong He Li Deng

Deep learning methods have revolutionized speech recognition, image and natural language processing since 2010. Each of these tasks involves a single modality in their input signals. However, many applications the artificial intelligence field involve multiple modalities. Therefore, it is broad interest to study more difficult complex problem modeling across In this paper, we provide technical review available models for multimodal intelligence. The main focus combination vision modalities,...

10.1109/jstsp.2020.2987728 article EN IEEE Journal of Selected Topics in Signal Processing 2020-03-01

Application of the Biological Conjugate between Antibody and Colloid Au Nanoparticles as Analyte to Inductively Coupled Plasma Mass Spectrometry

OPENALEX - Publications

Chao Zhang Zhenyu Zhang Binbing Yu Jinjun Shi Xinrong Zhang

This paper describes the study of atomization nanoparticles by inductively coupled plasma mass spectrometry (ICPMS) and developes a novel nonisotopic immunoassay coupling sandwich-type immunoreaction to ICPMS. The goat-anti-rabbit immunoglobulin G (IgG) labeled with colloidal gold served as an analyte in ICPMS for indirect measurement rabbit-anti-human IgG. Matrix effect studies showed signal was not sensitive organic matrix. A relatively good correlation (r2 = 0.9528) between proposed...

10.1021/ac0103468 article EN Analytical Chemistry 2001-11-30

αDiff: cross-version binary code similarity detection with DNN

OPENALEX - Publications

Bingchang Liu Wei Huo Chao Zhang Wenchao Li Feng Li and 2 more

Binary code similarity detection (BCSD) has many applications, including patch analysis, plagiarism detection, malware and vulnerability search etc. Existing solutions usually perform comparisons over specific syntactic features extracted from binary code, based on expert knowledge. They have either high performance overheads or low accuracy. Moreover, few are suitable for detecting similarities between cross-version binaries, which may not only diverge in structures but also slightly semantics.

10.1145/3238147.3238199 article EN 2018-08-20

Room-temperature Pd-catalyzed C–H chlorination by weak coordination: one-pot synthesis of 2-chlorophenols with excellent regioselectivity

OPENALEX - Publications

Xiuyun Sun Yonghui Sun Chao Zhang Yu Rao

A room-temperature Pd(II)-catalyzed regioselective chlorination reaction has been developed for a facile one-pot synthesis of broad range 2-chlorophenols. The demonstrates an excellent regioselectivity and reactivity C–H chlorination. This represents one the rare examples mild functionalization at ambient temperature.

10.1039/c3cc47431c article EN Chemical Communications 2013-11-26

Cobalt-catalyzed C–H activation and regioselective intermolecular annulation with allenes

OPENALEX - Publications

Tianlei Li Chao Zhang Yonghua Tan Weidong Pan Yu Rao

An efficient Co(<sc>ii</sc>)-catalyzed intermolecular annulation of <italic>N</italic>-(quinolin-8-yl)benzamide with allenes for the synthesis novel isoquinolin-1(2<italic>H</italic>)-one scaffolds has been developed.

10.1039/c6qo00567e article EN Organic Chemistry Frontiers 2016-11-08

Emotion Recognition by Fusing Time Synchronous and Time Asynchronous Representations

OPENALEX - Publications

Wen Wu Chao Zhang Philip C. Woodland

In this paper, a novel two-branch neural network model structure is proposed for multimodal emotion recognition, which consists of time synchronous branch (TSB) and asynchronous (TAB). To capture correlations between each word its acoustic realisation, the TSB combines speech text modalities at input window frame then uses pooling across to form single embedding vector. The TAB, by contrast, provides cross-utterance information integrating sentence embeddings from number context utterances...

10.1109/icassp39728.2021.9414880 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2021-05-13

Adapting GPT, GPT-2 and BERT Language Models for Speech Recognition

OPENALEX - Publications

Xianrui Zheng Chao Zhang Philip C. Woodland

Language models (LMs) pre-trained on massive amounts of text, in particular bidirectional encoder representations from Transformers (BERT), generative pre-training (GPT), and GPT-2, have become a key technology for many natural language processing tasks. In this paper, we present results using fine-tuned GPT, their combination automatic speech recognition (ASR). Unlike unidirectional LM GPT BERT is whose direct product the output probabilities no longer valid prior probability. A conversion...

10.1109/asru51503.2021.9688232 article EN 2021 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU) 2021-12-13

Connecting Speech Encoder and Large Language Model for ASR

OPENALEX - Publications

Wenyi Yu Changli Tang Guangzhi Sun Xianzhao Chen Tian Tan and 4 more

The impressive capability and versatility of large language models (LLMs) have aroused increasing attention in automatic speech recognition (ASR), with several pioneering studies attempting to build integrated ASR by connecting a encoder an LLM. This paper presents comparative study three commonly used structures as connectors, including fully connected layers, multi-head cross-attention, Q-Former. Speech encoders from the Whisper model series well LLMs Vicuna different sizes were studied....

10.1109/icassp48485.2024.10445874 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2024-03-18

Pd(ii) catalyzed ortho C–H iodination of phenylcarbamates at room temperature using cyclic hypervalent iodine reagents

OPENALEX - Publications

Xiuyun Sun Xia Yao Chao Zhang Yu Rao

A novel approach to access ortho iodinated phenols using cyclic hypervalent iodine reagents through palladium(II) catalyzed C-H activation has been developed weak coordination. The reaction showed excellent regioselectivity, reactivity and good functional group tolerance. unique mechanism was proposed.

10.1039/c5cc02533h article EN Chemical Communications 2015-01-01

Discriminative Neural Clustering for Speaker Diarisation

OPENALEX - Publications

Qiujia Li Florian Kreyssig Chao Zhang Philip C. Woodland

In this paper, we propose Discriminative Neural Clustering (DNC) that formulates data clustering with a maximum number of clusters as supervised sequence-to-sequence learning problem. Com-pared to traditional unsupervised algorithms, DNC learns patterns from training without requiring an explicit definition similarity measure. An implementation based on the Transformer architecture is shown be effective speaker diarisation task using challenging AMI dataset. Since contains only 147 complete...

10.1109/slt48900.2021.9383617 article EN 2022 IEEE Spoken Language Technology Workshop (SLT) 2021-01-19

Finding Cracks in Shields: On the Security of Control Flow Integrity Mechanisms

OPENALEX - Publications

Yuan Li Mingzhe Wang Chao Zhang Xingman Chen Songtao Yang and 1 more

Control-flow integrity (CFI) is a promising technique to mitigate control-flow hijacking attacks. In the past decade, dozens of CFI mechanisms have been proposed by researchers. Despite claims made themselves, security promises these not carefully evaluated, and thus are questionable.

10.1145/3372297.3417867 article EN Proceedings of the 2022 ACM SIGSAC Conference on Computer and Communications Security 2020-10-30

Synthesis of a Highly Fluorescent β-Diketone−Europium Chelate and Its Utility in Time-Resolved Fluoroimmunoassay of Serum Total Thyroxine

OPENALEX - Publications

Fengbo Wu Shi-Quan Han Chao Zhang Youfeng He

A new highly fluorescent β-diketone−europium chelate was synthesized and employed as a tracer to develop time-resolved fluoroimmunoassay (TRFIA) for detection of serum total thyroxine (T4). The tetradentate β-diketone chelator, 1,10-bis(thiophene-2'-yl)-4,4,5,5,6,6,7,7-octafluorodecane-1,3,8,10-tetraone (BTOT), structurally composed two units thenoyltrifluoroacetone (TTA) derivatives but expressed fluorescence that greatly enhanced, compared the original TTA molecules, in presence excess...

10.1021/ac025727f article EN Analytical Chemistry 2002-10-22

PACMem

OPENALEX - Publications

Yuan Li Wende Tan Zhizheng Lv Songtao Yang Mathias Payer and 2 more

Memory safety is a key security property that stops memory corruption vulnerabilities. Different types of enforcement solutions have been proposed and adopted by sanitizers or mitigations to catch stop such bugs, at the development deployment phase. However, existing either provide partial overwhelmingly high performance overheads.

10.1145/3548606.3560598 article EN Proceedings of the 2022 ACM SIGSAC Conference on Computer and Communications Security 2022-11-07

A new europium β-diketone chelate for ultrasensitive time-resolved fluorescence immunoassays

OPENALEX - Publications

Fengbo Wu Chao Zhang

10.1016/s0003-2697(02)00390-1 article EN Analytical Biochemistry 2002-12-01

Discriminative Neural Clustering for Speaker Diarisation

OPENALEX - Publications

Qiujia Li Florian Kreyssig Chao Zhang Philip C. Woodland

In this paper, we propose Discriminative Neural Clustering (DNC) that formulates data clustering with a maximum number of clusters as supervised sequence-to-sequence learning problem. Compared to traditional unsupervised algorithms, DNC learns patterns from training without requiring an explicit definition similarity measure. An implementation based on the Transformer architecture is shown be effective speaker diarisation task using challenging AMI dataset. Since contains only 147 complete...

10.48550/arxiv.1910.09703 preprint EN other-oa arXiv (Cornell University) 2019-01-01

Joint Aspect-Sentiment Analysis with Minimal User Guidance

OPENALEX - Publications

Honglei Zhuang Fang Guo Chao Zhang Liyuan Liu Jiawei Han

Aspect-based sentiment analysis is a substantial step towards text understanding which benefits numerous applications. Since most existing algorithms require large amount of labeled data or external language resources, applying them on new domain usually expensive and time-consuming. We aim to build an aspect-based model from unlabeled corpus with minimal guidance users, i.e., only small set seed words for each aspect class class. employ autoencoder structure attention learn two dictionary...

10.1145/3397271.3401179 article EN 2020-07-25

Tree-Constrained Pointer Generator for End-to-End Contextual Speech Recognition

OPENALEX - Publications

Guangzhi Sun Chao Zhang Philip C. Woodland

Contextual knowledge is important for real-world automatic speech recognition (ASR) applications. In this paper, a novel tree-constrained pointer generator (TCPGen) component proposed that incorpo-rates such as list of biasing words into both attention-based encoder-decoder and transducer end-to-end ASR models in neural-symbolic way. TCPGen structures the an efficient prefix tree to serve its symbolic input creates neu-ral shortcut between final output distribution facilitate recognising...

10.1109/asru51503.2021.9687915 article EN 2021 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU) 2021-12-13

Streaming End-to-End Multilingual Speech Recognition with Joint Language Identification

OPENALEX - Publications

Chao Zhang Bo Li Tara N. Sainath Trevor Strohman Sepand Mavandadi and 2 more

10.21437/interspeech.2022-11249 article EN Interspeech 2022 2022-09-16

SALMONN: Towards Generic Hearing Abilities for Large Language Models

OPENALEX - Publications

Changli Tang Wenyi Yu Guangzhi Sun Xianzhao Chen Tian Tan and 4 more

Hearing is arguably an essential ability of artificial intelligence (AI) agents in the physical world, which refers to perception and understanding general auditory information consisting at least three types sounds: speech, audio events, music. In this paper, we propose SALMONN, a speech language music open neural network, built by integrating pre-trained text-based large model (LLM) with encoders into single multimodal model. SALMONN enables LLM directly process understand inputs achieve...

10.48550/arxiv.2310.13289 preprint EN other-oa arXiv (Cornell University) 2023-01-01

Graph Neural Networks for Contextual ASR With the Tree-Constrained Pointer Generator

OPENALEX - Publications

Guangzhi Sun Chao Zhang Philip C. Woodland

Incorporating biasing words obtained through contextual knowledge is paramount in automatic speech recognition (ASR) applications. This paper proposes an innovative method for achieving end-to-end ASR using graph neural network (GNN) encodings based on the tree-constrained pointer generator method. GNN node facilitate lookahead future word pieces process of decoding at each tree by incorporating information about all branches rooted from it. results a more precise prediction generation...

10.1109/taslp.2024.3389645 article EN cc-by IEEE/ACM Transactions on Audio Speech and Language Processing 2024-01-01

Self-Taught Recognizer: Toward Unsupervised Adaptation for Speech Foundation Models

OPENALEX - Publications

Yuchen Hu Chen Chen Chao-Han Huck Yang Chengwei Qin Pin‐Yu Chen and 2 more

We propose an unsupervised adaptation framework, Self-TAught Recognizer (STAR), which leverages unlabeled data to enhance the robustness of automatic speech recognition (ASR) systems in diverse target domains, such as noise and accents. STAR is developed for prevalent foundation models based on Transformer-related architecture with auto-regressive decoding (e.g., Whisper, Canary). Specifically, we a novel indicator that empirically integrates step-wise information during assess token-level...

10.48550/arxiv.2405.14161 preprint EN arXiv (Cornell University) 2024-05-23

ICP-MS-based competitive immunoassay for the determination of total thyroxin in human serum

OPENALEX - Publications

Chao Zhang Fengbo Wu Xinrong Zhang

To further expand the range of analytes that can be detected by using ICP-MS coupled with bioanalytical methods, we have employed a new separation system based on highly active surface streptavidin and biotinylated monoclonal antibody (McAb) in competitive immunoassay followed detection. Specifically, demonstrated its application for determination total thyroxine (T4) human serum Eu3+ as label. In this method, immobilized to pre-coated bovine albumin (BSA)-biotin microwells showed...

10.1039/b205623b article EN Journal of Analytical Atomic Spectrometry 2002-08-27

Combination of deep speaker embeddings for diarisation

OPENALEX - Publications

Guangzhi Sun Chao Zhang Philip C. Woodland

10.1016/j.neunet.2021.04.020 article EN Neural Networks 2021-04-21

BET: black-box efficient testing for convolutional neural networks

OPENALEX - Publications

Jialai Wang Han Qiu Yi Rong Hengkai Ye Qi Li and 2 more

It is important to test convolutional neural networks (CNNs) identify defects (e.g. error-inducing inputs) before deploying them in security-sensitive scenarios. Although existing white-box testing methods can effectively CNN models with high neuron coverage, they are not applicable privacy-sensitive scenarios where full knowledge of target lacking. In this work, we propose a novel Black-box Efficient Testing (BET) method for models. The core insight BET that CNNs generally prone be affected...

10.1145/3533767.3534386 article EN 2022-07-15

Combining hybrid DNN-HMM ASR systems with attention-based models using lattice rescoring

OPENALEX - Publications

Qiujia Li Chao Zhang Philip C. Woodland

The traditional hybrid deep neural network (DNN)–hidden Markov model (HMM) system and attention-based encoder–decoder (AED) are both commonly used automatic speech recognition (ASR) approaches with distinct characteristics advantages. While systems per-frame-based highly modularised to leverage external phonetic linguistic knowledge, AED models operate on a per-label basis jointly learn the acoustic language information using single in an end-to-end trainable fashion. In this paper, we...

10.1016/j.specom.2022.12.002 article EN cc-by Speech Communication 2022-12-25

Coming Soon ...