NFDI4DS | UHH-SEMS - Publication Details

Efficient Multi-Scale Attention Module with Cross-Spatial Learning

OPENALEX - Publications

Daliang Ouyang Su He Guozhong Zhang Mingzhu Luo Huaiyong Guo and 2 more

Remarkable effectiveness of the channel or spatial attention mechanisms for producing more discernible feature representation are illustrated in various computer vision tasks. However, modeling cross-channel relationships with dimensionality reduction may bring side effect extracting deep visual representations. In this paper, a novel efficient multi-scale (EMA) module is proposed. Focusing on retaining information per and decreasing computational overhead, we reshape partly channels into...

10.1109/icassp49357.2023.10096516 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2023-05-05

Enhancing Knowledge Tracing via Adversarial Training

OPENALEX - Publications

Xiaopeng Guo Zhijie Huang Jie Gao Mingyu Shang Maojing Shu and 1 more

We study the problem of knowledge tracing (KT) where goal is to trace students' mastery over time so as make predictions on their future performance. Owing good representation capacity deep neural networks (DNNs), recent advances KT have increasingly concentrated exploring DNNs improve performance KT. However, we empirically reveal that based models may run risk overfitting, especially small datasets, leading limited generalization. In this paper, by leveraging current in adversarial...

10.1145/3474085.3475554 article EN Proceedings of the 30th ACM International Conference on Multimedia 2021-10-17

One-for-All: An Efficient Variable Convolution Neural Network for In-Loop Filter of VVC

OPENALEX - Publications

Zhijie Huang Jun Sun Xiaopeng Guo Mingyu Shang

Recently, many researches on convolution neural network (CNN) based in-loop filters have been proposed to improve coding efficiency. However, most existing CNN tend train and deploy multiple networks for various quantization parameters (QP) frame types (FT), which drastically increases resources in training these models the memory burdens video codec. In this paper, we propose a novel variable (VCNN) filter VVC, can effectively handle compressed videos with different QPs FTs via single...

10.1109/tcsvt.2021.3089498 article EN IEEE Transactions on Circuits and Systems for Video Technology 2021-06-15

An Energy-Efficient Configurable 1-D CNN-Based Multi-Lead ECG Classification Coprocessor for Wearable Cardiac Monitoring Devices

OPENALEX - Publications

Chen Zhang Zhijie Huang Changchun Zhou Ao Qie Xin’an Wang

Many electrocardiogram (ECG) processors have been widely used for cardiac monitoring. However, most of them relatively low energy efficiency, and lack configurability in classification leads number inference algorithm models. A multi-lead ECG coprocessor is proposed this paper, which can perform efficient anomaly detection. In order to achieve high sensitivity positive precision R-peak detection, a method based on zero-crossing slope adaptive threshold comparison proposed. Also,...

10.1109/tbcas.2025.3530790 article EN IEEE Transactions on Biomedical Circuits and Systems 2025-01-01

Adaptive Deep Reinforcement Learning-Based In-Loop Filter for VVC

OPENALEX - Publications

Zhijie Huang Jun Sun Xiaopeng Guo Mingyu Shang

Deep learning-based in-loop filters have recently demonstrated great improvement for both coding efficiency and subjective quality in video coding. However, most existing deep tend to develop a sophisticated model exchange good performance, they employ single network structure all reconstructed samples, which lack sufficient adaptiveness the various content, limiting their performances some extent. In contrast, this paper proposes an adaptive reinforcement filter (ARLF) versatile (VVC)....

10.1109/tip.2021.3084345 article EN IEEE Transactions on Image Processing 2021-01-01

STRANet: Soft-Target and Restriction-Aware Neural Network for Efficient VVC Intra Coding

OPENALEX - Publications

Tianyi Sun Yanze Wang Zhijie Huang Jun Sun

10.1109/tcsvt.2024.3428474 article EN IEEE Transactions on Circuits and Systems for Video Technology 2024-07-15

Multi-Gradient Convolutional Neural Network Based In-Loop Filter For Vvc

OPENALEX - Publications

Zhijie Huang Yunchang Li Jun Sun

While recent researches on convolutional neural network (CNN) based in-loop filters for High Efficiency Video Coding (HEVC) have achieved great success, the performance of these models new standard Versatile (VVC) may degrade due to many novel adopted techniques which make compression process more fine and capture image details. In this work, performances VVC two CNN proposed HEVC are investigated a multi-gradient filter (MGNLF) is proposed. The model exploits divergence second derivative...

10.1109/icme46284.2020.9102826 article EN 2022 IEEE International Conference on Multimedia and Expo (ICME) 2020-06-09

LSTM-based Deep Learning Models for Answer Ranking

OPENALEX - Publications

Zhenzhen Li Jiuming Huang Zhongcheng Zhou Haoyu Zhang Chang Shoufeng and 1 more

The learning problem of ranking arises in many tasks, including the question answering, information retrieval, and movie recommendation. In these ordering answers, documents or movies returned is a critical aspect system. Recently, deep approaches have gained lot attention from research community industry for their ability to automatically learn optimal feature representation given task. We aim solve answer practical answering system with approaches. this paper, we define composite questions...

10.1109/dsc.2016.37 article EN 2016-06-01

An Efficient QP Variable Convolutional Neural Network Based In-loop Filter for Intra Coding

OPENALEX - Publications

Zhijie Huang Xiaopeng Guo Mingyu Shang Jie Gao Jun Sun

In this paper, a novel QP variable convolutional neural network based in-loop filter is proposed for VVC intra coding. To avoid training and deploying multiple networks, we develop an efficient attention module (QPAM) which can capture compression noise levels different QPs emphasize meaningful features along channel dimension. Then embed QPAM into the residual block, on it, design architecture that equipped with controllability QPs. make model focus more examples have artifacts or hard to...

10.1109/dcc50243.2021.00011 article EN 2021-03-01

Question Similarity Modeling with Bidirectional Long Short-Term Memory Neural Network

OPENALEX - Publications

Chao An Jiuming Huang Chang Shoufeng Zhijie Huang

Modeling sentence similarity all along is a challengeable task in the field of natural language processing (NLP), since ambiguity and variability linguistic expression. Specifically, community question answering (CQA), homologous hotspot focusing on retrieval. To get most similar compared with user's query, we proposed model building Bidirectional Long Short-Term Memory (BLSTM) neural networks, which as well can be used other fields, such computation, paraphrase detection, so on. We...

10.1109/dsc.2016.13 article EN 2016-06-01

FastCNN: Towards Fast and Accurate Spatiotemporal Network for HEVC Compressed Video Enhancement

OPENALEX - Publications

Zhijie Huang Jun Sun Xiaopeng Guo

Deep neural networks have achieved remarkable success in HEVC compressed video quality enhancement. However, most existing multiframe-based methods either deliver unsatisfactory results or consume a significant amount of resources to leverage temporal information neighboring frames. For the sake practicality, thorough investigation architecture design enhancement network regarding performance, model parameters, and running speed is essential. In this article, we first propose an efficient...

10.1145/3569583 article EN ACM Transactions on Multimedia Computing Communications and Applications 2022-10-27

Pedestrian Detection Algorithm in Video Analysis Based on Centrist

OPENALEX - Publications

Zhijie Huang

Due to the application requirement of single source tracking in dynamic background, it is difficult get accurate pedestrian target detection. Through comparing performance quality describing operator between CENTRIST and HOG, we adopt feature extraction combine with SVM off-line classifier train model for Furthermore, propose importance utilize edge classification thought deepen contour feature. Meanwhile, put forward scheme remove local texture background noise improve algorithm detection...

10.1109/icitbs.2016.95 article EN 2016-12-01

GTA: a new General Tensor Accelerator with Better Area Efficiency and Data Reuse

OPENALEX - Publications

Chenyang Ai Lechuan Zhao Zhijie Huang Cangyuan Li Xinan Wang and 1 more

Recently, tensor algebra have witnessed significant applications across various domains. Each operator in features different computational workload and precision. However, current general accelerators, such as VPU, GPGPU, CGRA, support operators with low energy area efficiency. This paper conducts an in-depth exploration of accelerator for processing. First, we find the similarity between matrix multiplication precision multiplication, create a method classifying operators. Then, implement...

10.48550/arxiv.2405.02196 preprint EN arXiv (Cornell University) 2024-05-03

Exploration for Efficient Depthwise Separable Convolution Networks Deployment on FPGA

OPENALEX - Publications

Zhijie Huang Ao Qie Chen Zhang Jie Yang Xin’an Wang

10.1109/aicas59952.2024.10595964 article EN 2024-04-22

MelodyGLM: Multi-task Pre-training for Symbolic Melody Generation

OPENALEX - Publications

Xinda Wu Zhijie Huang Kejun Zhang Jiaxing Yu Xu Tan and 3 more

Pre-trained language models have achieved impressive results in various music understanding and generation tasks. However, existing pre-training methods for symbolic melody struggle to capture multi-scale, multi-dimensional structural information note sequences, due the domain knowledge discrepancy between text music. Moreover, lack of available large-scale datasets limits improvement. In this paper, we propose MelodyGLM, a multi-task framework generating melodies with long-term structure....

10.48550/arxiv.2309.10738 preprint EN other-oa arXiv (Cornell University) 2023-01-01

Efficient Intra Coding through Hierarchical CU Partition Prediction for VVC

OPENALEX - Publications

Tianyi Sun Yanze Wang Zhijie Huang Jun Sun

Versatile Video Coding (VVC) has adopted a quad-tree with nested multi-type tree (QTMT) partition structure to improve the rate-distortion (RD) performance, but this greatly increases complexity due brute-force RDO process search for best type. Some methods cannot fully utilize information of deeper CUs when training, while others large as whole and may neglect specific inside smaller CUs. Therefore, in paper, we propose learning-based approach predict modes every CU utilizing Firstly, model...

10.1109/vcip59821.2023.10402786 article EN 2022 IEEE International Conference on Visual Communications and Image Processing (VCIP) 2023-12-04

Generalized Compressed Video Restoration by Multi-Scale Temporal Fusion and Hierarchical Quality Score Estimation

OPENALEX - Publications

Zhijie Huang Tianyi Sun Xiaopeng Guo Yanze Wang Jun Sun

Learning-based methods have achieved excellent performance for compressed video restoration (CVR) in recent years. However, existing networks aggregate multi-frame information inefficiently and are usually developed specific quantization parameters (QPs), which not convenient practical usage. Moreover, current works only consider Constant QP (CQP) setting, but do discuss the of model more realistic scenarios, e.g., Rate Factor (CRF) Bitrate (CBR). In this paper, we propose a generalized...

10.1109/icme55011.2023.00092 article EN 2022 IEEE International Conference on Multimedia and Expo (ICME) 2023-07-01

An Efficient Logo Insertion Method for Video Coding in HEVC

OPENALEX - Publications

Yunchang Li Zhijie Huang Jun Sun

Inserting a logo into HEVC video streams is highly demanded in applications. In this paper, we present an efficient insertion method for coding HEVC. To reduce the impact of inserted logo, proposed mitigates encoding dependence on by partitioning frame separated regions. For lossless region, bit rate overhead according to error propagation model. information reusing partly re-encode quality-loss area maintain quality. The computational complexity reduced information. Experimental results...

10.1109/mmsp.2019.8901816 article EN 2019-09-01

An Efficient QP Variable Convolutional Neural Network Based In-loop Filter for Intra Coding

OPENALEX - Publications

Zhijie Huang Xiaopeng Guo Mingyu Shang Jie Gao Jun Sun

In this paper, a novel QP variable convolutional neural network based in-loop filter is proposed for VVC intra coding. To avoid training and deploying multiple networks, we develop an efficient attention module (QPAM) which can capture compression noise levels different QPs emphasize meaningful features along channel dimension. Then embed QPAM into the residual block, on it, design architecture that equipped with controllability QPs. make model focus more examples have artifacts or hard to...

10.48550/arxiv.2012.15003 preprint EN cc-by arXiv (Cornell University) 2020-01-01

Enhancing Knowledge Tracing via Adversarial Training

OPENALEX - Publications

Xiaopeng Guo Zhijie Huang Jie Gao Mingyu Shang Maojing Shu and 1 more

We study the problem of knowledge tracing (KT) where goal is to trace students' mastery over time so as make predictions on their future performance. Owing good representation capacity deep neural networks (DNNs), recent advances KT have increasingly concentrated exploring DNNs improve performance KT. However, we empirically reveal that based models may run risk overfitting, especially small datasets, leading limited generalization. In this paper, by leveraging current in adversarial...

10.48550/arxiv.2108.04430 preprint EN other-oa arXiv (Cornell University) 2021-01-01