- Video Coding and Compression Technologies
- Advanced Image Processing Techniques
- Advanced Vision and Imaging
- Image and Video Quality Assessment
- Topic Modeling
- Adversarial Robustness in Machine Learning
- Expert finding and Q&A systems
- Advanced Image and Video Retrieval Techniques
- Machine Learning in Healthcare
- Intelligent Tutoring Systems and Adaptive Learning
- Natural Language Processing Techniques
- Image Enhancement Techniques
- Image and Video Stabilization
- Computational Physics and Python Applications
- Power Systems Fault Detection
- Music and Audio Processing
- Vibration and Dynamic Analysis
- Music Technology and Sound Studies
- Industrial Vision Systems and Defect Detection
- Islanding Detection in Power Systems
- Video Surveillance and Tracking Methods
- AI-based Problem Solving and Planning
- Tensor decomposition and applications
- Video Analysis and Summarization
- Image Processing Techniques and Applications
Peking University
2019-2025
Remarkable effectiveness of the channel or spatial attention mechanisms for producing more discernible feature representation are illustrated in various computer vision tasks. However, modeling cross-channel relationships with dimensionality reduction may bring side effect extracting deep visual representations. In this paper, a novel efficient multi-scale (EMA) module is proposed. Focusing on retaining information per and decreasing computational overhead, we reshape partly channels into...
We study the problem of knowledge tracing (KT) where goal is to trace students' mastery over time so as make predictions on their future performance. Owing good representation capacity deep neural networks (DNNs), recent advances KT have increasingly concentrated exploring DNNs improve performance KT. However, we empirically reveal that based models may run risk overfitting, especially small datasets, leading limited generalization. In this paper, by leveraging current in adversarial...
Recently, many researches on convolution neural network (CNN) based in-loop filters have been proposed to improve coding efficiency. However, most existing CNN tend train and deploy multiple networks for various quantization parameters (QP) frame types (FT), which drastically increases resources in training these models the memory burdens video codec. In this paper, we propose a novel variable (VCNN) filter VVC, can effectively handle compressed videos with different QPs FTs via single...
Many electrocardiogram (ECG) processors have been widely used for cardiac monitoring. However, most of them relatively low energy efficiency, and lack configurability in classification leads number inference algorithm models. A multi-lead ECG coprocessor is proposed this paper, which can perform efficient anomaly detection. In order to achieve high sensitivity positive precision R-peak detection, a method based on zero-crossing slope adaptive threshold comparison proposed. Also,...
Deep learning-based in-loop filters have recently demonstrated great improvement for both coding efficiency and subjective quality in video coding. However, most existing deep tend to develop a sophisticated model exchange good performance, they employ single network structure all reconstructed samples, which lack sufficient adaptiveness the various content, limiting their performances some extent. In contrast, this paper proposes an adaptive reinforcement filter (ARLF) versatile (VVC)....
While recent researches on convolutional neural network (CNN) based in-loop filters for High Efficiency Video Coding (HEVC) have achieved great success, the performance of these models new standard Versatile (VVC) may degrade due to many novel adopted techniques which make compression process more fine and capture image details. In this work, performances VVC two CNN proposed HEVC are investigated a multi-gradient filter (MGNLF) is proposed. The model exploits divergence second derivative...
The learning problem of ranking arises in many tasks, including the question answering, information retrieval, and movie recommendation. In these ordering answers, documents or movies returned is a critical aspect system. Recently, deep approaches have gained lot attention from research community industry for their ability to automatically learn optimal feature representation given task. We aim solve answer practical answering system with approaches. this paper, we define composite questions...
In this paper, a novel QP variable convolutional neural network based in-loop filter is proposed for VVC intra coding. To avoid training and deploying multiple networks, we develop an efficient attention module (QPAM) which can capture compression noise levels different QPs emphasize meaningful features along channel dimension. Then embed QPAM into the residual block, on it, design architecture that equipped with controllability QPs. make model focus more examples have artifacts or hard to...
Modeling sentence similarity all along is a challengeable task in the field of natural language processing (NLP), since ambiguity and variability linguistic expression. Specifically, community question answering (CQA), homologous hotspot focusing on retrieval. To get most similar compared with user's query, we proposed model building Bidirectional Long Short-Term Memory (BLSTM) neural networks, which as well can be used other fields, such computation, paraphrase detection, so on. We...
Deep neural networks have achieved remarkable success in HEVC compressed video quality enhancement. However, most existing multiframe-based methods either deliver unsatisfactory results or consume a significant amount of resources to leverage temporal information neighboring frames. For the sake practicality, thorough investigation architecture design enhancement network regarding performance, model parameters, and running speed is essential. In this article, we first propose an efficient...
Due to the application requirement of single source tracking in dynamic background, it is difficult get accurate pedestrian target detection. Through comparing performance quality describing operator between CENTRIST and HOG, we adopt feature extraction combine with SVM off-line classifier train model for Furthermore, propose importance utilize edge classification thought deepen contour feature. Meanwhile, put forward scheme remove local texture background noise improve algorithm detection...
Recently, tensor algebra have witnessed significant applications across various domains. Each operator in features different computational workload and precision. However, current general accelerators, such as VPU, GPGPU, CGRA, support operators with low energy area efficiency. This paper conducts an in-depth exploration of accelerator for processing. First, we find the similarity between matrix multiplication precision multiplication, create a method classifying operators. Then, implement...
Pre-trained language models have achieved impressive results in various music understanding and generation tasks. However, existing pre-training methods for symbolic melody struggle to capture multi-scale, multi-dimensional structural information note sequences, due the domain knowledge discrepancy between text music. Moreover, lack of available large-scale datasets limits improvement. In this paper, we propose MelodyGLM, a multi-task framework generating melodies with long-term structure....
Versatile Video Coding (VVC) has adopted a quad-tree with nested multi-type tree (QTMT) partition structure to improve the rate-distortion (RD) performance, but this greatly increases complexity due brute-force RDO process search for best type. Some methods cannot fully utilize information of deeper CUs when training, while others large as whole and may neglect specific inside smaller CUs. Therefore, in paper, we propose learning-based approach predict modes every CU utilizing Firstly, model...
Learning-based methods have achieved excellent performance for compressed video restoration (CVR) in recent years. However, existing networks aggregate multi-frame information inefficiently and are usually developed specific quantization parameters (QPs), which not convenient practical usage. Moreover, current works only consider Constant QP (CQP) setting, but do discuss the of model more realistic scenarios, e.g., Rate Factor (CRF) Bitrate (CBR). In this paper, we propose a generalized...
Inserting a logo into HEVC video streams is highly demanded in applications. In this paper, we present an efficient insertion method for coding HEVC. To reduce the impact of inserted logo, proposed mitigates encoding dependence on by partitioning frame separated regions. For lossless region, bit rate overhead according to error propagation model. information reusing partly re-encode quality-loss area maintain quality. The computational complexity reduced information. Experimental results...
In this paper, a novel QP variable convolutional neural network based in-loop filter is proposed for VVC intra coding. To avoid training and deploying multiple networks, we develop an efficient attention module (QPAM) which can capture compression noise levels different QPs emphasize meaningful features along channel dimension. Then embed QPAM into the residual block, on it, design architecture that equipped with controllability QPs. make model focus more examples have artifacts or hard to...
We study the problem of knowledge tracing (KT) where goal is to trace students' mastery over time so as make predictions on their future performance. Owing good representation capacity deep neural networks (DNNs), recent advances KT have increasingly concentrated exploring DNNs improve performance KT. However, we empirically reveal that based models may run risk overfitting, especially small datasets, leading limited generalization. In this paper, by leveraging current in adversarial...