Xiao Lin

ORCID: 0009-0006-8716-2601
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Advanced Data Compression Techniques
  • Generative Adversarial Networks and Image Synthesis
  • Advanced Vision and Imaging
  • Video Coding and Compression Technologies
  • Advanced Image and Video Retrieval Techniques
  • Advanced Adaptive Filtering Techniques
  • Advanced Image Processing Techniques
  • Security and Verification in Computing
  • Image and Video Quality Assessment
  • Speech and Audio Processing
  • Embedded Systems Design Techniques
  • Advanced Malware Detection Techniques
  • Robot Manipulation and Learning
  • Network Security and Intrusion Detection
  • Advanced Neural Network Applications
  • Multimedia Communication and Technology
  • Computer Graphics and Visualization Techniques
  • Embedded Systems and FPGA Design
  • Image and Signal Denoising Methods
  • Multimodal Machine Learning Applications
  • Matrix Theory and Algorithms
  • Neural Networks and Applications
  • Digital Filter Design and Implementation
  • Aesthetic Perception and Analysis
  • Human Pose and Action Recognition

Tongji University
2024

Nanchang Hangkong University
2023

Nanjing University of Aeronautics and Astronautics
2022

Hong Kong University of Science and Technology
2011

Shanghai Jiao Tong University
2011

Institute for Infocomm Research
2003-2005

Hanalei Watershed Hui
2003

Nanyang Technological University
1999-2002

10.18653/v1/2024.findings-naacl.179 article EN Findings of the Association for Computational Linguistics: NAACL 2022 2024-01-01

This paper presents a rate control scheme for H.264 by introducing the concept of basic unit and linear prediction model. The can be macroblock (MB), slice, or frame. It used to obtain trade-off between overall coding efficiency bits fluctuation. model is solve chicken egg dilemma existing in H.264. Both constant bit (CBR) variable (VBR) cases are studied. Our has been adopted

10.1109/icip.2004.1419405 article EN 2005-04-19

We present a method for extracting local visual perceptual cues and its application rate control of videophone, in order to ensure the scarce bits be assigned maximum coding quality. The optimum quantization step is determined with rate-distortion model considering signal. For extraction cues, luminance adaptation texture masking are used as stimulus-driven factors, while skin color serves cognition-driven factor current implementation. Both objective subjective quality evaluations given by...

10.1109/tcsvt.2005.844458 article EN IEEE Transactions on Circuits and Systems for Video Technology 2005-04-01

Text-driven image editing aims to manipulate images with the guidance of natural language description. Text is much more and intuitive than many other interaction modes, attracts attention recently. However, compared classical supervised learning tasks, there no standard benchmark dataset for text-driven interactive up now. Therefore, it hard train an end-to-end model pixel-aligned driven by text. Some methods follow paradigm text-to-image models incorporating target into process generation....

10.1109/tmm.2023.3289755 article EN IEEE Transactions on Multimedia 2023-07-07

This paper presents an exemplar-based video inpainting mechanism that restores the area of removal object, and this can be further employed to extract background videos. The region inpainted in is still moving foreground. Our method consists a simple preprocessing stage step. constructing Gaussian Mixture Model (GMM) for both foreground separately, then make use GMMs distinguish entire video. That saves time calculating optical flow mosaics as many algorithms do As inpainting, we firstly...

10.1109/cmsp.2011.169 article EN 2011-05-01

Textual backdoor attacks pose significant security threats. Current detection approaches, typically relying on intermediate feature representation or reconstructing potential triggers, are task-specific and less effective beyond sentence classification, struggling with tasks like question answering named entity recognition. We introduce TABDet (Task-Agnostic Backdoor Detector), a pioneering task-agnostic method for detection. leverages final layer logits combined an efficient pooling...

10.48550/arxiv.2403.17155 preprint EN arXiv (Cornell University) 2024-03-25

We introduce a PDA-based live video streaming system on GPRS network based MPEG-4 compression standard. Due to the limited computational resources of PDA, all key modules codec are efficiently implemented and optimized such as multithreading, buffer design, wireless communication, encoder decoder. Several novel techniques developed in coding, well post- processing stages system.

10.1109/icme.2003.1221580 article EN 2003-01-01

For compression of audio waveform, prediction is one the important key components. We propose a novel multi-stage adaptive linear predictor (MSALP) for high fidelity waveform. The MSALP achieves higher gain compared with conventional (LP). Besides, uses less number coefficients. embedded into lossless waveform system. As result, system ratio obviously improved. By selecting known algorithms, some comparison results are presented.

10.1109/icme.2001.1237673 article EN 2022 IEEE International Conference on Multimedia and Expo (ICME) 2001-01-01

This paper presents our study on the feasibility and effectiveness of using MELP (mixed excitation linear prediction) model for coding wideband (7 kHz) speech signals at a transmission bit rate 8 kbps. In order to achieve reasonably good subjective quality decoded while maintaining low operating same time, modifications pitch estimation, LP analysis/synthesis post filtering stages original are discussed. Informal listening tests show that proposed coder is rated be slightly better than...

10.1109/icassp.2000.859165 article EN 2002-11-07

Download This Paper Open PDF in Browser Add to My Library Share: Permalink Using these links will ensure access this page indefinitely Copy URL DOI

10.2139/ssrn.4670638 preprint EN 2023-01-01

In this paper, we propose a new video post-processing algorithm for reducing coding artifacts. The is based on regularized image restoration technique, in which nonconvex regularization function used to enforce smoothness constraints the spatial pixels suppressing artifacts while preserving important attribute. Temporal carried out along motion trajectories. We apply half-quadratic and piecewise constraint strategy simplify optimization problem. Explicit filtering formulae are obtained...

10.1109/icip.2003.1246655 article EN 2004-06-03

10.11175/eastpro.2011.0.298.0 article EN Proceedings of the Eastern Asia Society for Transportation Studies The 9th International Conference of Eastern Asia Society for Transportation Studies, 2011 2011-01-01

This paper focuses on the design of an adaptive lossless compression algorithm for audio waveforms, which aims to be used in Internet, PC game, library archive applications, etc. Windows based software is developed can also DOS or UNIX operating system. The Rice code (Rice and Plant, 1971), a scheme adaptively select best set parameters enhance ratio proposed. In order obtain higher ratio, linear predictor employed. Our achieved slightly than known commercial package compatible with another...

10.1109/icosp.1998.770824 article EN 2002-11-27

Nowadays, image inpainting methods based on deep learning would lead to information loss when acquiring features, which is not conducive the restoration of texture details and ignores semantic features. Besides, great majority them generate results with unreasonable structures. In response above problems, an network a multi-scale feature joint attention model proposed. First all, using advanced. When depth fusion used reduce in convolution process. Afterwards, mechanism only strengthens...

10.3724/sp.j.1089.2022.19172 article EN Journal of Computer-Aided Design & Computer Graphics 2022-08-01

Artwork Generation is an important research area of computer vision. Recently, kinds generative models have achieved great success in natural image generation. However, artwork generation has rarely been studied due to the unfixed structure artworks. Combined with prevailing diffusion models, we propose a simple yet effective framework, named as ArtDiff, for Given name artist, can generate diverse and novel artworks which reflecting style artist. To best our knowledge, ArtDiff first model...

10.1145/3604078.3604167 article EN 2023-05-19

Complex scene generation is an important and challenging image synthesis task. Though latent space based conditional generative methods get impressive results, the accurate locating of objects for more detailed situation editable contiguous are still a snag. In addition, previous with spatial awareness easy to lose location information after series fusion operation. Focusing on such kind problems, in this paper we propose object-wise constraint free location-lossless way layout embedding by...

10.1145/3604078.3604150 article EN 2023-05-19

10.24251/hicss.2022.603 article EN Proceedings of the ... Annual Hawaii International Conference on System Sciences/Proceedings of the Annual Hawaii International Conference on System Sciences 2022-01-01

This paper presents the issues associated with real-time implementation of MELP 2.4 kbps speech codec by using a TI fixed-point DSP. It briefly reviews algorithm and procedure used in porting C codes into TMS320C54x assembly codes. Various factors such as memory, speed compatibility are also discussed.

10.1109/icosp.2000.891596 article EN 2002-11-07

We present a low bit rate base band speech communication system by using the PC and Windows environment. The codec is US DoD standard MELP with of 2.4 kbps. This especially suitable for channel transmission less than kbps provides an RS-232 serial port interface normal modem as well TCP/IP LAN. A Pentium 133 sound card enough such full duplex implementation.

10.1109/mmsp.1999.793911 article EN 1999-01-01
Coming Soon ...