- Advanced Image Processing Techniques
- Advanced Image and Video Retrieval Techniques
- Video Coding and Compression Technologies
- Advanced Vision and Imaging
- Video Surveillance and Tracking Methods
- Advanced Data Compression Techniques
- Advanced Neural Network Applications
- Face recognition and analysis
- Image Enhancement Techniques
- Image and Video Quality Assessment
- Generative Adversarial Networks and Image Synthesis
- Image and Signal Denoising Methods
- Visual Attention and Saliency Detection
- Digital Media Forensic Detection
- Tensor decomposition and applications
- Biometric Identification and Security
- Image Processing Techniques and Applications
- Human Pose and Action Recognition
- Medical Image Segmentation Techniques
- CCD and CMOS Imaging Sensors
- Anomaly Detection Techniques and Applications
- Gait Recognition and Analysis
- Stochastic processes and financial applications
- Planetary Science and Exploration
- Electricity Theft Detection Techniques
InterDigital (United States)
2022-2025
Seokyeong University
2020-2023
Simon Fraser University
2017-2022
LG (South Korea)
2013
Kwangwoon University
2011
University of California, Berkeley
2008
At present, and increasingly so in the future, much of captured visual content will not be seen by humans. Instead, it used for automated machine vision analytics may require occasional human viewing. Examples such applications include traffic monitoring, surveillance, autonomous navigation, industrial vision. To address requirements, we develop an end-to-end learned image codec whose latent space is designed to support scalability from simpler more complicated tasks. The simplest task...
Recent studies have shown that the efficiency of deep neural networks in mobile applications can be significantly improved by distributing computational workload between device and cloud. This paradigm, termed collaborative intelligence, involves communicating feature data The such approach further lossy compression data, which has not been examined to date. In this work we focus on object detection study impact both near-lossless its accuracy. We also propose a strategy for improving...
In this paper, we present a pixel-wise unified rate quantization (R-Q) model for low-complexity control on configurable coding units of high efficiency video (HEVC). the case HEVC, which employs hierarchical block structure, multiple R-Q models can be employed various sizes. However, found that ratios distortions over bits all blocks are nearly constant because employment distortion optimization technique. Hence, one relationship between and derived from characteristic similar regardless...
Collaborative intelligence is a new paradigm for efficient deployment of deep neural networks across the mobile-cloud infrastructure. By dividing network between mobile and cloud, it possible to distribute computational workload such that overall energy and/or latency system minimized. However, this necessitates sending feature data from cloud in order perform inference. In work, we examine differences natural image data, propose simple effective near-lossless compressor. The proposed method...
We propose a novel frame prediction method using deep neural network (DNN), with the goal of improving video coding efficiency. The proposed DNN makes use decoded frames, at both encoder and decoder to predict textures current block. Unlike conventional inter-prediction, does not require any motion information be transferred between decoder. Still, uni-directional bi-directional predictions are possible DNN, which is enabled by temporal index channel, in addition color channels. In this...
Image and video compression has traditionally been tailored to human vision. However, modern applications such as visual analytics surveillance rely on computers "seeing" analyzing the images before (or instead of) humans. For these applications, it is important adjust computer In this paper we present a bit allocation rate control strategy that object detection. U sing initial convolutional layers of state-of-the-art detector, create an importance map can guide areas are for The proposed...
Occluded person Re-identification is a challenging task which aims to find or distinguish specific when the human body occluded by obstacles, other persons oneself. Some recent state-of-the-art works adopting transformer and/or pose-guided methods have improved feature representation and performances, but are still in trouble with both weak heavy structure. In this paper, we suggest novel of transformer-based for as follows. First, data augmentation, instead deleting an arbitrary area, only...
Recent AI applications such as Collaborative Intelligence with neural networks involve transferring deep feature tensors between various computing devices. This necessitates tensor compression in order to optimize the usage of bandwidth-constrained channels In this paper we present a prediction scheme called Back-and-Forth (BaF) prediction, developed for tensors, which allows us dramatically reduce size and improve its compressibility. Our experiments state-of-the-art object detector...
We propose a neural network model to estimate the current frame from two reference frames, using affine transformation and adaptive spatially-varying filters. The estimated allows for shorter filters compared existing approaches deep prediction. predicted is used as coding frame. Since proposed available at both encoder decoder, there no need code or transmit motion information By making use of dilated convolutions reduced filter length, our significantly smaller, yet more accurate, than any...
We investigate latent-space scalability for multi-task collaborative intelligence, where one of the tasks is object detection and other input reconstruction. In our proposed approach, part latent space can be selectively decoded to support while remainder when reconstruction needed. Such an approach allows reduced computational resources only required, this achieved without reconstructing pixels. By varying scaling factors various terms in training loss function, system trained achieve...
Web refresh crawling is the problem of keeping a cache web pages fresh, that is, having most recent copy available when page requested, given limited bandwidth to crawler. Under assumption change and request events, resp., each follow independent Poisson processes, optimal scheduling policy was derived by Azar et al. 2018. In this paper, we study an extension where side information indicating content changes, such as various types pings, for example, signals from sitemaps, delivery networks,...
We present an object labelled dataset called SFU-HW-Objects-v1, which contains labels for a set of raw video sequences. The can be useful the cases where both detection accuracy and coding efficiency need to evaluated on same dataset. Object ground-truths 18 High Efficiency Video Coding (HEVC) v1 Common Test Conditions (CTC) sequences have been labelled. categories used labeling are based Objects in Context (COCO) labels. A total 21 classes found test sequences, out 80 original COCO label...
In collaborative intelligence applications, part of a deep neural network (DNN) is deployed on relatively low-complexity device such as mobile phone or edge device, and the remainder DNN processed where more computing resources are available, in cloud. This paper presents novel lightweight compression technique designed specifically to code activations split layer, while having low complexity suitable for devices not requiring any retraining. We also present modified entropy-constrained...
As an increasing amount of image and video content will be analyzed by machines, there is demand for a new codec paradigm that capable compressing visual input primarily the purpose computer vision inference, while secondarily supporting reconstruction. In this work, we propose learned compression architecture can used to build such codec. We introduce novel variational formulation explicitly takes feature data relevant desired inference task as at encoder side. such, our scalable encodes...
We propose a novel framework to compress human-centric videos for both human viewing and machine analytics. Our system uses three coding branches combine the power of generic face-prior learning with data-dependent detail recovery. The branch embeds faces into discrete code space described by learned high-quality (HQ) codebook, reconstruct an HQ baseline face. domain-adaptive adjusts reconstruction fit current data domain adding domain-specific information through supplementary codebook....
In this paper, we propose two structures for scalable video coding (SVC) based on HEVC. Several inter-layer prediction mechanisms are introduced to improve efficiency of the proposed HEVC-based SVC. The predictions developed single-loop and multi-loop decoding structures. We found that SVC is able decrease average bitrates enhancement layers by about 10.2% all-intra case, 7.4% random access compared with single layer no in decoding. addition, achieves gains 2.6% case.
Video content is watched not only by humans, but increasingly also machines. For example, machine learning models analyze surveillance video for security and traffic moni-toring, search through YouTube videos inappropriate content, so on. In this paper, we propose a scalable coding framework that supports vision (specifically, object detection) its base layer bitstream human via enhancement bitstream. The proposed includes components from both conventional Deep Neural Network (DNN)-based...
By numerically integrating the orbits of giant planets and test particles over a period four billion years, we follow evolution location midplane Kuiper belt. The Classical belt conforms to warped sheet that precesses with 1.9 Myr period. present-day plane can be computed using linear secular perturbation theory: local normal is given by theory's forced inclination vector, which specific every semimajor axis. does not coincide invariable plane, but deviates from it up few degrees in stable...
Finding faces in images is one of the most important tasks computer vision, with applications biometrics, surveillance, human-computer interaction, and other areas. In our earlier work, we demonstrated that it possible to tell whether or not an image contains a face by only examining HEVC syntax, without fully reconstructing image. present work move further this direction showing how localize HEVC-coded images, full reconstruction. We also demonstrate benefits such approach can have...
Image and video analytics are being increasingly used on a massive scale. Not only is the amount of data growing, but complexity processing pipelines also increasing, thereby exacerbating problem. It becoming important to save computational resources wherever possible. We focus one poster problems visual – face detection approach issue reducing computation by asking: Is it possible detect without full image reconstruction from High Efficiency Video Coding (HEVC) bitstream? demonstrate that...
When it comes to image compression in digital cameras, denoising is traditionally performed prior compression. However, there are applications where noise may be necessary demonstrate the trustworthiness of image, such as court evidence and forensics. This means that itself needs coded, addition clean itself. In this paper, we present a learning-based framework jointly. The latent space codec organized scalable manner can decoded from subset (the base layer), while noisy full at higher rate....
전례 없는 기후변화, 신종전염병 확산 등 글로벌 위기상황이 빈번해지면서 도시의 재난대응력과 회복탄력성이 강조되고 있으며, 이에 따라 스마트시티의 지향점도 변화하고 있다. 성공적 추진을 위해서 지역의 여건과 상황을 고려하는 동시에 동향을 반영한 스마트시티 목표와 전략 설정이 요구된다. 본 연구에서는 패러다임 변화를 살펴보고 추진 방향성과 시사점을 제시하고자 한다. 이를 위해 7개 인덱스의 평가지표 변화추이 및 11개 해외 사례분석을 통해 변화상을 검토했다. 분석 결과, 포용성, 지속가능성, 연결성, 혁신성 인덱스 핵심가치의 확인했으며, 신기술의 단편적 적용과 기반시설의 스마트화를 넘어 시민 삶의 질 개선, 지속가능성 등의 가치를 담은 정책수립과 도시개발을 추진해야한다는 정책적 제시했다. 연구에서 살펴본 추진동향은 향후 전략을 수립하는데 근거자료를 제공할 수 있을 것으로 기대된다.