Chung-Ching Lin

ORCID: 0000-0003-3296-9062
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Radio Frequency Integrated Circuit Design
  • Human Pose and Action Recognition
  • Multimodal Machine Learning Applications
  • Advanced Vision and Imaging
  • Microwave Engineering and Waveguides
  • Video Surveillance and Tracking Methods
  • Millimeter-Wave Propagation and Modeling
  • Advanced Image and Video Retrieval Techniques
  • Domain Adaptation and Few-Shot Learning
  • Advancements in PLL and VCO Technologies
  • Analog and Mixed-Signal Circuit Design
  • Advanced Power Amplifier Design
  • Robotics and Sensor-Based Localization
  • Advanced Neural Network Applications
  • Antenna Design and Optimization
  • Image Processing Techniques and Applications
  • Generative Adversarial Networks and Image Synthesis
  • Integrated Circuits and Semiconductor Failure Analysis
  • Natural Language Processing Techniques
  • Semiconductor materials and devices
  • Antenna Design and Analysis
  • Energy Harvesting in Wireless Networks
  • 3D IC and TSV technologies
  • Topic Modeling
  • Anomaly Detection Techniques and Applications

Washington State University
2019-2025

Microsoft Research (United Kingdom)
2021-2024

Film Independent
2024

IBM Research - Thomas J. Watson Research Center
2013-2021

IBM (United States)
2014-2020

University of California, Los Angeles
2020

Bioengineering Center
2018

Southern Methodist University
2017-2018

University of Illinois Urbana-Champaign
2016

Yuan Ze University
2014

The goal of image stitching is to create natural-looking mosaics free artifacts that may occur due relative camera motion, illumination changes, and optical aberrations. In this paper, we propose a novel method, uses smooth field over the entire target image, while accounting for all local transformation variations. Computing warp fully automated combination homography global similarity transformations, both which are estimated with respect target. We mitigate perspective distortion in...

10.1109/cvpr.2015.7298719 article EN 2015-06-01

The canonical approach to video captioning dictates a caption generation model learn from offline-extracted dense features. These feature extractors usually operate on frames sampled at fixed frame rate and are often trained image/video understanding tasks, without adaption data. In this work, we present SwinBERT, an end-to-end transformer-based for captioning, which takes patches directly as inputs, outputs natural language description. Instead of leveraging multiple 2D/3D extractors, our...

10.1109/cvpr52688.2022.01742 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2022-06-01

Large multimodal models (LMMs) extend large language (LLMs) with multi-sensory skills, such as visual understanding, to achieve stronger generic intelligence. In this paper, we analyze the latest model, GPT-4V(ision), deepen understanding of LMMs. The analysis focuses on intriguing tasks that GPT-4V can perform, containing test samples probe quality and genericity GPT-4V's capabilities, its supported inputs working modes, effective ways prompt model. our approach exploring GPT-4V, curate...

10.48550/arxiv.2309.17421 preprint EN other-oa arXiv (Cornell University) 2023-01-01

Unified vision-language frameworks have greatly advanced in recent years, most of which adopt an encoder-decoder architecture to unify image-text tasks as sequence-to-sequence generation. However, existing video-language (VidL) models still require task-specific designs model and training objectives for each task. In this work, we explore a unified VidL framework LAVENDER, where Masked Language Modeling [13] (MLM) is used the common interface all pre-training downstream tasks. Such...

10.1109/cvpr52729.2023.02214 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2023-06-01

10.1109/cvpr52733.2024.00891 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2024-06-16

We propose a modified variational autoencoder (VAE) architecture built on top of Mask R-CNN for instance-level video segmentation and tracking. The method builds shared encoder three parallel decoders, yielding disjoint branches predictions future frames, object detection boxes, instance masks. To effectively solve multiple learning tasks, we introduce Gaussian Process model to enhance the statistical representation VAE by relaxing prior strong independent identically distributed (iid)...

10.1109/cvpr42600.2020.01316 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2020-06-01

The best beam steering directions are estimated through training, which is one of the most important and challenging tasks in millimeter-wave sub-terahertz communications. Novel array architectures signal processing techniques required to avoid prohibitive training overhead associated with large antenna arrays narrow beams. In this work, we leverage recent developments true-time-delay (TTD) delay-bandwidth products accelerate using frequency-dependent probing We propose study two TTD...

10.1109/tcsi.2021.3054428 article EN IEEE Transactions on Circuits and Systems I Regular Papers 2021-02-08

We present a cross-modal Transformer-based frame-work, which jointly encodes video data and text labels for zero-shot action recognition (ZSAR). Our model employs conceptually new pipeline by visual representations are learned in conjunction with visual-semantic associations an end-to-end manner. The design provides natural mechanism semantic to be shared knowledge space, whereby it encourages the embedding discriminative more semantically consistent. In inference, we devise simple transfer...

10.1109/cvpr52688.2022.01935 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2022-06-01

The canonical approach to video captioning dictates a caption generation model learn from offline-extracted dense features. These feature extractors usually operate on frames sampled at fixed frame rate and are often trained image/video understanding tasks, without adaption data. In this work, we present SwinBERT, an end-to-end transformer-based for captioning, which takes patches directly as inputs, outputs natural language description. Instead of leveraging multiple 2D/3D extractors, our...

10.48550/arxiv.2111.13196 preprint EN other-oa arXiv (Cornell University) 2021-01-01

This paper presents a prior-less method for tracking and clustering an unknown number of human faces maintaining their individual identities in unconstrained videos. The key challenge is to accurately track with partial occlusion drastic appearance changes multiple shots resulting from significant variations makeup, facial expression, head pose illumination. To address this challenge, we propose new multi-face re-identification algorithm, which provides high accuracy face association the...

10.1109/cvpr.2018.00063 article EN 2018-06-01

Initial access in millimeter-wave (mmW) wireless is critical toward successful realization of the fifth-generation (5G) networks and beyond. Limited bandwidth existing standards use phase-shifters analog/hybrid phased-antenna arrays (PAAs) are not suited for these emerging demanding low-latency direction finding. This work proposes a reconfigurable true-time-delay (TTD)-based spatial signal processor (SSP) with frequency-division beam training methodology wideband beam-squint less data...

10.1109/jssc.2022.3178798 article EN publisher-specific-oa IEEE Journal of Solid-State Circuits 2022-06-08

Spatial signal processors (SSP) for emerging millimeter-wave wireless networks are critically dependent on link discovery. To avoid loss in communication, mobile devices need to locate narrow directional beams with millisecond latency. In this work, we demonstrate a true-time-delay (TTD) array digitally reconfigurable delay elements enabling both fast beam-training at the receiver wideband data communications. mode, large delay-bandwidth products implemented accelerate beam training using...

10.1109/esscirc53450.2021.9567822 article EN ESSCIRC 2022- IEEE 48th European Solid State Circuits Conference (ESSCIRC) 2021-09-13

This article presents a process- and temperature-invariant high-resolution highly linear low-power phase interpolator (PI) as an enabler for discrete-time spatial signal processors (SSPs) various mixed-mode RF transceiver architectures. Using current integration techniques, the proposed PI generates adaptable constant slope-and-swing ramp to achieve significantly lower power suited multiple antenna elements. Switched-capacitor-based bias generation enables tracking generator over process,...

10.1109/jssc.2023.3242935 article EN IEEE Journal of Solid-State Circuits 2023-02-14

We present MM-VID, an integrated system that harnesses the capabilities of GPT-4V, combined with specialized tools in vision, audio, and speech, to facilitate advanced video understanding. MM-VID is designed address challenges posed by long-form videos intricate tasks such as reasoning within hour-long content grasping storylines spanning multiple episodes. uses a video-to-script generation GPT-4V transcribe multimodal elements into long textual script. The generated script details character...

10.48550/arxiv.2310.19773 preprint EN other-oa arXiv (Cornell University) 2023-01-01

This study explores the concept of equivariance in vision-language foundation models (VLMs), focusing specifically on multimodal similarity function that is not only major training objective but also core delivery to support downstream tasks. Unlike existing image-text which categorizes matched pairs as similar and unmatched dissimilar, requires vary faithfully according semantic changes. allows VLMs generalize better nuanced unseen compositions. However, modeling challenging ground truth...

10.1109/iccv51070.2023.01102 article EN 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2023-10-01

The decadal research in integrated true-time-delay arrays have seen organic growth enabling realization of wideband beamformers for large with wide aperture widths. This article introduces highly reconfigurable delay elements implementable at analog or digital baseband that enables multiple Spatial Signal Processing (SSP) functions including beamforming, interference cancellation, and fast beam training. Details the beam-training algorithm, system design considerations, architecture circuits...

10.1109/mcas.2022.3214408 article EN publisher-specific-oa IEEE Circuits and Systems Magazine 2022-01-01

This paper proposes a new ego-motion estimation and background/foreground classification method to effectively segment moving objects from videos captured by camera on platform. Existing methods for moving-camera detecting impose serious constraints. In our approach, ellipsoid scene shape is applied in the motion model complicated formula derived. Genetic algorithm introduced accurately solve parameters. After recovery, noisy result refined vector correlation foreground classified pixel...

10.1109/icpr.2010.121 article EN 2010-08-01

Most methods for Bundle Adjustment (BA) in computer vision are either centralized or operate incrementally. This leads to poor scaling and affects the quality of solution as number images grows large scale structure from motion (SfM). Furthermore, they cannot be used scenarios where image acquisition processing must distributed. We address this problem with a new distributed BA algorithm. Our formulation uses alternating direction method multipliers (ADMM), and, since each processor sees...

10.1109/iccvw.2017.251 article EN 2017-10-01

In this paper, we propose a hierarchical computational system architecture to support the target domain of realtime mobile computing in context unmanned aerial vehicles (UAVs). The overall architectural vision includes for resilience presence uncertainties operational environment surveillance UAVs. We report measurement-based results that are obtained from UAV proxy demonstration apparatus. apparatus consists Raspberry Pi (RPi) board serves as an on-board computer, working with laptop...

10.1109/iccd.2015.7357189 article EN 2015-10-01
Coming Soon ...