NFDI4DS | UHH-SEMS - Publication Details

Chung-Ching Lin

ORCID: 0000-0003-3296-9062

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5065235694

Research Areas

Radio Frequency Integrated Circuit Design
Human Pose and Action Recognition
Multimodal Machine Learning Applications
Advanced Vision and Imaging
Microwave Engineering and Waveguides
Video Surveillance and Tracking Methods
Millimeter-Wave Propagation and Modeling
Advanced Image and Video Retrieval Techniques
Domain Adaptation and Few-Shot Learning
Advancements in PLL and VCO Technologies
Analog and Mixed-Signal Circuit Design
Advanced Power Amplifier Design
Robotics and Sensor-Based Localization
Advanced Neural Network Applications
Antenna Design and Optimization
Image Processing Techniques and Applications
Generative Adversarial Networks and Image Synthesis
Integrated Circuits and Semiconductor Failure Analysis
Natural Language Processing Techniques
Semiconductor materials and devices
Antenna Design and Analysis
Energy Harvesting in Wireless Networks
3D IC and TSV technologies
Topic Modeling
Anomaly Detection Techniques and Applications

Washington State University
2019-2025

Microsoft Research (United Kingdom)
2021-2024

Film Independent
2024

IBM Research - Thomas J. Watson Research Center
2013-2021

IBM (United States)
2014-2020

University of California, Los Angeles
2020

Bioengineering Center
2018

Southern Methodist University
2017-2018

University of Illinois Urbana-Champaign
2016

Yuan Ze University
2014

Adaptive as-natural-as-possible image stitching

OPENALEX - Publications

Chung-Ching Lin Sharathchandra Pankanti Karthikeyan Natesan Ramamurthy Aleksandr Y. Aravkin

The goal of image stitching is to create natural-looking mosaics free artifacts that may occur due relative camera motion, illumination changes, and optical aberrations. In this paper, we propose a novel method, uses smooth field over the entire target image, while accounting for all local transformation variations. Computing warp fully automated combination homography global similarity transformations, both which are estimated with respect target. We mitigate perspective distortion in...

10.1109/cvpr.2015.7298719 article EN 2015-06-01

SwinBERT: End-to-End Transformers with Sparse Attention for Video Captioning

OPENALEX - Publications

Kevin Lin Linjie Li Chung-Ching Lin Faisal Ahmed Zhe Gan and 3 more

The canonical approach to video captioning dictates a caption generation model learn from offline-extracted dense features. These feature extractors usually operate on frames sampled at fixed frame rate and are often trained image/video understanding tasks, without adaption data. In this work, we present SwinBERT, an end-to-end transformer-based for captioning, which takes patches directly as inputs, outputs natural language description. Instead of leveraging multiple 2D/3D extractors, our...

10.1109/cvpr52688.2022.01742 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2022-06-01

The Dawn of LMMs: Preliminary Explorations with GPT-4V(ision)

OPENALEX - Publications

Zhengyuan Yang Linjie Li Kevin Lin Jianfeng Wang Chung-Ching Lin and 2 more

Large multimodal models (LMMs) extend large language (LLMs) with multi-sensory skills, such as visual understanding, to achieve stronger generic intelligence. In this paper, we analyze the latest model, GPT-4V(ision), deepen understanding of LMMs. The analysis focuses on intriguing tasks that GPT-4V can perform, containing test samples probe quality and genericity GPT-4V's capabilities, its supported inputs working modes, effective ways prompt model. our approach exploring GPT-4V, curate...

10.48550/arxiv.2309.17421 preprint EN other-oa arXiv (Cornell University) 2023-01-01

LAVENDER: Unifying Video-Language Understanding as Masked Language Modeling

OPENALEX - Publications

Linjie Li Zhe Gan Kevin Lin Chung-Ching Lin Zicheng Liu and 2 more

Unified vision-language frameworks have greatly advanced in recent years, most of which adopt an encoder-decoder architecture to unify image-text tasks as sequence-to-sequence generation. However, existing video-language (VidL) models still require task-specific designs model and training objectives for each task. In this work, we explore a unified VidL framework LAVENDER, where Masked Language Modeling [13] (MLM) is used the common interface all pre-training downstream tasks. Such...

10.1109/cvpr52729.2023.02214 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2023-06-01

Disco: Disentangled Control for Realistic Human Dance Generation

OPENALEX - Publications

Tan Wang Linjie Li Kevin Lin Yuanhao Zhai Chung-Ching Lin and 4 more

10.1109/cvpr52733.2024.00891 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2024-06-16

Video Instance Segmentation Tracking With a Modified VAE Architecture

OPENALEX - Publications

Chung-Ching Lin Ying Hung Rogério Feris Linglin He

We propose a modified variational autoencoder (VAE) architecture built on top of Mask R-CNN for instance-level video segmentation and tracking. The method builds shared encoder three parallel decoders, yielding disjoint branches predictions future frames, object detection boxes, instance masks. To effectively solve multiple learning tasks, we introduce Gaussian Process model to enhance the statistical representation VAE by relaxing prior strong independent identically distributed (iid)...

10.1109/cvpr42600.2020.01316 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2020-06-01

Fast Beam Training With True-Time-Delay Arrays in Wideband Millimeter-Wave Systems

OPENALEX - Publications

Veljko Boljanovic Yan Han Chung-Ching Lin Soumen Mohapatra Deukhyoun Heo and 2 more

The best beam steering directions are estimated through training, which is one of the most important and challenging tasks in millimeter-wave sub-terahertz communications. Novel array architectures signal processing techniques required to avoid prohibitive training overhead associated with large antenna arrays narrow beams. In this work, we leverage recent developments true-time-delay (TTD) delay-bandwidth products accelerate using frequency-dependent probing We propose study two TTD...

10.1109/tcsi.2021.3054428 article EN IEEE Transactions on Circuits and Systems I Regular Papers 2021-02-08

Crossmodal Representation Learning for Zero-shot Action Recognition

OPENALEX - Publications

Chung-Ching Lin Kevin Lin Lijuan Wang Zicheng Liu Linjie Li

We present a cross-modal Transformer-based frame-work, which jointly encodes video data and text labels for zero-shot action recognition (ZSAR). Our model employs conceptually new pipeline by visual representations are learned in conjunction with visual-semantic associations an end-to-end manner. The design provides natural mechanism semantic to be shared knowledge space, whereby it encourages the embedding discriminative more semantically consistent. In inference, we devise simple transfer...

10.1109/cvpr52688.2022.01935 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2022-06-01

MM-Narrator: Narrating Long-form Videos with Multimodal In-Context Learning

OPENALEX - Publications

Chaoyi Zhang Kevin Lin Zhengyuan Yang Jianfeng Wang Linjie Li and 3 more

10.1109/cvpr52733.2024.01295 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2024-06-16

A TTD-Based Fast Precise Localization Enabled by Passive–Active Signal Combiner With Negative-Capacitance Stabilized RAMP

OPENALEX - Publications

Qiuyan Xu Aditya Wadaskar Foad Beheshti Chung-Ching Lin Huan Hu and 2 more

10.1109/jssc.2025.3546958 article EN IEEE Journal of Solid-State Circuits 2025-01-01

SwinBERT: End-to-End Transformers with Sparse Attention for Video Captioning

OPENALEX - Publications

Kevin Lin Linjie Li Chung-Ching Lin S. Faisal Ahmed Zhe Gan and 3 more

10.48550/arxiv.2111.13196 preprint EN other-oa arXiv (Cornell University) 2021-01-01

A Prior-Less Method for Multi-face Tracking in Unconstrained Videos

OPENALEX - Publications

Chung-Ching Lin Ying Hung

This paper presents a prior-less method for tracking and clustering an unknown number of human faces maintaining their individual identities in unconstrained videos. The key challenge is to accurately track with partial occlusion drastic appearance changes multiple shots resulting from significant variations makeup, facial expression, head pose illumination. To address this challenge, we propose new multi-face re-identification algorithm, which provides high accuracy face association the...

10.1109/cvpr.2018.00063 article EN 2018-06-01

Multi-Mode Spatial Signal Processor With Rainbow-Like Fast Beam Training and Wideband Communications Using True-Time-Delay Arrays

OPENALEX - Publications

Chung-Ching Lin Chase Puglisi Veljko Boljanovic Yan Han Erfan Ghaderi and 5 more

Initial access in millimeter-wave (mmW) wireless is critical toward successful realization of the fifth-generation (5G) networks and beyond. Limited bandwidth existing standards use phase-shifters analog/hybrid phased-antenna arrays (PAAs) are not suited for these emerging demanding low-latency direction finding. This work proposes a reconfigurable true-time-delay (TTD)-based spatial signal processor (SSP) with frequency-division beam training methodology wideband beam-squint less data...

10.1109/jssc.2022.3178798 article EN publisher-specific-oa IEEE Journal of Solid-State Circuits 2022-06-08

A 4-Element 800MHz-BW 29mW True-Time-Delay Spatial Signal Processor Enabling Fast Beam-Training with Data Communications

OPENALEX - Publications

Chung-Ching Lin Chase Puglisi Erfan Ghaderi Soumen Mohapatra Deukhyoun Heo and 4 more

Spatial signal processors (SSP) for emerging millimeter-wave wireless networks are critically dependent on link discovery. To avoid loss in communication, mobile devices need to locate narrow directional beams with millisecond latency. In this work, we demonstrate a true-time-delay (TTD) array digitally reconfigurable delay elements enabling both fast beam-training at the receiver wideband data communications. mode, large delay-bandwidth products implemented accelerate beam training using...

10.1109/esscirc53450.2021.9567822 article EN ESSCIRC 2022- IEEE 48th European Solid State Circuits Conference (ESSCIRC) 2021-09-13

Low-Power Process and Temperature-Invariant Constant Slope-and-Swing Ramp-Based Phase Interpolator

OPENALEX - Publications

Soumen Mohapatra Chung-Ching Lin Subhanshu Gupta Deukhyoun Heo

This article presents a process- and temperature-invariant high-resolution highly linear low-power phase interpolator (PI) as an enabler for discrete-time spatial signal processors (SSPs) various mixed-mode RF transceiver architectures. Using current integration techniques, the proposed PI generates adaptable constant slope-and-swing ramp to achieve significantly lower power suited multiple antenna elements. Switched-capacitor-based bias generation enables tracking generator over process,...

10.1109/jssc.2023.3242935 article EN IEEE Journal of Solid-State Circuits 2023-02-14

MM-VID: Advancing Video Understanding with GPT-4V(ision)

OPENALEX - Publications

Kevin Lin S. Faisal Ahmed Linjie Li Chung-Ching Lin Ehsan Azarnasab and 7 more

We present MM-VID, an integrated system that harnesses the capabilities of GPT-4V, combined with specialized tools in vision, audio, and speech, to facilitate advanced video understanding. MM-VID is designed address challenges posed by long-form videos intricate tasks such as reasoning within hour-long content grasping storylines spanning multiple episodes. uses a video-to-script generation GPT-4V transcribe multimodal elements into long textual script. The generated script details character...

10.48550/arxiv.2310.19773 preprint EN other-oa arXiv (Cornell University) 2023-01-01

Equivariant Similarity for Vision-Language Foundation Models

OPENALEX - Publications

Tan Wang Kevin Lin Linjie Li Chung-Ching Lin Zhengyuan Yang and 3 more

This study explores the concept of equivariance in vision-language foundation models (VLMs), focusing specifically on multimodal similarity function that is not only major training objective but also core delivery to support downstream tasks. Unlike existing image-text which categorizes matched pairs as similar and unmatched dissimilar, requires vary faithfully according semantic changes. allows VLMs generalize better nuanced unseen compositions. However, modeling challenging ground truth...

10.1109/iccv51070.2023.01102 article EN 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2023-10-01

Wideband Beamforming With Rainbow Beam Training Using Reconfigurable True-Time-Delay Arrays for Millimeter-Wave Wireless [Feature]

OPENALEX - Publications

Chung-Ching Lin Veljko Boljanovic Yan Han Erfan Ghaderi Mohammad Ali Mokri and 9 more

The decadal research in integrated true-time-delay arrays have seen organic growth enabling realization of wideband beamformers for large with wide aperture widths. This article introduces highly reconfigurable delay elements implementable at analog or digital baseband that enables multiple Spatial Signal Processing (SSP) functions including beamforming, interference cancellation, and fast beam training. Details the beam-training algorithm, system design considerations, architecture circuits...

10.1109/mcas.2022.3214408 article EN publisher-specific-oa IEEE Circuits and Systems Magazine 2022-01-01

Detecting Moving Objects Using a Camera on a Moving Platform

OPENALEX - Publications

Chung-Ching Lin Marilyn Wolf

This paper proposes a new ego-motion estimation and background/foreground classification method to effectively segment moving objects from videos captured by camera on platform. Existing methods for moving-camera detecting impose serious constraints. In our approach, ellipsoid scene shape is applied in the motion model complicated formula derived. Genetic algorithm introduced accurately solve parameters. After recovery, noisy result refined vector correlation foreground classified pixel...

10.1109/icpr.2010.121 article EN 2010-08-01

Distributed Bundle Adjustment

OPENALEX - Publications

Karthikeyan Natesan Ramamurthy Chung-Ching Lin Aleksandr Y. Aravkin Sharath Pankanti Raphael Viguier

Most methods for Bundle Adjustment (BA) in computer vision are either centralized or operate incrementally. This leads to poor scaling and affects the quality of solution as number images grows large scale structure from motion (SfM). Furthermore, they cannot be used scenarios where image acquisition processing must distributed. We address this problem with a new distributed BA algorithm. Our formulation uses alternating direction method multipliers (ADMM), and, since each processor sees...

10.1109/iccvw.2017.251 article EN 2017-10-01

Resilient, UAV-embedded real-time computing

OPENALEX - Publications

Augusto Vega Chung-Ching Lin Karthik Swaminathan Alper Buyuktosunoglu Sharathchandra Pankanti and 1 more

In this paper, we propose a hierarchical computational system architecture to support the target domain of realtime mobile computing in context unmanned aerial vehicles (UAVs). The overall architectural vision includes for resilience presence uncertainties operational environment surveillance UAVs. We report measurement-based results that are obtained from UAV proxy demonstration apparatus. apparatus consists Raspberry Pi (RPi) board serves as an on-board computer, working with laptop...

10.1109/iccd.2015.7357189 article EN 2015-10-01

Coming Soon ...