Yichao Zhou

ORCID: 0009-0003-8632-446X
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Advanced Vision and Imaging
  • Topic Modeling
  • Digital Media Forensic Detection
  • Image and Signal Denoising Methods
  • Image and Object Detection Techniques
  • Face recognition and analysis
  • Optical measurement and interference techniques
  • Robotics and Sensor-Based Localization
  • Biomedical Text Mining and Ontologies
  • Generative Adversarial Networks and Image Synthesis
  • Natural Language Processing Techniques
  • Handwritten Text Recognition Techniques
  • Web Data Mining and Analysis
  • 3D Shape Modeling and Analysis
  • Data-Driven Disease Surveillance
  • Advanced Image Processing Techniques
  • Data Quality and Management
  • Anomaly Detection Techniques and Applications
  • Scientific Computing and Data Management
  • Image Retrieval and Classification Techniques
  • Misinformation and Its Impacts
  • Semantic Web and Ontologies
  • 3D Surveying and Cultural Heritage
  • Data Mining Algorithms and Applications
  • Image Enhancement Techniques

Nanjing University of Science and Technology
2015-2024

University of Chicago
2024

Google (United States)
2022-2023

University of California, Los Angeles
2017-2022

University of California, Berkeley
2019-2022

Czech Academy of Sciences, Institute of Computer Science
2018

Courant Institute of Mathematical Sciences
2018

New York University
2018

Tel Aviv University
2018

ETH Zurich
2018

We present a conceptually simple yet effective algorithm to detect wireframes in given image. Compared the previous methods which first predict an intermediate heat map and then extract straight lines with heuristic algorithms, our method is end-to-end trainable can directly output vectorized wireframe that contains semantically meaningful geometrically salient junctions lines. To better understand quality of outputs, we propose new metric for evaluation penalizes overlapped line segments...

10.1109/iccv.2019.00105 article EN 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2019-10-01

In this work, we introduce the novel problem of identifying dense canonical 3D coordinate frames from a single RGB image. We observe that each pixel in an image corresponds to surface underlying geometry, where frame can be identified as represented by three orthogonal axes, one along its normal direction and two tangent plane. propose algorithm predict these axes RGB. Our first insight is computed automatically with recently introduced field synthesis methods provide training data for task....

10.1109/iccv.2019.00873 article EN 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2019-10-01

Extracting event temporal relations is a critical task for information extraction and plays an important role in natural language understanding. Prior systems leverage deep learning pre-trained models to improve the performance of task. However, these often suffer from two shortcomings: 1) when performing maximum posteriori (MAP) inference based on neural models, previous only used structured knowledge that assumed be absolutely correct, i.e., hard constraints; 2) biased predictions dominant...

10.18653/v1/2020.emnlp-main.461 article EN cc-by 2020-01-01

We present a foundation model for zero-shot metric monocular depth estimation. Our model, Depth Pro, synthesizes high-resolution maps with unparalleled sharpness and high-frequency details. The predictions are metric, absolute scale, without relying on the availability of metadata such as camera intrinsics. And is fast, producing 2.25-megapixel map in 0.3 seconds standard GPU. These characteristics enabled by number technical contributions, including an efficient multi-scale vision...

10.48550/arxiv.2410.02073 preprint EN arXiv (Cornell University) 2024-10-02

We present a simple yet effective end-to-end trainable deep network with geometry-inspired convolutional operators for detecting vanishing points in images. Traditional neural networks rely on aggregating edge features and do not have mechanisms to directly exploit the geometric properties of as intersections parallel lines. In this work, we identify canonical conic space which can effectively compute global information locally, propose novel operator named convolution that be implemented...

10.48550/arxiv.1910.06316 preprint EN other-oa arXiv (Cornell University) 2019-01-01

The outbreak of the novel coronavirus, COVID-19, has become one most severe pandemics in human history. In this paper, we propose to leverage social media users as sensors simultaneously predict pandemic trends and suggest potential risk factors for public health experts understand spread situations recommend proper interventions. More precisely, develop deep learning models recognize important entities their relations over time, thereby establishing dynamic heterogeneous graphs describe...

10.1098/rsta.2021.0125 article EN cc-by Philosophical Transactions of the Royal Society A Mathematical Physical and Engineering Sciences 2021-11-22

Being able to infer 3D structures from 2D images with geometric principles, vanishing points have been a well-recognized concept in vision research. It has widely used autonomous driving, SLAM, and AR/VR for applications including road direction estimation, camera calibration, pose estimation. Existing point detection methods often need trade off between robustness, precision, inference speed. In this paper, we introduce VaPiD, novel neural network-based rapid Vanishing Point Detector that...

10.1109/iccv48922.2021.01262 article EN 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2021-10-01

Abstract We have developed ACROBAT (Annotation for Case Reports using Open Biomedical Annotation Terms), a typing system detailed information extraction from clinical text. This resource supports identification and categorization of entities, events, relations within text documents, including clincal case reports (CCRs) the free-text components electronic health records. Using 200 CCRs, we annotated wide variety real-world disease presentations. The resulting dataset, MACCROBAT2018, is rich...

10.1101/19009118 preprint EN cc-by medRxiv (Cold Spring Harbor Laboratory) 2019-10-22

10.1109/icip51287.2024.10647603 article EN 2022 IEEE International Conference on Image Processing (ICIP) 2024-09-27

There has been a steady need to precisely extract structured knowledge from the web (i.e. HTML documents). Given page, extracting object along with various attributes of interest (e.g. price, publisher, author, and genre for book) can facilitate variety downstream applications such as large-scale base construction, e-commerce product search, personalized recommendation. Considering each page is rendered an DOM tree, existing approaches formulate problem tree node tagging task. However, they...

10.48550/arxiv.2101.02415 preprint EN cc-by arXiv (Cornell University) 2021-01-01

Smoke removal is an important and meaningful issue for endoscopic surgery, which can enhance the visual quality of images. Because it practically impossible to construct a large training dataset pair-matched images with/without smoke, Generative Adversarial Nets (GANs) based models are usually used image desmoke. But they have difficulties in either locating accurate smoke area, or recovering realistic internal organ tissue details. In this paper, we propose new approach, called...

10.1109/tcbb.2022.3204673 article EN IEEE/ACM Transactions on Computational Biology and Bioinformatics 2022-09-06

We present a conceptually simple yet effective algorithm to detect wireframes in given image. Compared the previous methods which first predict an intermediate heat map and then extract straight lines with heuristic algorithms, our method is end-to-end trainable can directly output vectorized wireframe that contains semantically meaningful geometrically salient junctions lines. To better understand quality of outputs, we propose new metric for evaluation penalizes overlapped line segments...

10.48550/arxiv.1905.03246 preprint EN other-oa arXiv (Cornell University) 2019-01-01

Given a web page, extracting an object along with various attributes of interest (e.g. price, publisher, author, and genre for book) can facilitate variety downstream applications such as large-scale knowledge base construction, e-commerce product search, personalized recommendation. Prior approaches have either relied on computationally expensive visual feature engineering or required large amounts training data to get acceptable precision. In this paper, we propose novel method, LeArNing...

10.1145/3488560.3498424 article EN Proceedings of the Fifteenth ACM International Conference on Web Search and Data Mining 2022-02-11

We present Re current F eature A lignment (ReFA), an end-to-end neural network for the very rapid creation of production-grade face assets from multi-view images. ReFA is on par with industrial pipelines in quality producing accurate, complete, registered, and textured directly applicable to physically-based rendering, but produces asset end-to-end, fully automatically at a significantly faster speed 4.5 FPS, which unprecedented among neural-based techniques. Our method represents geometry...

10.1145/3550454.3555509 article EN ACM Transactions on Graphics 2022-11-30

Removing noise and other artifacts in the electrocardiogram (ECG) is a critical preprocessing step for further heart disease analysis diagnosis. In this paper, we propose sparse representation based ECG signal denoising baseline wandering (BW) correction algorithm. Unlike traditional filtering-based methods, like Fourier or Wavelet transform, which use fixed basis, proposed algorithm models as superposition of few inner structures plus additive random noise, while those can be learned from...

10.1109/sips.2015.7344997 article EN 2015-10-01

Professional basketball provides an intriguing example of a dynamic spatio-temporal game that incorporates both hidden strategy policies and situational decision making. During game, the coaches players are assumed to follow general plan, but also forced make spur-of-the-moment decisions based on immediate conditions court. However, because it is challenging process heterogeneous signals court space potential actions outcomes massive, hard for find optimal fly given short amount time observe...

10.1145/3511808.3557105 article EN Proceedings of the 31st ACM International Conference on Information & Knowledge Management 2022-10-16

COVID-19 has caused lasting damage to almost every domain in public health, society, and economy. To monitor the pandemic trend, existing studies rely on aggregation of traditional statistical models epidemic spread theory. In other words, historical statistics COVID-19, as well population mobility data, become essential knowledge for monitoring trend. However, these solutions can barely provide precise prediction satisfactory explanations long-term disease surveillance while ubiquitous...

10.1145/3459637.3482222 article EN 2021-10-26

Understanding visually-rich business documents to extract structured data and automate workflows has been receiving attention both in academia industry. Although recent multi-modal language models have achieved impressive results, we find that existing benchmarks do not reflect the complexity of real seen In this work, identify desiderata for a more comprehensive benchmark propose one call Visually Rich Document (VRDU). VRDU contains two datasets represent several challenges: rich schema...

10.1145/3580305.3599929 article EN cc-by-sa Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining 2023-08-04

3D reconstruction from a single RGB image is challenging problem in computer vision. Previous methods are usually solely data-driven, which lead to inaccurate shape recovery and limited generalization capability. In this work, we focus on object-level present geometry-based end-to-end deep learning framework that first detects the mirror plane of reflection symmetry commonly exists man-made objects then predicts depth maps by finding intra-image pixel-wise correspondence symmetry. Our method...

10.48550/arxiv.2006.10042 preprint EN other-oa arXiv (Cornell University) 2020-01-01
Coming Soon ...