Runtao Liu

ORCID: 0000-0002-7260-4060
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Data Management and Algorithms
  • Advanced Vision and Imaging
  • Welding Techniques and Residual Stresses
  • Generative Adversarial Networks and Image Synthesis
  • Multimodal Machine Learning Applications
  • Simulation and Modeling Applications
  • Advanced Database Systems and Queries
  • Handwritten Text Recognition Techniques
  • Non-Destructive Testing Techniques
  • Metal and Thin Film Mechanics
  • Advanced Surface Polishing Techniques
  • Computational Geometry and Mesh Generation
  • Topic Modeling
  • Natural Language Processing Techniques
  • Advanced Image and Video Retrieval Techniques
  • Robotic Path Planning Algorithms
  • Silicon and Solar Cell Technologies
  • Geographic Information Systems Studies
  • Advanced Numerical Analysis Techniques
  • Domain Adaptation and Few-Shot Learning
  • Data Mining Algorithms and Applications
  • Optimization and Packing Problems
  • Thermal Expansion and Ionic Conductivity
  • Advanced Computational Techniques and Applications
  • Tunneling and Rock Mechanics

Shanxi Provincial Children's Hospital
2017-2024

Sun Yat-sen University
2022-2024

Hong Kong University of Science and Technology
2024

University of Hong Kong
2024

Dalian University of Technology
2021-2024

University of California, Berkeley
2019-2023

Johns Hopkins University
2023

International Computer Science Institute
2019-2023

Shandong University
2021-2022

Peking University
2017-2019

Referring object detection and referring image segmentation are important tasks that require joint understanding of visual information natural language. Yet there has been evidence current benchmark datasets suffer from bias, state-of-the-art models cannot be easily evaluated on their intermediate reasoning process. To address these issues complement similar efforts in question answering, we build CLEVR-Ref+, a synthetic diagnostic dataset for expression comprehension. The precise locations...

10.1109/cvpr.2019.00431 preprint EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2019-06-01

This electronic document is a "live" template. The various components of your paper [title, text, heads, etc.] are Abstract-Object detection in natural scenes has been widely researched the past decade, and many deep learning based methods have achieved good performance on this task. focuses how to transfer refine those object approaches from scene images documents images, proposes learning-based page (e.g., tables, formulae, figures) method. On basis traditional Convolutional Neural Network...

10.1109/icdar.2017.46 article EN 2017-11-01

This paper presents ThinkDiff, a novel alignment paradigm that empowers text-to-image diffusion models with multimodal in-context understanding and reasoning capabilities by integrating the strengths of vision-language (VLMs). Existing finetuning methods largely focus on pixel-level reconstruction rather than reasoning, are constrained complexity limited availability reasoning-based datasets. ThinkDiff addresses these challenges leveraging training as proxy task, aligning VLMs decoder an...

10.48550/arxiv.2502.10458 preprint EN arXiv (Cornell University) 2025-02-12

Referring object detection and referring image segmentation are important tasks that require joint understanding of visual information natural language. Yet there has been evidence current benchmark datasets suffer from bias, state-of-the-art models cannot be easily evaluated on their intermediate reasoning process. To address these issues complement similar efforts in question answering, we build CLEVR-Ref+, a synthetic diagnostic dataset for expression comprehension. The precise locations...

10.48550/arxiv.1901.00850 preprint EN other-oa arXiv (Cornell University) 2019-01-01

10.1007/s00170-022-09131-1 article EN The International Journal of Advanced Manufacturing Technology 2022-04-19

Multimodal Large Language Models (MLLMs) excel in generating responses based on visual inputs. However, they often suffer from a bias towards similar to their pretraining corpus, overshadowing the importance of information. We treat this as "preference" for statistics, which hinders model's grounding input. To mitigate issue, we propose Bootstrapped Preference Optimization (BPO), conducts preference learning with datasets containing negative bootstrapped model itself. Specifically, following...

10.48550/arxiv.2403.08730 preprint EN arXiv (Cornell University) 2024-03-13

Metal–organic frameworks (MOFs) are promising materials for quasi-solid-state electrolytes as a result of their tunable crystal structure and ion-selective capabilities. However, the rational design MOF-based high-energy lithium battery still requires improvement. In this study, (MQSSEs) were synthesized using various MOFs, effects different metal active sites ligand groups on electrochemical performance systematically investigated. The results indicate that have more significant impact ion...

10.1021/acs.energyfuels.4c01821 article EN Energy & Fuels 2024-06-04

Pseudopolyrotaxane was obtained through the grinding of a mixture O-trimethyl-α-cyclodextrin and polytetrahydrofuran in mortar by solvent-free synthesis, it fixed to stable polyrotaxane successive end-capping reaction with bulky isocyanate solid-state mortar. Higher molecular weight polytetrahydrofurans (Mn > 1000) successfully produced corresponding polyrotaxanes moderate yields coverage ratios. O-Trimethyl-β-cyclodextrin poly(ethylene glycol) also formed pseudopolyrotaxanes method.

10.1002/pola.21913 article EN Journal of Polymer Science Part A Polymer Chemistry 2007-03-08

Background. Vitamin D can play a vital role in autoimmune diseases. Epidemiologic evidence demonstrates vitamin deficiency excited adult patients with vitiligo. Objectives. To investigate 25-hydroxyvitamin (25(OH)D) levels children vitiligo and explore possible relevant factors. Methods. A total of 114 100 controls were included our case-control study. We analyzed the required data collected by questionnaire examination to reveal correlation 25(OH)D levels. Results. The mean serum 43.62 ±...

10.1177/0009922817734362 article EN Clinical Pediatrics 2017-10-09

10.1007/s00170-022-08706-2 article EN The International Journal of Advanced Manufacturing Technology 2022-01-21

With more and scientific documents becoming available in PDF format, recognition of formulae these is great significance. In this paper, we propose a symbol dominance based approach to recovering structures by using the rich information extracted directly from files. The hierarchical structure formula represented relationship tree, tree built recursively on dominance, which considers both spatial layout symbols typesetting conventions mathematics. addition, special character method identify...

10.1109/icdar.2017.189 article EN 2017-11-01

This paper proposes the first GAN inversion-based method for multi-class sketch-based image generation (MCSBIG). MC-SBIG is a challenging task that requires strong prior knowledge due to significant domain gap between sketches and natural images. Existing learning-based approaches rely on large-scale paired dataset learn mapping these two modalities. However, since public sketch-photo data are scarce, it struggling methods achieve satisfactory results. In this work, we introduce new approach...

10.1109/wacv56688.2023.00430 article EN 2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) 2023-01-01

Optimal axial pressure facilitates stabilization of interphasial chemistry and enhances Zn 2+ transport at the electrode–electrolyte interface, thereby guiding uniform dense deposition in aqueous batteries.

10.1039/d3ta07523k article EN Journal of Materials Chemistry A 2024-01-01

With the ability to generate high-quality images, text-to-image (T2I) models can be exploited for creating inappropriate content. To prevent misuse, existing safety measures are either based on text blacklists, which easily circumvented, or harmful content classification, requiring large datasets training and offering low flexibility. Hence, we propose Latent Guard, a framework designed improve in generation. Inspired by blacklist-based approaches, Guard learns latent space top of T2I...

10.48550/arxiv.2404.08031 preprint EN arXiv (Cornell University) 2024-04-11

With the recent advancement in large language models (LLMs), there is a growing interest combining LLMs with multimodal learning. Previous surveys of (MLLMs) mainly focus on understanding. This survey elaborates generation across different domains, including image, video, 3D, and audio, where we highlight notable advancements milestone works these fields. Specifically, exhaustively investigate key technical components behind methods datasets utilized studies. Moreover, dig into...

10.48550/arxiv.2405.19334 preprint EN arXiv (Cornell University) 2024-05-29
Coming Soon ...