- Topic Modeling
- Multimodal Machine Learning Applications
- Domain Adaptation and Few-Shot Learning
- Natural Language Processing Techniques
- Generative Adversarial Networks and Image Synthesis
- Advanced Neural Network Applications
- Human Pose and Action Recognition
- Glaucoma and retinal disorders
- Robot Manipulation and Learning
- Video Surveillance and Tracking Methods
- Retinal Diseases and Treatments
- Image Retrieval and Classification Techniques
- COVID-19 diagnosis using AI
- Human Motion and Animation
- Robotic Path Planning Algorithms
- Fungal Infections and Studies
- Explainable Artificial Intelligence (XAI)
- Connective Tissue Growth Factor Research
- 3D Shape Modeling and Analysis
- Anomaly Detection Techniques and Applications
- Brain Tumor Detection and Classification
- Cancer-related molecular mechanisms research
- Reinforcement Learning in Robotics
- Adversarial Robustness in Machine Learning
- Handwritten Text Recognition Techniques
The University of Tokyo
2024
Beijing University of Posts and Telecommunications
2024
The University of Sydney
2020-2023
China Earthquake Administration
2023
Tianjin University of Technology
2022
Civil Aviation University of China
2022
Huawei Technologies (China)
2020
National University of Singapore
2020
Shanghai First People's Hospital
2016-2018
Shanghai Jiao Tong University
2016-2018
Automatically generating coherent and semantically meaningful text has many applications in machine translation, dialogue systems, image captioning, etc. Recently, by combining with policy gradient, Generative Adversarial Nets(GAN) that use a discriminative model to guide the training of generative as reinforcement learning shown promising results generation. However, scalar guiding signal is only available after entire been generated lacks intermediate information about structure during...
We introduce Texygen, a benchmarking platform to support research on open-domain text generation models. Texygen has not only implemented majority of models, but also covered set metrics that evaluate the diversity, quality and consistency generated texts. The could help standardize improve reproductivity reliability future work in generation.
Large language models (LLMs) have demonstrated excellent zero-shot generalization to new tasks. However, effective utilization of LLMs for visual question-answering (VQA) remains challenging, primarily due the modality disconnect and task between LLM VQA End-to-end training on multimodal data may bridge disconnects, but is inflexible computationally expensive. To address this issue, we propose Img2LLM, a plug-and-play module that provides prompts enable perform zeroshot tasks without...
We introduce Texygen, a benchmarking platform to support research on open-domain text generation models. Texygen has not only implemented majority of models, but also covered set metrics that evaluate the diversity, quality and consistency generated texts. The could help standardize facilitate sharing fine-tuned open-source implementations among researchers for their work. As consequence, this would in improving reproductivity reliability future work generation.
Automatically generating coherent and semantically meaningful text has many applications in machine translation, dialogue systems, image captioning, etc. Recently, by combining with policy gradient, Generative Adversarial Nets (GAN) that use a discriminative model to guide the training of generative as reinforcement learning shown promising results generation. However, scalar guiding signal is only available after entire been generated lacks intermediate information about structure during...
<b><i>Purpose:</i></b> The aim of this study was to evaluate the repeatability and reproducibility foveal avascular zone (FAZ) area measurements using AngioPlex spectral domain optical coherence tomography (OCT) angiography in normal subjects. <b><i>Methods:</i></b> Twenty-two healthy subjects (25 eyes) underwent FAZ with OCT. Each volunteer separately examined 3 consecutive times by 2 experienced observers. measured ImageJ software....
To evaluate the reliability of vessel density measurements in peripapillary retina using optical coherence tomography angiography (OCT-A) and to analyze correlation with retinal nerve fiber layer (RNFL) thickness healthy subjects.Thirty-five volunteers were recruited study. The optic disc region was scanned three times spectral-domain OCT (SD-OCT) split-spectrum amplitude decorrelation by two skilled examiners. Vessel automatically calculated software RTVue-XR (version 2015.1.1.98). RNFL on...
Unsupervised image-to-image (I21) translation aims to learn a domain mapping function that can preserve the semantics of input images without paired data. However, because underlying distributions in source and target domains are often mismatched, current distribution matching-based methods may distort when matching distributions, resulting inconsistency between translated images, which is known as distortion problem. In this paper, we focus on low-level I21 translation, where structure...
Text-to-Image generative models have shown a remarkable ability to produce high-quality images. However, existing methods still face difficulties in exemplar-guided image editing without destroying the given objects' identity exemplar image. To address this problem, we propose new framework called Paste and Harmonize via Denoising, which leverages pre-trained diffusion facilitate text-driven transfer of objects from an edited while preserving their appearance characteristics. The consists...
Recent studies have increasingly demonstrated that large language models (LLMs) possess significant theory of mind (ToM) capabilities, showing the potential for simulating tracking mental states in generative agents. In this study, we propose a novel paradigm called ToM-agent, designed to empower LLMs-based agents simulate ToM open-domain conversational interactions. ToM-agent disentangles confidence from states, facilitating emulation an agent's perception its counterpart's such as beliefs,...
Large language models (LLMs) have demonstrated excellent zero-shot generalization to new tasks. However, effective utilization of LLMs for visual question-answering (VQA) remains challenging, primarily due the modality disconnection and task between LLM VQA task. End-to-end training on vision data may bridge disconnections, but is inflexible computationally expensive. To address this issue, we propose \emph{Img2Prompt}, a plug-and-play module that provides prompts can aforementioned so...
Synthesizing novel view images from a few views is challenging but practical problem. Existing methods often struggle with producing high-quality results or necessitate per-object optimization in such few-view settings due to the insufficient information provided. In this work, we explore leveraging strong 2D priors pre-trained diffusion models for synthesizing images. models, nevertheless, lack 3D awareness, leading distorted image synthesis and compromising identity. To address these...
The present study was designed to evaluate the effects of doxazosin on experimental choroidal neovascularization (CNV) in mice.Six- 8-week-old male C57BL/6 mice were divided into a control group and doxazosin-treated (5 mg/kg, i.p., daily). Experimental CNV induced by laser photocoagulation. Seven 14 days after induction, fluorescein angiography, flat mounts, histological studies performed fluorescence leakage, area, thickness lesions, respectively. In addition, western blot analysis carried...
Zero-shot human-AI coordination holds the promise of collaborating with humans without human data. Prevailing methods try to train ego agent a population partners via self-play. However, these suffer from two problems: 1) The diversity finite is limited, thereby limiting capacity trained collaborate novel human; 2) Current only provide common best response for every partner in population, which may result poor zero-shot performance or humans. To address issues, we first propose policy...
Unlike perfect information games, where all elements are known to every player, imperfect games emulate the real-world complexities of decision-making under uncertain or incomplete information. GPT-4, recent breakthrough in large language models (LLMs) trained on massive passive data, is notable for its knowledge retrieval and reasoning abilities. This paper delves into applicability GPT-4's learned games. To achieve this, we introduce \textbf{Suspicion-Agent}, an innovative agent that...
Moving objects recognition plays an important role in camera-only active safety systems and intelligent autonomous vehicles. For these applications, reliable detection performance is required; however, pedestrian challenging due to their divergent dressing action variety. Besides, real-time also critical. This paper aims optimize the by combining both temporal-domain spatial-domain methods. Accordingly, we first use Background Subtraction (BS) technique detect moving objects. Then, AdaBoost...
Network quantization is an effective method for the deployment of neural networks on memory and energy constrained mobile devices. In this paper, we propose a Dynamic Quantization (DNQ) framework which composed two modules: bit-width controller quantizer. Unlike most existing methods that use universal whole network, utilize policy gradient to train agent learn each layer by controller. This can make trade-off between accuracy compression ratio. Given sequence, quantizer adopts distance as...
Selecting the number of slice is a key step for implement sliced average variance estimation (SAVE) method. To our knowledge, there no widely accepted method it in practical application. And an incorrect may leads to inaccurate conclusion. In traditional multivariate sufficient dimension reduction procedure, usually adopt fuze approach which combined kernel operators SAVE with various numbers slices solve this problem. Due infinite functional data, can not be directly applied (FSAVE). Hence...
With the further development of China's society and economy, modern people are not only satisfied with adequate food clothing, but also spiritual satisfaction has become an indispensable part people's busy life, potted plants have necessities life adjustment for some people. In a developed pace is getting faster faster, it easy to forget watering, go out frequently, family often no one water several days, so watering big problem. order solve this problem, set automatic irrigation control...
Minghan Wang, Hao Yang, Yao Deng, Ying Qin, Lizhi Lei, Daimeng Wei, Hengchao Shang, Ning Xie, Xiaochun Li, Jiaxian Guo. Proceedings of the 17th International Conference on Spoken Language Translation. 2020.