- Visual Attention and Saliency Detection
- Advanced Image and Video Retrieval Techniques
- Advanced Neural Network Applications
- Face Recognition and Perception
- Olfactory and Sensory Function Studies
- Image and Video Quality Assessment
- Adversarial Robustness in Machine Learning
- Visual perception and processing mechanisms
- Human Pose and Action Recognition
- Generative Adversarial Networks and Image Synthesis
- Video Surveillance and Tracking Methods
- Domain Adaptation and Few-Shot Learning
- Video Analysis and Summarization
- Gaze Tracking and Assistive Technology
- Multimodal Machine Learning Applications
- Anomaly Detection Techniques and Applications
- Neural dynamics and brain function
- Digital Media Forensic Detection
- Advanced Vision and Imaging
- Advanced Image Fusion Techniques
- Multisensory perception and integration
- Aesthetic Perception and Analysis
- Music and Audio Processing
- Evolutionary Algorithms and Applications
- Topic Modeling
University of Southern California
2011-2023
Southern California University for Professional Studies
2011-2023
Abeam Technologies (United States)
2019-2022
Microsoft Research (United Kingdom)
2022
HCL Technologies (India)
2020-2021
University of Central Florida
2015-2020
Aalto University
2019
Nankai University
2018
Centro de Investigación en Red en Enfermedades Cardiovasculares
2017
Florida Southern College
2016-2017
Foreground map evaluation is crucial for gauging the progress of object segmentation algorithms, in particular field salient detection where purpose to accurately detect and segment most a scene. Several widely-used measures such as Area Under Curve (AUC), Average Precision (AP) recently proposed F W/B (Fbw) have been used evaluate similarity between non-binary saliency (SM) ground-truth (GT) map. These are based on pixel-wise errors often ignore structural similarities. Behavioral vision...
Recent progress on saliency detection is substantial, benefiting mostly from the explosive development of Convolutional Neural Networks (CNNs). Semantic segmentation and algorithms developed lately have been based Fully (FCNs). There still a large room for improvement over generic FCN models that do not explicitly deal with scale-space problem. Holisitcally-Nested Edge Detector (HED) provides skip-layer structure deep supervision edge boundary detection, but performance gain HED obvious. In...
The existing binary foreground map (FM) measures address various types of errors in either pixel-wise or structural ways. These consider pixel-level match image-level information independently, while cognitive vision studies have shown that human is highly sensitive to both global and local details scenes. In this paper, we take a detailed look at current FM evaluation propose novel effective E-measure (Enhanced-alignment measure). Our measure combines pixel values with the mean value one...
We extensively compare, qualitatively and quantitatively, 40 state-of-the-art models (28 salient object detection, 10 fixation prediction, 1 objectness, baseline) over 6 challenging datasets for the purpose of benchmarking detection segmentation methods. From results obtained so far, our evaluation shows a consistent rapid progress last few years in terms both accuracy running time. The top contenders this benchmark significantly outperform identified as best previous conducted just two ago....
Detecting and segmenting salient objects from natural scenes, often referred to as object detection, has attracted great interest in computer vision. While many models have been proposed several applications emerged, a deep understanding of achievements issues remains lacking. We aim provide comprehensive review recent progress detection situate this field among other closely related areas such generic scene segmentation, proposal generation, saliency for fixation prediction. Covering 228...
Recent progress on saliency detection is substantial, benefiting mostly from the explosive development of Convolutional Neural Networks (CNNs). Semantic segmentation and algorithms developed lately have been based Fully (FCNs). There still a large room for improvement over generic FCN models that do not explicitly deal with scale-space problem. Holistically-Nested Edge Detector (HED) provides skip-layer structure deep supervision edge boundary detection, but performance gain HED salience...
Visual attention is a process that enables biological and machine vision systems to select the most relevant regions from scene. Relevance determined by two components: 1) top-down factors driven task 2) bottom-up highlight image are different their surroundings. The latter often referred as "visual saliency." Modeling visual saliency has been subject of numerous research efforts during past 20 years, with many successful applications in computer robotics. Available models have tested...
This paper presents a new method for detecting salient objects in images using convolutional neural networks (CNNs). The proposed network, named PAGE-Net, offers two key contributions. first is the exploitation of an essential pyramid attention structure object detection. enables network to concentrate more on regions while considering multi-scale saliency information. Such stacked design provides powerful tool efficiently improve representation ability corresponding layer with enlarged...
Effective integration of contextual information is crucial for salient object detection. To achieve this, most existing methods based on 'skip' architecture mainly focus how to integrate hierarchical features Convolutional Neural Networks (CNNs). They simply apply concatenation or element-wise operation incorporate high-level semantic cues and low-level detailed information. However, this can degrade the quality predictions because cluttered noisy also be passed through. address problem, we...
Deep convolutional neural networks (CNNs) have been successfully applied to a wide variety of problems in computer vision, including salient object detection. To detect and segment objects accurately, it is necessary extract combine high-level semantic features with low-levelfine details simultaneously. This happens be challenge for CNNs as repeated subsampling operations such pooling convolution lead significant decrease the initial image resolution, which results loss spatial finer...
We introduce a saliency model based on two key ideas. The first one is considering local and global image patch rarities as complementary processes. second our observation that for different images, of the RGB Lab color spaces outperforms other in detection. propose framework measures each space combines them final map. For channel, first, input partitioned into non-overlapping patches then represented by vector coefficients linearly reconstruct it from learned dictionary natural scenes....
Abstract Large language models have been demonstrated to be valuable in different fields. ChatGPT, developed by OpenAI, has trained using massive amounts of data and simulates human conversation comprehending context generating appropriate responses. It garnered significant attention due its ability effectively answer a broad range inquiries, with fluent comprehensive answers surpassing prior public chatbots both security usefulness. However, analysis ChatGPT’s failures is lacking, which the...
Despite significant recent progress, the best available visual saliency models still lag behind human performance in predicting eye fixations free-viewing of natural scenes. Majority are based on low-level features and importance top-down factors has not yet been fully explored or modeled. Here, we combine such as orientation, color, intensity, maps previous bottom-up with cognitive (e.g., faces, humans, cars, etc.) learn a direct mapping from those to using Regression, SVM, AdaBoost...
Predicting where people look in static scenes, a.k.a visual saliency, has received significant research interest recently. However, relatively less effort been spent understanding and modeling attention over dynamic scenes. This work makes three contributions to video saliency research. First, we introduce a new benchmark, called DHF1K (Dynamic Human Fixation 1K), for predicting fixations during scene free-viewing, which is long-time need this field. consists of 1K high-quality...
Abstract In a very influential yet anecdotal illustration, Yarbus suggested that human eye-movement patterns are modulated top down by different task demands. While the hypothesis it is possible to decode observer's from eye movements has received some support (e.g., Henderson, Shinkareva, Wang, Luke, & Olejarczyk, 2013; Iqbal Bailey, 2004), Greene, Liu, and Wolfe (2012) argued against reporting failure. this study, we perform more systematic investigation of problem, probing larger number...
Saliency modeling has been an active research area in computer vision for about two decades. Existing state of the art models perform very well predicting where people look natural scenes. There is, however, risk that these may have overfitting themselves to available small scale biased datasets, thus trapping progress a local minimum. To gain deeper insight regarding current issues saliency and better gauge progress, we recorded eye movements 120 observers while they freely viewed large...
In this work, we contribute to video saliency research in two ways. First, introduce a new benchmark for predicting human eye movements during dynamic scene free-viewing, which is long-time urged field. Our dataset, named DHF1K (Dynamic Human Fixation), consists of 1K high-quality, elaborately selected sequences spanning large range scenes, motions, object types and background complexity. Existing datasets lack variety generality common scenes fall short covering challenging situations...
Significant recent progress has been made in developing high-quality saliency models. However, less effort undertaken on fair assessment of these models, over large standardized datasets and correctly addressing confounding factors. In this study, we pursue a critical quantitative look at challenges (e.g., center-bias, map smoothing) modeling the way they affect model accuracy. We quantitatively compare 32 state-of-the-art models (using shuffled AUC score to discount center-bias) 4 benchmark...
Research in visual saliency has been focused on two major types of models namely fixation prediction and salient object detection. The relationship between the two, however, less explored. In this paper, we propose to employ former model type identify segment objects scenes. We build a novel neural network called Attentive Saliency Network (ASNet) that learns detect from maps. map, derived at upper layers, captures high-level understanding scene. Salient detection is then viewed as...
Previous research in visual saliency has been focused on two major types of models namely fixation prediction and salient object detection. The relationship between the two, however, less explored. In this work, we propose to employ former model type identify objects. We build a novel Attentive Saliency Network (ASNet)1 1.Available at: https://github.com/wenguanwang/ASNet. that learns detect objects from fixations. map, derived at upper network layers, mimics human attention mechanisms...
Learning to generate natural scenes has always been a challenging task in computer vision. It is even more painstaking when the generation conditioned on images with drastically different views. This mainly because understanding, corresponding, and transforming appearance semantic information across views not trivial. In this paper, we attempt solve novel problem of cross-view image synthesis, aerial street-view vice versa, using conditional generative adversarial networks (cGAN). Two new...