- Visual Attention and Saliency Detection
- Video Surveillance and Tracking Methods
- Advanced Neural Network Applications
- Image Enhancement Techniques
- Advanced Vision and Imaging
- Smart Grid and Power Systems
- Hand Gesture Recognition Systems
- Natural Language Processing Techniques
- Anomaly Detection Techniques and Applications
- Human Pose and Action Recognition
- Topic Modeling
- Machine Learning and Data Classification
- Hydraulic Fracturing and Reservoir Analysis
- Advanced Optical Sensing Technologies
- Advanced Image and Video Retrieval Techniques
- Advanced Image Fusion Techniques
- 3D Shape Modeling and Analysis
- Advanced Memory and Neural Computing
- Face and Expression Recognition
- Hydrocarbon exploration and reservoir analysis
- Simulation and Modeling Applications
- Drilling and Well Engineering
- Data Stream Mining Techniques
- Coal Properties and Utilization
- Advanced Computational Techniques and Applications
University of Shanghai for Science and Technology
2020-2024
Princeton University
2022-2023
Dalian University of Technology
2023
Sinopec (China)
2008-2023
Changchun University of Technology
2016-2023
Zhejiang University
2021-2023
Alibaba Group (United States)
2023
Sichuan University
2019-2022
Zhangzhou Vocational and Technical College
2022
Huzhou Vocational and Technical College
2022
Most polyp segmentation methods use convolutional neural networks (CNNs) as their backbone, leading to two key issues when exchanging information between the encoder and decoder: (1) taking into account differences in contribution different-level features, (2) designing an effective mechanism for fusing these features. Unlike existing CNN-based methods, we adopt a transformer encoder, which learns more powerful robust representations. In addition, considering image acquisition influence...
Most polyp segmentation methods use CNNs as their backbone, leading to two key issues when exchanging information between the encoder and decoder: 1) taking into account differences in contribution different-level features 2) designing an effective mechanism for fusing these features. Unlike existing CNN-based methods, we adopt a transformer encoder, which learns more powerful robust representations. In addition, considering image acquisition influence elusive properties of polyps, introduce...
Event-based cameras bring a unique capability to tracking, being able function in challenging real-world conditions as direct result of their high temporal resolution and dynamic range. These imagers capture events asynchronously that encode rich spatial information. However, effectively extracting this information from remains an open challenge. In work, we propose spiking transformer network, STNet, for single object tracking. STNet dynamically extracts fuses both domains. particular, the...
Inspired by the complementarity between conventional frame-based and bio-inspired event-based cameras, we propose a multi-modal based approach to fuse visual cues from frame- event-domain enhance single object tracking performance, especially in degraded conditions (e.g., scenes with high dynamic range, low light, fast-motion objects). The proposed can effectively adaptively combine meaningful information both domains. Our approach’s effectiveness is enforced novel designed cross-domain...
Transparent and semi-transparent materials pose significant challenges for existing scene understanding segmentation algorithms due to their lack of RGB texture which impedes the extraction meaningful features. In this work, we exploit that light-matter interactions on glass provide unique intensity-polarization cues each observed wavelength light. We present a novel learning-based network leverages both trichromatic (RGB) intensities as well linear polarization from single photograph...
Existing semantic segmentation works have been mainly focused on designing effective decoders; however, the computational load introduced by overall structure has long ignored, which hinders their applications resource-constrained hardwares. In this paper, we propose a head-free lightweight architecture specifically for segmentation, named Adaptive Frequency Transformer (AFFormer). AFFormer adopts parallel to leverage prototype representations as specific learnable local descriptions...
We present a novel mirror segmentation method that leverages depth estimates from ToF-based cameras as an additional cue to disambiguate challenging cases where the contrast or relation in RGB colors between reflection and surrounding scene is subtle. A key observation ToF do not report true of surface, but instead return total length reflected light paths, thereby creating obvious dis-continuities at boundaries. To exploit information segmentation, we first construct large-scale RGB-D...
In recent years, marine animal study has gained increasing research attention, which raises significant demands for fine-grained segmentation (MAS) techniques. addition, deep learning been widely adopted object and achieved promising performance. However, deep-based MAS is still lack of investigation due to the shortage a large-scale dataset. To tackle this issue, we construct first dataset, called <italic xmlns:mml="http://www.w3.org/1998/Math/MathML"...
The virtual patient, a unique computer simulation of the patient's face, teeth, oral mucosa, and bone, provides an extraordinary mechanism for digital dental implant surgery planning prosthetic design. However, seamless registration scans with functional information in context articulator remains challenge. This report describes treatment 47-year-old male full-mouth guided immediate placement loading CAD/CAM interim prostheses. Utilizing novel workflow, multifactorial vertical dimension...
With the widespread application of digital impression techniques in prosthetic dentistry, accurate intraoral scan mounting, and virtual articulator parameters setting as per patients' anatomic structures are essential for treatment planning restoration fabrication, especially complex rehabilitation cases; meanwhile, marginal fit checking, occlusal adjustment, porcelain layering restorations also crucial procedures all cases which analog procedure to mount maxillary arches on a mechanical is...
We present a deep reinforcement learning method of progressive view inpainting for colored semantic point cloud scene completion under volume guidance, achieving high-quality reconstruction from only single RGB-D image with severe occlusion. Our approach is end-to-end, consisting three modules: 3D reconstruction, 2D and segmentation inpainting, multi-view selection completion. Given image, our first predicts its map goes through the branch to obtain volumetric as guide next step, which...
Time-resolved imaging is an emerging sensing modality that has been shown to enable advanced applications, including remote sensing, fluorescence lifetime imaging, and even non-line-of-sight sensing. Single-photon avalanche diodes (SPADs) outperform relevant time-resolved technologies thanks their excellent photon sensitivity superior temporal resolution on the order of tens picoseconds. The capability exceeding limits conventional cameras for SPADs also draws attention photon-efficient...
Large language models (LLMs) have demonstrated remarkable performance and tremendous potential across a wide range of tasks. However, deploying these has been challenging due to the astronomical amount model parameters, which requires demand for large memory capacity high bandwidth. In this paper, we propose an effective approach that can make deployment LLMs more efficiently. We support automatic INT4 weight-only quantization flow design special LLM runtime with highly-optimized kernels...
The dynamic membrane potential threshold, as one of the essential properties a biological neuron, is spontaneous regulation mechanism that maintains neuronal homeostasis, i.e., constant overall spiking firing rate neuron. As such, neuron regulated by which has been extensively studied in biology. Existing work machine learning community does not employ bioinspired threshold schemes. This aims at bridging this gap introducing novel energy-temporal (BDETT) scheme for neural networks (SNNs)....
We introduce a wearable single-eye emotion recognition device and real-time approach to recognizing emotions from partial observations of an that is robust changes in lighting conditions. At the heart our method bio-inspired event-based camera setup newly designed lightweight Spiking Eye Emotion Network (SEEN). Compared conventional cameras, cameras offer higher dynamic range (up 140 dB vs. 80 dB) temporal resolution (in order μ s 10s ms). Thus, captured events can encode rich cues under...
Abstract It is very challenging to reconstruct a high dynamic range (HDR) from low (LDR) image as an ill‐posed problem. This paper proposes luminance attentive network named LANet for HDR reconstruction single LDR image. Our method based on two fundamental observations: (1) images stored in relative are scale‐invariant, which means the will hold same information when multiplied by any positive real number. Based this observation, we propose novel normalization called “ calibration “for...
Camouflaged object detection (COD), which aims to identify the objects that conceal themselves into surroundings, has recently drawn increasing research efforts in field of computer vision. In practice, success deep learning based COD is mainly determined by two key factors, including (i) A significantly large receptive field, provides rich context information, and (ii) An effective fusion strategy, aggregates multi-level features for accurate COD. Motivated these observations, this paper,...
Recognizing named entities (NEs) is commonly conducted as a classification problem that predicts class tag for word or NE candidate in sentence. In shallow structures, categorized features are weighted to support the prediction. Recent developments neural networks have adopted deep structures map into continuous representations. This approach unfolds dense space saturated with high-order abstract semantic information, where prediction based on distributed feature this paper, positions of NEs...