Bo Dong

ORCID: 0000-0001-9189-9506
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Visual Attention and Saliency Detection
  • Video Surveillance and Tracking Methods
  • Advanced Neural Network Applications
  • Image Enhancement Techniques
  • Advanced Vision and Imaging
  • Smart Grid and Power Systems
  • Hand Gesture Recognition Systems
  • Natural Language Processing Techniques
  • Anomaly Detection Techniques and Applications
  • Human Pose and Action Recognition
  • Topic Modeling
  • Machine Learning and Data Classification
  • Hydraulic Fracturing and Reservoir Analysis
  • Advanced Optical Sensing Technologies
  • Advanced Image and Video Retrieval Techniques
  • Advanced Image Fusion Techniques
  • 3D Shape Modeling and Analysis
  • Advanced Memory and Neural Computing
  • Face and Expression Recognition
  • Hydrocarbon exploration and reservoir analysis
  • Simulation and Modeling Applications
  • Drilling and Well Engineering
  • Data Stream Mining Techniques
  • Coal Properties and Utilization
  • Advanced Computational Techniques and Applications

University of Shanghai for Science and Technology
2020-2024

Princeton University
2022-2023

Dalian University of Technology
2023

Sinopec (China)
2008-2023

Changchun University of Technology
2016-2023

Zhejiang University
2021-2023

Alibaba Group (United States)
2023

Sichuan University
2019-2022

Zhangzhou Vocational and Technical College
2022

Huzhou Vocational and Technical College
2022

Most polyp segmentation methods use convolutional neural networks (CNNs) as their backbone, leading to two key issues when exchanging information between the encoder and decoder: (1) taking into account differences in contribution different-level features, (2) designing an effective mechanism for fusing these features. Unlike existing CNN-based methods, we adopt a transformer encoder, which learns more powerful robust representations. In addition, considering image acquisition influence...

10.26599/air.2023.9150015 article EN cc-by CAAI Artificial Intelligence Research 2023-06-30

Most polyp segmentation methods use CNNs as their backbone, leading to two key issues when exchanging information between the encoder and decoder: 1) taking into account differences in contribution different-level features 2) designing an effective mechanism for fusing these features. Unlike existing CNN-based methods, we adopt a transformer encoder, which learns more powerful robust representations. In addition, considering image acquisition influence elusive properties of polyps, introduce...

10.48550/arxiv.2108.06932 preprint EN cc-by arXiv (Cornell University) 2021-01-01

Event-based cameras bring a unique capability to tracking, being able function in challenging real-world conditions as direct result of their high temporal resolution and dynamic range. These imagers capture events asynchronously that encode rich spatial information. However, effectively extracting this information from remains an open challenge. In work, we propose spiking transformer network, STNet, for single object tracking. STNet dynamically extracts fuses both domains. particular, the...

10.1109/cvpr52688.2022.00860 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2022-06-01

Inspired by the complementarity between conventional frame-based and bio-inspired event-based cameras, we propose a multi-modal based approach to fuse visual cues from frame- event-domain enhance single object tracking performance, especially in degraded conditions (e.g., scenes with high dynamic range, low light, fast-motion objects). The proposed can effectively adaptively combine meaningful information both domains. Our approach’s effectiveness is enforced novel designed cross-domain...

10.1109/iccv48922.2021.01280 article EN 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2021-10-01

Transparent and semi-transparent materials pose significant challenges for existing scene understanding segmentation algorithms due to their lack of RGB texture which impedes the extraction meaningful features. In this work, we exploit that light-matter interactions on glass provide unique intensity-polarization cues each observed wavelength light. We present a novel learning-based network leverages both trichromatic (RGB) intensities as well linear polarization from single photograph...

10.1109/cvpr52688.2022.01229 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2022-06-01

Existing semantic segmentation works have been mainly focused on designing effective decoders; however, the computational load introduced by overall structure has long ignored, which hinders their applications resource-constrained hardwares. In this paper, we propose a head-free lightweight architecture specifically for segmentation, named Adaptive Frequency Transformer (AFFormer). AFFormer adopts parallel to leverage prototype representations as specific learnable local descriptions...

10.1609/aaai.v37i1.25126 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2023-06-26

We present a novel mirror segmentation method that leverages depth estimates from ToF-based cameras as an additional cue to disambiguate challenging cases where the contrast or relation in RGB colors between reflection and surrounding scene is subtle. A key observation ToF do not report true of surface, but instead return total length reflected light paths, thereby creating obvious dis-continuities at boundaries. To exploit information segmentation, we first construct large-scale RGB-D...

10.1109/cvpr46437.2021.00306 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2021-06-01

In recent years, marine animal study has gained increasing research attention, which raises significant demands for fine-grained segmentation (MAS) techniques. addition, deep learning been widely adopted object and achieved promising performance. However, deep-based MAS is still lack of investigation due to the shortage a large-scale dataset. To tackle this issue, we construct first dataset, called <italic xmlns:mml="http://www.w3.org/1998/Math/MathML"...

10.1109/tcsvt.2021.3093890 article EN IEEE Transactions on Circuits and Systems for Video Technology 2021-07-02

The virtual patient, a unique computer simulation of the patient's face, teeth, oral mucosa, and bone, provides an extraordinary mechanism for digital dental implant surgery planning prosthetic design. However, seamless registration scans with functional information in context articulator remains challenge. This report describes treatment 47-year-old male full-mouth guided immediate placement loading CAD/CAM interim prostheses. Utilizing novel workflow, multifactorial vertical dimension...

10.1111/jopr.13204 article EN Journal of Prosthodontics 2020-05-19

With the widespread application of digital impression techniques in prosthetic dentistry, accurate intraoral scan mounting, and virtual articulator parameters setting as per patients' anatomic structures are essential for treatment planning restoration fabrication, especially complex rehabilitation cases; meanwhile, marginal fit checking, occlusal adjustment, porcelain layering restorations also crucial procedures all cases which analog procedure to mount maxillary arches on a mechanical is...

10.1111/jopr.13570 article EN Journal of Prosthodontics 2022-07-22

We present a deep reinforcement learning method of progressive view inpainting for colored semantic point cloud scene completion under volume guidance, achieving high-quality reconstruction from only single RGB-D image with severe occlusion. Our approach is end-to-end, consisting three modules: 3D reconstruction, 2D and segmentation inpainting, multi-view selection completion. Given image, our first predicts its map goes through the branch to obtain volumetric as guide next step, which...

10.1109/tpami.2023.3264449 article EN IEEE Transactions on Pattern Analysis and Machine Intelligence 2023-04-05

Time-resolved imaging is an emerging sensing modality that has been shown to enable advanced applications, including remote sensing, fluorescence lifetime imaging, and even non-line-of-sight sensing. Single-photon avalanche diodes (SPADs) outperform relevant time-resolved technologies thanks their excellent photon sensitivity superior temporal resolution on the order of tens picoseconds. The capability exceeding limits conventional cameras for SPADs also draws attention photon-efficient...

10.1609/aaai.v39i9.33043 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2025-04-11

Large language models (LLMs) have demonstrated remarkable performance and tremendous potential across a wide range of tasks. However, deploying these has been challenging due to the astronomical amount model parameters, which requires demand for large memory capacity high bandwidth. In this paper, we propose an effective approach that can make deployment LLMs more efficiently. We support automatic INT4 weight-only quantization flow design special LLM runtime with highly-optimized kernels...

10.48550/arxiv.2311.00502 preprint EN cc-by arXiv (Cornell University) 2023-01-01

The dynamic membrane potential threshold, as one of the essential properties a biological neuron, is spontaneous regulation mechanism that maintains neuronal homeostasis, i.e., constant overall spiking firing rate neuron. As such, neuron regulated by which has been extensively studied in biology. Existing work machine learning community does not employ bioinspired threshold schemes. This aims at bridging this gap introducing novel energy-temporal (BDETT) scheme for neural networks (SNNs)....

10.48550/arxiv.2206.04426 preprint EN other-oa arXiv (Cornell University) 2022-01-01

We introduce a wearable single-eye emotion recognition device and real-time approach to recognizing emotions from partial observations of an that is robust changes in lighting conditions. At the heart our method bio-inspired event-based camera setup newly designed lightweight Spiking Eye Emotion Network (SEEN). Compared conventional cameras, cameras offer higher dynamic range (up 140 dB vs. 80 dB) temporal resolution (in order μ s 10s ms). Thus, captured events can encode rich cues under...

10.1145/3588432.3591511 article EN 2023-07-19

Abstract It is very challenging to reconstruct a high dynamic range (HDR) from low (LDR) image as an ill‐posed problem. This paper proposes luminance attentive network named LANet for HDR reconstruction single LDR image. Our method based on two fundamental observations: (1) images stored in relative are scale‐invariant, which means the will hold same information when multiplied by any positive real number. Based this observation, we propose novel normalization called “ calibration “for...

10.1111/cgf.14412 article EN Computer Graphics Forum 2021-10-01

Camouflaged object detection (COD), which aims to identify the objects that conceal themselves into surroundings, has recently drawn increasing research efforts in field of computer vision. In practice, success deep learning based COD is mainly determined by two key factors, including (i) A significantly large receptive field, provides rich context information, and (ii) An effective fusion strategy, aggregates multi-level features for accurate COD. Motivated these observations, this paper,...

10.48550/arxiv.2101.05687 preprint EN other-oa arXiv (Cornell University) 2021-01-01

Recognizing named entities (NEs) is commonly conducted as a classification problem that predicts class tag for word or NE candidate in sentence. In shallow structures, categorized features are weighted to support the prediction. Recent developments neural networks have adopted deep structures map into continuous representations. This approach unfolds dense space saturated with high-order abstract semantic information, where prediction based on distributed feature this paper, positions of NEs...

10.48550/arxiv.2011.14330 preprint EN other-oa arXiv (Cornell University) 2020-01-01
Coming Soon ...