Bin Sun

ORCID: 0000-0002-7029-8784
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Topic Modeling
  • Remote Sensing in Agriculture
  • Advanced Image and Video Retrieval Techniques
  • Natural Language Processing Techniques
  • Speech and dialogue systems
  • Remote Sensing and Land Use
  • Remote-Sensing Image Classification
  • Multimodal Machine Learning Applications
  • Geometric and Algebraic Topology
  • Land Use and Ecosystem Services
  • Advanced Image Fusion Techniques
  • Advanced Neural Network Applications
  • Image and Signal Denoising Methods
  • Virus-based gene therapy research
  • Emotion and Mood Recognition
  • Human Pose and Action Recognition
  • Cancer Research and Treatments
  • Domain Adaptation and Few-Shot Learning
  • Image Retrieval and Classification Techniques
  • Homotopy and Cohomology in Algebraic Topology
  • Image Processing Techniques and Applications
  • Remote Sensing and LiDAR Applications
  • Rangeland Management and Livestock Ecology
  • RNA modifications and cancer
  • Advanced Vision and Imaging

Hunan University
2015-2025

Dalian University of Technology
2019-2025

Xian Yang Central Hospital
2025

Beijing Institute of Technology
2020-2025

Yangpu Hospital of Tongji University
2025

First Affiliated Hospital of Anhui Medical University
2018-2024

Anhui Medical University
2018-2024

China University of Petroleum, Beijing
2024

Chinese Academy of Sciences
2004-2024

China Academy of Engineering Physics
2022-2024

Facial Expression Recognition (FER) in the wild is extremely challenging due to occlusions, variant head poses, face deformation and motion blur under unconstrained conditions. Although substantial progresses have been made automatic FER past few decades, previous studies were mainly designed for lab-controlled FER. Real-world poses other issues definitely increase difficulty of on account these information-deficient regions complex backgrounds. Different from pure CNNs based methods, we...

10.1109/taffc.2021.3122146 article EN IEEE Transactions on Affective Computing 2021-10-26

Convolutional neural network (CNN) has achieved great success on image super-resolution (SR). However, most deep CNN-based SR models take massive computations to obtain high performance. Downsampling features for multi-resolution fusion is an efficient and effective way improve the performance of visual recognition. Still, it counter-intuitive in task, which needs project a low-resolution input high-resolution. In this paper, we propose novel Hybrid Pixel-Unshuffled Network (HPUN) by...

10.1609/aaai.v37i2.25333 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2023-06-26

Abstract Background We developed a computer-assisted diagnosis model to evaluate the feasibility of automated classification intrapapillary capillary loops (IPCLs) improve detection esophageal squamous cell carcinoma (ESCC). Methods recruited patients who underwent magnifying endoscopy with narrow-band imaging for evaluation suspicious condition. Case images were evaluated establish gold standard IPCL according endoscopic and histological findings. A double-labeling fully convolutional...

10.1055/a-0756-8754 article EN Endoscopy 2018-11-23

Hyperspectral images (HSIs) are often degraded by a mixture of various types noise during the imaging process, including Gaussian noise, impulse and stripes. Such complex could plague subsequent HSIs processing. Generally, most HSI denoising methods formulate sparsity optimization problems with convex norm constraints, which over-penalize large entries vectors, may result in biased solution. In this paper, nonconvex regularized low-rank sparse matrix decomposition (NonRLRS) method is...

10.1109/tip.2019.2926736 article EN IEEE Transactions on Image Processing 2019-07-12

Automatic speech recognition (ASR) is the major human–machine interface in many intelligent systems, such as homes, autonomous driving, and servant robots. However, its performance usually significantly deteriorates presence of external noise, leading to limitations application scenes. The audio-visual (AVSR) takes visual information a complementary modality enhance audio effectively, particularly noisy conditions. Recently, transformer-based architectures have been used model video...

10.1109/tnnls.2022.3163771 article EN IEEE Transactions on Neural Networks and Learning Systems 2022-04-12

Action prediction based on video is an important problem in computer vision field with many applications, such as preventing accidents and criminal activities. It's challenging to predict actions at the early stage because of large variations between observed videos complete ones. Besides, intra-class cause confusions predictors well. In this paper, we propose a mem-LSTM model stage, which memory module introduced record several "hard-to-predict" samples variety observations. Our method uses...

10.1609/aaai.v32i1.12324 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2018-04-27

Active learning (AL) and semisupervised (SSL) are both promising solutions to hyperspectral image classification. Given a few initial labeled samples, this work combines AL SSL in novel manner, aiming obtain more manually pseudolabeled samples use them together with the improve classification performance. First, based on comparison of segmentation spectral-spatial results obtained by random walker (RW) extended RW (ERW) algorithms, unlabeled separated into two different sets, i.e., low-...

10.1109/tgrs.2016.2604290 article EN IEEE Transactions on Geoscience and Remote Sensing 2016-09-16

Image rasterization is a mature technique in computer graphics, while image vectorization, the reverse path of rasterization, remains major challenge. Recent advanced deep learning-based models achieve vectorization and semantic interpolation vector graphs demonstrate better topology generating new figures. However, cannot be easily generalized to out-of-domain testing data. The generated SVGs also contain complex redundant shapes that are not quite convenient for further editing....

10.1109/cvpr52688.2022.01583 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2022-06-01

Temporal answer grounding in instructional video (TAGV) is a new task naturally derived from temporal sentence general (TSGV). Given an untrimmed and text question, this aims at locating the frame span that can semantically i.e., visual answer. Existing methods tend to solve TAGV problem with span-based predictor, taking information predict start end frames video. However, due weak correlations between semantic features of textual question answer, current using predictor do not work well...

10.1109/tpami.2024.3411045 article EN IEEE Transactions on Pattern Analysis and Machine Intelligence 2024-01-01

Glioblastoma (GBM) tumor is the most common primary brain malignant tumor. The precise identification of GBM very important for diagnosis and treatment. Hyperspectral imaging a fast, non-contact, accurate safety modern medical detection technology, which expected to be new tool intraoperative diagnosis. In order make full use spectral spatial information hyperspectral images (HSIs) achieve identification, method based on fusion multiple deep models (FMDM) proposed in-vivo human HSI...

10.1109/tim.2021.3117634 article EN IEEE Transactions on Instrumentation and Measurement 2021-01-01

Previous methods for dynamic facial expression in the wild are mainly based on Convolutional Neural Networks (CNNs), whose local operations ignore long-range dependencies videos. To solve this problem, we propose spatio-temporal Transformer (STT) to capture discriminative features within each frame and model contextual relationships among frames. Spatio-temporal captured integrated by our unified Transformer. Specifically, given an image sequence consisting of multiple frames as input,...

10.48550/arxiv.2205.04749 preprint EN other-oa arXiv (Cornell University) 2022-01-01

Additive manufacturing of carbon-fiber-reinforced polymer (CFRP) has been widely used in many fields. However, issues such as inconsistent fiber orientation distribution and void formation during the layer stacking process have hindered further optimization composite material’s performance. This study aimed to address these challenges by conducting a comprehensive investigation into influence carbon content printing parameters on micro-morphology, thermal properties, mechanical properties...

10.3390/polym15183722 article EN Polymers 2023-09-11

In the context of global warming, sustainability farmland ecosystems is increasingly impacted by multiple disturbances from both natural and human-induced sources. This study constructed a conceptual model indicator system ecosystem resilience (FER) based on disturbance-response processes ecosystems. FER assessment, supported 30 specific indicators, was tested in Ethiopia, one most food-insecure countries world factors impending were discussed obstacle degree values (ODVs). The results...

10.1016/j.ecolind.2023.109900 article EN cc-by-nc-nd Ecological Indicators 2023-01-17

Grassland is the second largest terrestrial ecosystem and a fundamental land resource for human survival development. Although grassland degradation recognized crucial ecological problem, there no consensus on area, scope, degree of its global trends, making implementation Sustainable Development Goals (SDG) 15.3 achieving degradation-neutral world uncertain. This study quantitatively explored trends from 2000 to 2020 by coupling vegetation growth response climate change. Furthermore,...

10.1080/17538947.2023.2207840 article EN cc-by International Journal of Digital Earth 2023-05-02

Previous methods for dynamic facial expression recognition (DFER) in the wild are mainly based on Convolutional Neural Networks (CNNs), whose local operations ignore long-range dependencies videos. Transformer-based DFER can achieve better performances but result higher FLOPs and computational costs. To solve these problems, local-global spatio-temporal Transformer (LOGO-Former) is proposed to capture discriminative features within each frame model contextual relationships among frames while...

10.1109/icassp49357.2023.10095448 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2023-05-05

Semisupervised semantic segmentation is an effective way to reduce the expensive manual annotation cost and take advantage of unlabeled data for remote sensing (RS) image interpretation. Recent related research has mainly adopted two strategies: self-training consistency regularization. Self-training tries acquire accurate pseudo-labels explicitly expand train set. However, existing methods cannot accurately identify false pseudo-labels, suffering from their negative impact on model...

10.1109/tgrs.2021.3134277 article EN IEEE Transactions on Geoscience and Remote Sensing 2021-12-09
Coming Soon ...