Shiyang Feng

ORCID: 0009-0001-6675-2948
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Advanced Image and Video Retrieval Techniques
  • Remote-Sensing Image Classification
  • Evaluation Methods in Various Fields
  • Domain Adaptation and Few-Shot Learning
  • Automated Road and Building Extraction
  • Advanced Image Fusion Techniques
  • Advanced Neural Network Applications
  • Multimodal Machine Learning Applications
  • Remote Sensing and Land Use
  • Topic Modeling
  • Technology and Security Systems
  • E-commerce and Technology Innovations
  • Natural Language Processing Techniques

Fudan University
2024-2025

Donghua University
2019

Although remote sensing (RS) data with multiple modalities can be used to significantly improve the accuracy of semantic segmentation in RS data, how effectively extract multimodal information through feature fusion remains a challenging task. Specifically, existing methods for still face two major challenges: 1) Due diverse imaging mechanisms boundaries same foreground may vary across different modalities, leading inclusion unwanted background semantics fused features; 2) from exhibit...

10.1109/tgrs.2025.3526247 article EN IEEE Transactions on Geoscience and Remote Sensing 2025-01-01

10.1109/jstars.2025.3545831 article EN cc-by IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing 2025-01-01

Object detection in remote sensing images (RSIs) remains a challenging task due to complex variations object scale, dense arrangements, and arbitrary orientations. Compared the widely-used multi-stage one-stage approaches, query-based methods that avoid post-processing procedures implement end-to-end inference, have recently attracted much attention. However, existing still face two main challenges: 1) The feature sampling regions predicted by query vectors often fail be aligned with...

10.1109/tgrs.2024.3352011 article EN IEEE Transactions on Geoscience and Remote Sensing 2024-01-01

Scientific documents record research findings and valuable human knowledge, comprising a vast corpus of high-quality data. Leveraging multi-modality data extracted from these assessing large models' abilities to handle scientific document-oriented tasks is therefore meaningful. Despite promising advancements, models still perform poorly on multi-page document extraction understanding tasks, their capacity process within-document formats such as charts equations remains under-explored. To...

10.48550/arxiv.2406.11633 preprint EN arXiv (Cornell University) 2024-06-17

Spatial transformer network has been used in a layered form conjunction with convolutional to enable the model transform data spatially. In this paper, we propose combined spatial (STN) and Long Short-Term Memory (LSTM) classify digits sequences formed by MINST elements. This LSTM-STN top-down attention mechanism profit from LSTM layer, so that STN layer can perform short-term independent elements for statement process of transformation, thus avoiding distortion may be caused when entire...

10.1109/itaic.2019.8785574 article EN 2022 IEEE 10th Joint International Information Technology and Artificial Intelligence Conference (ITAIC) 2019-05-01
Coming Soon ...