NFDI4DS | UHH-SEMS - Publication Details

Osamu Yoshie

ORCID: 0000-0002-4192-554X

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5057487414

Research Areas

Advanced Neural Network Applications
Video Surveillance and Tracking Methods
Natural Language Processing Techniques
Domain Adaptation and Few-Shot Learning
Topic Modeling
Advanced Image and Video Retrieval Techniques
Speech and dialogue systems
Digital Games and Media
Face and Expression Recognition
Speech and Audio Processing
Advanced Vision and Imaging
Music and Audio Processing
Anomaly Detection Techniques and Applications
Semantic Web and Ontologies
Artificial Intelligence in Games
Educational Games and Gamification
Thermal Radiation and Cooling Technologies
Human Motion and Animation
Multimodal Machine Learning Applications
Speech Recognition and Synthesis
Gear and Bearing Dynamics Analysis
Neural Networks and Applications
Digital Transformation in Industry
Manufacturing Process and Optimization
Emotion and Mood Recognition

Waseda University
2016-2025

Division of Undergraduate Education
2022

Framework
2022

Graduate School USA
2005-2018

Fudan University
2018

Tokyo University of Science
1993-1995

The University of Tokyo
1994

OTA: Optimal Transport Assignment for Object Detection

OPENALEX - Publications

Zheng Ge Songtao Liu Zeming Li Osamu Yoshie Jian Sun

Recent advances in label assignment object detection mainly seek to independently define positive/negative training samples for each ground-truth (gt) object. In this paper, we innovatively revisit the from a global perspective and propose formulate assigning procedure as an Optimal Transport (OT) problem – well-studied topic Optimization Theory. Concretely, unit transportation cost between demander (anchor) supplier pair weighted summation of their classification regression losses. After...

10.1109/cvpr46437.2021.00037 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2021-06-01

NMS by Representative Region: Towards Crowded Pedestrian Detection by Proposal Pairing

OPENALEX - Publications

Xin Huang Zheng Ge Zequn Jie Osamu Yoshie

Although significant progress has been made in pedestrian detection recently, crowded scenes is still challenging. The heavy occlusion between pedestrians imposes great challenges to the standard Non-Maximum Suppression (NMS). A relative low threshold of intersection over union (IoU) leads missing highly overlapped pedestrians, while a higher one brings plenty false positives. To avoid such dilemma, this paper proposes novel Representative Region NMS (R2NMS) approach leveraging less occluded...

10.1109/cvpr42600.2020.01076 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2020-06-01

PP-YOLOv2: A Practical Object Detector

OPENALEX - Publications

Xin Huang Xinxin Wang Wenyu Lv Xiaying Bai Xiang Long and 8 more

Being effective and efficient is essential to an object detector for practical use. To meet these two concerns, we comprehensively evaluate a collection of existing refinements improve the performance PP-YOLO while almost keep infer time unchanged. This paper will analyze empirically their impact on final model through incremental ablation study. Things tried that didn't work also be discussed. By combining multiple refinements, boost PP-YOLO's from 45.9% mAP 49.5% COCO2017 test-dev. Since...

10.48550/arxiv.2104.10419 preprint EN other-oa arXiv (Cornell University) 2021-01-01

SST: Spatial and Semantic Transformers for Multi-Label Image Recognition

OPENALEX - Publications

Zhaomin Chen Quan Cui Borui Zhao Renjie Song Xiaoqin Zhang and 1 more

Multi-label image recognition has attracted considerable research attention and achieved great success in recent years. Capturing label correlations is an effective manner to advance the performance of multi-label recognition. Two types were principally studied, i.e., spatial semantic correlations. However, literature, previous methods considered only either them. In this work, inspired by Transformer, we propose a plug-and-play module, named Spatial Semantic Transformers (SST),...

10.1109/tip.2022.3148867 article EN IEEE Transactions on Image Processing 2022-01-01

LLA: Loss-aware label assignment for dense pedestrian detection

OPENALEX - Publications

Zheng Ge Jianfeng Wang Xin Huang Songtao Liu Osamu Yoshie

Label assignment has been widely studied in general object detection because of its great impact on detectors' performance. In the field dense pedestrian detection, human bodies are often heavily entangled, making label more important. However, none existing method focuses crowd scenarios. Motivated by this, we propose Loss-aware Assignment (LLA) to boost performance detectors Concretely, LLA first calculates classification (cls) and regression (reg) losses between each anchor ground-truth...

10.1016/j.neucom.2021.07.094 article EN cc-by Neurocomputing 2021-08-06

Multilayer optical thin film design with deep Q learning

OPENALEX - Publications

An-Qing Jiang Osamu Yoshie Liang‐Yao Chen

Abstract Multilayer optical film plays a significant role in broad fields of application. Due to the nonlinear relationship between dispersion characteristics materials and actual performance parameters thin films, it is challenging optimize structure with traditional models. In this paper, we present an implementation Deep Q-learning, which suited for most part film. As set concrete demonstrations, solar absorber. The optimal program could absorber 500 epoch (about 200 steps per-epoch)...

10.1038/s41598-020-69754-w article EN cc-by Scientific Reports 2020-07-29

Gated‐Attention Model with Reinforcement Learning for Solving Dynamic Job Shop Scheduling Problem

OPENALEX - Publications

Goytom Gebreyesus Getu Fellek Ahmed Farid Shigeru Fujimura Osamu Yoshie

Job shop scheduling problem (JSSP) is one of the well‐known NP‐hard combinatorial optimization problems (COPs) that aims to optimize sequential assignment finite machines a set jobs while adhering specified constraints. Conventional solution approaches which include heuristic dispatching rules and evolutionary algorithms has been largely in use solve JSSPs. Recently, reinforcement learning (RL) gained popularity for delivering better quality In this research, we propose an end‐to‐end deep...

10.1002/tee.23788 article EN IEEJ Transactions on Electrical and Electronic Engineering 2023-03-24

A Simple Framework for Text-Supervised Semantic Segmentation

OPENALEX - Publications

Muyang Yi Quan Cui Hao Wu Cheng Yang Osamu Yoshie and 1 more

Text-supervised semantic segmentation is a novel research topic that allows segments to emerge with image-text contrasting. However, pioneering methods could be subject specifically designed network architectures. This paper shows vanilla contrastive language-image pretraining (CLIP) model an effective text-supervised segmentor by itself. First, we reveal CLIP inferior localization and due its optimization being driven densely aligning visual language representations. Second, propose the...

10.1109/cvpr52729.2023.00683 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2023-06-01

Accent modification for speech recognition of non-native speakers using neural style transfer

OPENALEX - Publications

Kacper Radzikowski Le Wang Osamu Yoshie Robert Nowak

Abstract Nowadays automatic speech recognition (ASR) systems can achieve higher and accuracy rates depending on the methodology applied datasets used. The rate decreases significantly when ASR system is being used with a non-native speaker of language to be recognized. main reason for this specific pronunciation accent features related mother tongue that speaker, which influence pronunciation. At same time, an extremely limited volume labeled makes it difficult train, from ground up,...

10.1186/s13636-021-00199-3 article EN cc-by EURASIP Journal on Audio Speech and Music Processing 2021-02-18

Graph Transformer with Reinforcement Learning for Vehicle Routing Problem

OPENALEX - Publications

Getu Fellek Ahmed Farid Goytom Gebreyesus Shigeru Fujimura Osamu Yoshie

Vehicle routing problem (VRP) is one of the classic combinatorial optimization problems where an optimal tour to visit customers required with a minimum total cost in presence some constraints. Recently, VRP being solved use deep reinforcement learning (DRL), node sets considered (represented) as graph structure. Existing Transformer based DRL solutions for rely only on information ignoring role edges between nodes In this paper, we proposed attention‐based end‐to‐end model solve which...

10.1002/tee.23771 article EN IEEJ Transactions on Electrical and Electronic Engineering 2023-02-13

Examining viewers’ impulsive buying behaviour in sports livestreaming commerce

OPENALEX - Publications

Haoyu Liu Kim Hua Tan Leanne Chung Osamu Yoshie Yuya Ieiri

Abstract Livestreaming commerce is increasingly influencing the sports industry’s supply chain. This study seeks to understand how quality of service characteristics enhance customer flow experiences and encourage buying behaviour. It also delves into relationship between experience impulsive behaviour, particularly examining fan identification moderates this dynamic. Data from 274 participants, who recounted their recent shopping while watching on SLSPs, were analysed. Structural equation...

10.1007/s12063-024-00536-7 article EN cc-by Operations Management Research 2025-01-04

Fasttalker: An Unified Framework for Generating Speech and Conversational Gestures from Text

OPENALEX - Publications

Jian Zhang Osamu Yoshie Zixin Guo Minggui He

10.2139/ssrn.5128348 preprint EN 2025-01-01

BFA: Best-Feature-Aware Fusion for Multi-View Fine-grained Manipulation

OPENALEX - Publications

Zihan Lan Weixin Mao Haosheng Li Le Wang Tiancai Wang and 2 more

In real-world scenarios, multi-view cameras are typically employed for fine-grained manipulation tasks. Existing approaches (e.g., ACT) tend to treat features equally and directly concatenate them policy learning. However, it will introduce redundant visual information bring higher computational costs, leading ineffective manipulation. For a task, tends involve multiple stages while the most contributed view different is varied over time. this paper, we propose plug-and-play...

10.48550/arxiv.2502.11161 preprint EN arXiv (Cornell University) 2025-02-16

ARM : nnU-Net with Arena Mechanism for Medical Image Segmentation

OPENALEX - Publications

Haoran Luo Cong Guan Tengfei Shao Shenglei Li Tomoji Kishi and 1 more

10.1109/icassp49660.2025.10888982 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2025-03-12

FastTalker: An unified framework for generating speech and conversational gestures from text

OPENALEX - Publications

Jian Zhang Zixin Guo Minggui He Osamu Yoshie

10.1016/j.neucom.2025.130074 article EN Neurocomputing 2025-03-01

A consumer behavior analytics model for commercial district marketing using network-structured stamp rally data

OPENALEX - Publications

Yuya Ieiri Tengfei Shao Osamu Yoshie

10.1016/j.dajour.2025.100567 article EN cc-by Decision Analytics Journal 2025-04-01

PS-RCNN: Detecting Secondary Human Instances in a Crowd via Primary Object Suppression

OPENALEX - Publications

Zheng Ge Zequn Jie Xin Huang Rong Xu Osamu Yoshie

Detecting human bodies in highly crowded scenes is a challenging problem. Two main reasons result such problem: 1). weak visual cues of heavily occluded instances can hardly provide sufficient information for accurate detection; 2). are easier to be suppressed by Non-Maximum-Suppression (NMS). To address these two issues, we introduce variant two-stage detectors called PS-RCNN. PS-RCNN first detects slightly/none objects an R-CNN [1] module (referred as P-RCNN), and then suppress the...

10.1109/icme46284.2020.9102793 article EN 2022 IEEE International Conference on Multimedia and Expo (ICME) 2020-06-09

G-DGANet: Gated deep graph attention network with reinforcement learning for solving traveling salesman problem

OPENALEX - Publications

Getu Fellek Ahmed Farid Shigeru Fujimura Osamu Yoshie Goytom Gebreyesus

10.1016/j.neucom.2024.127392 article EN Neurocomputing 2024-02-15

Traffic engineering framework with machine learning based meta-layer in software-defined networks

OPENALEX - Publications

Yanjun Li Xiaobo Li Osamu Yoshie

Software-defined networks is an emerging architecture that separates the control plane and data plane. This paradigm enables flexible network resource allocations for traffic engineering, which aims to gain better capacity improved delay loss performance. As we know, many heuristic algorithms have been developed solve dynamic routing problem. Whereas they lead a high computational time cost, results in crucial problem whether such approach this NP-complete of any use practice. paper proposes...

10.1109/icnidc.2014.7000278 article EN 2014-09-01

Multilayered metal-dielectric film structure for highly efficient solar selective absorption

OPENALEX - Publications

Ertao Hu Xin-Xing Liu Yuan Yao Kai-Yan Zang Zong-Jie Tu and 14 more

To improve the optical absorptance of a solar selective absorber over wide wavelength range, an eight-layered metal-dielectric film structure was designed by transfer matrix method and fabricated with magnetron sputtering method. The experimental results showed that multilayered yields high 98.3% excellent spectral selectivity angular range in radiation region 250–2000 nm, total hemispherical emittance 0.12 at 400 K, nearly unchanged reflectance after heat treatment 673 K for 48 h vacuum,...

10.1088/2053-1591/aacdb3 article EN Materials Research Express 2018-06-20

High efficiency of photon-to-heat conversion with a 6-layered metal/dielectric film structure in the 250-1200 nm wavelength region

OPENALEX - Publications

Ming Hui Liu Er-Tao Hu Yuan Yao Kai-Yan Zang Ning He and 8 more

The optical properties and thermal stability of a 6-layered metal/dielectric film structure are investigated in this work. A high absorption average > 98% is achieved the broad spectral range 250-1200 nm with experiment results, good agreement our simulated results. samples have typical layered of: SiO(2)(57.3 nm)/Ti(5.7 nm)/SiO(2) (67.1 nm)/Ti(11.6 nm)/SiO(2)(51.4 nm)/Cu(>100 nm), deposited on optically polished Si or K9-glass substrates by magnetron sputtering. sample has an AM1.5G solar...

10.1364/oe.22.0a1843 article EN cc-by Optics Express 2014-11-13

Coming Soon ...