NFDI4DS | UHH-SEMS - Publication Details

Tetsuya Takiguchi

ORCID: 0000-0001-5005-7679

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5009283470

Research Areas

Speech and Audio Processing
Speech Recognition and Synthesis
Music and Audio Processing
Voice and Speech Disorders
Blind Source Separation Techniques
Advanced Image and Video Retrieval Techniques
Face and Expression Recognition
Advanced Adaptive Filtering Techniques
Image Retrieval and Classification Techniques
Video Analysis and Summarization
Indoor and Outdoor Localization Technologies
Speech and dialogue systems
Phonetics and Phonology Research
Face recognition and analysis
Natural Language Processing Techniques
Hand Gesture Recognition Systems
Neural Networks and Applications
Topic Modeling
Advanced Data Compression Techniques
Video Surveillance and Tracking Methods
Image and Signal Denoising Methods
Advanced Vision and Imaging
Hearing Loss and Rehabilitation
Image Processing Techniques and Applications
Emotion and Mood Recognition

Kobe University
2016-2025

Kanazawa Medical Center
2017-2024

National Hospital Organization
2017-2024

Nagoya University
2024

Kumamoto Health Science University
2022

Nara Institute of Science and Technology
1996-2020

Multidisciplinary Digital Publishing Institute (Switzerland)
2020

Duke University
2020

The University of Tokyo
2009-2019

Hitotsubashi University
2019

Voice conversion in high-order eigen space using deep belief nets

OPENALEX - Publications

Toru Nakashika Ryoichi Takashima Tetsuya Takiguchi Yasuo Ariki

This paper presents a voice conversion technique using Deep Belief Nets (DBNs) to build high-order eigen spaces of the source/target speakers, where it is easier convert source speech target than in traditional cepstrum space. DBNs have deep architecture that automatically discovers abstractions maximally express original input features. If we train only an individual speaker, can be considered there less phonological information and relatively more speaker individuality output features at...

10.21437/interspeech.2013-102 article EN Interspeech 2022 2013-08-25

Exemplar-based voice conversion in noisy environment

OPENALEX - Publications

Ryoichi Takashima Tetsuya Takiguchi Yasuo Ariki

This paper presents a voice conversion (VC) technique for noisy environments, where parallel exemplars are introduced to encode the source speech signal and synthesize target signal. The (dictionary) consist of exemplars, having same texts uttered by speakers. input is decomposed into noise obtained from signal, their weights (activities). Then, using converted constructed exemplars. We carried out speaker tasks clean data noise-added data. effectiveness this method was confirmed comparing...

10.1109/slt.2012.6424242 article EN 2022 IEEE Spoken Language Technology Workshop (SLT) 2012-12-01

Pain induces stable, active microcircuits in the somatosensory cortex that provide a therapeutic target

OPENALEX - Publications

Takuya Okada Daisuke Kato Yuki Nomura Norihiko Obata Xiangyu Quan and 10 more

Upregulation of N-type Ca 2+ channel dependent subunits increases functional connections and synchronization for pain formation.

10.1126/sciadv.abd8261 article EN cc-by-nc Science Advances 2021-03-19

GMM-Based Emotional Voice Conversion Using Spectrum and Prosody Features

OPENALEX - Publications

Ryo Aihara Ryoichi Takashima Tetsuya Takiguchi Yasuo Ariki

We propose Gaussian Mixture Model (GMM)-based emotional voice conversion using spectrum and prosody features. In recent years, speech recognition synthesis techniques have been developed, an technique is required for synthesizing more expressive voices. The common was based on transformation of neutral to by huge corpus. this paper, we convert a GMMs. GMM-based widely used modify non linguistic information such as characteristics while keeping unchanged. Because the conventional method...

10.5923/j.ajsp.20120205.06 article EN American Journal of Signal Processing 2012-12-01

Voice Conversion Using RNN Pre-Trained by Recurrent Temporal Restricted Boltzmann Machines

OPENALEX - Publications

Toru Nakashika Tetsuya Takiguchi Yasuo Ariki

This paper presents a voice conversion (VC) method that utilizes the recently proposed probabilistic models called recurrent temporal restricted Boltzmann machines (RTRBMs). One RTRBM is used for each speaker, with goal of capturing high-order dependencies in an acoustic sequence. Our algorithm starts from separate training one source speaker and another target using speaker-dependent data. Because attempts to discover abstractions maximally express data at time step, as well data, we expect...

10.1109/taslp.2014.2379589 article EN IEEE/ACM Transactions on Audio Speech and Language Processing 2014-12-09

Microbiomes of the normal middle ear and ears with chronic otitis media

OPENALEX - Publications

Shujiro Minami Hideki Mutai Tomoko Suzuki Arata Horii Naoki Oishi and 6 more

Objective The aim of this study was to profile and compare the middle ear microbiomes human subjects with without chronic otitis media. Study Design Prospective multicenter cohort study. Methods All consecutive patients undergoing tympanoplasty surgery for media or conditions other than were recruited. Sterile swab samples collected from mucosa during surgery. variable region 4 16S rRNA gene in each sample amplified using region‐specific primers adapted Illumina MiSeq sequencer (Illumina,...

10.1002/lary.26579 article EN The Laryngoscope 2017-04-11

Non-Parallel Training in Voice Conversion Using an Adaptive Restricted Boltzmann Machine

OPENALEX - Publications

Toru Nakashika Tetsuya Takiguchi Yasuhiro Minami

In this paper, we present a voice conversion (VC) method that does not use any parallel data while training the model. VC is technique where only speaker-specific information in source speech converted keeping phonological unchanged. Most of existing methods rely on data-pairs from and target speakers uttering same sentences. However, causes several problems: 1) used for are limited to predefined sentences, 2) trained model applied speaker pair training, 3) mismatches alignment may occur....

10.1109/taslp.2016.2593263 article EN IEEE/ACM Transactions on Audio Speech and Language Processing 2016-07-19

Speech intonation in children with autism spectrum disorder

OPENALEX - Publications

Yasushi Nakai Ryoichi Takashima Tetsuya Takiguchi Satoshi Takada

10.1016/j.braindev.2013.07.006 article EN Brain and Development 2013-08-20

High-order sequence modeling using speaker-dependent recurrent temporal restricted boltzmann machines for voice conversion

OPENALEX - Publications

Toru Nakashika Tetsuya Takiguchi Yasuo Ariki

This paper presents a voice conversion (VC) method that utilizes recently proposed recurrent temporal restricted Boltzmann machines (RTRBMs) for each speaker, with the goal of capturing high-order dependencies in an acoustic sequence. Our algorithm starts from separate training two RTRBMs source and target speaker using speaker-dependent data. Since RTRBM attempts to discover abstractions at time step, as well data, we expect models represent speaker-specific latent features spaces. In our...

10.21437/interspeech.2014-447 article EN Interspeech 2022 2014-09-14

Spillover effects between energies, gold, and stock: the United States versus China

OPENALEX - Publications

Xie He Tetsuya Takiguchi Tadahiro Nakajima Shigeyuki Hamori

This study investigates the time–frequency dynamics of return and volatility spillovers between stock market three commodity markets: natural gas, crude oil, gold via a comparative analysis United States China is conducted with help new empirical methods. Our findings are as follows. First, in terms time, oil strongest two markets. Crude emits net negative spillover to US market, positive Chinese market. By contrast, effect transmitted markets both countries through gold. However, has on In...

10.1177/0958305x20907081 article EN Energy & Environment 2020-03-02

MM-iTransformer: A Multimodal Approach to Economic Time Series Forecasting with Textual Data

OPENALEX - Publications

Shangyang Mou Qiang Xue Jinhui Chen Tetsuya Takiguchi Yasuo Ariki

This paper introduces a novel multimodal framework for economic time series forecasting, integrating textual information with historical price data to enhance predictive accuracy. The proposed method employs multi-head attention mechanism dynamically align embeddings temporal data, capturing previously unrecognized cross-modal dependencies and enhancing the model’s ability interpret event-driven market dynamics. enables model complex behaviors in unified effective manner. Experimental...

10.3390/app15031241 article EN cc-by Applied Sciences 2025-01-25

DialFill: Utilizing Dialogue Filling to Integrate Retrieved Knowledge in Responses

OPENALEX - Publications

Qiang Xue Tetsuya Takiguchi Yasuo Ariki

10.1109/access.2025.3555650 article EN cc-by IEEE Access 2025-01-01

Participation of endogenous IGF‐I and TGF‐β1 with enamel matrix derivative‐stimulated cell growth in human periodontal ligament cells

OPENALEX - Publications

Kenichi Okubo Makoto Kobayashi Tetsuya Takiguchi Takatora Takada Atsushi Ohazama and 2 more

Previous studies have provided the biological basis for therapeutic use of enamel matrix derivative (EMD) at sites periodontal regeneration. A purpose this study is to determine effects EMD on cell growth, osteoblastic differentiation and insulin-like growth factor-I (IGF-I) transforming factor-beta 1 (TGF-beta 1) production in human ligament cells (HPLC). We also examined participation endogenous IGF-I TGF-beta with EMD-stimulated these cells. HPLCs used were treated alone or combination...

10.1034/j.1600-0765.2003.01607.x article EN Journal of Periodontal Research 2003-01-28

Emotional voice conversion using deep neural networks with MCC and F0 features

OPENALEX - Publications

Zhaojie Luo Tetsuya Takiguchi Yasuo Ariki

An artificial neural network is one of the most important models for training features in a voice conversion task. Typically, Neural Networks (NNs) are not effective processing low-dimensional F0 features, thus this causes that performance those methods based on networks Mel Cepstral Coefficients (MCC) outstanding. However, can robustly represent various prosody signals (e.g., emotional prosody). In study, we propose an method NNs to train normalized-segment-F0 (NSF0) conversion. Meanwhile,...

10.1109/icis.2016.7550889 article EN 2016-06-01

Lip reading using a dynamic feature of lip images and convolutional neural networks

OPENALEX - Publications

Yiting Li Yuki Takashima Tetsuya Takiguchi Yasuo Ariki

In this paper, a lip-reading method using novel dynamic feature of lip images is proposed. The calculated as the first-order regression coefficients few neighboring frames (images). It constiutes better representation time derivatives to basic static image. processed by convolution neural networks (CNNs), which are able reduce negative influence caused shaking subject and face alignment blurring at feature-extraction level. Its effectiveness has been confirmed word-recognition experiments...

10.1109/icis.2016.7550888 article EN 2016-06-01

End-to-end Dysarthric Speech Recognition Using Multiple Databases

OPENALEX - Publications

Yuki Takashima Tetsuya Takiguchi Yasuo Ariki

We present in this paper an end-to-end automatic speech recognition (ASR) system for a person with articulation disorder resulting from athetoid cerebral palsy. In the case of type disorder, style is quite different that physically unimpaired person, and amount their data available to train model limited because burden large due strain on muscles. Therefore, performance ASR systems people degrades significantly. paper, we propose framework trained by not only Japanese but also non-Japanese...

10.1109/icassp.2019.8683803 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2019-04-17

Two-Step Acoustic Model Adaptation for Dysarthric Speech Recognition

OPENALEX - Publications

Ryoichi Takashima Tetsuya Takiguchi Yasuo Ariki

This paper introduces a model adaptation approach for speaker-dependent dysarthric speech recognition system. The dysarthria we focus on in this is caused by athetoid cerebral palsy, which causes involuntary muscle movements those with the disease. For reason, people's often unstable and difficult conventional automatic (ASR) systems to recognize. A model-adaptation approach, adapts an ASR speech, one possible solution. However, because difference speaking styles between non-dysarthric...

10.1109/icassp40776.2020.9053725 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2020-04-09

Recombinant Human Bone Morphogenetic Protein-2 Stimulates Osteoblastic Differentiation in Cells Isolated from Human Periodontal Ligament

OPENALEX - Publications

Masaaki Kobayashi Tetsuya Takiguchi Ryoko Suzuki Akira Yamaguchi Katsutoshi Deguchi and 5 more

Periodontal ligament cells may play an important role in the successful regeneration of periodontium. We investigated effects recombinant human bone morphogenetic protein-2 (rhBMP-2), one most potent growth factors that stimulates osteoblast differentiation and formation, on cell osteoblastic periodontal (HPLC) isolated from four adult patients. rhBMP-2 induced no significant changes any HPLCs. at concentrations over 50 ng/mL significantly stimulated alkaline phosphatase (ALPase) activity...

10.1177/00220345990780100701 article EN Journal of Dental Research 1999-10-01

Exemplar-Based Voice Conversion Using Sparse Representation in Noisy Environments

OPENALEX - Publications

Ryoichi Takashima Tetsuya Takiguchi Yasuo Ariki

This paper presents a voice conversion (VC) technique for noisy environments, where parallel exemplars are introduced to encode the source speech signal and synthesize target signal. The (dictionary) consist of exemplars, having same texts uttered by speakers. input is decomposed into noise their weights (activities). Then, using converted constructed from exemplars. We carried out speaker tasks clean data noise-added data. effectiveness this method was confirmed comparing its with that...

10.1587/transfun.e96.a.1946 article EN IEICE Transactions on Fundamentals of Electronics Communications and Computer Sciences 2013-01-01

Voice Conversion Based on Speaker-Dependent Restricted Boltzmann Machines

OPENALEX - Publications

Toru Nakashika Tetsuya Takiguchi Yasuo Ariki

This paper presents a voice conversion technique using speaker-dependent Restricted Boltzmann Machines (RBM) to build high-order eigen spaces of source/target speakers, where it is easier convert the source speech target than in traditional cepstrum space. We deep architecture that concatenates two RBMs with neural networks, expecting they automatically discover abstractions express original input features. Under this concept, if we train only an individual speaker includes various phonemes...

10.1587/transinf.e97.d.1403 article EN IEICE Transactions on Information and Systems 2014-01-01

Voice conversion based on Non-negative matrix factorization using phoneme-categorized dictionary

OPENALEX - Publications

Ryo Aihara Toru Nakashika Tetsuya Takiguchi Yasuo Ariki

We present in this paper an exemplar-based voice conversion (VC) method using a phoneme-categorized dictionary. Sparse representation-based VC Non-negative matrix factorization (NMF) is employed for spectral between different speakers. In our previous NMF-based method, source exemplars and target are extracted from parallel training data, having the same texts uttered by The input signal represented their weights. Then, converted speech constructed weights related to exemplars. However,...

10.1109/icassp.2014.6855137 article EN 2014-05-01

Detecting Abnormal Word Utterances in Children With Autism Spectrum Disorders

OPENALEX - Publications

Yasushi Nakai Tetsuya Takiguchi Gakuyo Matsui Noriko Yamaoka Satoshi Takada

Abnormal prosody is often evident in the voice intonations of individuals with autism spectrum disorders. We compared a machine-learning-based analysis human hearing judgments made by 10 speech therapists for classifying children disorders ( n = 30) and typical development 51). Using stimuli limited to single-word utterances, was superior therapist judgments. There significantly higher true-positive than false-negative rate but not therapists. Results are discussed terms some artificiality...

10.1177/0031512517716855 article EN Perceptual and Motor Skills 2017-06-26

Emotional voice conversion using neural networks with arbitrary scales F0 based on wavelet transform

OPENALEX - Publications

Zhaojie Luo Jinhui Chen Tetsuya Takiguchi Yasuo Ariki

An artificial neural network is an important model for training features of voice conversion (VC) tasks. Typically, networks (NNs) are very effective in processing nonlinear features, such as Mel Cepstral Coefficients (MCC), which represent the spectrum features. However, a simple representation fundamental frequency (F0) not enough NNs to deal with emotional VC. This because time sequence F0 changes drastically. Therefore, our previous method, we used continuous wavelet transform (CWT)...

10.1186/s13636-017-0116-2 article EN cc-by EURASIP Journal on Audio Speech and Music Processing 2017-08-01

Coming Soon ...