Yusuke Miyao

ORCID: 0000-0002-0678-3400
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Natural Language Processing Techniques
  • Topic Modeling
  • Speech and dialogue systems
  • Text Readability and Simplification
  • Semantic Web and Ontologies
  • Biomedical Text Mining and Ontologies
  • Multimodal Machine Learning Applications
  • Advanced Text Analysis Techniques
  • Algorithms and Data Compression
  • Video Analysis and Summarization
  • Human Pose and Action Recognition
  • Web Data Mining and Analysis
  • Text and Document Classification Technologies
  • Speech Recognition and Synthesis
  • Language, Metaphor, and Cognition
  • Domain Adaptation and Few-Shot Learning
  • Stock Market Forecasting Methods
  • Syntax, Semantics, Linguistic Variation
  • semigroups and automata theory
  • Machine Learning and Algorithms
  • Rough Sets and Fuzzy Logic
  • Software Engineering Research
  • Advanced Database Systems and Queries
  • Logic, programming, and type systems
  • Mathematics, Computing, and Information Processing

The University of Tokyo
2009-2025

National Institute of Advanced Industrial Science and Technology
2016-2023

Tokyo Institute of Technology
2019-2023

Administration for Community Living
2023

IT University of Copenhagen
2023

American Jewish Committee
2023

Ochanomizu University
2018-2023

Imperial College London
2022

University of Wuppertal
2020

National Institute of Informatics
2010-2019

Stephan Oepen, Marco Kuhlmann, Yusuke Miyao, Daniel Zeman, Dan Flickinger, Jan Hajič, Angelina Ivanova, Yi Zhang. Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014). 2014.

10.3115/v1/s14-2008 article EN cc-by Proceedings of the 16th International Workshop on Semantic Evaluation (SemEval-2022) 2014-01-01

This paper defines a generative probabilistic model of parse trees, which we call PCFG-LA. is an extension PCFG in non-terminal symbols are augmented with latent variables. Fine-grained CFG rules automatically induced from parsed corpus by training PCFG-LA using EM-algorithm. Because exact parsing NP-hard, several approximations described and empirically compared. In experiments the Penn WSJ corpus, our trained gave performance 86.6% (F1, sentences ≤ 40 words), comparable to that...

10.3115/1219840.1219850 article EN 2005-01-01

Probabilistic modeling of lexicalized grammars is difficult because these exploit complicated data structures, such as typed feature structures. This prevents us from applying common methods probabilistic in which a complete structure divided into sub-structures under the assumption statistical independence among sub-structures. For example, part-of-speech tagging sentence decomposed each word, and CFG parsing split applications rules. These have relied on target problem, namely lattices or...

10.1162/coli.2008.34.1.35 article EN Computational Linguistics 2008-03-01

Abstract Motivation: While text mining technologies for biomedical research have gained popularity as a way to take advantage of the explosive growth information in form papers, selecting appropriate natural language processing (NLP) tools is still difficult researchers who are not familiar with recent advances NLP. This article provides comparative evaluation several state-of-the-art parsers, focusing on task extracting protein–protein interaction (PPI) from papers. We measure how each...

10.1093/bioinformatics/btn631 article EN cc-by-nc Bioinformatics 2008-12-09

Stephan Oepen, Marco Kuhlmann, Yusuke Miyao, Daniel Zeman, Silvie Cinková, Dan Flickinger, Jan Hajič, Zdeňka Urešová. Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015). 2015.

10.18653/v1/s15-2153 article EN cc-by Proceedings of the 16th International Workshop on Semantic Evaluation (SemEval-2022) 2015-01-01

Temporal relation classification is becoming an active research field. Lots of methods have been proposed, while most them focus on extracting features from external resources. Less attention has paid to a significant advance in closely related task: extraction. In this work, we borrow state-of-the-art method extraction by adopting bidirectional long short-term memory (Bi-LSTM) along dependency paths (DP). We make "common root" assumption extend DP representations cross-sentence links. the...

10.18653/v1/p17-2001 article EN cc-by 2017-01-01

Because of the importance protein-protein interaction (PPI) extraction from text, many corpora have been proposed with slightly differing definitions proteins and PPI. Since no single corpus is large enough to saturate a machine learning system, it necessary learn multiple different corpora. In this paper, we propose solution challenge. We designed rich feature vector, applied support vector modified for weighting (SVM-CW) complete task PPI extraction. The made useful kernels, used express...

10.3115/1699510.1699527 article EN public-domain 2009-01-01

Background: Work on pharmacovigilance systems using texts from PubMed and Twitter typically target at different elements use annotation guidelines resulting in a scenario where there is no comparable set of documents both annotated the same manner. Objective: This study aimed to provide corpus that can be used drug reports these two sources information, allowing researchers area natural language processing (NLP) perform experiments better understand similarities differences between PubMed....

10.2196/publichealth.6396 article EN cc-by JMIR Public Health and Surveillance 2017-05-03

This paper reports the development of log-linear models for disambiguation in wide-coverage HPSG parsing. The estimation requires high computational cost, especially with grammars. Using techniques to reduce we trained using 20 sections Penn Tree-bank. A series experiments empirically evaluated techniques, and also examined performance on parsing real-world sentences.

10.3115/1219840.1219851 article EN 2005-01-01

This paper introduces a novel framework for the accurate retrieval of relational concepts from huge texts. Prior to retrieval, all sentences are annotated with predicate argument structures and ontological identifiers by applying deep parser term recognizer. During run time, user requests converted into queries region algebra on these annotations. Structural matching pre-computed semantic annotations establishes efficient concepts. was applied text system MEDLINE. Experiments biomedical...

10.3115/1220175.1220303 article EN 2006-01-01

Soichiro Murakami, Akihiko Watanabe, Akira Miyazawa, Keiichi Goshima, Toshihiko Yanase, Hiroya Takamura, Yusuke Miyao. Proceedings of the 55th Annual Meeting Association for Computational Linguistics (Volume 1: Long Papers). 2017.

10.18653/v1/p17-1126 article EN cc-by Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) 2017-01-01

Existing dialogue models may encounter scenarios which are not well-represented in the training data, and as a result generate responses that unnatural, inappropriate, or unhelpful. We propose "Ask an Expert" framework model is trained with access to "expert" it can consult at each turn. Advice solicited via structured expert, optimized selectively utilize (or ignore) given context history. In this work expert takes form of LLM.We evaluate mental health support domain, where structure...

10.18653/v1/2023.findings-acl.417 article EN cc-by Findings of the Association for Computational Linguistics: ACL 2022 2023-01-01

This paper presents techniques to apply semi-CRFs Named Entity Recognition tasks with a tractable computational cost. Our framework can handle an NER task that has long named entities and many labels which increase the To reduce cost, we propose two techniques: first is use of feature forests, enables us pack feature-equivalent states, second introduction filtering process significantly reduces number candidate states. allows rich set features extracted from chunk-based representation...

10.3115/1220175.1220234 article EN 2006-01-01

We demonstrate a simple and easy-to-use system to produce logical semantic representations of sentences.Our software operates by composing formulas bottom-up given CCG parse tree.It uses flexible templates specify patterns.Templates for English Japanese accompany our software, they are easy understand, use extend cover other linguistic phenomena or languages.We also provide scripts in textual entailment task, visualization tool display semantically augmented trees HTML.

10.18653/v1/p16-4015 article EN cc-by 2016-01-01
Coming Soon ...