NFDI4DS | UHH-SEMS - Publication Details

Yusuke Miyao

ORCID: 0000-0002-0678-3400

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5004444958

Research Areas

Natural Language Processing Techniques
Topic Modeling
Speech and dialogue systems
Text Readability and Simplification
Semantic Web and Ontologies
Biomedical Text Mining and Ontologies
Multimodal Machine Learning Applications
Advanced Text Analysis Techniques
Algorithms and Data Compression
Video Analysis and Summarization
Human Pose and Action Recognition
Web Data Mining and Analysis
Text and Document Classification Technologies
Speech Recognition and Synthesis
Language, Metaphor, and Cognition
Domain Adaptation and Few-Shot Learning
Stock Market Forecasting Methods
Syntax, Semantics, Linguistic Variation
semigroups and automata theory
Machine Learning and Algorithms
Rough Sets and Fuzzy Logic
Software Engineering Research
Advanced Database Systems and Queries
Logic, programming, and type systems
Mathematics, Computing, and Information Processing

The University of Tokyo
2009-2025

National Institute of Advanced Industrial Science and Technology
2016-2023

Tokyo Institute of Technology
2019-2023

Administration for Community Living
2023

IT University of Copenhagen
2023

American Jewish Committee
2023

Ochanomizu University
2018-2023

Imperial College London
2022

University of Wuppertal
2020

National Institute of Informatics
2010-2019

SemEval 2014 Task 8: Broad-Coverage Semantic Dependency Parsing

OPENALEX - Publications

Stephan Oepen Marco Kuhlmann Yusuke Miyao Daniel Zeman Dan Flickinger and 3 more

Stephan Oepen, Marco Kuhlmann, Yusuke Miyao, Daniel Zeman, Dan Flickinger, Jan Hajič, Angelina Ivanova, Yi Zhang. Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014). 2014.

10.3115/v1/s14-2008 article EN cc-by Proceedings of the 16th International Workshop on Semantic Evaluation (SemEval-2022) 2014-01-01

Probabilistic CFG with latent annotations

OPENALEX - Publications

Takuya Matsuzaki Yusuke Miyao Jun’ichi Tsujii

This paper defines a generative probabilistic model of parse trees, which we call PCFG-LA. is an extension PCFG in non-terminal symbols are augmented with latent variables. Fine-grained CFG rules automatically induced from parsed corpus by training PCFG-LA using EM-algorithm. Because exact parsing NP-hard, several approximations described and empirically compared. In experiments the Penn WSJ corpus, our trained gave performance 86.6% (F1, sentences ≤ 40 words), comparable to that...

10.3115/1219840.1219850 article EN 2005-01-01

EVENT EXTRACTION FROM BIOMEDICAL PAPERS USING A FULL PARSER

OPENALEX - Publications

Akane Yakushiji Yuka Tateisi Yusuke Miyao Jun’ichi Tsujii

10.1142/9789814447362_0040 article EN Biocomputing 2000-12-01

Feature Forest Models for Probabilistic HPSG Parsing

OPENALEX - Publications

Yusuke Miyao Jun’ichi Tsujii

Probabilistic modeling of lexicalized grammars is difficult because these exploit complicated data structures, such as typed feature structures. This prevents us from applying common methods probabilistic in which a complete structure divided into sub-structures under the assumption statistical independence among sub-structures. For example, part-of-speech tagging sentence decomposed each word, and CFG parsing split applications rules. These have relied on target problem, namely lattices or...

10.1162/coli.2008.34.1.35 article EN Computational Linguistics 2008-03-01

Evaluating contributions of natural language parsers to protein–protein interaction extraction

OPENALEX - Publications

Yusuke Miyao Kenji Sagae Rune Sætre Takuya Matsuzaki Jun’ichi Tsujii

Abstract Motivation: While text mining technologies for biomedical research have gained popularity as a way to take advantage of the explosive growth information in form papers, selecting appropriate natural language processing (NLP) tools is still difficult researchers who are not familiar with recent advances NLP. This article provides comparative evaluation several state-of-the-art parsers, focusing on task extracting protein–protein interaction (PPI) from papers. We measure how each...

10.1093/bioinformatics/btn631 article EN cc-by-nc Bioinformatics 2008-12-09

SemEval 2015 Task 18: Broad-Coverage Semantic Dependency Parsing

OPENALEX - Publications

Stephan Oepen Marco Kuhlmann Yusuke Miyao Daniel Zeman Silvie Cinková and 3 more

Stephan Oepen, Marco Kuhlmann, Yusuke Miyao, Daniel Zeman, Silvie Cinková, Dan Flickinger, Jan Hajič, Zdeňka Urešová. Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015). 2015.

10.18653/v1/s15-2153 article EN cc-by Proceedings of the 16th International Workshop on Semantic Evaluation (SemEval-2022) 2015-01-01

Protein–protein interaction extraction by leveraging multiple kernels and parsers

OPENALEX - Publications

Makoto Miwa Rune Sætre Yusuke Miyao Jun’ichi Tsujii

10.1016/j.ijmedinf.2009.04.010 article EN International Journal of Medical Informatics 2009-06-05

Classifying Temporal Relations by Bidirectional LSTM over Dependency Paths

OPENALEX - Publications

Fei Cheng Yusuke Miyao

Temporal relation classification is becoming an active research field. Lots of methods have been proposed, while most them focus on extracting features from external resources. Less attention has paid to a significant advance in closely related task: extraction. In this work, we borrow state-of-the-art method extraction by adopting bidirectional long short-term memory (Bi-LSTM) along dependency paths (DP). We make "common root" assumption extend DP representations cross-sentence links. the...

10.18653/v1/p17-2001 article EN cc-by 2017-01-01

A rich feature vector for protein-protein interaction extraction from multiple corpora

OPENALEX - Publications

Makoto Miwa Rune Sætre Yusuke Miyao Jun’ichi Tsujii

Because of the importance protein-protein interaction (PPI) extraction from text, many corpora have been proposed with slightly differing definitions proteins and PPI. Since no single corpus is large enough to saturate a machine learning system, it necessary learn multiple different corpora. In this paper, we propose solution challenge. We designed rich feature vector, applied support vector modified for weighting (SVM-CW) complete task PPI extraction. The made useful kernels, used express...

10.3115/1699510.1699527 article EN public-domain 2009-01-01

TwiMed: Twitter and PubMed Comparable Corpus of Drugs, Diseases, Symptoms, and Their Relations

OPENALEX - Publications

Nestor Alvaro Yusuke Miyao Nigel Collier

Background: Work on pharmacovigilance systems using texts from PubMed and Twitter typically target at different elements use annotation guidelines resulting in a scenario where there is no comparable set of documents both annotated the same manner. Objective: This study aimed to provide corpus that can be used drug reports these two sources information, allowing researchers area natural language processing (NLP) perform experiments better understand similarities differences between PubMed....

10.2196/publichealth.6396 article EN cc-by JMIR Public Health and Surveillance 2017-05-03

Probabilistic disambiguation models for wide-coverage HPSG parsing

OPENALEX - Publications

Yusuke Miyao Jun’ichi Tsujii

This paper reports the development of log-linear models for disambiguation in wide-coverage HPSG parsing. The estimation requires high computational cost, especially with grammars. Using techniques to reduce we trained using 20 sections Penn Tree-bank. A series experiments empirically evaluated techniques, and also examined performance on parsing real-world sentences.

10.3115/1219840.1219851 article EN 2005-01-01

Semantic retrieval for the accurate identification of relational concepts in massive textbases

OPENALEX - Publications

Yusuke Miyao Tomoko Ohta Katsuya Masuda Yoshimasa Tsuruoka Kazuhiro YOSHIDA and 2 more

This paper introduces a novel framework for the accurate retrieval of relational concepts from huge texts. Prior to retrieval, all sentences are annotated with predicate argument structures and ontological identifiers by applying deep parser term recognizer. During run time, user requests converted into queries region algebra on these annotations. Structural matching pre-computed semantic annotations establishes efficient concepts. was applied text system MEDLINE. Experiments biomedical...

10.3115/1220175.1220303 article EN 2006-01-01

Learning to Generate Market Comments from Stock Prices

OPENALEX - Publications

Soichiro Murakami Akihiko Watanabe Akira Miyazawa Keiichi Goshima Yanase Toshihiko and 2 more

Soichiro Murakami, Akihiko Watanabe, Akira Miyazawa, Keiichi Goshima, Toshihiko Yanase, Hiroya Takamura, Yusuke Miyao. Proceedings of the 55th Annual Meeting Association for Computational Linguistics (Volume 1: Long Papers). 2017.

10.18653/v1/p17-1126 article EN cc-by Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) 2017-01-01

Ask an Expert: Leveraging Language Models to Improve Strategic Reasoning in Goal-Oriented Dialogue Models

OPENALEX - Publications

Qiang Zhang Jason Naradowsky Yusuke Miyao

Existing dialogue models may encounter scenarios which are not well-represented in the training data, and as a result generate responses that unnatural, inappropriate, or unhelpful. We propose "Ask an Expert" framework model is trained with access to "expert" it can consult at each turn. Advice solicited via structured expert, optimized selectively utilize (or ignore) given context history. In this work expert takes form of LLM.We evaluate mental health support domain, where structure...

10.18653/v1/2023.findings-acl.417 article EN cc-by Findings of the Association for Computational Linguistics: ACL 2022 2023-01-01

Improving the scalability of semi-Markov conditional random fields for named entity recognition

OPENALEX - Publications

Daisuke Okanohara Yusuke Miyao Yoshimasa Tsuruoka Jun’ichi Tsujii

This paper presents techniques to apply semi-CRFs Named Entity Recognition tasks with a tractable computational cost. Our framework can handle an NER task that has long named entities and many labels which increase the To reduce cost, we propose two techniques: first is use of feature forests, enables us pack feature-equivalent states, second introduction filtering process significantly reduces number candidate states. allows rich set features extracted from chunk-based representation...

10.3115/1220175.1220234 article EN 2006-01-01

ccg2lambda: A Compositional Semantics System

OPENALEX - Publications

Pascual Martínez-Gómez Koji Mineshima Yusuke Miyao Daisuke Bekki

We demonstrate a simple and easy-to-use system to produce logical semantic representations of sentences.Our software operates by composing formulas bottom-up given CCG parse tree.It uses flexible templates specify patterns.Templates for English Japanese accompany our software, they are easy understand, use extend cover other linguistic phenomena or languages.We also provide scripts in textual entailment task, visualization tool display semantically augmented trees HTML.

10.18653/v1/p16-4015 article EN cc-by 2016-01-01

Coming Soon ...