- Magnetic Properties of Alloys
- Magnetic properties of thin films
- Topic Modeling
- Natural Language Processing Techniques
- Rare-earth and actinide compounds
- Magnetic Properties and Applications
- Advanced X-ray and CT Imaging
- Medical Imaging Techniques and Applications
- Magnetic and transport properties of perovskites and related materials
- Metallic Glasses and Amorphous Alloys
- Multimodal Machine Learning Applications
- Radiation Dose and Imaging
- Crystallization and Solubility Studies
- X-ray Diffraction in Crystallography
- Acoustic Wave Resonator Technologies
- Advanced Algorithms and Applications
- Wireless Communication Security Techniques
- Mechanical and Optical Resonators
- Hydrogen Storage and Materials
- Gas Sensing Nanomaterials and Sensors
- Advanced Measurement and Detection Methods
- Advanced Measurement and Metrology Techniques
- Text Readability and Simplification
- Optical measurement and interference techniques
- Energy Harvesting in Wireless Networks
PLA Information Engineering University
2020-2024
University of Chinese Academy of Sciences
2024
Aerospace Information Research Institute
2024
University of Hong Kong
2000-2023
University of Washington
2021-2023
Allen Institute
2023
Tianjin University of Science and Technology
2007-2021
Nanjing University of Aeronautics and Astronautics
2020
Nanjing University of Posts and Telecommunications
2019
Peking University
2016-2018
Wei He, Kai Liu, Jing Yajuan Lyu, Shiqi Zhao, Xinyan Xiao, Yuan Yizhong Wang, Hua Wu, Qiaoqiao She, Xuan Tian Haifeng Wang. Proceedings of the Workshop on Machine Reading for Question Answering. 2018.
Reading comprehension has recently seen rapid progress, with systems matching humans on the most popular datasets for task. However, a large body of work highlighted brittleness these systems, showing that there is much left to be done. We introduce new English reading benchmark, DROP, which requires Discrete Reasoning Over content Paragraphs. In this crowdsourced, adversarially-created, 96k-question system must resolve references in question, perhaps multiple input positions, and perform...
Yizhong Wang, Swaroop Mishra, Pegah Alipoormolabashi, Yeganeh Kordi, Amirreza Mirzaei, Atharva Naik, Arjun Ashok, Arut Selvan Dhanasekaran, Anjana Arunkumar, David Stap, Eshaan Pathak, Giannis Karamanolakis, Haizhi Lai, Ishan Purohit, Ishani Mondal, Jacob Anderson, Kirby Kuznia, Krima Doshi, Kuntal Kumar Pal, Maitreya Patel, Mehrad Moradshahi, Mihir Parmar, Mirali Neeraj Varshney, Phani Rohitha Kaza, Pulkit Verma, Ravsehaj Singh Puri, Rushang Karia, Savan Shailaja Keyur Sampat, Siddhartha...
Large "instruction-tuned" language models (i.e., finetuned to respond instructions) have demonstrated a remarkable ability generalize zero-shot new tasks. Nevertheless, they depend heavily on human-written instruction data that is often limited in quantity, diversity, and creativity, therefore hindering the generality of tuned model. We introduce Self-Instruct, framework for improving instruction-following capabilities pretrained by bootstrapping off their own generations. Our pipeline...
Hongjin Su, Weijia Shi, Jungo Kasai, Yizhong Wang, Yushi Hu, Mari Ostendorf, Wen-tau Yih, Noah A. Smith, Luke Zettlemoyer, Tao Yu. Findings of the Association for Computational Linguistics: ACL 2023.
Previous work introduced transition-based algorithms to form a unified architecture of parsing rhetorical structures (including span, nuclearity and relation), but did not achieve satisfactory performance. In this paper, we propose that model is more appropriate for the naked discourse tree (i.e., identifying span nuclearity) due data sparsity. At same time, argue relation labeling can benefit from structure should be treated elaborately with consideration three kinds relations including...
Despite their remarkable capabilities, large language models (LLMs) often produce responses containing factual inaccuracies due to sole reliance on the parametric knowledge they encapsulate. Retrieval-Augmented Generation (RAG), an ad hoc approach that augments LMs with retrieval of relevant knowledge, decreases such issues. However, indiscriminately retrieving and incorporating a fixed number retrieved passages, regardless whether is necessary, or passages are relevant, diminishes LM...
Despite thousands of researchers, engineers, and artists actively working on improving text-to-image generation models, systems often fail to produce images that accurately align with the text inputs. We introduce TIFA (Text-to-Image Faithfulness evaluation question Answering), an automatic metric measures faithfulness a generated image its input via visual answering (VQA). Specifically, given input, we automatically generate several question-answer pairs using language model. calculate by...
In this work we explore recent advances in instruction-tuning language models on a range of open instruction-following datasets. Despite claims that can be par with state-of-the-art proprietary models, these are often accompanied by limited evaluation, making it difficult to compare across the board and determine utility various resources. We provide large set instruction-tuned from 6.7B 65B parameters size, trained 12 instruction datasets ranging manually curated (e.g., OpenAssistant)...
A sentence can be translated into more than one correct sentences. However, most of the existing neural machine translation models only use translations as targets, and other sentences are punished incorrect in training stage. Since for share similar bag-of-words, it is possible to distinguish from ones by bag-of-words. In this paper, we propose an approach that uses both bag-of-words targets stage, order encourage model generate potentially not appeared set. We evaluate our on a...
How well can NLP models generalize to a variety of unseen tasks when provided with task instructions? To address this question, we first introduce Super-NaturalInstructions, benchmark 1,616 diverse and their expert-written instructions. Our collection covers 76 distinct types, including but not limited classification, extraction, infilling, sequence tagging, text rewriting, composition. This large enables rigorous benchmarking cross-task generalization under instructions -- training follow...
Health literacy has emerged as a crucial factor in making appropriate health decisions and ensuring treatment outcomes. However, medical jargon the complex structure of professional language this domain make information especially hard to interpret. Thus, there is an urgent unmet need for automated methods enhance accessibility biomedical literature general population. This problem can be framed type translation between healthcare professionals, that public. In paper, we introduce novel task...
A series of compounds with the ${\mathrm{ThMn}}_{12}$-type structure, R${\mathrm{Fe}}_{11.35}$${\mathrm{Nb}}_{0.65}$ (R=Y, Sm, Gd, Tb, Dy, Ho, Er, and Lu), were synthesized. The corresponding nitrides, obtained by gas-solid reactions, retained same structure as their parent compounds, but a relative volume expansion 3%. Nb atoms occupy 8i sites in ${\mathrm{ThMn}}_{12}$ structure. highest Curie temperatures are 597 K 773 for ${\mathrm{GdFe}}_{11.35}$${\mathrm{Nb}}_{0.65}$ its nitride,...
This paper introduces DuReader, a new large-scale, open-domain Chinese ma- chine reading comprehension (MRC) dataset, designed to address real-world MRC. DuReader has three advantages over previous MRC datasets: (1) data sources: questions and documents are based on Baidu Search Zhidao; answers manually generated. (2) question types: it provides rich annotations for more types, especially yes-no opinion questions, that leaves opportunity the research community. (3) scale: contains 200K 420K...
Models of language trained on very large corpora have been demonstrated useful for natural processing. As fixed artifacts, they become the object intense study, with many researchers "probing" extent to which acquire and readily demonstrate linguistic abstractions, factual commonsense knowledge, reasoning abilities. Recent work applied several probes intermediate training stages observe developmental process a large-scale model (Chiang et al., 2020). Following this effort, we systematically...
Language models (LMs) have become ubiquitous in both NLP research and commercial product offerings. As their importance has surged, the most powerful closed off, gated behind proprietary interfaces, with important details of training data, architectures, development undisclosed. Given these scientifically studying models, including biases potential risks, we believe it is essential for community to access powerful, truly open LMs. To this end, technical report first release OLMo, a...
A new interstitial nitride of Sm3(Fe0.933Ti0.067)29Ny (y=5) has been synthesized. Its x-ray pattern can be indexed in Nd3(Fe,Ti)29-type monoclinic symmetry with the lattice parameters a=1.098 nm, b=0.882 c=0.985 and β=97.50°. The exhibits ferromagnetic ordering a Curie temperature Tc=750 K. saturation magnetization Ms is 160 m2/kg at 4.2 K 140 293 anisotropy field 18.1 T 12.8 coercivity μ0iHc=0.83 on this developed.