Fei Mi

ORCID: 0000-0001-6358-9922
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Topic Modeling
  • Natural Language Processing Techniques
  • Speech and dialogue systems
  • Multimodal Machine Learning Applications
  • Software Engineering Research
  • Domain Adaptation and Few-Shot Learning
  • Hate Speech and Cyberbullying Detection
  • Mycorrhizal Fungi and Plant Interactions
  • Fungal Biology and Applications
  • Luminescence Properties of Advanced Materials
  • Plant Pathogens and Fungal Diseases
  • Recommender Systems and Techniques
  • AI in Service Interactions
  • Gout, Hyperuricemia, Uric Acid
  • Perovskite Materials and Applications
  • Text Readability and Simplification
  • Luminescence and Fluorescent Materials
  • Animal Genetics and Reproduction
  • CCD and CMOS Imaging Sensors
  • Alcohol Consumption and Health Effects
  • CRISPR and Genetic Engineering
  • Machine Learning in Healthcare
  • Advanced Photocatalysis Techniques
  • Advanced Bandit Algorithms Research
  • Mental Health via Writing

Guangdong Medical College
2023-2025

Kunming Medical University
2019-2024

IT University of Copenhagen
2023

Huawei Technologies (Sweden)
2021-2023

Chinese University of Hong Kong
2023

National University of Singapore
2023

Huawei Technologies (China)
2022-2023

Tokyo Institute of Technology
2023

Administration for Community Living
2023

American Jewish Committee
2023

Quantum dot light-emitting diode (QLED) displays are considered a next-generation technology, but previously reported quantum dots (QDs) consisting of heavy metals toxic and harmful. This work examined earth-abundant, metal-free, graphitic C3N4 (g-C3N4) with exceptional optical electronic properties, excellent chemical thermal stability, an appropriate band gap, non-toxicity for QLED applications. The dependence the luminescence performance on reaction atmosphere temperature; transformation...

10.1016/j.mattod.2018.06.008 article EN cc-by-nc-nd Materials Today 2018-08-10

Natural language generation (NLG) is an essential component of task-oriented dialogue systems. Despite the recent success neural approaches for NLG, they are typically developed particular domains with rich annotated training examples. In this paper, we study NLG in a low-resource setting to generate sentences new scenarios handful We formulate problem from meta-learning perspective, and propose generalized optimization-based approach (Meta-NLG) based on well-recognized model-agnostic (MAML)...

10.24963/ijcai.2019/437 article EN 2019-07-28

Code representation learning, which aims to encode the semantics of source code into distributed vectors, plays an important role in recent deep-learning-based models for intelligence. Recently, many pre-trained language (e.g., CuBERT and CodeBERT) have been proposed model context serve as a basis downstream intelligence tasks such search, clone detection, program translation. Current approaches typically consider plain sequence tokens, or inject structure information AST data-flow)...

10.48550/arxiv.2108.04556 preprint EN other-oa arXiv (Cornell University) 2021-01-01

Offensive language detection is increasingly crucial for maintaining a civilized social media platform and deploying pre-trained models. However, this task in Chinese still under exploration due to the scarcity of reliable datasets. To end, we propose benchmark –COLD offensive analysis, including Language Dataset –COLDATASET baseline detector –COLDETECTOR which trained on dataset. We show that COLD contributes challenging existing resources. then deploy COLDETECTOR conduct detailed analyses...

10.18653/v1/2022.emnlp-main.796 article EN cc-by Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing 2022-01-01

Large Language Models (LLMs) trained on extensive textual corpora have emerged as leading solutions for a broad array of Natural Processing (NLP) tasks. Despite their notable performance, these models are prone to certain limitations such misunderstanding human instructions, generating potentially biased content, or factually incorrect (hallucinated) information. Hence, aligning LLMs with expectations has become an active area interest within the research community. This survey presents...

10.48550/arxiv.2307.12966 preprint EN cc-by arXiv (Cornell University) 2023-01-01

It is a great challenge to maintain thermally stable luminescence of red phosphors in white light-emitting diodes (LEDs), because the large Stokes shift. For purpose overcoming this challenge, work elucidates intrinsic mechanism thermal quenching CaAlSiN3:Eu2+. The empty 5d orbital Eu2+ partly filled with electrons upon increasing, as observed using XANES; and exceptional expansion local Eu–N bond length, ratio which far larger than volume crystal lattice brought by doping Eu2+, measured...

10.1021/acs.chemmater.6b02121 article EN Chemistry of Materials 2016-07-10

Yunnan Province, China, is famous for its abundant wild edible mushroom diversity and a rich source of the world’s trade markets. However, much remains unknown about mushrooms, including number species their distributions. In this study, we collected analyzed 3585 samples from markets in 35 counties across Province 2010 to 2019. Among these samples, successfully obtained DNA barcode sequences 2198 samples. Sequence comparisons revealed that likely belonged 159 known 56 different genera, 31...

10.3390/jof7040310 article EN cc-by Journal of Fungi 2021-04-17

A desirable dialog system should be able to continually learn new skills without forgetting old ones, and thereby adapt domains or tasks in its life cycle. However, training a model often leads well-known catastrophic issue. In this paper, we present Continual Prompt Tuning, parameter-efficient framework that not only avoids but also enables knowledge transfer between tasks. To avoid forgetting, store few prompt tokens' embeddings for each task while freezing the backbone pre-trained model....

10.18653/v1/2022.acl-long.80 article EN cc-by Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) 2022-01-01

We present an efficient method of utilizing pretrained language models, where we learn selective binary masks for weights in lieu modifying them through finetuning. Extensive evaluations masking BERT, RoBERTa, and DistilBERT on eleven diverse NLP tasks show that our scheme yields performance comparable to finetuning, yet has a much smaller memory footprint when several need be inferred. Intrinsic representations computed by masked models encode information necessary solving downstream tasks....

10.18653/v1/2020.emnlp-main.174 preprint EN cc-by 2020-01-01

Natural language generation (NLG) is an essential component of task-oriented dialog systems. Despite the recent success neural approaches for NLG, they are typically developed in offline manner particular domains. To better fit real-life applications where new data come a stream, we study NLG “continual learning” setting to expand its knowledge domains or functionalities incrementally. The major challenge towards this goal catastrophic forgetting, meaning that continually trained model tends...

10.18653/v1/2020.findings-emnlp.310 article EN cc-by 2020-01-01

Session-based recommendation has received growing attention recently due to the increasing privacy concern. Despite recent success of neural session-based recommenders, they are typically developed in an offline manner using a static dataset. However, requires continual adaptation take into account new and obsolete items users, "continual learning" real-life applications. In this case, recommender is updated continually periodically with data that arrives each update cycle, model needs...

10.1145/3383313.3412218 article EN 2020-09-19

Myostatin (MSTN) encodes a negative regulator of skeletal muscle mass that might have applications for promoting growth in livestock. In this study, we aimed to test whether targeted MSTN editing, mediated by transcription activator-like effector nucleases (TALENs), is viable approach create myostatin-modified goats (Capra hircus).We obtained pair TALENs (MTAL-2) could recognize and cut the site goat genome. Fibroblasts from pedigreed were co-transfected with MTAL-2, 272 monoclonal cell...

10.1186/s12861-016-0126-9 article EN cc-by BMC Developmental Biology 2016-07-26

Large Language Models (LLMs), such as ChatGPT, greatly empower dialogue systems with strong language understanding and generation capabilities. However, most of the previous works prompt LLMs to directly generate a response based on context, overlooking underlying linguistic cues about user status exhibited in context. Such in-depth scenarios are challenging for existing figure out user's hidden needs respond satisfactorily through single-step inference. To this end, we propose novel...

10.18653/v1/2023.findings-emnlp.806 article EN cc-by 2023-01-01

The main goal of modeling human conversation is to create agents which can interact with people in both open-ended and goal-oriented scenarios. End-to-end trained neural dialog systems are an important line research for such generalized models as they do not resort any situation-specific handcrafting rules. However, incorporating personalization into a largely unexplored topic there no existing corpora facilitate work. In this paper, we present new dataset dialogs influenced by speaker...

10.48550/arxiv.1706.07503 preprint EN other-oa arXiv (Cornell University) 2017-01-01

Abstract Context Evidence regarding the association of long-term exposure to air pollution on bone strength or osteoporosis is rare, especially in highly polluted low- and middle-income countries. Little known about whether between changes at different distributions. Objective Using baseline data from China Multi-Ethnic Cohort, we investigated strength. Methods We used multiple linear models estimate strength, conducted quantile regression investigate variation this distribution The 3-year...

10.1210/clinem/dgab462 article EN publisher-specific-oa The Journal of Clinical Endocrinology & Metabolism 2021-07-15

Automatically generating compilable programs with (or without) natural language descriptions has always been a touchstone problem for computational linguistics and automated software engineering. Existing deep-learning approaches model code generation as text generation, either constrained by grammar structures in decoder, or driven pre-trained models on large-scale corpus (e.g., CodeGPT, PLBART, CodeT5). However, few of them account compilability the generated programs. To improve programs,...

10.18653/v1/2022.findings-acl.2 article EN cc-by Findings of the Association for Computational Linguistics: ACL 2022 2022-01-01

Among all the safety concerns that hinder deployment of open-domain dialog systems (e.g., offensive languages, biases, and toxic behaviors), social bias presents an insidious challenge. Addressing this challenge requires rigorous analyses normative reasoning. In paper, we focus our investigation on measurement to facilitate development unbiased systems. We first propose a novel Dial-Bias Framework for analyzing in conversations using holistic method beyond lexicons or dichotomous...

10.18653/v1/2022.findings-emnlp.262 article EN cc-by 2022-01-01
Coming Soon ...