NFDI4DS | UHH-SEMS - Publication Details

Fei Mi

ORCID: 0000-0001-6358-9922

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5012014905

Research Areas

Topic Modeling
Natural Language Processing Techniques
Speech and dialogue systems
Multimodal Machine Learning Applications
Software Engineering Research
Domain Adaptation and Few-Shot Learning
Hate Speech and Cyberbullying Detection
Mycorrhizal Fungi and Plant Interactions
Fungal Biology and Applications
Luminescence Properties of Advanced Materials
Plant Pathogens and Fungal Diseases
Recommender Systems and Techniques
AI in Service Interactions
Gout, Hyperuricemia, Uric Acid
Perovskite Materials and Applications
Text Readability and Simplification
Luminescence and Fluorescent Materials
Animal Genetics and Reproduction
CCD and CMOS Imaging Sensors
Alcohol Consumption and Health Effects
CRISPR and Genetic Engineering
Machine Learning in Healthcare
Advanced Photocatalysis Techniques
Advanced Bandit Algorithms Research
Mental Health via Writing

Guangdong Medical College
2023-2025

Kunming Medical University
2019-2024

IT University of Copenhagen
2023

Huawei Technologies (Sweden)
2021-2023

Chinese University of Hong Kong
2023

National University of Singapore
2023

Huawei Technologies (China)
2022-2023

Tokyo Institute of Technology
2023

Administration for Community Living
2023

American Jewish Committee
2023

Graphitic C3N4 quantum dots for next-generation QLED displays

OPENALEX - Publications

Liangrui He Fei Mi Jie Chen Yunfei Tian Yang Jiang and 7 more

Quantum dot light-emitting diode (QLED) displays are considered a next-generation technology, but previously reported quantum dots (QDs) consisting of heavy metals toxic and harmful. This work examined earth-abundant, metal-free, graphitic C3N4 (g-C3N4) with exceptional optical electronic properties, excellent chemical thermal stability, an appropriate band gap, non-toxicity for QLED applications. The dependence the luminescence performance on reaction atmosphere temperature; transformation...

10.1016/j.mattod.2018.06.008 article EN cc-by-nc-nd Materials Today 2018-08-10

Meta-Learning for Low-resource Natural Language Generation in Task-oriented Dialogue Systems

OPENALEX - Publications

Fei Mi Minlie Huang Jiyong Zhang Boi Faltings

Natural language generation (NLG) is an essential component of task-oriented dialogue systems. Despite the recent success neural approaches for NLG, they are typically developed particular domains with rich annotated training examples. In this paper, we study NLG in a low-resource setting to generate sentences new scenarios handful We formulate problem from meta-learning perspective, and propose generalized optimization-based approach (Meta-NLG) based on well-recognized model-agnostic (MAML)...

10.24963/ijcai.2019/437 article EN 2019-07-28

SynCoBERT: Syntax-Guided Multi-Modal Contrastive Pre-Training for Code Representation

OPENALEX - Publications

Xin Wang Yasheng Wang Fei Mi Pingyi Zhou Yao Wan and 5 more

Code representation learning, which aims to encode the semantics of source code into distributed vectors, plays an important role in recent deep-learning-based models for intelligence. Recently, many pre-trained language (e.g., CuBERT and CodeBERT) have been proposed model context serve as a basis downstream intelligence tasks such search, clone detection, program translation. Current approaches typically consider plain sequence tokens, or inject structure information AST data-flow)...

10.48550/arxiv.2108.04556 preprint EN other-oa arXiv (Cornell University) 2021-01-01

COLD: A Benchmark for Chinese Offensive Language Detection

OPENALEX - Publications

Jiawen Deng Jingyan Zhou Hao Sun Chujie Zheng Fei Mi and 2 more

Offensive language detection is increasingly crucial for maintaining a civilized social media platform and deploying pre-trained models. However, this task in Chinese still under exploration due to the scarcity of reliable datasets. To end, we propose benchmark –COLD offensive analysis, including Language Dataset –COLDATASET baseline detector –COLDETECTOR which trained on dataset. We show that COLD contributes challenging existing resources. then deploy COLDETECTOR conduct detailed analyses...

10.18653/v1/2022.emnlp-main.796 article EN cc-by Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing 2022-01-01

Aligning Large Language Models with Human: A Survey

OPENALEX - Publications

Yufei Wang Wanjun Zhong Liangyou Li Fei Mi Xingshan Zeng and 4 more

Large Language Models (LLMs) trained on extensive textual corpora have emerged as leading solutions for a broad array of Natural Processing (NLP) tasks. Despite their notable performance, these models are prone to certain limitations such misunderstanding human instructions, generating potentially biased content, or factually incorrect (hallucinated) information. Hence, aligning LLMs with expectations has become an active area interest within the research community. This survey presents...

10.48550/arxiv.2307.12966 preprint EN cc-by arXiv (Cornell University) 2023-01-01

Understanding the Local and Electronic Structures toward Enhanced Thermal Stable Luminescence of CaAlSiN3:Eu2+

OPENALEX - Publications

Lei Chen Fei Mi Zhao Zhang Yang Jiang Shifu Chen and 7 more

It is a great challenge to maintain thermally stable luminescence of red phosphors in white light-emitting diodes (LEDs), because the large Stokes shift. For purpose overcoming this challenge, work elucidates intrinsic mechanism thermal quenching CaAlSiN3:Eu2+. The empty 5d orbital Eu2+ partly filled with electrons upon increasing, as observed using XANES; and exceptional expansion local Eu–N bond length, ratio which far larger than volume crystal lattice brought by doping Eu2+, measured...

10.1021/acs.chemmater.6b02121 article EN Chemistry of Materials 2016-07-10

Exploring the Species Diversity of Edible Mushrooms in Yunnan, Southwestern China, by DNA Barcoding

OPENALEX - Publications

Ying Zhang Meizi Mo Liu Yang Fei Mi Yang Cao and 4 more

Yunnan Province, China, is famous for its abundant wild edible mushroom diversity and a rich source of the world’s trade markets. However, much remains unknown about mushrooms, including number species their distributions. In this study, we collected analyzed 3585 samples from markets in 35 counties across Province 2010 to 2019. Among these samples, successfully obtained DNA barcode sequences 2198 samples. Sequence comparisons revealed that likely belonged 159 known 56 different genera, 31...

10.3390/jof7040310 article EN cc-by Journal of Fungi 2021-04-17

Continual Prompt Tuning for Dialog State Tracking

OPENALEX - Publications

Qi Zhu Bing Li Fei Mi Xiaoyan Zhu Minlie Huang

A desirable dialog system should be able to continually learn new skills without forgetting old ones, and thereby adapt domains or tasks in its life cycle. However, training a model often leads well-known catastrophic issue. In this paper, we present Continual Prompt Tuning, parameter-efficient framework that not only avoids but also enables knowledge transfer between tasks. To avoid forgetting, store few prompt tokens' embeddings for each task while freezing the backbone pre-trained model....

10.18653/v1/2022.acl-long.80 article EN cc-by Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) 2022-01-01

Cafestol inhibits colon cancer cell proliferation and tumor growth in xenograft mice by activating LKB1/AMPK/ULK1-dependent autophagy

OPENALEX - Publications

Yuemei Feng JiZhuo Yang Yihan Wang Xue Wang Qian Ma and 8 more

10.1016/j.jnutbio.2024.109623 article EN The Journal of Nutritional Biochemistry 2024-03-15

Masking as an Efficient Alternative to Finetuning for Pretrained Language Models

OPENALEX - Publications

Mengjie Zhao Tao Lin Fei Mi Martin Jaggi Hinrich Schütze

We present an efficient method of utilizing pretrained language models, where we learn selective binary masks for weights in lieu modifying them through finetuning. Extensive evaluations masking BERT, RoBERTa, and DistilBERT on eleven diverse NLP tasks show that our scheme yields performance comparable to finetuning, yet has a much smaller memory footprint when several need be inferred. Intrinsic representations computed by masked models encode information necessary solving downstream tasks....

10.18653/v1/2020.emnlp-main.174 preprint EN cc-by 2020-01-01

Continual Learning for Natural Language Generation in Task-oriented Dialog Systems

OPENALEX - Publications

Fei Mi Liangwei Chen Mengjie Zhao Minlie Huang Boi Faltings

Natural language generation (NLG) is an essential component of task-oriented dialog systems. Despite the recent success neural approaches for NLG, they are typically developed in offline manner particular domains. To better fit real-life applications where new data come a stream, we study NLG “continual learning” setting to expand its knowledge domains or functionalities incrementally. The major challenge towards this goal catastrophic forgetting, meaning that continually trained model tends...

10.18653/v1/2020.findings-emnlp.310 article EN cc-by 2020-01-01

ADER: Adaptively Distilled Exemplar Replay Towards Continual Learning for Session-based Recommendation

OPENALEX - Publications

Fei Mi Xiaoyu Lin Boi Faltings

Session-based recommendation has received growing attention recently due to the increasing privacy concern. Despite recent success of neural session-based recommenders, they are typically developed in an offline manner using a static dataset. However, requires continual adaptation take into account new and obsolete items users, "continual learning" real-life applications. In this case, recommender is updated continually periodically with data that arrives each update cycle, model needs...

10.1145/3383313.3412218 article EN 2020-09-19

Defending Large Language Models Against Jailbreaking Attacks Through Goal Prioritization

OPENALEX - Publications

Zhexin Zhang Junxiao Yang Pei Ke Fei Mi Hongning Wang and 1 more

10.18653/v1/2024.acl-long.481 article EN 2024-01-01

Dynamic Data Selection with Normalized Gradient-Based Influence Approximation for Targeted Fine-Tuning of Llms

OPENALEX - Publications

Zige Wang Qi Zhu Fei Mi Yasheng Wang Haotian Wang and 1 more

10.2139/ssrn.5206107 preprint EN 2025-01-01

Efficient TALEN-mediated myostatin gene editing in goats

OPENALEX - Publications

Baoli Yu Rui Lü Yuguo Yuan Ting Zhang Shaozheng Song and 5 more

Myostatin (MSTN) encodes a negative regulator of skeletal muscle mass that might have applications for promoting growth in livestock. In this study, we aimed to test whether targeted MSTN editing, mediated by transcription activator-like effector nucleases (TALENs), is viable approach create myostatin-modified goats (Capra hircus).We obtained pair TALENs (MTAL-2) could recognize and cut the site goat genome. Fibroblasts from pedigreed were co-transfected with MTAL-2, 272 monoclonal cell...

10.1186/s12861-016-0126-9 article EN cc-by BMC Developmental Biology 2016-07-26

Cue-CoT: Chain-of-thought Prompting for Responding to In-depth Dialogue Questions with LLMs

OPENALEX - Publications

Hongru Wang Rui Wang Fei Mi Yang Deng Zezhong Wang and 3 more

Large Language Models (LLMs), such as ChatGPT, greatly empower dialogue systems with strong language understanding and generation capabilities. However, most of the previous works prompt LLMs to directly generate a response based on context, overlooking underlying linguistic cues about user status exhibited in context. Such in-depth scenarios are challenging for existing figure out user's hidden needs respond satisfactorily through single-step inference. To this end, we propose novel...

10.18653/v1/2023.findings-emnlp.806 article EN cc-by 2023-01-01

Personalization in Goal-Oriented Dialog

OPENALEX - Publications

Chaitanya K. Joshi Fei Mi Boi Faltings

The main goal of modeling human conversation is to create agents which can interact with people in both open-ended and goal-oriented scenarios. End-to-end trained neural dialog systems are an important line research for such generalized models as they do not resort any situation-specific handcrafting rules. However, incorporating personalization into a largely unexplored topic there no existing corpora facilitate work. In this paper, we present new dataset dialogs influenced by speaker...

10.48550/arxiv.1706.07503 preprint EN other-oa arXiv (Cornell University) 2017-01-01

The Association Between Long-term Exposure to Ambient Air Pollution and Bone Strength in China

OPENALEX - Publications

Jialong Wu Bing Guo Han Guan Fei Mi Jingru Xu and 11 more

Abstract Context Evidence regarding the association of long-term exposure to air pollution on bone strength or osteoporosis is rare, especially in highly polluted low- and middle-income countries. Little known about whether between changes at different distributions. Objective Using baseline data from China Multi-Ethnic Cohort, we investigated strength. Methods We used multiple linear models estimate strength, conducted quantile regression investigate variation this distribution The 3-year...

10.1210/clinem/dgab462 article EN publisher-specific-oa The Journal of Clinical Endocrinology & Metabolism 2021-07-15

Compilable Neural Code Generation with Compiler Feedback

OPENALEX - Publications

Xin Wang Yasheng Wang Yao Wan Fei Mi Yitong Li and 5 more

Automatically generating compilable programs with (or without) natural language descriptions has always been a touchstone problem for computational linguistics and automated software engineering. Existing deep-learning approaches model code generation as text generation, either constrained by grammar structures in decoder, or driven pre-trained models on large-scale corpus (e.g., CodeGPT, PLBART, CodeT5). However, few of them account compilability the generated programs. To improve programs,...

10.18653/v1/2022.findings-acl.2 article EN cc-by Findings of the Association for Computational Linguistics: ACL 2022 2022-01-01

Towards Identifying Social Bias in Dialog Systems: Framework, Dataset, and Benchmark

OPENALEX - Publications

Jingyan Zhou Jiawen Deng Fei Mi Yitong Li Yasheng Wang and 4 more

Among all the safety concerns that hinder deployment of open-domain dialog systems (e.g., offensive languages, biases, and toxic behaviors), social bias presents an insidious challenge. Addressing this challenge requires rigorous analyses normative reasoning. In paper, we focus our investigation on measurement to facilitate development unbiased systems. We first propose a novel Dial-Bias Framework for analyzing in conversations using holistic method beyond lexicons or dichotomous...

10.18653/v1/2022.findings-emnlp.262 article EN cc-by 2022-01-01

Coming Soon ...