NFDI4DS | UHH-SEMS - Publication Details

MoZIP: A Multilingual Benchmark to Evaluate Large Language Models in Intellectual Property

Benchmark (surveying)

DOI: 10.48550/arxiv.2402.16389 Publication Date: 2024-02-26

Abstract Supplemental Material References Cited by

AUTHORS (12)

Shiwen Ni

Minghuan Tan

Yuelin Bai

Fuqiang Niu

Min Yang

Bowen Zhang

Ruifeng Xu

Xiaojun Chen

Chengming Li

Xiping Hu

Ye Li

Jianping Fan

ABSTRACT

Large language models (LLMs) have demonstrated impressive performance in various natural processing (NLP) tasks. However, there is limited understanding of how well LLMs perform specific domains (e.g, the intellectual property (IP) domain). In this paper, we contribute a new benchmark, first Multilingual-oriented quiZ on Intellectual Property (MoZIP), for evaluation IP domain. The MoZIP benchmark includes three challenging tasks: multiple-choice quiz (IPQuiz), question answering (IPQA), and patent matching (PatentMatch). addition, also develop IP-oriented multilingual large model (called MoZi), which BLOOMZ-based that has been supervised fine-tuned with IP-related text data. We evaluate our proposed MoZi four well-known (i.e., BLOOMZ, BELLE, ChatGLM ChatGPT) benchmark. Experimental results demonstrate outperforms BELLE by noticeable margin, while it had lower scores compared ChatGPT. Notably, current much room improvement, even most powerful ChatGPT does not reach passing level. Our source code, data, are available at \url{https://github.com/AI-for-Science/MoZi}.

SUPPLEMENTAL MATERIAL

Coming soon ....

REFERENCES ()

CITATIONS ()

EXTERNAL LINKS

OPENALEX - Publications

PlumX Metrics

MoZIP: A Multilingual Benchmark to Evaluate Large Language Models in Intellectual Property

RECOMMENDATIONS

FAIR ASSESSMENT

Coming soon ....

JUPYTER LAB

Coming soon ....