Visualization of Materials Science Topics in Publications of Institutional Repository using Natural Language Processing

topic map FAIR data Science open access repository Q 05 social sciences 0509 other social sciences
DOI: 10.3897/rio.8.e95679 Publication Date: 2022-10-12T16:31:56Z
ABSTRACT
SAMURAI (NIMS 2022), a directory service of the National Institute for Materials Science (NIMS) researchers in Japan was launched 2009 following development NIMS institutional repository (Tanifuji et al. 2019). The concept is to synchronize between profile information and their publications which are self-archived system. renewed 2017 with interoperable functions ORCID. supports various links not only individual articles patents, but also databases such as KAKEN (Database Grants-in-Aid Scientific Research by NII). has yielded fully identified authors journal from research members implementing unique ResearcherID. Through this directory, promoting materials research, supporting management its activities, introducing work public. In work, we present an application describe each researcher's output topics automatically archived papers repository, science specific natural language processing developed our study (Dieb 2021) that visualizes trend researchers. approach can maximize absorbance general audience corresponds open policy. A list publications' digital object identifiers (DOIs DOI 2022) researcher constructed his SAMURAI. (In SAMURAI, DOIs stored PostgreSQL database). Using DOI, recent were retrieved text data mining platform (TDM-PF) XML format mainly available 2003. Representative topic terms related engineering extracted. We utilize term frequency analysis automatic extraction names extract these necessary informative terms. Additionally, domain knowledge resources dictionaries used. Data preprocessed using noise reduction removing English stop words physical units filtering. Such do have significance on own. Word cloud used visualization (Fig. 1). This brings us opportunity apply NLP experience mine public step towards data-driven science.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (4)
CITATIONS (1)