- Particle physics theoretical and experimental studies
- Quantum Chromodynamics and Particle Interactions
- High-Energy Particle Collisions Research
- Neutrino Physics Research
- Computational Physics and Python Applications
- Dark Matter and Cosmic Phenomena
- Particle Detector Development and Performance
- Topic Modeling
- Semantic Web and Ontologies
- Black Holes and Theoretical Physics
- Particle Accelerators and Free-Electron Lasers
- Graph Theory and Algorithms
- Advanced Database Systems and Queries
- Nuclear physics research studies
- Data Management and Algorithms
- Facility Location and Emergency Management
- Medical Imaging Techniques and Applications
- Web Data Mining and Analysis
- Scientific Computing and Data Management
- Natural Language Processing Techniques
- Stochastic processes and statistical mechanics
- Atomic and Subatomic Physics Research
- Text and Document Classification Technologies
- Vehicle Routing Optimization Methods
- Advanced Graph Neural Networks
Peking University
2012-2025
State Key Laboratory of Nuclear Physics and Technology
2020-2025
Sichuan University
2024-2025
Ningbo No. 2 Hospital
2023-2025
Chinese Academy of Sciences
2011-2024
Computer Network Information Center
2012-2024
Université Paris-Saclay
2023
Laboratoire de Physique des 2 Infinis Irène Joliot-Curie
2023
Institut National de Physique Nucléaire et de Physique des Particules
2023
Centre National de la Recherche Scientifique
2023
In this paper we describe a new release of Web scale entity graph that serves as the backbone Microsoft Academic Service (MAS), major production effort with broadened scope to namesake vertical search engine has been publicly available since 2008 research prototype. At core MAS is heterogeneous comprised six types entities model scholarly activities: field study, author, institution, paper, venue, and event. addition obtaining these from publisher feeds in previous effort, version include...
The Covid-19 Open Research Dataset (CORD-19) is a growing resource of scientific papers on and related historical coronavirus research. CORD-19 designed to facilitate the development text mining information retrieval systems over its rich collection metadata structured full papers. Since release, has been downloaded 200K times served as basis many discovery systems. In this article, we describe mechanics dataset construction, highlighting challenges key design decisions, provide an overview...
An ongoing project explores the extent to which artificial intelligence (AI), specifically in areas of natural language processing and semantic reasoning, can be exploited facilitate studies science by deploying software agents equipped with understanding capabilities read scholarly publications on web. The knowledge extracted these AI is organized into a heterogeneous graph, called Microsoft Academic Graph (MAG), where nodes edges represent entities engaging communications relationships...
This REgional Carbon Cycle Assessment and Processes regional study provides a synthesis of the carbon balance terrestrial ecosystems in East Asia, region comprised China, Japan, North South Korea, Mongolia. We estimate current Asia its driving mechanisms during 1990â2009 using three different approaches: inventories combined with satellite greenness measurements, ecosystem cycle models atmospheric inversion models. The magnitudes Asia's sink from these approaches are comparable:...
To enable efficient exploration of Web-scale scientific knowledge, it is necessary to organize publications into a hierarchical concept structure. In this work, we present large-scale system (1) identify hundreds thousands concepts, (2) tag these identified concepts millions by leveraging both text and graph structure, (3) build six-level hierarchy with subsumption-based model. The builds the most comprehensive cross-domain ontology published date, more than 200 thousand over one million...
Abstract In this article, we are interested in routing vehicles to service a large‐scale bioterrorism emergency. We describe the specifics of such emergency and decompose problem into two stages: planning stage an operational stage. stage, generate routes well advance any take account planned information revealed at time emergency, decide delivery quantity adjustments routes. propose mathematical formulations solution approaches for both stages. Lastly, demonstrate effectiveness our...
Progress in science has advanced the development of human society across history, with dramatic revolutions shaped by information theory, genetic cloning, and artificial intelligence, among many scientific achievements produced 20th century. However, way that advances itself is much less well-understood. In this work, we study evolution over past century presenting an anatomy 89 million digitalized papers published between 1900 2015. We find benefited from shift individual work to...
We present the design and methodology for large scale hybrid paper recommender system used by Microsoft Academic. The provides recommendations approximately 160 million English research papers patents. Our approach handles incomplete citation information while also alleviating cold-start problem that often affects other systems. use Academic Graph (MAG), titles, available abstracts of to build a recommendation list all documents, thereby combining co-citation content based approaches. Tuning...
The prevalence of metabolic syndrome among people living with HIV (PLWH) is increasing worldwide. This study aimed to develop and validate a nomogram predict the risk in PLWH receiving antiretroviral therapy (ART) China, accounting for both traditional HIV-specific factors. A retrospective cohort was conducted ART at designated treatment center Yinzhou District, China. total 774 patients were randomly assigned development validation cohorts 5:5 ratio. Predictive variables identified using...
Many aspects and properties of Recommender Systems have been well studied in the past decade, however, impact User Fatigue has mostly ignored literature. fatigue represents phenomenon that a user quickly loses interest on recommended item if same presented to this multiple times before. The direct caused by is dramatic decrease Click Through Rate (CTR, i.e., ratio clicks impressions). In paper, we present comprehensive study research online recommender systems. By analyzing behavioral logs...
Conservation laws are considered to be fundamental of nature.It has broad applications in many fields, including physics, chemistry, biology, geology, and engineering.Solving the differential equations associated with conservation is a major branch computational mathematics.The recent success machine learning, especially deep learning areas such as computer vision natural language processing, attracted lot attention from community mathematics inspired intriguing works combining traditional...
Large-scale multi-label text classification (LMTC) aims to associate a document with its relevant labels from large candidate set. Most existing LMTC approaches rely on massive human-annotated training data, which are often costly obtain and suffer long-tailed label distribution (i.e., many occur only few times in the set). In this paper, we study under zero-shot setting, does not require any annotated documents relies surface names descriptions. To train classifier that calculates...
Taxonomies consist of machine-interpretable semantics and provide valuable knowledge for many web applications. For example, online retailers (e.g., Amazon eBay) use taxonomies product recommendation, search engines Google Bing) leverage to enhance query understanding. Enormous efforts have been made on constructing either manually or semi-automatically. However, with the fast-growing volume content, existing will become outdated fail capture emerging knowledge. Therefore, in applications,...
The genomic variations of SARS-CoV-2 continue to emerge and spread worldwide. Some mutant strains show increased transmissibility virulence, which may cause reduced protection provided by vaccines. Thus, it is necessary continuously monitor analyze the SARS-COV-2 genomes. We established an evaluation prewarning system, system (VarEPS), including known virtual mutations genomes achieve rapid risks posed strains. From perspective genomics structural biology, database comprehensively analyzes...
Preprint is a version of scientific paper that publicly distributed preceding formal peer review. Since the launch arXiv in 1991, preprints have been increasingly over Internet as opposed to copies. It allows open online access disseminate original research within few days, often at very low operating cost. This work overviews how preprint has evolving and impacting community past thirty years alongside growth Web. In this work, we first report number exponentially increased 63 times 30...
Abstract Here, we present the manually curated Global Catalogue of Pathogens (gcPathogen), an extensive genomic resource designed to facilitate rapid and accurate pathogen analysis, epidemiological exploration monitoring antibiotic resistance features virulence factors. The catalogue seamlessly integrates analyzes data associated metadata for human pathogens isolated from infected patients, animal hosts, food environment. list is supported by evidence medical or government pathogenic lists...
Progress in science has advanced the development of human society across history, with dramatic revolutions shaped by information theory, genetic cloning, and artificial intelligence, among many scientific achievements produced 20th century. However, way that advances itself is much less well-understood. In this work, we study evolution over past century presenting an anatomy 89 million digitalized papers published between 1900 2015. We find benefited from shift individual work to...