Z. Shen

ORCID: 0000-0003-1391-5384
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Particle physics theoretical and experimental studies
  • Quantum Chromodynamics and Particle Interactions
  • High-Energy Particle Collisions Research
  • Neutrino Physics Research
  • Computational Physics and Python Applications
  • Dark Matter and Cosmic Phenomena
  • Particle Detector Development and Performance
  • Topic Modeling
  • Semantic Web and Ontologies
  • Black Holes and Theoretical Physics
  • Particle Accelerators and Free-Electron Lasers
  • Graph Theory and Algorithms
  • Advanced Database Systems and Queries
  • Nuclear physics research studies
  • Data Management and Algorithms
  • Facility Location and Emergency Management
  • Medical Imaging Techniques and Applications
  • Web Data Mining and Analysis
  • Scientific Computing and Data Management
  • Natural Language Processing Techniques
  • Stochastic processes and statistical mechanics
  • Atomic and Subatomic Physics Research
  • Text and Document Classification Technologies
  • Vehicle Routing Optimization Methods
  • Advanced Graph Neural Networks

Peking University
2012-2025

State Key Laboratory of Nuclear Physics and Technology
2020-2025

Sichuan University
2024-2025

Ningbo No. 2 Hospital
2023-2025

Chinese Academy of Sciences
2011-2024

Computer Network Information Center
2012-2024

Université Paris-Saclay
2023

Laboratoire de Physique des 2 Infinis Irène Joliot-Curie
2023

Institut National de Physique Nucléaire et de Physique des Particules
2023

Centre National de la Recherche Scientifique
2023

In this paper we describe a new release of Web scale entity graph that serves as the backbone Microsoft Academic Service (MAS), major production effort with broadened scope to namesake vertical search engine has been publicly available since 2008 research prototype. At core MAS is heterogeneous comprised six types entities model scholarly activities: field study, author, institution, paper, venue, and event. addition obtaining these from publisher feeds in previous effort, version include...

10.1145/2740908.2742839 article EN 2015-05-18

The Covid-19 Open Research Dataset (CORD-19) is a growing resource of scientific papers on and related historical coronavirus research. CORD-19 designed to facilitate the development text mining information retrieval systems over its rich collection metadata structured full papers. Since release, has been downloaded 200K times served as basis many discovery systems. In this article, we describe mechanics dataset construction, highlighting challenges key design decisions, provide an overview...

10.48550/arxiv.2004.10706 preprint EN cc-by arXiv (Cornell University) 2020-01-01

An ongoing project explores the extent to which artificial intelligence (AI), specifically in areas of natural language processing and semantic reasoning, can be exploited facilitate studies science by deploying software agents equipped with understanding capabilities read scholarly publications on web. The knowledge extracted these AI is organized into a heterogeneous graph, called Microsoft Academic Graph (MAG), where nodes edges represent entities engaging communications relationships...

10.1162/qss_a_00021 article EN cc-by Quantitative Science Studies 2020-01-23

This REgional Carbon Cycle Assessment and Processes regional study provides a synthesis of the carbon balance terrestrial ecosystems in East Asia, region comprised China, Japan, North South Korea, Mongolia. We estimate current Asia its driving mechanisms during 1990–2009 using three different approaches: inventories combined with satellite greenness measurements, ecosystem cycle models atmospheric inversion models. The magnitudes Asia's sink from these approaches are comparable:...

10.5194/bg-9-3571-2012 article EN cc-by Biogeosciences 2012-09-07

To enable efficient exploration of Web-scale scientific knowledge, it is necessary to organize publications into a hierarchical concept structure. In this work, we present large-scale system (1) identify hundreds thousands concepts, (2) tag these identified concepts millions by leveraging both text and graph structure, (3) build six-level hierarchy with subsumption-based model. The builds the most comprehensive cross-domain ontology published date, more than 200 thousand over one million...

10.18653/v1/p18-4015 article EN cc-by 2018-01-01

Abstract In this article, we are interested in routing vehicles to service a large‐scale bioterrorism emergency. We describe the specifics of such emergency and decompose problem into two stages: planning stage an operational stage. stage, generate routes well advance any take account planned information revealed at time emergency, decide delivery quantity adjustments routes. propose mathematical formulations solution approaches for both stages. Lastly, demonstrate effectiveness our...

10.1002/net.20337 article EN Networks 2009-10-16

Progress in science has advanced the development of human society across history, with dramatic revolutions shaped by information theory, genetic cloning, and artificial intelligence, among many scientific achievements produced 20th century. However, way that advances itself is much less well-understood. In this work, we study evolution over past century presenting an anatomy 89 million digitalized papers published between 1900 2015. We find benefited from shift individual work to...

10.1145/3097983.3098016 article EN 2017-08-04

We present the design and methodology for large scale hybrid paper recommender system used by Microsoft Academic. The provides recommendations approximately 160 million English research papers patents. Our approach handles incomplete citation information while also alleviating cold-start problem that often affects other systems. use Academic Graph (MAG), titles, available abstracts of to build a recommendation list all documents, thereby combining co-citation content based approaches. Tuning...

10.1145/3308558.3313700 preprint EN 2019-05-13

10.1007/s13762-024-06198-z article EN International Journal of Environmental Science and Technology 2025-01-04

The prevalence of metabolic syndrome among people living with HIV (PLWH) is increasing worldwide. This study aimed to develop and validate a nomogram predict the risk in PLWH receiving antiretroviral therapy (ART) China, accounting for both traditional HIV-specific factors. A retrospective cohort was conducted ART at designated treatment center Yinzhou District, China. total 774 patients were randomly assigned development validation cohorts 5:5 ratio. Predictive variables identified using...

10.3389/fcimb.2025.1514823 article EN cc-by Frontiers in Cellular and Infection Microbiology 2025-02-20

10.1016/j.nima.2025.170473 article EN Nuclear Instruments and Methods in Physics Research Section A Accelerators Spectrometers Detectors and Associated Equipment 2025-03-01

Many aspects and properties of Recommender Systems have been well studied in the past decade, however, impact User Fatigue has mostly ignored literature. fatigue represents phenomenon that a user quickly loses interest on recommended item if same presented to this multiple times before. The direct caused by is dramatic decrease Click Through Rate (CTR, i.e., ratio clicks impressions). In paper, we present comprehensive study research online recommender systems. By analyzing behavioral logs...

10.1145/2872427.2874813 article EN 2016-04-11

Conservation laws are considered to be fundamental of nature.It has broad applications in many fields, including physics, chemistry, biology, geology, and engineering.Solving the differential equations associated with conservation is a major branch computational mathematics.The recent success machine learning, especially deep learning areas such as computer vision natural language processing, attracted lot attention from community mathematics inspired intriguing works combining traditional...

10.4208/cicp.oa-2020-0194 article EN Communications in Computational Physics 2020-01-01

Large-scale multi-label text classification (LMTC) aims to associate a document with its relevant labels from large candidate set. Most existing LMTC approaches rely on massive human-annotated training data, which are often costly obtain and suffer long-tailed label distribution (i.e., many occur only few times in the set). In this paper, we study under zero-shot setting, does not require any annotated documents relies surface names descriptions. To train classifier that calculates...

10.1145/3485447.3512174 article EN Proceedings of the ACM Web Conference 2022 2022-04-25

Taxonomies consist of machine-interpretable semantics and provide valuable knowledge for many web applications. For example, online retailers (e.g., Amazon eBay) use taxonomies product recommendation, search engines Google Bing) leverage to enhance query understanding. Enormous efforts have been made on constructing either manually or semi-automatically. However, with the fast-growing volume content, existing will become outdated fail capture emerging knowledge. Therefore, in applications,...

10.1145/3366423.3380132 preprint EN 2020-04-20

The genomic variations of SARS-CoV-2 continue to emerge and spread worldwide. Some mutant strains show increased transmissibility virulence, which may cause reduced protection provided by vaccines. Thus, it is necessary continuously monitor analyze the SARS-COV-2 genomes. We established an evaluation prewarning system, system (VarEPS), including known virtual mutations genomes achieve rapid risks posed strains. From perspective genomics structural biology, database comprehensively analyzes...

10.1093/nar/gkab921 article EN Nucleic Acids Research 2021-09-30

Preprint is a version of scientific paper that publicly distributed preceding formal peer review. Since the launch arXiv in 1991, preprints have been increasingly over Internet as opposed to copies. It allows open online access disseminate original research within few days, often at very low operating cost. This work overviews how preprint has evolving and impacting community past thirty years alongside growth Web. In this work, we first report number exponentially increased 63 times 30...

10.48550/arxiv.2102.09066 preprint EN cc-by arXiv (Cornell University) 2021-01-01

Abstract Here, we present the manually curated Global Catalogue of Pathogens (gcPathogen), an extensive genomic resource designed to facilitate rapid and accurate pathogen analysis, epidemiological exploration monitoring antibiotic resistance features virulence factors. The catalogue seamlessly integrates analyzes data associated metadata for human pathogens isolated from infected patients, animal hosts, food environment. list is supported by evidence medical or government pathogenic lists...

10.1093/nar/gkad875 article EN cc-by Nucleic Acids Research 2023-10-18

Progress in science has advanced the development of human society across history, with dramatic revolutions shaped by information theory, genetic cloning, and artificial intelligence, among many scientific achievements produced 20th century. However, way that advances itself is much less well-understood. In this work, we study evolution over past century presenting an anatomy 89 million digitalized papers published between 1900 2015. We find benefited from shift individual work to...

10.48550/arxiv.1704.05150 preprint EN other-oa arXiv (Cornell University) 2017-01-01
Coming Soon ...