Fengze Dai

ORCID: 0000-0002-8539-0356
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Surface Treatment and Residual Stress
  • Erosion and Abrasive Machining
  • High-Velocity Impact and Material Behavior
  • High-Temperature Coating Behaviors
  • Laser Material Processing Techniques
  • Metal and Thin Film Mechanics
  • Diamond and Carbon-based Materials Research
  • Tendon Structure and Treatment
  • Glass properties and applications
  • Advanced ceramic materials synthesis
  • High Entropy Alloys Studies
  • Tribology and Lubrication Engineering
  • Hydrogen embrittlement and corrosion behaviors in metals
  • Laser-induced spectroscopy and plasma
  • Nuclear Materials and Properties
  • Metal Alloys Wear and Properties
  • Advanced materials and composites
  • Nuclear materials and radiation effects
  • High voltage insulation and dielectric phenomena
  • Healthcare and Venom Research
  • Advanced Welding Techniques Analysis
  • Surface Modification and Superhydrophobicity
  • Catalytic Processes in Materials Science
  • Additive Manufacturing Materials and Processes
  • Electrospun Nanofibers in Biomedical Applications

Jiangsu University
2015-2024

Wenzhou University
2022-2023

Shanghai Electric (China)
2022

We introduce our first-generation reasoning models, DeepSeek-R1-Zero and DeepSeek-R1. DeepSeek-R1-Zero, a model trained via large-scale reinforcement learning (RL) without supervised fine-tuning (SFT) as preliminary step, demonstrates remarkable capabilities. Through RL, naturally emerges with numerous powerful intriguing behaviors. However, it encounters challenges such poor readability, language mixing. To address these issues further enhance performance, we DeepSeek-R1, which incorporates...

10.48550/arxiv.2501.12948 preprint EN arXiv (Cornell University) 2025-01-22

We present DeepSeek-V3, a strong Mixture-of-Experts (MoE) language model with 671B total parameters 37B activated for each token. To achieve efficient inference and cost-effective training, DeepSeek-V3 adopts Multi-head Latent Attention (MLA) DeepSeekMoE architectures, which were thoroughly validated in DeepSeek-V2. Furthermore, pioneers an auxiliary-loss-free strategy load balancing sets multi-token prediction training objective stronger performance. pre-train on 14.8 trillion diverse...

10.48550/arxiv.2412.19437 preprint EN arXiv (Cornell University) 2024-12-26
Coming Soon ...