Enhancing Generic Reaction Yield Prediction through Reaction Condition-Based Contrastive Learning
Benchmark (surveying)
DOI:
10.34133/research.0292
Publication Date:
2023-12-12T13:17:44Z
AUTHORS (12)
ABSTRACT
Deep learning (DL)-driven efficient synthesis planning may profoundly transform the paradigm for designing novel pharmaceuticals and materials. However, progress of many DL-assisted (DASP) algorithms has suffered from lack reliable automated pathway evaluation tools. As a critical metric evaluating chemical reactions, accurate prediction reaction yields helps improve practicality DASP in real-world scenarios. Currently, accurately predicting interesting reactions still faces numerous challenges, mainly including absence high-quality generic yield datasets robust predictors. To compensate limitations high-throughput datasets, we curated dataset containing 12 categories rich condition information. Subsequently, by utilizing 2 pretraining tasks based on masked language modeling contrastive learning, proposed powerful bidirectional encoder representations transformers (BERT)-based predictor named Egret. It achieved comparable or even superior performance to best previous models 4 benchmark established state-of-the-art newly dataset. We found that reaction-condition-based enhances model's sensitivity conditions, Egret is capable capturing subtle differences between involving identical reactants products but different conditions. Furthermore, new scoring function incorporated into multistep routes. Test results showed yield-incorporated facilitated prioritization literature-supported high-yield pathways target molecules. In addition, through meta-learning strategy, further improved reliability types with limited data lower quality. Our suggest holds potential become an essential component next-generation
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (61)
CITATIONS (4)
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....