- Topic Modeling
- Natural Language Processing Techniques
- Recommender Systems and Techniques
- Speech and dialogue systems
- Multimodal Machine Learning Applications
- Gold and Silver Nanoparticles Synthesis and Applications
- Advanced Image and Video Retrieval Techniques
- Advanced Text Analysis Techniques
- Image Retrieval and Classification Techniques
- Advanced Computational Techniques and Applications
- Data Management and Algorithms
- Domain Adaptation and Few-Shot Learning
- Spectroscopy Techniques in Biomedical and Chemical Research
- Electrochemical sensors and biosensors
- Web Data Mining and Analysis
- Text Readability and Simplification
- Higher Education and Teaching Methods
- Neural Networks and Applications
- Text and Document Classification Technologies
- Advanced MEMS and NEMS Technologies
- Semantic Web and Ontologies
- Carbon Nanotubes in Composites
- Thermal properties of materials
- Speech Recognition and Synthesis
- Plasmonic and Surface Plasmon Research
Yunnan University of Finance And Economics
2023-2025
Central South University
2021-2025
State Key Laboratory of Powder Metallurgy
2025
Inner Mongolia Electric Power (China)
2023-2024
University of California, Santa Cruz
2014-2024
Collaborative Innovation Center of Advanced Microstructures
2012-2024
Nanjing University
2012-2024
Chinese Academy of Sciences
2008-2024
Nanjing Audit University
2019-2024
Wuhan University of Technology
2011-2024
Collaborative Filtering(CF)-based recommendation algorithms, such as Latent Factor Models (LFM), work well in terms of prediction accuracy. However, the latent features make it difficulty to explain results users. Fortunately, with continuous growth online user reviews, information available for training a recommender system is no longer limited just numerical star ratings or user/item features. By extracting explicit opinions about various aspects product from possible learn more details...
This paper introduces a new technique for finding latent software bugs called change classification. Change classification uses machine learning classifier to determine whether is more similar prior buggy changes, or clean changes. In this manner, predicts the existence of in The trained using features (in sense) extracted from revision history project, as stored its configuration management repository. can classify changes with 78% accuracy and 65% recall (on average). has several desirable...
For the 11th straight year, Conference on Computational Natural Language Learning has been accompanied by a shared task whose purpose is to promote natural language processing applications and evaluate them in standard setting. In 2009, was dedicated joint parsing of syntactic semantic dependencies multiple languages. This combines tasks previous five years under unique dependency-based formalism similar 2008 task. this paper, we define task, describe how data sets were created show their...
A personalized conversational sales agent could have much commercial potential. E-commerce companies such as Amazon, eBay, JD, Alibaba etc. are piloting kind of agents with their users. However, the research on this topic is very limited and existing solutions either based single round adhoc search engine or traditional multi dialog system. They usually only utilize user inputs in current session, ignoring users' long term preferences. On other hand, it well known that conversion rate can be...
Recent indentation experiments indicate that wurtzite BN (w-BN) exhibits surprisingly high hardness rivals of diamond. Here we unveil a novel two-stage shear deformation mechanism responsible for this unexpected result. We show by first-principles calculations large normal compressive pressures under indenters can compel w-BN into stronger structure through volume-conserving bond-flipping structural phase transformation during which produces significant enhancement in its strength,...
Stephan Oepen, Marco Kuhlmann, Yusuke Miyao, Daniel Zeman, Dan Flickinger, Jan Hajič, Angelina Ivanova, Yi Zhang. Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014). 2014.
Ranking is a core task in recommender systems, which aims at providing an ordered list of items to users. Typically, ranking function learned from the labeled dataset optimize global performance, produces score for each individual item. However, it may be sub-optimal because scoring applies item individually and does not explicitly consider mutual influence between items, as well differences users' preferences or intents. Therefore, we propose personalized re-ranking model systems. The...
We introduce phi-1, a new large language model for code, with significantly smaller size than competing models: phi-1 is Transformer-based 1.3B parameters, trained 4 days on 8 A100s, using selection of ``textbook quality" data from the web (6B tokens) and synthetically generated textbooks exercises GPT-3.5 (1B tokens). Despite this small scale, attains pass@1 accuracy 50.6% HumanEval 55.5% MBPP. It also displays surprising emergent properties compared to phi-1-base, our before finetuning...
We introduce phi-3-mini, a 3.8 billion parameter language model trained on 3.3 trillion tokens, whose overall performance, as measured by both academic benchmarks and internal testing, rivals that of models such Mixtral 8x7B GPT-3.5 (e.g., phi-3-mini achieves 69% MMLU 8.38 MT-bench), despite being small enough to be deployed phone. The innovation lies entirely in our dataset for training, scaled-up version the one used phi-2, composed heavily filtered web data synthetic data. is also further...
Sentiment classification refers to the task of automatically identifying whether a given piece text expresses positive or negative opinion towards subject at hand. The proliferation user-generated web content such as blogs, discussion forums and online review sites has made it possible perform large-scale mining public opinion. modeling is thus becoming critical component market intelligence social media technologies that aim tap into collective wisdom crowds. In this paper, we consider...
A content-based personalized recommendation system learns user specific profiles from feedback so that it can deliver information tailored to each individual user's interest. serving millions of users learn a better profile for new user, or with little feedback, by borrowing other through the use Bayesian hierarchical model. Learning model parameters optimize joint data likelihood is very computationally expensive. The commonly used EM algorithm converges slowly due sparseness in IR...
Faceted search is becoming a popular method to allow users interactively and navigate complex information spaces. A faceted system presents with key-value metadata that used for query refinement. While in e-commerce digital libraries, not much research has been conducted on which present user order improve the experience. Nor are there repeatable benchmarks evaluating engine. This paper proposes use of collaborative filtering personalization customize interface each user's behavior. also...