- Topic Modeling
- Natural Language Processing Techniques
- Software Engineering Research
- Advanced Malware Detection Techniques
- Machine Learning in Materials Science
- Multimodal Machine Learning Applications
- Green IT and Sustainability
- Software Testing and Debugging Techniques
- Web Data Mining and Analysis
- Manufacturing Process and Optimization
- Caching and Content Delivery
- Information Retrieval and Search Behavior
- Autonomous Vehicle Technology and Safety
- Traffic control and management
- Sentiment Analysis and Opinion Mining
- Advanced Image and Video Retrieval Techniques
- Reinforcement Learning in Robotics
- Model-Driven Software Engineering Techniques
- Sports Analytics and Performance
- Optimization and Search Problems
- Vehicle Dynamics and Control Systems
- Personal Information Management and User Behavior
- Open Source Software Innovations
- Speech Recognition and Synthesis
- Domain Adaptation and Few-Shot Learning
University of California, Los Angeles
2018-2024
California Institute of Technology
2024
Hunan University
2022-2024
Peking University
2017-2020
Stock trend prediction plays a critical role in seeking maximized profit from the stock investment. However, precise is very difficult since highly volatile and non-stationary nature of market. Exploding information on Internet together with advancing development natural language processing text mining techniques have enabled investors to unveil market trends volatility online content. Unfortunately, quality, trustworthiness, comprehensiveness content related vary drastically, large portion...
Recently a number of algorithms under the theme 'unbiased learning-to-rank' have been proposed, which can reduce position bias, major type bias in click data, and train high-performance ranker with data. Most existing algorithms, based on inverse propensity weighting (IPW) principle, first estimate at each position, then an unbiased estimated biases using learning-to-rank algorithm. However, there has not method for pairwise that simultaneously conduct debiasing data training loss function....
Sentiment classification typically relies on a large amount of labeled data. In practice, the availability labels is highly imbalanced among different languages, e.g., more English texts are than in any other which creates considerable inequality quality related information services received by users speaking languages. To tackle this problem, cross-lingual sentiment approaches aim to transfer knowledge learned from one language that has abundant examples (i.e., source language, usually...
Existing approaches for learning word embedding often assume there are sufficient occurrences each in the corpus, such that representation of words can be accurately estimated from their contexts. However, real-world scenarios, out-of-vocabulary (a.k.a. OOV) do not appear training corpus emerge frequently. How to learn accurate representations these augment a pre-trained by only few observations is challenging research problem. In this paper, we formulate OOV as few-shot regression problem...
Collaborative Filtering (CF), as one of the most popular approaches, is widely employed in recommender systems but suffers from cold-start problem, where interactions are very limited for new users system. To deal with this issue, previous work has largely focused on utilizing various auxiliary information such user profiles and social relationships to infer preferences. However, not always available due reasons privacy concerns, making CF approaches have count interactions. Moreover,...
Answering open-domain questions requires world knowledge about in-context entities. As pre-trained Language Models (LMs) lack the power to store all required knowledge, external sources, such as graphs, are often used augment LMs. In this work, we propose knOwledge REasOning empowered Model(OREO-LM), which consists of a novel Knowledge Interaction Layer that can be flexibly plugged into existing Transformer-based LMs interact with differentiable Graph Reasoning module collaboratively. way,...
In this paper, we propose an autonomous information seeking visual question answering framework, AVIS. Our method leverages a Large Language Model (LLM) to dynamically strategize the utilization of external tools and investigate their outputs, thereby acquiring indispensable knowledge needed provide answers posed questions. Responding questions that necessitate knowledge, such as "What event is commemorated by building depicted in image?", complex task. This task presents combinatorial...
Language agents have become a promising solution to complex interactive tasks. One of the key ingredients success language is reward model on trajectory agentic workflow, which provides valuable guidance during training or inference. However, due lack annotations intermediate interactions, most existing works use an outcome optimize policies across entire trajectories. This may lead sub-optimal and hinder overall performance. To address this, we propose QLASS (Q-guided Agent Stepwise...
This paper presents DataSciBench, a comprehensive benchmark for evaluating Large Language Model (LLM) capabilities in data science. Recent related benchmarks have primarily focused on single tasks, easily obtainable ground truth, and straightforward evaluation metrics, which limits the scope of tasks that can be evaluated. In contrast, DataSciBench is constructed based more curated collection natural challenging prompts uncertain truth metrics. We develop semi-automated pipeline generating...
We study the problem of symbolic music generation (e.g., generating piano rolls), with a technical focus on non-differentiable rule guidance. Musical rules are often expressed in form note characteristics, such as density or chord progression, many which pose challenge when using them for guided diffusion. propose Stochastic Control Guidance (SCG), novel guidance method that only requires forward evaluation functions can work pre-trained diffusion models plug-and-play way, thus achieving...
Automated-test-generation tools generate test cases to enable dynamic analysis of Android apps, such as functional testing. These build a GUI model describe the app states during execution, and script that performs actions on UI widgets form case. However, when are re-executed, apps under often do not behave consistently. The major reasons for limited reproducibility due (1) backend-service dependencies cause non-determinism in behaviors (2) severe fragmentation platform (i.e., alarming...
Compared to the Web where each web page has a global URL for external access, specific 'page' inside mobile app cannot be easily accessed unless user performs several steps from landing of this app. Recently, concept 'deep link' is expected promising solution and been advocated by major service providers enable targeting opening an externally with accessible uniform resource identifier. In paper, we present large-scale empirical study investigate how deep links are really adopted, over...
In the appstore-centric ecosystem, app developers have an urgent requirement to optimize their release strategy maximize user adoption of apps. To address this problem, we introduce approach assisting select proper opportunity based on purpose update and current condition app. Before that, propose interval characterize patterns apps, find significance updates through empirical analysis. We mined release-history data 17,820 apps from 33 categories in Google Play, over a period 105 days. With...
Designing desirable and aesthetical manifestation of web graphic user interfaces (GUI) is a challenging task for developers. After determining page's content, developers usually refer to existing pages, adapt the styles from desired pages into target one. However, it not only difficult find appropriate exhibit but also tedious incorporate different harmoniously in page. To tackle these two issues, we propose FaceOff, data-driven automation system that assists design GUI. FaceOff constructs...
Large Language Models (LLMs) have shown promise in assisting scientific discovery. However, such applications are currently limited by LLMs' deficiencies understanding intricate concepts, deriving symbolic equations, and solving advanced numerical calculations. To bridge these gaps, we introduce SciGLM, a suite of language models able to conduct college-level reasoning. Central our approach is novel self-reflective instruction annotation framework address the data scarcity challenge science...
A spatial-dependent robust control strategy is proposed for the on-ramp merging problem based on coordination of connected and automated vehicles. In strategy, planning stage weakened while strengthened. More specifically, mainly forms a virtual platoon containing all vehicles inside communication zone. stage, time-varying parameter uncertainties in model are considered. controller with uniform boundedness, ultimate boundedness robustness delicately designed each vehicle to analytically...
Dynamically planning in multi-agent systems has been explored to improve decision-making various domains. Professional basketball serves as a compelling example of dynamic spatio-temporal game, encompassing both concealed strategic policies and decision-making. However, processing the diverse on-court signals navigating vast space potential actions outcomes makes it difficult for existing approaches swiftly identify optimal strategies response evolving circumstances. In this study, we first...
In the appstore-centric ecosystem, app developers have an urgent requirement to optimize their release strategy maximize success opportunity of apps. To address this problem, we introduce approach assisting select proper based on purpose update and current condition app. Before that, propose interval its previous characterize patterns, find significance through empirical analysis. We mined update-history data 17,820 apps from 33 categories in Google Play, over a period 105 days. With 41,028...
Generating test cases through automatic app exploration is very useful for analyzing and testing Android apps. However, generated by current app-exploration tools are not reproducible, i.e. when the case re-executed, cannot reach same state as explored one. As a result, developers able to reproduce failure or crash reported during exploration, conduct regression after fixing bug, execute in different environments. In this paper, we present DroidWalker, dynamic-analysis tool generate...
The mobile application (app) has become the main entrance to access Internet on handheld devices. Unlike Web where each webpage a global URL reach directly, specific "content page" of an app can be opened only by exploring with several operations from landing page. interoperability between apps is quite fixed and thus limits value-added "linked data" apps. Recently, deep link been proposed enable targeting opening page externally accessible uniform resource identifier (URI). However,...
For the sake of safety, vehicle path tracking control should not only ensure stability error containing lateral offset and orientation but also guarantee that both transient steady states are within a specified safe boundary. However, time-varying uncertainties system make design tough task. This paper develops an adaptive robust (ARC) which guarantees bounded property for autonomous vehicles. First, to handle requirement, barrier function based state transformation converts constrained into...