- Generative Adversarial Networks and Image Synthesis
- Semantic Web and Ontologies
- Reinforcement Learning in Robotics
- Web Data Mining and Analysis
- Advanced Vision and Imaging
- Natural Language Processing Techniques
- Video Analysis and Summarization
- Domain Adaptation and Few-Shot Learning
- Topic Modeling
- Stock Market Forecasting Methods
- Image Retrieval and Classification Techniques
- Adversarial Robustness in Machine Learning
- Handwritten Text Recognition Techniques
- Image Processing Techniques and Applications
- Complex Systems and Time Series Analysis
- Time Series Analysis and Forecasting
- Medical Image Segmentation Techniques
- Digital Imaging for Blood Diseases
- Advanced Neural Network Applications
- Machine Learning and Data Classification
- AI in cancer detection
- Digital Media Forensic Detection
- Medical Imaging and Analysis
- Visual and Cognitive Learning Processes
- Advanced Data Storage Technologies
Huawei Technologies (China)
2022-2023
Huawei Technologies (United Kingdom)
2022
Huawei Technologies (France)
2018
Institute of Computing Technology
2014
Chinese Academy of Sciences
2009-2014
Few-shot image generation is a challenging task even using the state-of-the-art Generative Adversarial Networks (GANs). Due to unstable GAN training process and limited data, generated images are often of low quality diversity. In this work, we propose new "editing-based" method, i.e., Attribute Group Editing (AGE), for few-shot generation. The basic assumption that any collection attributes editing direction specific attribute shared across all categories. AGE examines internal...
Recent Large Vision-Language Models (LVLMs) have shown promising reasoning capabilities on text-rich images from charts, tables, and documents. However, the abundant text within such may increase model's sensitivity to language. This raises need evaluate LVLM performance cross-lingual visual inputs, where language in image differs of instructions. To address this, we introduce XT-VQA (Cross-Lingual Text-Rich Visual Question Answering), a benchmark designed assess how LVLMs handle...
It is a popular belief that model-based Reinforcement Learning (RL) more sample efficient than model-free RL, but in practice, it not always true due to overweighed model errors. In complex and noisy settings, RL tends have trouble using the if does know when trust model. this work, we find better usage can make huge difference. We show theoretically use of model-generated data restricted state-action pairs where error small, performance gap between real rollouts be reduced. motivates us...
In the field of quantitative trading, it is common practice to transform raw historical stock data into indicative signals for market trend. Such are called alpha factors. Alphas in formula forms more interpretable and thus favored by practitioners concerned with risk. practice, a set formulaic alphas often used together better modeling precision, so we need find synergistic sets that work well together. However, most traditional generators mine one separately, overlooking fact would be...
PDF HTML阅读 XML下载 导出引用 引用提醒 基于联合概率矩阵分解的上下文广告推荐算法 DOI: 10.3724/SP.J.1001.2013.04238 作者: 作者单位: 作者简介: 通讯作者: 中图分类号: 基金项目: 国家自然科学基金(60873243) Using Unified Probabilistic Matrix Factorization for Contextual Advertisement Recommendation Author: Affiliation: Fund Project: 摘要 | 图/表 访问统计 参考文献 相似文献 引证文献 资源附件 文章评论 摘要:上下文广告与用户兴趣及网页内容相匹配,可增强用户体验并提高广告点击率.而广告收益与广告点击率直接相关,准确预测广告点击率是提高上下文广告收益的关键.目前,上下文广告推荐面临如下问题:(1) 网页数量及用户数量规模很大;(2)...
Constrained Reinforcement Learning (CRL) burgeons broad interest in recent years, which pursues maximizing long-term returns while constraining costs. Although CRL can be cast as a multi-objective optimization problem, it is still facing the key challenge that gradient-based Pareto methods tend to stick known Pareto-optimal solutions even when they yield poor (e.g., safest self-driving car never moves) or violate constraints record-breaking racer crashes car). In this paper, we propose...
Both text detection and structured data extraction are imperative in an optical character recognition (OCR) processing pipeline. Text detection, especially for indistinct, diverse, multi-language regions, is one of the most challenging tasks computer vision has attracted increasing attention recently. Moreover, although there some studies mining related to extraction, it not received its deserved as important steps OCR. The previous methods structural including layout template-based,...
In recent years, creative content generations like style transfer and neural photo editing have attracted more attention. Among these, cartoonization of real-world scenes has promising applications in entertainment industry. Different from image translations focusing on improving the effect generated images, video additional requirements temporal consistency. this paper, we propose a spatially-adaptive semantic alignment framework with perceptual motion consistency for coherent an...
In high-dimensional time-series analysis, it is essential to have a set of key factors (namely, the style factors) that explain change observed variable. For example, volatility modeling in finance relies on risk factors, and climate studies climatology rely causal factors. The ideal low-dimensional should balance significance (with high explanatory power) stability (consistent, no significant fluctuations). However, previous supervised unsupervised feature extraction methods can hardly...
Large-scale data parallel applications such as web indexing, mining demand plenty of computing and storage resources. As a widely adopted solution, hadoop partitions distributes the large datasets into chunks across multiple nodes in clusters to process parallel. For single cluster node, typically there are several concurrently running applications, sharing competing system CPU, memory, disk network bandwidth. The competition will cause unfairness for some jobs extend their response time....
Through reading the documentation in context, tool-using language models can dynamically extend their capability using external tools. The cost is that we have to input lengthy every time model needs use tool, occupying window as well slowing down decoding process. Given progress general-purpose compression, soft context compression a suitable approach alleviate problem. However, when compressing tool documentation, existing methods suffer from weaknesses of key information loss...
Function calling significantly extends the application boundary of large language models, where high-quality and diverse training data is critical for unlocking this capability. However, real function-calling quite challenging to collect annotate, while synthetic generated by existing pipelines tends lack coverage accuracy. In paper, we present ToolACE, an automatic agentic pipeline designed generate accurate, complex, tool-learning data. ToolACE leverages a novel self-evolution synthesis...
Recent Large Vision-Language Models (LVLMs) have shown promising reasoning capabilities on text-rich images from charts, tables, and documents. However, the abundant text within such may increase model's sensitivity to language. This raises need evaluate LVLM performance cross-lingual visual inputs, where language in image differs of instructions. To address this, we introduce XT-VQA (Cross-Lingual Text-Rich Visual Question Answering), a benchmark designed assess how LVLMs handle...
Few-shot image generation aims to generate data of an unseen category based on only a few samples. Apart from basic content generation, bunch downstream applications hopefully benefit this task, such as low-data detection and few-shot classification. To achieve goal, the generated images should guarantee retention for classification beyond visual quality diversity. In our preliminary work, we present ``editing-based'' framework Attribute Group Editing (AGE) reliable which largely improves...
For capturing colored document images, e.g. posters and magazines, it is common that multiple degradations such as shadows, wrinkles, etc., are simultaneously introduced due to external factors. Restoring multi-degraded images a great challenge, yet overlooked, most existing algorithms focus on enhancing color-ignored via binarization. Thus, we propose DocStormer, novel algorithm designed restore documents their potential pristine PDF. The contributions are: firstly, "Perceive-then-Restore"...
This paper addresses the problem of Ranking Internet service quality by taking a machine learning approach using multiple features. helps find good services for applications that use as building blocks. Unlike other ranking problems, goodness qualities is dependent upon key The features vary across different categories and have unequally discriminative natures. divides into four subtasks including categorizing according to functionalities, identifying determine quality, denoising feature...
Machine learning based medical image analysis highly depends on datasets. Biases in the dataset can be learned by model and degrade generalizability of applications. There are studies debiased models. However, scientists practitioners difficult to identify implicit biases datasets, which causes lack reliable unbias test datasets valid To tackle this issue, we first define data intrinsic bias attribute, then propose a novel identification framework for The contains two major components,...
In recent years, creative content generations like style transfer and neural photo editing have attracted more attention. Among these, cartoonization of real-world scenes has promising applications in entertainment industry. Different from image translations focusing on improving the effect generated images, video additional requirements temporal consistency. this paper, we propose a spatially-adaptive semantic alignment framework with perceptual motion consistency for coherent an...