NFDI4DS | UHH-SEMS - Publication Details

Attribute Group Editing for Reliable Few-shot Image Generation

OPENALEX - Publications

Guanqi Ding Xinzhe Han Shuhui Wang Shuzhe Wu Xin Jin and 2 more

Few-shot image generation is a challenging task even using the state-of-the-art Generative Adversarial Networks (GANs). Due to unstable GAN training process and limited data, generated images are often of low quality diversity. In this work, we propose new "editing-based" method, i.e., Attribute Group Editing (AGE), for few-shot generation. The basic assumption that any collection attributes editing direction specific attribute shared across all categories. AGE examines internal...

10.1109/cvpr52688.2022.01091 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2022-06-01

Cross-Lingual Text-Rich Visual Comprehension: An Information Theory Perspective

OPENALEX - Publications

Xinmiao Yu Xiaocheng Feng Yun Li Minghui Liao Yaqi Yu and 8 more

Recent Large Vision-Language Models (LVLMs) have shown promising reasoning capabilities on text-rich images from charts, tables, and documents. However, the abundant text within such may increase model's sensitivity to language. This raises need evaluate LVLM performance cross-lingual visual inputs, where language in image differs of instructions. To address this, we introduce XT-VQA (Cross-Lingual Text-Rich Visual Question Answering), a benchmark designed assess how LVLMs handle...

10.1609/aaai.v39i9.33049 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2025-04-11

Trust the Model When It Is Confident: Masked Model-based Actor-Critic

OPENALEX - Publications

Feiyang Pan Jia He Dandan Tu Qing He

It is a popular belief that model-based Reinforcement Learning (RL) more sample efficient than model-free RL, but in practice, it not always true due to overweighed model errors. In complex and noisy settings, RL tends have trouble using the if does know when trust model. this work, we find better usage can make huge difference. We show theoretically use of model-generated data restricted state-action pairs where error small, performance gap between real rollouts be reduced. motivates us...

10.48550/arxiv.2010.04893 preprint EN other-oa arXiv (Cornell University) 2020-01-01

Generating Synergistic Formulaic Alpha Collections via Reinforcement Learning

OPENALEX - Publications

Shuo Yu Hongyan Xue Xiang Ao Feiyang Pan Jia He and 2 more

In the field of quantitative trading, it is common practice to transform raw historical stock data into indicative signals for market trend. Such are called alpha factors. Alphas in formula forms more interpretable and thus favored by practitioners concerned with risk. practice, a set formulaic alphas often used together better modeling precision, so we need find synergistic sets that work well together. However, most traditional generators mine one separately, overlooking fact would be...

10.1145/3580305.3599831 article EN cc-by Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining 2023-08-04

Using Unified Probabilistic Matrix Factorization for Contextual Advertisement Recommendation

OPENALEX - Publications

Dandan Tu Chengchun Shu Hou‐Yong Yu

PDF HTML阅读 XML下载导出引用引用提醒基于联合概率矩阵分解的上下文广告推荐算法 DOI: 10.3724/SP.J.1001.2013.04238 作者: 作者单位: 作者简介: 通讯作者: 中图分类号: 基金项目: 国家自然科学基金(60873243) Using Unified Probabilistic Matrix Factorization for Contextual Advertisement Recommendation Author: Affiliation: Fund Project: 摘要 | 图/表访问统计参考文献相似文献引证文献资源附件文章评论摘要:上下文广告与用户兴趣及网页内容相匹配,可增强用户体验并提高广告点击率.而广告收益与广告点击率直接相关,准确预测广告点击率是提高上下文广告收益的关键.目前,上下文广告推荐面临如下问题:(1) 网页数量及用户数量规模很大;(2)...

10.3724/sp.j.1001.2013.04238 article EN Journal of Software 2014-01-14

Gradient-Adaptive Pareto Optimization for Constrained Reinforcement Learning

OPENALEX - Publications

Zixian Zhou Mengda Huang Feiyang Pan Jia He Xiang Ao and 2 more

Constrained Reinforcement Learning (CRL) burgeons broad interest in recent years, which pursues maximizing long-term returns while constraining costs. Although CRL can be cast as a multi-objective optimization problem, it is still facing the key challenge that gradient-based Pareto methods tend to stick known Pareto-optimal solutions even when they yield poor (e.g., safest self-driving car never moves) or violate constraints record-breaking racer crashes car). In this paper, we propose...

10.1609/aaai.v37i9.26353 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2023-06-26

Concise and Precise Context Compression for Tool-Using Language Models

OPENALEX - Publications

Yang Xu Yunlong Feng Honglin Mu Yutai Hou Yitong Li and 7 more

10.18653/v1/2024.findings-acl.974 article EN Findings of the Association for Computational Linguistics: ACL 2022 2024-01-01

A unified scheme of text localization and structured data extraction for joint OCR and data mining

OPENALEX - Publications

Yibin Ye Shenggao Zhu Jing Wang Qi Du Yezhang Yang and 3 more

Both text detection and structured data extraction are imperative in an optical character recognition (OCR) processing pipeline. Text detection, especially for indistinct, diverse, multi-language regions, is one of the most challenging tasks computer vision has attracted increasing attention recently. Moreover, although there some studies mining related to extraction, it not received its deserved as important steps OCR. The previous methods structural including layout template-based,...

10.1109/bigdata.2018.8622129 article EN 2021 IEEE International Conference on Big Data (Big Data) 2018-12-01

Unsupervised Coherent Video Cartoonization with Perceptual Motion Consistency

OPENALEX - Publications

Zhenhuan Liu Liang Li Huajie Jiang Xin Jin Dandan Tu and 2 more

In recent years, creative content generations like style transfer and neural photo editing have attracted more attention. Among these, cartoonization of real-world scenes has promising applications in entertainment industry. Different from image translations focusing on improving the effect generated images, video additional requirements temporal consistency. this paper, we propose a spatially-adaptive semantic alignment framework with perceptual motion consistency for coherent an...

10.1609/aaai.v36i2.20078 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2022-06-28

Style Miner: Find Significant and Stable Explanatory Factors in Time Series with Constrained Reinforcement Learning

OPENALEX - Publications

Dapeng Li Feiyang Pan Jia He Zhiwei Xu Dandan Tu and 1 more

In high-dimensional time-series analysis, it is essential to have a set of key factors (namely, the style factors) that explain change observed variable. For example, volatility modeling in finance relies on risk factors, and climate studies climatology rely causal factors. The ideal low-dimensional should balance significance (with high explanatory power) stability (consistent, no significant fluctuations). However, previous supervised unsupervised feature extraction methods can hardly...

10.48550/arxiv.2303.11716 preprint EN other-oa arXiv (Cornell University) 2023-01-01

XConveryer: Guarantee Hadoop Throughput via Lightweight OS-Level Virtualization

OPENALEX - Publications

An Qin Dandan Tu Chengchun Shu Chang Gao

Large-scale data parallel applications such as web indexing, mining demand plenty of computing and storage resources. As a widely adopted solution, hadoop partitions distributes the large datasets into chunks across multiple nodes in clusters to process parallel. For single cluster node, typically there are several concurrently running applications, sharing competing system CPU, memory, disk network bandwidth. The competition will cause unfairness for some jobs extend their response time....

10.1109/gcc.2009.62 article EN 2009-08-01

Concise and Precise Context Compression for Tool-Using Language Models

OPENALEX - Publications

Yang Xu Yunlong Feng Honglin Mu Yutai Hou Yitong Li and 7 more

Through reading the documentation in context, tool-using language models can dynamically extend their capability using external tools. The cost is that we have to input lengthy every time model needs use tool, occupying window as well slowing down decoding process. Given progress general-purpose compression, soft context compression a suitable approach alleviate problem. However, when compressing tool documentation, existing methods suffer from weaknesses of key information loss...

10.48550/arxiv.2407.02043 preprint EN arXiv (Cornell University) 2024-07-02

ToolACE: Winning the Points of LLM Function Calling

OPENALEX - Publications

Weiwen Liu Xu Huang Xingshan Zeng Xinlong Hao Yuanman Hu and 22 more

Function calling significantly extends the application boundary of large language models, where high-quality and diverse training data is critical for unlocking this capability. However, real function-calling quite challenging to collect annotate, while synthetic generated by existing pipelines tends lack coverage accuracy. In paper, we present ToolACE, an automatic agentic pipeline designed generate accurate, complex, tool-learning data. ToolACE leverages a novel self-evolution synthesis...

10.48550/arxiv.2409.00920 preprint EN arXiv (Cornell University) 2024-09-01

Cross-Lingual Text-Rich Visual Comprehension: An Information Theory Perspective

OPENALEX - Publications

Xinmiao Yu Xiaocheng Feng Yun Li Minghui Liao Yaqi Yu and 8 more

Recent Large Vision-Language Models (LVLMs) have shown promising reasoning capabilities on text-rich images from charts, tables, and documents. However, the abundant text within such may increase model's sensitivity to language. This raises need evaluate LVLM performance cross-lingual visual inputs, where language in image differs of instructions. To address this, we introduce XT-VQA (Cross-Lingual Text-Rich Visual Question Answering), a benchmark designed assess how LVLMs handle...

10.48550/arxiv.2412.17787 preprint EN arXiv (Cornell University) 2024-12-23

Stable Attribute Group Editing for Reliable Few-shot Image Generation

OPENALEX - Publications

Guanqi Ding Xinzhe Han Shuhui Wang Xin Jin Dandan Tu and 1 more

Few-shot image generation aims to generate data of an unseen category based on only a few samples. Apart from basic content generation, bunch downstream applications hopefully benefit this task, such as low-data detection and few-shot classification. To achieve goal, the generated images should guarantee retention for classification beyond visual quality diversity. In our preliminary work, we present ``editing-based'' framework Attribute Group Editing (AGE) reliable which largely improves...

10.48550/arxiv.2302.00179 preprint EN other-oa arXiv (Cornell University) 2023-01-01

DocStormer: Revitalizing Multi-Degraded Colored Document Images to Pristine PDF

OPENALEX - Publications

Chaowei Liu Jichun Li Yihua Teng Chaoqun Wang Nuo Xu and 2 more

For capturing colored document images, e.g. posters and magazines, it is common that multiple degradations such as shadows, wrinkles, etc., are simultaneously introduced due to external factors. Restoring multi-degraded images a great challenge, yet overlooked, most existing algorithms focus on enhancing color-ignored via binarization. Thus, we propose DocStormer, novel algorithm designed restore documents their potential pristine PDF. The contributions are: firstly, "Perceive-then-Restore"...

10.48550/arxiv.2310.17910 preprint EN cc-by-nc-nd arXiv (Cornell University) 2023-01-01

Rank Internet Service Quality Using Multiple Features: A Machine Learning Approach

OPENALEX - Publications

Dandan Tu Chengchun Shu Jingwei Shi Tao Zhu Shuang Wang and 1 more

This paper addresses the problem of Ranking Internet service quality by taking a machine learning approach using multiple features. helps find good services for applications that use as building blocks. Unlike other ranking problems, goodness qualities is dependent upon key The features vary across different categories and have unequally discriminative natures. divides into four subtasks including categorizing according to functionalities, identifying determine quality, denoising feature...

10.1109/skg.2010.27 article EN 2010-11-01

Intrinsic Bias Identification on Medical Image Datasets

OPENALEX - Publications

Shijie Zhang Lanjun Wang Lian Ding An‐An Liu Senhua Zhu and 1 more

Machine learning based medical image analysis highly depends on datasets. Biases in the dataset can be learned by model and degrade generalizability of applications. There are studies debiased models. However, scientists practitioners difficult to identify implicit biases datasets, which causes lack reliable unbias test datasets valid To tackle this issue, we first define data intrinsic bias attribute, then propose a novel identification framework for The contains two major components,...

10.48550/arxiv.2203.12872 preprint EN cc-by arXiv (Cornell University) 2022-01-01

Unsupervised Coherent Video Cartoonization with Perceptual Motion Consistency

OPENALEX - Publications

Zhenhuan Liu Liang Li Huajie Jiang Xin Jin Dandan Tu and 2 more

In recent years, creative content generations like style transfer and neural photo editing have attracted more attention. Among these, cartoonization of real-world scenes has promising applications in entertainment industry. Different from image translations focusing on improving the effect generated images, video additional requirements temporal consistency. this paper, we propose a spatially-adaptive semantic alignment framework with perceptual motion consistency for coherent an...

10.48550/arxiv.2204.00795 preprint EN cc-by arXiv (Cornell University) 2022-01-01