- Multimodal Machine Learning Applications
- Topic Modeling
- Natural Language Processing Techniques
- Text and Document Classification Technologies
- Advanced Image and Video Retrieval Techniques
- Advanced Neural Network Applications
- Blockchain Technology Applications and Security
- Caching and Content Delivery
- Distributed systems and fault tolerance
- Video Surveillance and Tracking Methods
- Video Analysis and Summarization
- Image Retrieval and Classification Techniques
- Gait Recognition and Analysis
- Indoor and Outdoor Localization Technologies
- Software Testing and Debugging Techniques
- Spam and Phishing Detection
- Marine animal studies overview
- Chaos-based Image/Signal Encryption
- Cellular Automata and Applications
- Domain Adaptation and Few-Shot Learning
- Advanced Steganography and Watermarking Techniques
- Formal Methods in Verification
- Human Mobility and Location-Based Analysis
- Animal Vocal Communication and Behavior
- Rough Sets and Fuzzy Logic
Tsinghua University
2013-2023
National Engineering Research Center for Information Technology in Agriculture
2021-2023
Shanghai Jiao Tong University
2022-2023
Shanghai Municipal Education Commission
2023
Shanghai Maritime University
2023
Changzhou University
2023
Foshan University
2022
Hefei University of Technology
2021
Jinan University
2019
Institute of Agricultural Machinery
2009
Trajectory prediction is confronted with the dilemma to capture multi-modal nature of future dynamics both diversity and accuracy. In this paper, we present a distribution discrimination (DisDis) method predict personalized motion patterns by distinguishing potential distributions. Motivated that pattern each person due his/her habit, our DisDis learns latent represent different optimize it contrastive discrimination. This encourages distributions be more discriminative. Our can integrated...
Multimodal pre-training with text, layout, and image has made significant progress for Visually Rich Document Understanding (VRDU), especially the fixed-layout documents such as scanned document images. While, there are still a large number of digital where layout information is not fixed needs to be interactively dynamically rendered visualization, making existing layout-based approaches easy apply. In this paper, we propose MarkupLM understanding tasks markup languages backbone,...
Open-Domain Question Answering (ODQA) aims to answer questions without explicitly providing specific background documents. This task becomes notably challenging in a zero-shot setting where no data is available train tailored retrieval-reader models. While recent Large Language Models (LLMs) like GPT-3 have demonstrated their effectiveness ODQA using direct prompting methods, these methods still fall short of fully harnessing the potential LLMs when implicitly invoked. In this paper, we...
Learning a generalizable and comprehensive similarity metric to depict the semantic discrepancies between images is foundation of many computer vision tasks. While existing methods approach this goal by learning an ensemble embeddings with diverse objectives, backbone network still receives mix all training signals. Differently, we propose deep factorized (DFML) method factorize signal employ different samples train various components network. We sub-blocks devise learnable router adaptively...
Applying sharding protocol to address scalability challenges in alliance chain is popular. However, inevitable cross-shard transactions significantly hamper performance even at low ratios, negating benefits when they dominate as shard scale grows. This article proposes a new suitable for that reduces transaction impact, improving system performance. It adopts directed acyclic graph ledger, enabling parallel processing, and employs dynamic confirmation consensus simplicity. The protocol's...
In open-domain question-answering (ODQA), most existing questions require single-hop reasoning on commonsense. To further extend this task, we officially introduce multi-hop (ODMR) by answering with explicit steps in setting. Recently, large language models (LLMs) have found significant utility facilitating ODQA without external corpus. Furthermore, chain-of-thought (CoT) prompting boosts the capability of LLMs to a greater extent manual or automated paradigms. However, methods lack quality...
Alliance chain has gained widespread popularity in industrial and commercial fields due to its multi-centralization node manageability. Current implementations of the alliance suffer from scalability obstacles, such as communication congestion throughput drop, when number nodes increases. In this paper, a novel dynamic transaction confirmation sharding protocol is proposed, which improves processing efficiency by partitioning assigning different transactions shards. It utilizes consensus...
With increasing size and complexity, more embedded systems are built from interconnected components. Testing is necessary to ensure the compatibility of composite components correctness integrated system. Interface automata (IAs) provide a light-weight formal method for modelling component external observable behaviour compositions. The paper presents systematic automatic testing based on an extended IA (EIA) model. EIA enriches capability by adding data constraints basic automata. Based...
In recent years, significant progress has been made in video instance segmentation (VIS), with many offline and online methods achieving state-of-the-art performance. While have the advantage of producing temporally consistent predictions, they are not suitable for real-time scenarios. Conversely, more practical, but maintaining temporal consistency remains a challenging task. this paper, we propose novel method segmentation, called TCOVIS, which fully exploits information clip. The core our...
Large Language Models (LLMs) are known to have limited extrapolation ability beyond their pre-trained context window, constraining application in downstream tasks with lengthy inputs. Recent studies sought extend LLMs' window by modifying rotary position embedding (RoPE), a popular encoding method adopted well-known LLMs such as LLaMA, PaLM, and GPT-NeoX. However, prior works like Position Interpolation (PI) YaRN resource-intensive lack comparative experiments assess applicability. In this...
Abstract In recent years, people’s awareness of wildlife protection has been increasing, and the demand for observation recording increased. However, existing sound source localization system needs to continuously collect signals convert time-domain into frequency-domain signals. These shortcomings will increase power consumption device, adversely affect battery life, result is easily interfered by echo environmental noise target waves. this study, algorithm hardware are improved on basis...
Multi-label feature selection aims to select discriminative attributes in multi-label scenario, but most of existing methods fail consider streaming features, i.e. features gradually flow one by one, which is more common real-world applications. In addition, though there are already some representative works on selection, they tackle the class-imbalance problem, exists widely learning. fact, will lead performance degradation learning models. Thus considering problem scenario beneficial...
In the environment of large forest, factors causing fire are nonlinear and uncertain. If data collected by sensor is simply analyzed compared, false alarm rate will be higher. How to combine several sensors for effective warning a difficult point. order improve accuracy prediction, aiming at shortcomings traditional forest prevention early system, we propose method based on fuzzy Bayesian network. Firstly, control system network in series, pre-process data. The pre-processed sent previously...
In open-domain question-answering (ODQA), most existing questions require single-hop reasoning on commonsense. To further extend this task, we officially introduce multi-hop (ODMR) by answering with explicit steps in setting. Recently, large language models (LLMs) have found significant utility facilitating ODQA without external corpus. Furthermore, chain-of-thought (CoT) prompting boosts the capability of LLMs to a greater extent manual or automated paradigms. However, methods lack quality...
Trajectory prediction is confronted with the dilemma to capture multi-modal nature of future dynamics both diversity and accuracy. In this paper, we present a distribution discrimination (DisDis) method predict personalized motion patterns by distinguishing potential distributions. Motivated that pattern each person due his/her habit, our DisDis learns latent represent different optimize it contrastive discrimination. This encourages distributions be more discriminative. Our can integrated...