- Anomaly Detection Techniques and Applications
- Advanced Neural Network Applications
- Topic Modeling
- Natural Language Processing Techniques
- Human Pose and Action Recognition
- Domain Adaptation and Few-Shot Learning
- Advanced Image and Video Retrieval Techniques
- Advanced Vision and Imaging
- Video Surveillance and Tracking Methods
- Multimodal Machine Learning Applications
- Machine Learning and Data Classification
- Image Retrieval and Classification Techniques
- Advanced Graph Neural Networks
- Embedded Systems Design Techniques
- Advanced Text Analysis Techniques
- Image and Object Detection Techniques
- Medical Image Segmentation Techniques
- Generative Adversarial Networks and Image Synthesis
- Face and Expression Recognition
- Fault Detection and Control Systems
- Software Engineering Research
- Fire Detection and Safety Systems
- Cervical Cancer and HPV Research
- Sentiment Analysis and Opinion Mining
- Imbalanced Data Classification Techniques
Beijing Forestry University
2011-2022
University of Southern California
2021-2022
Tencent (China)
2020-2021
Harbin Institute of Technology
2020-2021
Chinese Academy of Sciences
2020
National University of Defense Technology
2020
University of Oulu
2020
Toa Pharmaceutical (Japan)
2020
Google (United States)
2019-2020
Johns Hopkins University
2018-2020
Most recent methods used for crowd counting are based on the convolutional neural network (CNN), which has a strong ability to extract local features. But CNN inherently fails in modeling global context due limited receptive fields. However, transformer can model easily. In this paper, we propose simple approach called CCTrans simplify design pipeline. Specifically, utilize pyramid vision backbone capture information, feature aggregation (PFA) combine low-level and high-level features, an...
Action quality assessment is crucial in areas of sports, surgery and assembly line where action skills can be evaluated. In this paper, we propose the Segment-based P3D-fused network S3D built-upon ED-TCN push performance on UNLV-Dive dataset by a significant margin. We verify that segment-aware training performs better than full-video which turns out to focus water spray. show temporal segmentation embedded with few efforts.
Supervised relation extraction methods based on deep neural network play an important role in the recent information field. However, at present, their performance still fails to reach a good level due existence of complicated relations. On other hand, recently proposed pre-trained language models (PLMs) have achieved great success multiple tasks natural processing through fine-tuning when combined with model downstream tasks. original standard PLM do not include task yet. We believe that...
Most data selection research in machine translation focuses on improving a single domain. We perform for multiple domains at once. This is achieved by carefully introducing instance-level domain-relevance features and automatically constructing training curriculum to gradually concentrate multi-domain relevant noise-reduced batches. Both the choice of use are crucial balancing all domains, including out-of-domain. In large-scale experiments, simultaneously reaches or outperforms individual...
In this work, we present graph star net (GraphStar), a novel and unified neural architecture which utilizes message-passing relay attention mechanism for multiple prediction tasks - node classification, classification link prediction. GraphStar addresses many earlier challenges facing nets achieves non-local representation without increasing the model depth or bearing heavy computational costs. We also propose new method to tackle topic-specific sentiment analysis based on text as...
This paper focuses on the semi-supervised object detection (SSOD) which makes good use of unlabeled data to boost performance. We face following obstacles when adapting knowledge distillation (KD) framework in SSOD. (1) The teacher model serves a dual role as and student, such that predictions images may limit upper bound student. (2) imbalance issue caused by large quantity consistent between student hinders an efficient transfer them. To mitigate these issues, we propose novel SSOD called...
Privacy of data is a critical concern when applying Machine Learning (ML) techniques to domains with sensitive data. Homomorphic Encryption (HE), by enabling computations on encrypted data, has emerged as promising approach perform inference ML models such Convolution Neural Network (CNN) in privacy preserving manner. A significant portion the total latency performing convolution over homomorphic (HE-Convolution). For plaintext low accelerator designs have been proposed using algorithms...
Spectral clustering is an emerging research topic that has numerous applications, such as data dimension reduction and image segmentation. In spectral clustering, new points are added continuously, dynamic sets processed in on-line way to avoid costly re-computation. this paper, we propose a representative measure compress the original maintain set of by continuously updating Eigen-system with incidence vector. According these extracted generate instant cluster labels arrive. Our method...
Recently, deep learning methods have been extensively applied for action recognition in videos. Most existing networks equally treat every video frame and directly assign a label to all the frames sampled from it. However, discriminative may occurs sparsely few key video, other are less relevant or even irrelevant class. Equally treating will hurt performance. To address this issue, we propose temporal attention model which learns recognize human actions videos while focusing selectively on...
Goal: Squamous cell carcinoma of cervix is one the most prevalent cancer worldwide in females. Traditionally, indispensable diagnosis squamous histopathological assessment which achieved under microscope by pathologist. However, human evaluation pathology slide highly depending on experience pathologist, thus big inter- and intra-observer variability exists. Digital pathology, combination with deep learning provides an opportunity to improve objectivity efficiency histopathologic analysis....
In this paper, we propose an effective THresholding method based on the Order Statistic, called THORS, to convert arbitrary scoring-type classifier, which can induce a continuous cumulative distribution function of score, into cost-sensitive one. The procedure uses order statistic find optimal threshold for classification, requiring almost no knowledge classifier itself. Unlike common data-driven methods, analytically show that THORS has theoretical guaranteed performance, bounds costs, and...
Our ``You Only Move Once"(YOMO) detector based on depthwise separable convolutions is a single stage face that balances accuracy and latency. YOMO performs scale-invariantly by utilizing top-down architecture with feature agglomeration, multiple detection modules instead of in an image pyramid approach. At the same time, we propose semi-soft random cropping algorithm enables different module adequately trained scales samples. Several experiments are conducted FDDB dataset discrete continuous...
In meta-learning, the knowledge learned from previous tasks is transferred to new ones, but this transfer only works if are related. Sharing information between unrelated might hurt performance, and it unclear how across with a hierarchical structure. Our research extends model agnostic meta-learning model, MAML, by exploiting task relationships. algorithm, TreeMAML, adapts each few gradient steps, adaptation follows tree structure: in step, gradients pooled clusters, subsequent steps follow...
The Mixture-of-Expert (MoE) technique plays a crucial role in expanding the size of DNN model parameters. However, it faces challenge extended all-to-all communication latency during training process. Existing methods attempt to mitigate this issue by overlapping with expert computation. Yet, these frequently fall short achieving sufficient overlap, consequently restricting potential for performance enhancements. In our study, we extend scope considering overlap at broader graph level....
Image registration is currently employed to assist with diagnostic tasks such as neurodegenerative disease diagnosis. Aging also affects the brain, and our knowledge of age-related brain diseases encumbered by age-induced bias; understanding mechanisms requires a deeper aging. Thus we have devised pipeline for generating age-specific templates using novel deep-learning algorithm detailed in separate report. Our results show qualitative changes morphology across age groups, tissue...
Almost all of the state-of-the-art object detectors employ convolutional neural network (CNN) to extract feature. However, how fully utilize spatial information is a challenge. In this paper, we propose an effective framework for detection. Our motivation that multi-scale representation and context are extremely important For representation, our mothed combines hierarchical feature maps fusion map, which has abundant high-level semantics. context, exploit by stacking multi-region maps. The...
As a research hotspot of computer vision, crowd counting methods have achieved success in natural images. But aerial images are rarely explored, and existing do not perform well because the higher resolution, smaller object scale more complex scene. Therefore, this paper proposes lightweight dual-task network (LDNet) for counting, which only uses bifurcated structure to overcome these new challenges without complicated pipelines. To realize this, complete but efficient Guidance Branch is...
This paper focuses on Semi-Supervised Object Detection (SSOD). Knowledge Distillation (KD) has been widely used for semi-supervised image classification. However, adapting these methods SSOD the following obstacles. (1) The teacher model serves a dual role as and student, such that predictions unlabeled images may be very close to those of which limits upper-bound student. (2) class imbalance issue in hinders an efficient knowledge transfer from To address problems, we propose novel method...
Most data selection research in machine translation focuses on improving a single domain. We perform for multiple domains at once. This is achieved by carefully introducing instance-level domain-relevance features and automatically constructing training curriculum to gradually concentrate multi-domain relevant noise-reduced batches. Both the choice of use are crucial balancing all domains, including out-of-domain. In large-scale experiments, simultaneously reaches or outperforms individual...