- Natural Language Processing Techniques
- Human Pose and Action Recognition
- Video Analysis and Summarization
- Anomaly Detection Techniques and Applications
- Big Data and Business Intelligence
- Topic Modeling
- Advanced Text Analysis Techniques
- Handwritten Text Recognition Techniques
- Diabetic Foot Ulcer Assessment and Management
- Data Mining Algorithms and Applications
- Customer churn and segmentation
Walmart (United States)
2022-2025
Yahoo (United States)
2021
We present a model for temporally precise action spotting in videos, which uses dense set of detection anchors, predicting confidence and corresponding fine-grained temporal displacement each anchor. experiment with two trunk architectures, both are able to incorporate large contexts while preserving the smaller-scale features required localization: one-dimensional version u-net, Transformer encoder (TE). also suggest best practices training models this kind, by applying Sharpness-Aware...
The rapid evolution of text-to-image diffusion models has opened the door generative AI, enabling translation textual descriptions into visually compelling images with remarkable quality. However, a persistent challenge within this domain is optimization prompts to effectively convey abstract concepts concrete objects. For example, text encoders can hardly express "peace", while easily illustrate olive branches and white doves. This paper introduces novel approach named Prompt Optimizer for...
We present a model for temporally precise action spotting in videos, which uses dense set of detection anchors, predicting confidence and corresponding fine-grained temporal displacement each anchor. experiment with two trunk architectures, both are able to incorporate large contexts while preserving the smaller-scale features required localization: one-dimensional version u-net, Transformer encoder (TE). also suggest best practices training models this kind, by applying Sharpness-Aware...
Comprehensive understanding of key players and actions in multiplayer sports broadcast videos is a challenging problem. Unlike news or finance videos, have limited text. While both action recognition for detection has seen robust research, contextual text video frames still remains one the most impactful avenues understanding. In this work we study extremely accurate semantic clocks, challenges therein. We observe unique properties which makes it hard to utilize general-purpose pre-trained...