Jan Zahálka

ORCID: 0000-0002-6743-3607
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Advanced Image and Video Retrieval Techniques
  • Video Analysis and Summarization
  • Image Retrieval and Classification Techniques
  • Multimodal Machine Learning Applications
  • Cancer-related molecular mechanisms research
  • Data Visualization and Analytics
  • RNA modifications and cancer
  • MicroRNA in disease regulation
  • Topic Modeling
  • Music and Audio Processing
  • Recommender Systems and Techniques
  • Machine Learning and Algorithms
  • Human Mobility and Location-Based Analysis
  • Artificial Intelligence in Healthcare and Education
  • Mental Health via Writing
  • Text and Document Classification Technologies
  • Wound Healing and Treatments
  • Human Pose and Action Recognition
  • Data-Driven Disease Surveillance
  • Integrated Circuits and Semiconductor Failure Analysis
  • Electrostatic Discharge in Electronics
  • Anomaly Detection Techniques and Applications
  • Pressure Ulcer Prevention and Management
  • Ethics and Social Impacts of AI
  • Image and Video Quality Assessment

Czech Technical University in Prague
2010-2024

Delft University of Technology
2023

ELLIS Alicante
2023

Sudop Praha (Czechia)
2022

University of Amsterdam
2014-2017

Amsterdam University of the Arts
2014-2016

The size and importance of visual multimedia collections grew rapidly over the last years, creating a need for sophisticated analytics systems enabling large-scale, interactive, insightful analysis.These to integrate human's natural expertise in analyzing with machine's ability process large-scale data.The paper starts off comprehensive overview representation, learning, interaction techniques from both point view.To this end, hundreds references related disciplines (visual analytics,...

10.1109/vast.2014.7042476 article EN 2014-10-01

In this paper, we propose City Melange, an interactive and multimodal content-based venue explorer. Our framework matches the interacting user to users of social media platforms exhibiting similar taste. The data collection integrates location-based networks such as Foursquare with general multimedia sharing Flickr or Picasa. interacts a set images thus implicitly underlying semantics. semantic information is captured through convolutional deep net features in visual domain latent topics...

10.1109/tmm.2015.2480007 article EN IEEE Transactions on Multimedia 2015-09-18

This paper presents Blackthorn, an efficient interactive multimodal learning approach facilitating analysis of multimedia collections up to 100 million items on a single high-end workstation. Blackthorn features data compression, feature selection, and optimizations the process. The Ratio-64 representation introduced in this only costs tens bytes per item yet preserves most visual textual semantic information with good accuracy. optimized model scores Ratio-64-compressed directly, greatly...

10.1109/tmm.2017.2755986 article EN IEEE Transactions on Multimedia 2017-09-22

The interplay between artificial intelligence (AI) and psychology, particularly in personality assessment, represents an important emerging area of research. Accurate trait estimation is crucial not only for enhancing personalization human-computer interaction but also a wide variety applications ranging from mental health to education. This paper analyzes the capability generic chatbot, ChatGPT, effectively infer traits short texts. We report results comprehensive user study featuring texts...

10.1016/j.chbah.2024.100088 article EN cc-by Computers in Human Behavior Artificial Humans 2024-07-26

We present an enhanced version of Exquisitor, our interactive and scalable media exploration system. At its core, Exquisitor is learning system using relevance feedback on items to build a model the users' information need. Relying efficient representation indexing, it facilitates real-time user interaction. The new features for Lifelog Search Challenge 2020 include support timeline browsing, search functionality finding positive examples, significant interface improvements. Participation in...

10.1145/3379172.3391718 article EN 2020-06-04

We propose a multimedia analytics solution for getting insight into image collections by extending the powerful analytic capabilities of pivot tables, found in ubiquitous spreadsheets, to multimedia. formalize concept tables and give design rules methods multimodal summarization, structuring, browsing collection based on these all optimized support an analyst structural conclusive insights. Our proposed provides truly interactive visual content through detection results, as well tags,...

10.1109/tmm.2016.2614380 article EN IEEE Transactions on Multimedia 2016-09-28

In this paper, we present analytic quality (AQ), a novel paradigm for the design and evaluation of multimedia analysis methods. AQ complements existing methods based on either machine-driven benchmarks or user studies. includes notion insight gain time needed to acquire it, both critical aspects large-scale collections analysis. To incorporate insight, introduces model. model, each simulated user, artificial actor, builds its over time, at any operating with multiple categories relevance....

10.1145/2733373.2806279 article EN 2015-10-13

Interactive learning is an umbrella term for methods that attempt to understand the information need of user and formulate queries satisfy need. We propose apply state art in interactive multimodal visual lifelog exploration search, using Exquisitor system. a highly scalable system, which uses semantic features extracted from content text suggest relevant media items user, based on relevance feedback previously suggested items. Findings our initial experiments indicate will likely work well...

10.1145/3326460.3329156 article EN 2019-06-05

In this paper, we introduce 11-20 (Image Insight 2020), a multimedia analytics approach for analytic categorization of image collections. Advanced visualizations collections exist, but they need tight integration with machine model to support the task categorization. Directly employing computer vision and interactive learning techniques gravitates towards search. Analytic categorization, however, is not classification (the difference between two called pragmatic gap): human...

10.1109/tvcg.2020.3030383 article EN IEEE Transactions on Visualization and Computer Graphics 2020-10-21

As large language models (LLMs) permeate more and applications, an assessment of their associated security risks becomes increasingly necessary. The potential for exploitation by malicious actors, ranging from disinformation to data breaches reputation damage, is substantial. This paper addresses a gap in current research specifically focusing on posed LLMs within the prompt-based interaction scheme, which extends beyond widely covered ethical societal implications. Our work proposes...

10.48550/arxiv.2311.11415 preprint EN cc-by arXiv (Cornell University) 2023-01-01

Exquisitor is a scalable media exploration system based on interactive learning. To satisfy user's information need, the asks user for feedback items and uses that to interactively construct classifier, in turn used identify next potentially relevant set of items. facilitate effective collection, offers filters narrow scope exploration, search functionality finding good examples support timeline browsing videos or image sequences. For this year's Lifelog Search Challenge, we have enhanced...

10.1145/3463948.3469255 article EN 2021-08-20

In this demonstration, we present Exquisitor, a media explorer capable of learning user preferences in real-time during interactions with the 99.2 million images YFCC100M. Exquisitor owes its efficiency to innovations data representation, compression, and indexing. can complete each interaction round, including presenting most relevant results, less than 30 ms using only single CPU core modest RAM. short, bring large-scale interactive standard desktops laptops, even high-end mobile devices.

10.1145/3343031.3350580 preprint EN Proceedings of the 30th ACM International Conference on Multimedia 2019-10-15

In this paper we propose New Yorker Melange, an interactive city explorer, which navigates York venues through the eyes of Yorkers having a similar taste to interacting user. To gain insight into Yorkers' preferences and properties venues, dataset more than million venue images associated annotations has been collected from Foursquare, Picasa, Flickr. As visual text features, use semantic concepts extracted by convolutional deep net latent Dirichlet allocation topics. identify different...

10.1145/2647868.2656403 article EN 2014-10-31

This paper presents Blackthorn, an efficient interactive multimodal learning approach facilitating analysis of multimedia collections 100 million items on a single high-end workstation. is achieved by data compression and optimizations to the process. The compressed i-I64 representation costs tens bytes per item yet preserves most visual textual semantic information. optimized model scores i-I64-compressed directly, greatly reducing computational requirements. experiments show that...

10.1145/2911996.2912062 article EN 2016-06-06

The interplay between artificial intelligence (AI) and psychology, particularly in personality assessment, represents an important emerging area of research. Accurate trait estimation is crucial not only for enhancing personalization human-computer interaction but also a wide variety applications ranging from mental health to education. This paper analyzes the capability generic chatbot, ChatGPT, effectively infer traits short texts. We report results comprehensive user study featuring texts...

10.48550/arxiv.2312.16070 preprint EN cc-by arXiv (Cornell University) 2023-01-01

User Relevance Feedback (URF) is a class of interactive learning methods that rely on the interaction between human user and system to analyze media collection. To improve URF evaluation design better systems, it important understand impact different strategies can have. Based literature observations from real sessions Lifelog Search Challenge Video Browser Showdown, we related (a) labeling positive negative examples, (b) applying filters based users' domain knowledge. Experiments show there...

10.1145/3460426.3463663 article EN 2021-08-24

Increasing scale is a dominant trend in today's multimedia collections, which especially impacts interactive applications. To facilitate exploration of large new approaches are needed that capable learning on the fly analytic categories based visual and textual content. general use standard desktops, laptops, mobile devices, they must furthermore work with limited computing resources. We present Exquisitor, highly scalable approach, intelligent large-scale YFCC100M image collection extremely...

10.48550/arxiv.1904.08689 preprint EN other-oa arXiv (Cornell University) 2019-01-01
Coming Soon ...