Heysem Kaya

ORCID: 0000-0001-7947-5508
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Emotion and Mood Recognition
  • Speech Recognition and Synthesis
  • Speech and Audio Processing
  • Face and Expression Recognition
  • Music and Audio Processing
  • Machine Learning and ELM
  • Ethics and Social Impacts of AI
  • Sentiment Analysis and Opinion Mining
  • Neural Networks and Applications
  • Natural Language Processing Techniques
  • Mental Health via Writing
  • Mental Health Research Topics
  • Face recognition and analysis
  • Explainable Artificial Intelligence (XAI)
  • Domain Adaptation and Few-Shot Learning
  • Generative Adversarial Networks and Image Synthesis
  • Animal Vocal Communication and Behavior
  • Video Analysis and Summarization
  • Voice and Speech Disorders
  • Privacy-Preserving Technologies in Data
  • Human Pose and Action Recognition
  • Primate Behavior and Ecology
  • Bipolar Disorder and Treatment
  • Advanced Data Compression Techniques
  • Bayesian Methods and Mixture Models

Utrecht University
2020-2025

Altrecht GGZ
2022

Namık Kemal University
2016-2019

Boğaziçi University
2010-2015

In-Q-Tel
2000

The Audio/Visual Emotion Challenge and Workshop (AVEC 2018) "Bipolar disorder, cross-cultural affect recognition'' is the eighth competition event aimed at comparison of multimedia processing machine learning methods for automatic audiovisual health emotion analysis, with all participants competing strictly under same conditions. goal to provide a common benchmark test set multimodal information bring together recognition communities, as well compare relative merits various approaches from...

10.1145/3266302.3266316 preprint EN 2018-10-15

Predictive emission monitoring systems (PEMS) are important tools for validation and backing up of costly continuous used in gas-turbine-based power plants. Their implementation relies on the availability appropriate ecologically valid data. In this paper, we introduce a novel PEMS dataset collected over five years from gas turbine predictive modeling CO NOx emissions. We analyze data using recent machine learning paradigm, present useful insights about predictions. Furthermore, benchmark...

10.3906/elk-1807-87 article EN TURKISH JOURNAL OF ELECTRICAL ENGINEERING & COMPUTER SCIENCES 2019-08-10

The INTERSPEECH 2021 Computational Paralinguistics Challenge addresses four different problems for the first time in a research competition under well-defined conditions: In COVID-19 Cough and Speech Sub-Challenges, binary classification on infection has to be made based coughing sounds speech; Escalation Sub-Challenge, three-way assessment of level escalation dialogue is featured; Primates species vs background need classified.We describe baseline feature extraction, classifiers 'usual'...

10.21437/interspeech.2021-19 article EN Interspeech 2022 2021-08-27

This paper presents our work on ACM MM Audio Visual Emotion Corpus 2014 (AVEC 2014) using the baseline features in accordance with challenge protocol. For prediction, we use Canonical Correlation Analysis (CCA) affect sub-challenge (ASC) and Moore-Penrose generalized inverse (MPGI) depression (DSC). The video provides histograms of Local Gabor Binary Patterns from Three Orthogonal Planes (LGBP-TOP) features. Based preliminary experiments AVEC 2013 data, focus inner facial regions that...

10.1145/2661806.2661814 article EN 2014-11-03

We propose a two-level system for apparent age estimation from facial images. Our first classifies samples into overlapping groups. Within each group, the is estimated with local regressors, whose outputs are then fused final estimate. use deformable parts model based face detector, and features pretrained deep convolutional network. Kernel extreme learning machines used classification. evaluate our on ChaLearn Looking at People 2016 - Apparent Age Estimation challenge dataset, report 0.3740...

10.1109/cvprw.2016.103 article EN 2016-06-01

Explainability and interpretability are two critical aspects of decision support systems. Despite their importance, it is only recently that researchers starting to explore these aspects. This paper provides an introduction explainability in the context apparent personality recognition. To best our knowledge, this first effort direction. We describe a challenge we organized on impressions analysis from video. analyze detail newly introduced data set, evaluation protocol, proposed solutions...

10.1109/taffc.2020.2973984 article EN IEEE Transactions on Affective Computing 2020-02-14

This paper presents our contribution to ACM ICMI 2015 Emotion Recognition in the Wild Challenge (EmotiW 2015). We participate both static facial expression (SFEW) and audio-visual emotion recognition challenges. In challenges, we use a set of visual descriptors their early late fusion schemes. For AFEW, also exploit popularly used spatio-temporal modeling alternatives carry out multi-modal fusion. classification, employ two least squares regression based learners that are shown be fast...

10.1145/2818346.2830588 article EN 2015-11-09

In this study we make use of Canonical Correlation Analysis (CCA) based feature selection for continuous depression recognition from speech. Besides its common in multi-modal/multi-view extraction, CCA can be easily employed as a selector. We introduce several novel ways filter (ranking) methods, showing their relations to previous work. test the suitability proposed methods on AVEC 2013 dataset under ACM MM Challenge protocol. Using 17% features, obtained relative improvement 30%...

10.1109/icassp.2014.6854298 article EN 2014-05-01

Computational Paralinguistics has several unresolved issues, one of which is coping with large variability due to speakers, spoken content and corpora. In this paper, we address the compensation issue by proposing a novel method composed i) Fisher vector encoding low level descriptors extracted from signal, ii) speaker z-normalization applied after clustering iii) non-linear normalization features iv) classification based on Kernel Extreme Learning Machines Partial Least Squares regression....

10.21437/interspeech.2015-193 article EN Interspeech 2022 2015-09-06

We describe an end-to-end system for explainable automatic job candidate screening from video CVs. In this application, audio, face and scene features are first computed input CV, using rich feature sets. These multiple modalities fed into modality-specific regressors to predict apparent personality traits a variable that predicts whether the subject will be invited interview. The base learners stacked ensemble of decision trees produce outputs quantitative stage, single tree, combined with...

10.1109/cvprw.2017.210 article EN 2017-07-01

Inpatient violence is a common and severe problem within psychiatry. Knowing who might become violent can influence staffing levels mitigate severity. Predictive machine learning models assess each patient's likelihood of becoming based on clinical notes. Yet, while benefit from having more data, data availability limited as hospitals typically do not share their for privacy preservation. Federated Learning (FL) overcome the limitation by training in decentralised manner, without disclosing...

10.1016/j.eswa.2022.116720 article EN cc-by Expert Systems with Applications 2022-03-10

As emotions play a central role in human communication, automatic emotion recognition has attracted increasing attention the last two decades. While multimodal systems enjoy high performances on lab-controlled data, they are still far from providing ecological validity non-lab-controlled, namely “in-the-wild” data. This work investigates audiovisual deep learning approaches to in-the-wild problem. Inspired by outstanding performance of end-to-end and transfer techniques, we explored...

10.3390/mti6020011 article EN cc-by Multimodal Technologies and Interaction 2022-01-27

Affective computing, particularly emotion and personality trait recognition, is of increasing interest in many research disciplines. The interplay shows itself the first impression left on other people. Moreover, ambient information, e.g. environment objects surrounding subject, also affect these impressions. In this work, we employ pre-trained Deep Convolutional Neural Networks to extract facial information from images for predicting apparent personality. We investigate Local Gabor Binary...

10.1109/icpr.2016.7899605 article EN 2016-12-01

This paper introduces a new audio-visual Bipolar Disorder (BD) corpus for the affective computing and psychiatric communities. The is annotated BD state, as well Young Mania Rating Scale (YMRS) by psychiatrists. also presents an pipeline state classification. investigated features include functionals of appearance descriptors extracted from fine-tuned Deep Convolutional Neural Networks (DCNN), geometric obtained using tracked facial landmarks, acoustic via openSMILE tool. Furthermore,...

10.1109/aciiasia.2018.8470362 article EN 2018-05-01

Recent developments in Artificial Intelligence (AI) have greatly benefited society, but they alsocome with risks. One of those risks is that AI has the potential to discriminate against certaingroups people. To address risk, benchmark regulations such as Act been cre-ated, requiring systems be fair and tasking auditors ensuring their compliance. In orderto do so, use fairness measures. However, selecting a specific definition fromthe various available options choosing measure from numerous...

10.31219/osf.io/cpxmf_v1 preprint EN 2025-02-27

10.1109/icassp49660.2025.10889722 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2025-03-12

Recent developments in Artificial Intelligence (AI) have greatly benefited society, but they alsocome with risks. One of those risks is that AI has the potential to discriminate against certaingroups people. To address risk, benchmark regulations such as Act been cre-ated, requiring systems be fair and tasking auditors ensuring their compliance. In orderto do so, use fairness measures. However, selecting a specific definition fromthe various available options choosing measure from numerous...

10.31219/osf.io/cpxmf_v2 preprint EN 2025-03-26
Coming Soon ...