NFDI4DS | UHH-SEMS - Publication Details

Álan L. V. Guedes

ORCID: 0000-0003-0110-9975

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5059620310

Research Areas

Multimedia Communication and Technology
Video Analysis and Summarization
Image and Video Quality Assessment
Human Pose and Action Recognition
Speech and dialogue systems
Advanced Image and Video Retrieval Techniques
Usability and User Interface Design
Peer-to-Peer Network Technologies
Advanced Image Processing Techniques
Web Applications and Data Management
Music and Audio Processing
Face recognition and analysis
Multimodal Machine Learning Applications
Telecommunications and Broadcasting Technologies
Image and Signal Denoising Methods
Digital Media Forensic Detection
Education and Digital Technologies
Video Surveillance and Tracking Methods
Speech and Audio Processing
Online Learning and Analytics
Biometric Identification and Security
Video Coding and Compression Technologies
Educational Games and Gamification
Power Systems and Technologies
Subtitles and Audiovisual Media

Pontifical Catholic University of Rio de Janeiro
2015-2024

University College London
2023

Universidade Federal da Paraíba
2011-2012

Extending multimedia languages to support multimodal user interactions

OPENALEX - Publications

Álan L. V. Guedes Roberto Gerson de Albuquerque Azevedo Simone Diniz Junqueira Barbosa

10.1007/s11042-016-3846-8 article EN Multimedia Tools and Applications 2016-10-01

Subjective Evaluation of 360-degree Sensory Experiences

OPENALEX - Publications

Álan L. V. Guedes Roberto Gerson de Albuquerque Azevedo Pascal Frossard Sérgio Colcher Simone Diniz Junqueira Barbosa

Traditionally, most multimedia content has been developed to stimulate two of the human senses, i.e., sight and hearing. Due recent technological advancements, however, innovative services have that provide more realistic, immersive, engaging experiences audience. Omnidirectional (i.e., 360-degree) video, for instance, is becoming increasingly popular. It allows viewer navigate full 360-degree view a scene from specific point. In particular, when consumed through head-mounted displays,...

10.1109/mmsp.2019.8901743 article EN 2019-09-01

An Authoring Model for Interactive 360 Videos

OPENALEX - Publications

Paulo Renato da Costa Mendes Álan L. V. Guedes Daniel de Sousa Moraes Roberto Gerson Albuquerque Azevedo Sérgio Colcher

The recent availability of consumer-level head-mounted displays and omnidirectional cameras has been driving an explosion 360 video content. Transforming the original recorded content in meaningful interactive multimedia presentations that support viewers tasks such as learning, entertainment, telepresence, however, is not trivial requires new tools. Such tools must provide easy-to-use authoring model for integration different media objects active user interface elements.In this paper, based...

10.1109/icmew46912.2020.9105958 article EN 2020-06-09

An introduction to data stream processing

OPENALEX - Publications

Marcos Roriz Fernando B. V. Magalhães Álan L. V. Guedes Sérgio Colcher Markus Endler

Most of information technologies courses are focused on exposing the traditional static data (e.g., SGBD) processing model and lack presenting stream model. Such can be applied to fields such as Internet Things, busyness finances, logistics smart cities) industry. Complex Event Processing consists a programming approach handle Data Stream Processing. It provide primitives process detect occurrence patterns in streams. This short course has objective complex event (CEP) means dealing with...

10.1145/3323503.3345028 article EN 2019-10-10

Deep learning methods for video understanding

OPENALEX - Publications

Gabriel N. P. dos Santos Pedro V. A. de Freitas Antonio José G. Busson Álan L. V. Guedes Ruy Luiz Milidiú and 1 more

Methods based on Deep Learning became state-of-the-art in several Multimedia challenges. However, there is a gap of professionals to perform the industry. Therefore, this tutorial aims present grounds and ways develop applications that uses methods for video analysis tasks. Likewise, an opportunity students information technology qualify themselves. The main focus short course fundamentals technologies such DL.

10.1145/3323503.3345029 article EN 2019-10-10

Extending NCL to Support Multiuser and Multimodal Interactions

OPENALEX - Publications

Álan L. V. Guedes Roberto Gerson de Albuquerque Azevedo Sérgio Colcher Simone Diniz Junqueira Barbosa

Recent advances in technologies for speech, touch and gesture recognition have given rise to a new class of user interfaces that does not only explore multiple modalities but also allows interacting users. Even so, current declarative multimedia languages e.g. HTML, SMIL, NCL?support limited forms input (mainly keyboard mouse) single user. In this paper, we aim at studying how the NCL language could take advantage those technologies. To do revisit model behind NCL, named NCM (Nested Context...

10.1145/2976796.2976869 article EN 2016-11-07

An ergonomic evaluation method using a mobile depth sensor and pose estimation

OPENALEX - Publications

Pedro V. A. de Freitas Paulo Renato da Costa Mendes Antonio José G. Busson Álan L. V. Guedes Giovanni Lucca F. da Silva and 2 more

An ergonomic evaluation is an observation of a person in order to identify musculoskeletal disorders (WMSDs) caused by prolonged or repeated harmful poses that adopts during work tasks. Nowadays, ergonomist other health professional perform such evaluations based on set posture rules and checklists, which can be subjective thus lead erroneous risk classifications. Moreover, this usually the patient environment. In make those more objective concise we propose method using mobile depth sensor....

10.1145/3323503.3349550 article EN 2019-10-10

A baseline for NSFW video detection in e-learning environments

OPENALEX - Publications

Pedro V. A. de Freitas Gabriel N. P. dos Santos Antonio José G. Busson Álan L. V. Guedes Sérgio Colcher

The broad use of video capture and services for its storage transmission has enabled the production a massive volume data. This usage presents challenge in controlling type content that is loaded these services. Internet slang NSFW (Not Safe For Work) often used as warning media contain inappropriate content, such nudity, intense sexuality, violence, gore or other potentially disturbing subject matter. Convolutional Neural Network (CNNs) architectures, ConvNets, have become primary method...

10.1145/3323503.3360625 article EN 2019-10-10

A Cluster-Matching-Based Method for Video Face Recognition

OPENALEX - Publications

Paulo Renato da Costa Mendes Antonio José G. Busson Sérgio Colcher Daniel Schwabe Álan L. V. Guedes and 1 more

Face recognition systems are present in many modern solutions and thousands of applications our daily lives. However, current not easily scalable, especially when it comes to the addition new targeted people. We propose a cluster-matching-based approach for face video. In approach, we use unsupervised learning cluster faces both dataset videos selected recognition. Moreover, design matching heuristic associate clusters sets that is also capable identifying belongs non-registered person. Our...

10.1145/3428658.3430967 preprint EN 2020-11-25

A CNN-Based Tool to Index Emotion on Anime Character Stickers

OPENALEX - Publications

Ivan Jesus Jessica Cardoso Antonio José G. Busson Álan L. V. Guedes Sérgio Colcher and 1 more

Anime character stickers consist of a detailed illustration characters that represents emotions. Generally, message apps let users save received stickers, but in some cases, the task manual searching for specific sticker may be frustrating when large amount them are stored. In this work, we propose CNN-based tool emotion indexing stickers. We built dataset with 12.668 labeled 3 classes (Sad, Happy and Angry). experiments, our model achieves 84.01% global f1-score. Additionally, describe...

10.1109/ism46123.2019.00071 article EN 2019-12-01

Decoder-Side Quality Enhancement of JPEG Images Using Deep Learning-Based Prediction Models for Quantized DCT Coefficients

OPENALEX - Publications

Antonio José G. Busson Paulo Renato da Costa Mendes Daniel de Sousa Moraes Álvaro Mário G. da Veiga Sérgio Colcher and 1 more

Many recent works have successfully applied some types of Convolutional Neural Networks (CNNs) to reduce the noticeable distortion resulting from lossy JPEG compression technique. Most them are built upon processing made on spatial domain. In this work, we propose a image decoder that is purely based frequency-to-frequency domain: it reads quantized DCT coefficients received low-quality bitstream and, using deep learning-based model, predicts missing in order recompose same with enhanced...

10.1145/3428658.3430966 article EN 2020-11-25

A Deep Learning Approach to Detect Pornography Videos in Educational Repositories

OPENALEX - Publications

Pedro V. A. de Freitas Antonio José G. Busson Álan L. V. Guedes Sérgio Colcher

A large number of videos are uploaded on educational platforms every minute. Those responsible for any sensitive media by their users. An automated detection system to identify pornographic content could assist human workers pre-selecting suspicious videos. In this paper, we propose a multimodal approach adult detection. We use two Deep Convolutional Neural Networks extract high-level features from both image and audio sources video. Then, concatenate those evaluate the performance...

10.5753/cbie.sbie.2020.1253 article EN Anais do XXXI Simpósio Brasileiro de Informática na Educação (SBIE 2020) 2020-11-24

Interactive 360-degree Videos in Ginga-NCL Using Head-Mounted-Displays as Second Screen Devices

OPENALEX - Publications

Gabriel Alves De Souza Daniel Silva Matheus Delgado Renato Rodrigues Paulo Renato da Costa Mendes and 3 more

The recent availability of HMD devices (Head-Mounted Display) spurred advances in Virtual Reality and Augmented Reality. It allowed scenarios immersion users realistic environments, e.g., synthesized 3D environments for games simulation, omnidirectional videos, namely 360° videos. These are commonly used conjunction with home computers or controlled (e.g., museums). However, despite these scenarios, current interactive TV systems still focus on using content two-dimensional layouts. could...

10.1145/3428658.3430972 article EN 2020-11-25

Should I See or Should I Go

OPENALEX - Publications

Arthur Costa Serra Paulo Renato da Costa Mendes Pedro V. A. de Freitas Antonio José G. Busson Álan L. V. Guedes and 1 more

A very large amount of multimedia data is continually being shared through social networks. In these public spaces, administrators are legally responsible for moderating and controlling the content uploaded or posted to their platforms. However, traffic media in some private such as chat rooms messaging platforms, example, often protected, sometimes by end-to-end encryption, therefore not subject this kind monitoring, which makes them prone spread inappropriate pornography, violence other...

10.1145/3470482.3479639 article EN 2021-09-28

A Multimodal CNN-based Tool to Censure Inappropriate Video Scenes

OPENALEX - Publications

Pedro V. A. de Freitas Paulo Renato da Costa Mendes Gabriel N. P. dos Santos Antonio José G. Busson Álan L. V. Guedes and 2 more

Due to the extensive use of video-sharing platforms and services for their storage, amount such media on internet has become massive. This volume data makes it difficult control kind content that may be present in video files. One main concerns regarding is if an inappropriate subject matter, as nudity, violence, or other potentially disturbing content. More than telling a either appropriate inappropriate, also important identify which parts contain content, preserving would discarded simple...

10.48550/arxiv.1911.03974 preprint EN other-oa arXiv (Cornell University) 2019-01-01

An ontology-based approach to integrate TV and IoT middlewares

OPENALEX - Publications

Danne Makleyston Gomes Pereira Francisco Silva Carlos de Salles Soares Neto Davi Viana Luciano Coutinho and 1 more

10.1007/s11042-020-09645-4 article EN Multimedia Tools and Applications 2020-09-10

Quality Enhancement of Highly Degraded Music Using Deep Learning-Based Prediction Models for Lost Frequencies

OPENALEX - Publications

Arthur Costa Serra Antonio José G. Busson Álan L. V. Guedes Sérgio Colcher

Audio quality degradation can have many causes. For musical applications, this fragmentation may lead to highly unpleasant experiences. Restoration algorithms be employed reconstruct missing parts of the audio in a similar way as for image reconstruction --- an approach called inpainting. Current state-of-the art methods inpainting cover limited scenarios, with well-defined gap windows and little variety genres. In work, we propose Deep-Learning-based (DL-based) method accompanied by dataset...

10.1145/3470482.3479635 article EN 2021-09-28

Modeling Multimodal-Multiuser Interactions in Declarative Multimedia Languages

OPENALEX - Publications

Álan L. V. Guedes Roberto Gerson de Albuquerque Azevedo Sérgio Colcher Simone Diniz Junqueira Barbosa

Recent advances in hardware and software technologies have given rise to a new class of human-computer interfaces that both explores multiple modalities allows for collaborating users. When compared the development traditional single-user WIMP (windows, icons, menus, pointer)-based applications, however, applications supporting seamless integration multimodal-multiuser interactions bring specification runtime requirements. With aim assisting multimedia integrate interactions, this paper: (1)...

10.1145/3342558.3345400 article EN 2019-09-19

Shaping the Video Conferences of Tomorrow With AI

OPENALEX - Publications

Paulo Renato da Costa Mendes Eduardo Silva Vieira Pedro V. A. de Freitas Antonio José G. Busson Álan L. V. Guedes and 2 more

Before the COVID-19 pandemic, video was already one of main media used on internet. During conferencing services became even more important, coming to be instruments enable most social and professional human activities. Given distancing policies, people are spending time using these online for working, learning, also leisure Videoconferencing software standard communication home-office remote learning. Nevertheless, there still a lot issues addressed platforms, many different aspects...

10.5753/webmedia_estendido.2020.13082 article EN 2020-11-30

Video Quality Enhancement Using Deep Learning-Based Prediction Models for Quantized DCT Coefficients in MPEG I-frames

OPENALEX - Publications

Antonio José G. Busson Paulo Renato da Costa Mendes Daniel de Sousa Moraes Álvaro M da Veiga Álan L. V. Guedes and 1 more

Recent works have successfully applied some types of Convolutional Neural Networks (CNNs) to reduce the noticeable distortion resulting from lossy JPEG/MPEG compression technique. Most them are built upon processing made on spatial domain. In this work, we propose a MPEG video decoder that is purely based frequency-to-frequency domain: it reads quantized DCT coefficients received low-quality I-frames bitstream and, using deep learning-based model, predicts missing in order recompose same...

10.1109/ism.2020.00012 article EN 2020-12-01

A Clustering-Based Method for Automatic Educational Video Recommendation Using Deep Face-Features of Lecturers

OPENALEX - Publications

Paulo Renato da Costa Mendes Eduardo Silva Vieira Álan L. V. Guedes Antonio José G. Busson Sérgio Colcher

Discovering and accessing specific content within educational video bases is a challenging task, mainly because of the abundance its diversity. Recommender systems are often used to enhance ability find select content. But, recommendation mechanisms, especially those based on textual information, exhibit some limitations, such as being error-prone manually created keywords or due imprecise speech recognition. This paper presents method for generating recommendations using deep face-features...

10.1109/ism.2020.00034 article EN 2020-12-01

GingaSpace

OPENALEX - Publications

Álan L. V. Guedes L. Costa Fernando Santos De Mattos Brito Ana Paula Nunes Guimarães José Ivan Bezerra Vilarouca Filho and 2 more

Advances in interactive digital TV have enabled the introduction of application scenarios that explore Internet content and multiple device interaction. However, authorship interoperability for such is hampered by diversity technologies devices involved. This paper presents a software architecture portable store based on H.761 ITU recommendation IPTV services. The concept implemented as Ginga-NCL application, which retrieves executes other applications. description proposed architecture,...

10.1145/2526188.2526239 article EN 2013-11-05

Specification of Multimodal Interactions in NCL

OPENALEX - Publications

Álan L. V. Guedes Roberto Gerson de Albuquerque Azevedo Márcio Ferreira Moreno Luiz Fernando Gomes Soares

This paper proposes an approach to integrate multimodal events--both user-generated, e.g., audio recognizer, motion sensors; and user-consumed, speech synthesizer, haptic synthesizer--into programming languages for the declarative specification of multimedia applications. More precisely, it presents extensions NCL (Nested Context Language) language. is standard language development interactive applications Brazilian Digital TV ITU-T Recommendation IPTV services. extended with features are...

10.1145/2820426.2820436 article EN 2015-10-27

Coming Soon ...