NFDI4DS | UHH-SEMS - Publication Details

Florian Matthes

ORCID: 0000-0002-6667-5452

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5022973212

Research Areas

Topic Modeling
Service-Oriented Architecture and Web Services
Software Engineering Techniques and Practices
Advanced Database Systems and Queries
Natural Language Processing Techniques
Software Engineering Research
Semantic Web and Ontologies
Business Process Modeling and Analysis
Advanced Software Engineering Methodologies
Software System Performance and Reliability
Artificial Intelligence in Law
Blockchain Technology Applications and Security
Distributed systems and fault tolerance
Information Technology Governance and Strategy
Privacy-Preserving Technologies in Data
Privacy, Security, and Data Protection
Collaboration in agile enterprises
Advanced Text Analysis Techniques
Biomedical Text Mining and Ontologies
Cryptography and Data Security
Data Management and Algorithms
Distributed and Parallel Computing Systems
Data Quality and Management
Scientific Computing and Data Management
Usability and User Interface Design

Technical University of Munich
2016-2025

Universität Bayern
2024

Afe Babalola University
2020

München Klinik
2017-2019

Siemens (Germany)
2017

Information Technology University
2013-2016

University of Edinburgh
2012

Software (Spain)
2006

Software (Germany)
1999-2001

Hamburg University of Technology
1994-2000

Modeling aspects of the language of life through transfer-learning protein sequences

OPENALEX - Publications

Michael Heinzinger Ahmed Elnaggar Yu Wang Christian Dallago Dmitrii Nechaev and 2 more

Predicting protein function and structure from sequence is one important challenge for computational biology. For 26 years, most state-of-the-art approaches combined machine learning evolutionary information. However, some applications retrieving related proteins becoming too time-consuming. Additionally, information less powerful small families, e.g. the Dark Proteome. Both these problems are addressed by new methodology introduced here.We a novel way to represent sequences as continuous...

10.1186/s12859-019-3220-8 article EN cc-by BMC Bioinformatics 2019-12-01

Evaluating Natural Language Understanding Services for Conversational Question Answering Systems

OPENALEX - Publications

Daniel Braun Adrian Hernandez-Mendez Florian Matthes Manfred Langen

Conversational interfaces recently gained a lot of attention. One the reasons for current hype is fact that chatbots (one particularly popular form conversational interfaces) nowadays can be created without any programming knowledge, thanks to different toolkits and so-called Natural Language Understanding (NLU) services. While these NLU services are already widely used in both, industry science, so far, they have not been analysed systematically. In this paper, we present method evaluate...

10.18653/v1/w17-5522 article EN cc-by 2017-01-01

Revealing the landscape of privacy-enhancing technologies in the context of data markets for the IoT: A systematic literature review

OPENALEX - Publications

Gonzalo Munilla Garrido Johannes Sedlmeir Ömer Uludağ Ilias Soto Alaoui André Luckow and 1 more

10.1016/j.jnca.2022.103465 article EN Journal of Network and Computer Applications 2022-07-16

Revealing the state of the art of large-scale agile development research: A systematic mapping study

OPENALEX - Publications

Ömer Uludağ Pascal Philipp Abheeshta Putta Maria Paasivaara Casper Lassenius and 1 more

10.1016/j.jss.2022.111473 article EN Journal of Systems and Software 2022-08-09

Identifying and Structuring Challenges in Large-Scale Agile Development Based on a Structured Literature Review

OPENALEX - Publications

Ömer Uludağ Martin Kleehaus Christoph Caprano Florian Matthes

Over the last two decades, agile methods have transformed and brought unique changes to software development practice by strongly emphasizing team collaboration, customer involvement, change tolerance. The success of for small, co-located teams has inspired organizations increasingly apply practices large-scale efforts. Since these are originally designed small teams, unprecedented challenges occur when introducing them at larger scale, such as inter-team coordination communication,...

10.1109/edoc.2018.00032 article EN 2018-10-01

DID and VC:Untangling Decentralized Identifiers and Verifiable Credentials for the Web of Trust

OPENALEX - Publications

Clemens Brunner Ulrich Gallersdörfer Fabian Knirsch Dominik Engel Florian Matthes

Decentralized identifiers and verifiable credentials have been proposed as a self-sovereign privacy-friendly alternative to centralized proprietary authentication services. Currently, W3C standard exists that attempts unify existing proposals find common layer for decentralized identification verification. However, there are some limitations of in comparison established, centrally controlled platforms concerning trust, privacy usability. In this paper, we first describe all workflows which...

10.1145/3446983.3446992 article EN 2020-12-14

PatternRank: Leveraging Pretrained Language Models and Part of Speech for Unsupervised Keyphrase Extraction

OPENALEX - Publications

Tim Schopf Simon Klimek Florian Matthes

Keyphrase extraction is the process of automatically selecting a small set most relevant phrases from given text.Supervised keyphrase approaches need large amounts labeled training data and perform poorly outside domain (Bennani-Smires et al., 2018).In this paper, we present PatternRank, which leverages pretrained language models part-of-speech for unsupervised single documents.Our experiments show PatternRank achieves higher precision, recall F 1 -scores than previous state-of-the-art...

10.5220/0011546600003335 preprint EN cc-by-nc-nd 2022-01-01

Evaluating Unsupervised Text Classification: Zero-shot and Similarity-based Approaches

OPENALEX - Publications

Tim Schopf Daniel Braun Florian Matthes

Text classification of unseen classes is a challenging Natural Language Processing task and mainly attempted using two different types approaches. Similarity-based approaches attempt to classify instances based on similarities between text document representations class description representations. Zero-shot aim generalize knowledge gained from training by assigning appropriate labels unknown documents. Although existing studies have already investigated individual these categories, the...

10.1145/3582768.3582795 preprint EN 2022-12-16

Investigating the Role of Architects in Scaling Agile Frameworks

OPENALEX - Publications

Ömer Uludağ Martin Kleehaus Xu Xian Florian Matthes

This study describes the roles of architects in scaling agile frameworks with help a structured literature review. We aim to provide primary analysis 20 identified frameworks. Subsequently, we thoroughly describe three popular frameworks: Scaled Agile Framework, Large Scale Scrum, and Disciplined 2.0. After specifying main concepts frameworks, characterize enterprise, software, solution, information architects, as four Finally, discussion generalizable findings on role

10.1109/edoc.2017.25 article EN 2017-10-01

Modeling the language of life – Deep Learning Protein Sequences

OPENALEX - Publications

Michael Heinzinger Ahmed Elnaggar Yu Wang Christian Dallago Dmitrii Nechaev and 2 more

Abstract Background One common task in Computational Biology is the prediction of aspects protein function and structure from their amino acid sequence. For 26 years, most state-of-the-art approaches toward this end have been marrying machine learning evolutionary information. The retrieval related proteins ever growing sequence databases becoming so time-consuming that analysis entire proteomes becomes challenging. On top, information less powerful for small families, e.g. Dark Proteome ....

10.1101/614313 preprint EN cc-by-nc-nd bioRxiv (Cold Spring Harbor Laboratory) 2019-04-19

CarMem: Enhancing Long-Term Memory in LLM Voice Assistants through Category-Bounding

OPENALEX - Publications

Johannes Kirmayr Lukas Stappen Phillip Schneider Florian Matthes Elisabeth André

In today's assistant landscape, personalisation enhances interactions, fosters long-term relationships, and deepens engagement. However, many systems struggle with retaining user preferences, leading to repetitive requests disengagement. Furthermore, the unregulated opaque extraction of preferences in industry applications raises significant concerns about privacy trust, especially regions stringent regulations like Europe. response these challenges, we propose a memory system for voice...

10.48550/arxiv.2501.09645 preprint EN arXiv (Cornell University) 2025-01-16

CRSet: Non-Interactive Verifiable Credential Revocation with Metadata Privacy for Issuers and Everyone Else

OPENALEX - Publications

Felix Hoops Jonas Gebele Florian Matthes

Like any digital certificate, Verifiable Credentials (VCs) require a way to revoke them in case of an error or key compromise. Existing solutions for VC revocation, most prominently Bitstring Status List, are not viable many use cases since they leak the issuer's behavior, which turn leaks internal business metrics. For instance, exact staff fluctuation through issuance and revocation employee IDs. We introduce CRSet, mechanism that allows issuer encode information years worth VCs as Bloom...

10.48550/arxiv.2501.17089 preprint EN arXiv (Cornell University) 2025-01-28

Pandora's Box: Cross-Chain Arbitrages in the Realm of Blockchain Interoperability

OPENALEX - Publications

Burak Öz Christof Ferreira Torres Jonas Gebele Filip Rezabek Bruno Mazorra and 1 more

Over recent years, the blockchain ecosystem has grown significantly with emergence of new Layer-1 (L1) and Layer-2 (L2) networks. These blockchains typically host Decentralized Exchanges (DEXes) for trading assets such as native currencies stablecoins. While this diversity enriches ecosystem, it also fragments liquidity, posing challenges DEXes offering same across multiple blockchains. This fragmentation leads to price discrepancies, creating opportunities like arbitrages profit-seeking...

10.48550/arxiv.2501.17335 preprint EN arXiv (Cornell University) 2025-01-28

On the Impact of Noise in Differentially Private Text Rewriting

OPENALEX - Publications

Stephen Meisenbacher Maulik Chevli Florian Matthes

The field of text privatization often leverages the notion $\textit{Differential Privacy}$ (DP) to provide formal guarantees in rewriting or obfuscation sensitive textual data. A common and nearly ubiquitous form DP application necessitates addition calibrated noise vector representations text, either at data- model-level, which is governed by privacy parameter $\varepsilon$. However, almost undoubtedly leads considerable utility loss, thereby highlighting one major drawback NLP. In this...

10.48550/arxiv.2501.19022 preprint EN arXiv (Cornell University) 2025-01-31

Lexical Substitution is not Synonym Substitution: On the Importance of Producing Contextually Relevant Word Substitutes

OPENALEX - Publications

Juraj Vladika Stephen Meisenbacher Florian Matthes

Lexical Substitution is the task of replacing a single word in sentence with similar one. This should ideally be one that not necessarily only synonymous, but also fits well into surrounding context target word, while preserving sentence's grammatical structure. Recent advances have leveraged masked token prediction Pre-trained Language Models to generate replacements for given sentence. With this technique, we introduce ConCat, simple augmented approach which utilizes original bolster...

10.48550/arxiv.2502.04173 preprint EN arXiv (Cornell University) 2025-02-06

How Can CoPs in Scaled Agile Settings Look Like? Toward a Taxonomy for CoPs in Large-Scale Agile Software Development

OPENALEX - Publications

Franziska Tobisch Johannes Schmidt Florian Matthes

10.24251/hicss.2025.892 article EN Proceedings of the ... Annual Hawaii International Conference on System Sciences/Proceedings of the Annual Hawaii International Conference on System Sciences 2025-01-01

On the Influence of Context Size and Model Choice in Retrieval-Augmented Generation Systems

OPENALEX - Publications

Juraj Vladika Florian Matthes

Retrieval-augmented generation (RAG) has emerged as an approach to augment large language models (LLMs) by reducing their reliance on static knowledge and improving answer factuality. RAG retrieves relevant context snippets generates based them. Despite its increasing industrial adoption, systematic exploration of components is lacking, particularly regarding the ideal size provided context, choice base LLM retrieval method. To help guide development robust systems, we evaluate various...

10.48550/arxiv.2502.14759 preprint EN arXiv (Cornell University) 2025-02-20

Step-by-Step Fact Verification System for Medical Claims with Explainable Reasoning

OPENALEX - Publications

Juraj Vladika Ivana Hacajová Florian Matthes

Fact verification (FV) aims to assess the veracity of a claim based on relevant evidence. The traditional approach for automated FV includes three-part pipeline relying short evidence snippets and encoder-only inference models. More recent approaches leverage multi-turn nature LLMs address as step-by-step problem where questions inquiring additional context are generated answered until there is enough information make decision. This iterative method makes process rational explainable. While...

10.48550/arxiv.2502.14765 preprint EN arXiv (Cornell University) 2025-02-20

Lexical Substitution is not Synonym Substitution: On the Importance of Producing Contextually Relevant Word Substitutes

OPENALEX - Publications

Juraj Vladika Stephen Meisenbacher Florian Matthes

10.5220/0013389000003890 article EN Proceedings of the 14th International Conference on Agents and Artificial Intelligence 2025-01-01

Which Factors Influence the Success of Communities of Practices in Large Agile Organizations, and How Are They Related?

OPENALEX - Publications

Franziska Tobisch Johannes Schmidt Ahmet Şentürk Florian Matthes

10.5220/0013200000003929 article EN 2025-01-01

Coming Soon ...