Julián Moreno Schneider

ORCID: 0000-0003-1418-9935
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Natural Language Processing Techniques
  • Topic Modeling
  • Semantic Web and Ontologies
  • Artificial Intelligence in Law
  • Digital Humanities and Scholarship
  • Video Analysis and Summarization
  • Speech and dialogue systems
  • Hate Speech and Cyberbullying Detection
  • Flexible and Reconfigurable Manufacturing Systems
  • Scientific Computing and Data Management
  • Authorship Attribution and Profiling
  • Misinformation and Its Impacts
  • Web Data Mining and Analysis
  • Biomedical Text Mining and Ontologies
  • Digital Innovation in Industries
  • Speech Recognition and Synthesis
  • Image Processing and 3D Reconstruction
  • Text Readability and Simplification
  • Research Data Management Practices
  • European and International Law Studies
  • Advanced Database Systems and Queries
  • Spam and Phishing Detection
  • Mathematics, Computing, and Information Processing
  • Finance, Taxation, and Governance
  • Technology, Environment, Urban Planning

German Research Centre for Artificial Intelligence
2016-2024

Stuttgart University of Applied Sciences
2022

Deutsche Nationalbibliothek
2021

Allgemeine Unfallversicherungsanstalt
2021

Institut für Automatisierung und Informatik
2021

Universidad Católica San Antonio de Murcia
2020

Universidad Carlos III de Madrid
2010-2014

We present a system for the detection of stance headlines with regard to their corresponding article bodies. The approach can be applied in fake news, especially clickbait scenarios. component is part larger platform curation digital content; we consider veracity and relevancy an increasingly important curating online information. want contribute debate on how deal news related phenomena technological means, by providing means separate from unrelated further classifying headlines. On...

10.18653/v1/w17-4215 article EN cc-by 2017-01-01

In this paper, we focus on the classification of books using short descriptive texts (cover blurbs) and additional metadata. Building upon BERT, a deep neural language model, demonstrate how to combine text representations with metadata knowledge graph embeddings, which encode author information. Compared standard BERT approach achieve considerably better results for task. For more coarse-grained eight labels an F1- score 87.20, while detailed 343 yields F1-score 64.70. We make source code...

10.48550/arxiv.1909.08402 preprint EN cc-by arXiv (Cornell University) 2019-01-01

Growing concerns about climate change and sustainability are driving manufacturers to take significant steps toward reducing their carbon footprints. For these manufacturers, a first step towards this goal is identify the environmental impact of individual components products. We propose system leveraging large language models (LLMs) automatically map from manufacturer Bills Materials (BOMs) Life Cycle Assessment (LCA) database entries by using LLMs expand on available component information....

10.48550/arxiv.2502.07418 preprint EN arXiv (Cornell University) 2025-02-11

We describe a dataset developed for Named Entity Recognition in German federal court decisions. It consists of approx. 67,000 sentences with over 2 million tokens. The resource contains 54,000 manually annotated entities, mapped to 19 fine-grained semantic classes: person, judge, lawyer, country, city, street, landscape, organization, company, institution, court, brand, law, ordinance, European legal norm, regulation, contract, decision, and literature. documents were, furthermore,...

10.48550/arxiv.2003.13016 preprint EN other-oa arXiv (Cornell University) 2020-01-01

Georg Rehm, Stelios Piperidis, Kalina Bontcheva, Jan Hajic, Victoria Arranz, Andrejs Vasiļjevs, Gerhard Backfried, Jose Manuel Gomez-Perez, Ulrich Germann, Rémi Calizzano, Nils Feldhus, Stefanie Hegele, Florian Kintzel, Katrin Marheinecke, Julian Moreno-Schneider, Dimitris Galanis, Penny Labropoulou, Miltos Deligiannis, Katerina Gkirtzou, Athanasia Kolovou, Gkoumas, Leon Voukoutis, Ian Roberts, Jana Hamrlova, Dusan Varis, Lukas Kacena, Khalid Choukri, Valérie Mapelli, Mickaël Rigault, Julija...

10.18653/v1/2021.eacl-demos.26 article EN cc-by 2021-01-01

We describe our submissions for SemEval-2017 Task 8, Determining Rumour Veracity and Support Rumours. The Digital Curation Technologies (DKT) team at the German Research Center Artificial Intelligence (DFKI) participated in two subtasks: Subtask A (determining stance of a message) B veracity message, closed variant). In both cases, implementation consisted Multivariate Logistic Regression (Maximum Entropy) classifier coupled with hand-written patterns rules (heuristics) applied post-process...

10.18653/v1/s17-2085 article EN cc-by Proceedings of the 16th International Workshop on Semantic Evaluation (SemEval-2022) 2017-01-01

Recommender systems assist legal professionals in finding relevant literature for supporting their case. Despite its importance the profession, applications do not reflect latest advances recommender and representation learning research. Simultaneously, are typically evaluated small-scale user study without any public available benchmark datasets. Thus, these studies have limited reproducibility. To address gap between research practice, we explore a set of state-of-the-art document methods...

10.1145/3462757.3466073 article EN 2021-06-21

We present a data set consisting of German news articles labeled for political bias on five-point scale in semi-supervised way. While earlier work hyperpartisan detection uses binary classification (i.e., or not) and English data, we argue more fine-grained classification, covering the full spectrum far-left, left, centre, right, far-right) extending research to data. Understanding helps accurately detecting hate speech online abuse. experiment with different methods detection. Their...

10.18653/v1/2021.woah-1.13 article EN cc-by 2021-01-01

The EU-funded project Lynx focuses on the creation of a knowledge graph for legal domain (Legal Knowledge Graph, LKG) and its use semantic processing, analysis enrichment documents from domain. This article describes cases covered in project, entire developed platform services that operate documents.

10.1016/j.is.2021.101966 article EN cc-by Information Systems 2021-12-06

In all domains and sectors, the demand for intelligent systems to support processing generation of digital content is rapidly increasing. The availability vast amounts pressure publish new quickly in rapid succession requires faster, more efficient smarter methods. With a consortium ten partners from research industry broad range expertise AI, Machine Learning Language Technologies, QURATOR project, funded by German Federal Ministry Education Research, develops sustainable innovative...

10.48550/arxiv.2004.12195 preprint EN cc-by arXiv (Cornell University) 2020-01-01

Georg Rehm, Julian Moreno Schneider, Peter Bourgonje, Ankit Srivastava, Jan Nehring, Armin Berger, Luca König, Sören Räuchle, Jens Gerth. Proceedings of the Events and Stories in News Workshop. 2017.

10.18653/v1/w17-2707 article DE 2017-01-01

We explore to what extent knowledge about the pre-trained language model that is used beneficial for task of abstractive summarization. To this end, we experiment with conditioning encoder and decoder a Transformer-based neural on BERT model. In addition, propose new method BERT-windowing, which allows chunk-wise processing texts longer than window size. also how locality modelling, i.e., explicit restriction calculations local context, can affect summarization ability Transformer. This done...

10.48550/arxiv.2003.13027 preprint EN other-oa arXiv (Cornell University) 2020-01-01

With regard to the wider area of AI/LT platform interoperability, we concentrate on two core aspects: (1) cross-platform search and discovery resources services; (2) composition service workflows. We devise five different levels (of increasing complexity) interoperability that suggest implement in a federation platforms. illustrate approach using emerging platforms AI4EU, ELG, Lynx, QURATOR SPEAKER.

10.48550/arxiv.2004.08355 preprint EN other-oa arXiv (Cornell University) 2020-01-01

We present a prototypical content curation dashboard, to be used in the newsroom, and several of its underlying semantic analysis components (such as named entity recognition, linking, summarisation temporal expression analysis). The idea is enable journalists (a) process incoming (agency reports, twitter feeds, blog posts, social media etc.) (b) create new articles more easily efficiently. prototype system also allows automatic annotation events for purpose supporting identifying important,...

10.18653/v1/w17-4212 article EN cc-by 2017-01-01

Legal technology is currently receiving a lot of attention from various angles. In this contribution we describe the main technical components system that under development in European innovation project Lynx, which includes partners industry and research. The key paper workflow manager enables flexible orchestration workflows based on portfolio Natural Language Processing Content Curation services as well Multilingual Knowledge Graph contains semantic information meaningful references to...

10.48550/arxiv.2003.12900 preprint EN other-oa arXiv (Cornell University) 2020-01-01

Georg Rehm, Julián Moreno-Schneider, Jorge Gracia, Artem Revenko, Victor Mireles, Maria Khvalchik, Ilan Kernerman, Andis Lagzdins, Marcis Pinnis, Artus Vasilevskis, Elena Leitner, Jan Milde, Pia Weißenhorn. Proceedings of the Natural Legal Language Processing Workshop 2019.

10.18653/v1/w19-2207 article EN 2019-01-01
Coming Soon ...