- Advanced Graph Neural Networks
- Topic Modeling
- Machine Learning in Materials Science
- Data Quality and Management
- Semantic Web and Ontologies
- Cognitive Science and Mapping
- Complex Network Analysis Techniques
- X-ray Diffraction in Crystallography
- Simulation Techniques and Applications
- Bayesian Modeling and Causal Inference
- Digital Rights Management and Security
- Electron and X-Ray Spectroscopy Techniques
- Scientific Computing and Data Management
- Service-Oriented Architecture and Web Services
- Web Data Mining and Analysis
- Advanced Database Systems and Queries
- Machine Learning in Bioinformatics
- Molecular spectroscopy and chirality
- Library Science and Information Systems
- Computational Drug Discovery Methods
- Machine Learning and Algorithms
- Data Mining Algorithms and Applications
- Embedded Systems Design Techniques
- Crystallization and Solubility Studies
- History and advancements in chemistry
Intel (United States)
2023-2024
McGill University
2021-2022
Centre Universitaire de Mila
2022
Technische Universität Dresden
2020
ITMO University
2015-2018
University of Bonn
2015-2018
Fraunhofer Institute for Intelligent Analysis and Information Systems
2018
The heterogeneity in recently published knowledge graph embedding models' implementations, training, and evaluation has made fair thorough comparisons difficult. To assess the reproducibility of previously results, we re-implemented evaluated 21 models PyKEEN software package. In this paper, outline which results could be reproduced with their reported hyper-parameters, only alternate not at all, as well provide insight to why might case. We then performed a large-scale benchmarking on four...
Recently, transformer architectures for graphs emerged as an alternative to established techniques machine learning with graphs, such (message-passing) graph neural networks. So far, they have shown promising empirical results, e.g., on molecular prediction datasets, often attributed their ability circumvent networks' shortcomings, over-smoothing and over-squashing. Here, we derive a taxonomy of architectures, bringing some order this emerging field. We overview theoretical properties,...
Generating novel crystalline materials has potential to lead advancements in fields such as electronics, energy storage, and catalysis. The defining characteristic of crystals is their symmetry, which plays a central role determining physical properties. However, existing crystal generation methods either fail generate that display the symmetries real-world crystals, or simply replicate symmetry information from examples database. To address this limitation, we propose SymmCD,...
While machine learning on graphs has demonstrated promise in drug design and molecular property prediction, significant benchmarking challenges hinder its further progress relevance. Current practices often lack focus transformative, real-world applications, favoring narrow domains like two-dimensional over broader, impactful areas such as combinatorial optimization, relational databases, or chip design. Additionally, many benchmark datasets poorly represent the underlying data, leading to...
The link prediction task on knowledge graphs without explicit negative triples in the training data motivates usage of rank-based metrics. Here, we review existing metrics and propose desiderata for improved to address lack interpretability comparability datasets different sizes properties. We introduce a simple theoretical framework upon which investigate two avenues improvements via alternative aggregation functions concepts from probability theory. finally several new that are more easily...
Semantic computing and enterprise Linked Data have recently gained traction in enterprises. Although the concept of Enterprise Knowledge Graphs (EKGs) has meanwhile received some attention, a formal conceptual framework for designing such graphs not yet been developed. By EKG we refer to semantic network concepts, properties, individuals links representing referencing foundational domain knowledge relevant an enterprise. Through efforts reported this paper, aim bridge gap between increasing...
Artificial intelligence and machine learning have shown great promise in their ability to accelerate novel materials discovery. As researchers domain scientists seek unify consolidate chemical knowledge, the case for models with potential generalize across different tasks within science - so-called "foundation models" grows ambitions. This manuscript reviews our recent progress development of Open MatSci ML Toolkit, details experiments that lay groundwork foundation model research framework....
Foundation models that can perform inference on any new task without requiring specific training have revolutionized machine learning in vision and language applications. However, applications involving graph-structured data remain a tough nut for foundation models, due to challenges the unique feature- label spaces associated with each graph. Traditional graph ML such as neural networks (GNNs) trained graphs cannot feature different from ones. Furthermore, existing learn functions...
We propose MatSci ML, a novel benchmark for modeling MATerials SCIence using Machine Learning (MatSci ML) methods focused on solid-state materials with periodic crystal structures. Applying machine learning to is nascent field substantial fragmentation largely driven by the great variety of datasets used develop models. This makes comparing performance and generalizability different difficult, thereby hindering overall research progress in field. Building top open-source datasets, including...
For many years, link prediction on knowledge. graphs has been a purely transductive task, not allowing for reasoning unseen entities. Recently, increasing efforts are put into exploring semi- and fully inductive scenarios, enabling inference over emerging Still, all these approaches only consider triple-based KGs, whereas their richer counterparts, hyper-relational KGs (e.g., Wikidata), have yet properly studied. In this work, we classify different settings study the benefits of employing...
Foundation models in language and vision have the ability to run inference on any textual visual inputs thanks transferable representations such as a vocabulary of tokens language. Knowledge graphs (KGs) different entity relation vocabularies that generally do not overlap. The key challenge designing foundation KGs is learn enable graph with arbitrary vocabularies. In this work, we make step towards present ULTRA, an approach for learning universal representations. ULTRA builds relational...
Recent equivariant models have shown significant progress in not just chemical property prediction, but as surrogates for dynamical simulations of molecules and materials. Many the top performing this category are built within framework tensor products, which preserves equivariance by restricting interactions transformations to those that allowed symmetry selection rules. Despite being a core part modeling process, there has yet been much attention into understanding what information...
Formulating and answering logical queries is a standard communication interface for knowledge graphs (KGs). Alleviating the notorious incompleteness of real-world KGs, neural methods achieved impressive results in link prediction complex query tasks by learning representations entities, relations, queries. Still, most existing rely on transductive entity embeddings cannot generalize to KGs containing new entities without retraining embeddings. In this work, we study inductive task where...
The abundance of the data in Internet facilitates improvement extraction and processing tools. trend open publishing encourages adoption structured formats like CSV RDF. However, there is still a plethora unstructured on Web which we assume contain semantics. For this reason, propose an approach to derive semantics from web tables are most popular tool Web. paper also discusses methods services as well machine learning techniques enhance such workflow. eventual result framework process,...
The amount of Linked Data both open, made available on the Web, and private, exchanged across companies organizations, have been increasing in recent years. This data can be distributed form Knowledge Graphs (KGs), but maintaining these KGs is mainly responsibility owners or providers. Moreover, building applications top order to provide, for instance, analytics, access control, privacy left end user consumers. However, many resources terms development costs equipment are required by...
Compositional generalization, the ability of an agent to generalize unseen combinations latent factors, is easy for humans but hard deep neural networks. A line research in cognitive science has hypothesized a process, ``iterated learning,'' help explain how human language developed this ability; theory rests on simultaneous pressures towards compressibility (when ignorant learns from informed one) and expressivity it uses representation downstream tasks). Inspired by we propose improve...