NFDI4DS | UHH-SEMS - Publication Details

Oktie Hassanzadeh

ORCID: 0000-0001-5307-9857

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5068065546

Research Areas

Semantic Web and Ontologies
Data Quality and Management
Biomedical Text Mining and Ontologies
Topic Modeling
Natural Language Processing Techniques
Advanced Database Systems and Queries
Advanced Graph Neural Networks
Service-Oriented Architecture and Web Services
Advanced Text Analysis Techniques
Web Data Mining and Analysis
Computational Drug Discovery Methods
Multi-Agent Systems and Negotiation
Wikis in Education and Collaboration
AI-based Problem Solving and Planning
Scientific Computing and Data Management
Data Management and Algorithms
Graph Theory and Algorithms
Species Distribution and Climate Change
Privacy-Preserving Technologies in Data
Bioinformatics and Genomic Networks
Bayesian Modeling and Causal Inference
Text and Document Classification Technologies
Data Mining Algorithms and Applications
Pharmacovigilance and Adverse Drug Reactions
Data Visualization and Analytics

Alliance for Safe Kids
2024

IBM Research - Thomas J. Watson Research Center
2013-2023

IBM (United States)
2012-2023

University of Toronto
2007-2015

Sharif University of Technology
2005

Framework for evaluating clustering algorithms in duplicate detection

OPENALEX - Publications

Oktie Hassanzadeh Fei Chiang Hyun‐Chul Lee Renée J. Miller

The presence of duplicate records is a major data quality concern in large databases. To detect duplicates, entity resolution also known as duplication detection or record linkage used part the cleaning process to identify that potentially refer same real-world entity. We present Stringer system provides an evaluation framework for understanding what barriers remain towards goal truly scalable and general purpose algorithms. In this paper, we use evaluate clusters (groups potential...

10.14778/1687627.1687771 article EN Proceedings of the VLDB Endowment 2009-08-01

Linked open drug data for pharmaceutical research and development

OPENALEX - Publications

Matthias Samwald Anja Jentzsch Christopher M. L. S. Bouton Claus Stie Kallesøe Egon Willighagen and 6 more

There is an abundance of information about drugs available on the Web. Data sources range from medicinal chemistry results, over impact gene expression, to outcomes in clinical trials. These data are typically not connected together, which reduces ease with insights can be gained. Linking Open Drug (LODD) a task force within World Wide Web Consortium's (W3C) Health Care and Life Sciences Interest Group (HCLS IG). LODD has surveyed publicly drugs, created Linked representations sets,...

10.1186/1758-2946-3-19 article EN cc-by Journal of Cheminformatics 2011-05-16

Toward a complete dataset of drug–drug interaction information from publicly available sources

OPENALEX - Publications

Serkan Ayvaz John R. Horn Oktie Hassanzadeh Qian Zhu Johann Stan and 7 more

Although potential drug–drug interactions (PDDIs) are a significant source of preventable drug-related harm, there is currently no single complete PDDI information. In the current study, all publically available sources information that could be identified using comprehensive and broad search were combined into dataset. The dataset merged fourteen different including 5 clinically-oriented sources, 4 Natural Language Processing (NLP) Corpora, Bioinformatics/Pharmacovigilance sources. As...

10.1016/j.jbi.2015.04.006 article EN cc-by-nc-nd Journal of Biomedical Informatics 2015-04-25

Large-scale structural and textual similarity-based mining of knowledge graph to predict drug–drug interactions

OPENALEX - Publications

Ibrahim Abdelaziz Achille Fokoue Oktie Hassanzadeh Ping Zhang Mohammad Sadoghi

10.1016/j.websem.2017.06.002 article EN Journal of Web Semantics 2017-05-01

Schema management for document stores

OPENALEX - Publications

Lanjun Wang Shuo Zhang Juwei Shi Limei Jiao Oktie Hassanzadeh and 2 more

Document stores that provide the efficiency of a schema-less interface are widely used by developers in mobile and cloud applications. However, simplicity achieved controversially leads to complexity for data management due lack schema. In this paper, we present schema framework document stores. This discovers persists schemas JSON records repository, also supports queries summarization. The major technical challenge comes from varied structures caused model evolution. discovery phase, apply...

10.14778/2777598.2777601 article EN Proceedings of the VLDB Endowment 2015-05-01

Answering Binary Causal Questions Through Large-Scale Text Mining: An Evaluation Using Cause-Effect Pairs from Human Experts

OPENALEX - Publications

Oktie Hassanzadeh Debarun Bhattacharjya Mark Feblowitz Kavitha Srinivas Michael Perrone and 2 more

In this paper, we study the problem of answering questions type "Could X cause Y?" where and Y are general phrases without any constraints. Answering such will assist with various decision analysis tasks as verifying extending presumed causal associations used for making. Our goal is to analyze ability an AI agent built using state-of-the-art unsupervised methods in derived from collections cause-effect pairs human experts. We focus only on weakly supervised due difficulty creating a large...

10.24963/ijcai.2019/695 article EN 2019-07-28

Benchmarking declarative approximate selection predicates

OPENALEX - Publications

Amit Singh Chandel Oktie Hassanzadeh Nick Koudas Mohammad Sadoghi Divesh Srivastava

Declarative data quality has been an active research topic. The fundamental principle behind a declarative approach to is the use of statements realize primitives on top any relational source. A primary advantage such ease and integration with existing applications. Over last few years several similarity predicates have proposed for common (approximate selections, joins, etc) fully expressed using SQL statements. In this paper we propose new along their realization, based notions...

10.1145/1247480.1247521 article EN 2007-06-11

Creating probabilistic databases from duplicated data

OPENALEX - Publications

Oktie Hassanzadeh Renée J. Miller

10.1007/s00778-009-0161-2 article EN The VLDB Journal 2009-08-19

LinkedCT: A Linked Data Space for Clinical Trials

OPENALEX - Publications

Oktie Hassanzadeh Anastasios Kementsietsidis Lipyeow Lim Renée J. Miller Min Wang

The Linked Clinical Trials (LinkedCT) project aims at publishing the first open semantic web data source for clinical trials data. database exposed by LinkedCT is generated (1) transforming existing sources of into RDF, and (2) discovering links between records in several other sources. In this paper, we discuss challenges involved these two steps present methodology used to overcome challenges. Our approach link discovery involves using state-of-the-art approximate string matching...

10.48550/arxiv.0908.0567 preprint EN other-oa arXiv (Cornell University) 2009-01-01

A framework for semantic link discovery over relational data

OPENALEX - Publications

Oktie Hassanzadeh Anastasios Kementsietsidis Lipyeow Lim Renée J. Miller Min Wang

Discovering links between different data items in a single source or across sources is challenging problem faced by many information systems today. In particular, the recent Linking Open Data (LOD) community project has highlighted paramount importance of establishing semantic among web sources. Currently, LOD provide billions RDF triples, but only millions Many these are published using tools that operate over relational stored standard RDBMS. this paper, we present framework for discovery...

10.1145/1645953.1646084 article EN 2009-11-02

Discovering linkage points over web data

OPENALEX - Publications

Oktie Hassanzadeh Ken Q. Pu Soheil Hassas Yeganeh Renée J. Miller Lucian Popa and 2 more

A basic step in integration is the identification of linkage points, i.e., finding attributes that are shared (or related) between data sources, and can be used to match records or entities across sources. This usually performed using a operator, associates one database another. However, massive growth amount variety unstructured semi-structured on Web has created new challenges for this task. Such sources often do not have fixed pre-defined schema contain large numbers diverse attributes....

10.14778/2536336.2536345 article EN Proceedings of the VLDB Endowment 2013-04-01

A declarative framework for semantic link discovery over relational data

OPENALEX - Publications

Oktie Hassanzadeh Lipyeow Lim Anastasios Kementsietsidis Min Wang

In this paper, we present a framework for online discovery of semantic links from relational data. Our is based on declarative specification the linkage requirements by user, that allows matching data items in many real-world scenarios. These are translated to queries can run over source, potentially using knowledge enhance accuracy link discovery. lets publishers easily find and publish high-quality other sources, therefore could significantly value next generation web.

10.1145/1526709.1526876 article EN 2009-04-20

An analysis of one-to-one matching algorithms for entity resolution

OPENALEX - Publications

George Papadakis Vasilis Efthymiou Emmanouil Thanos Oktie Hassanzadeh Peter Christen

Abstract Entity resolution (ER) is the task of finding records that refer to same real-world entities. A common scenario, which we as Clean-Clean ER, resolve across two clean sources (i.e., they are duplicate-free and contain one record per entity). Matching algorithms for ER yield bipartite graphs, further processed by clustering produce end result. In this paper, perform an extensive empirical evaluation eight graph matching take input a similarity provide output set matched records. We...

10.1007/s00778-023-00791-3 article EN cc-by The VLDB Journal 2023-04-18

Coming Soon ...