- Advanced Database Systems and Queries
- Semantic Web and Ontologies
- Data Management and Algorithms
- Biomedical Text Mining and Ontologies
- Web Data Mining and Analysis
- Distributed systems and fault tolerance
- Complex Network Analysis Techniques
- Distributed and Parallel Computing Systems
- Caching and Content Delivery
- Bioinformatics and Genomic Networks
- Cloud Computing and Resource Management
- Scientific Computing and Data Management
- Data Quality and Management
- Genetics, Bioinformatics, and Biomedical Research
- Service-Oriented Architecture and Web Services
- Peer-to-Peer Network Technologies
- Data Mining Algorithms and Applications
- Stock Market Forecasting Methods
- Algorithms and Data Compression
- Gene expression and cancer classification
- Banking stability, regulation, efficiency
- Scheduling and Optimization Algorithms
- Advanced Data Storage Technologies
- Advanced Text Analysis Techniques
- Opinion Dynamics and Social Influence
University of Maryland, College Park
2015-2024
Simón Bolívar University
2015
Technion – Israel Institute of Technology
2009
Williams (United States)
2008
Arizona State University
2007
National Center for Supercomputing Applications
2005
Menlo School
2002
SRI International
2002
Research Institute for Advanced Computer Science
2000
Access to large numbers of data sources introduces new problems for users heterogeneous distributed databases. End and application programmers must deal with unavailable sources. Database administrators incorporating into the model. implementers translation queries between query languages schemas. The Distributed Information Search COmponent (Disco) addresses these problems. Query processing semantics are developed process over which do not return answers. Data modeling techniques manage...
Accessing many data sources aggravates problems for users of heterogeneous distributed databases. Database administrators must deal with fragile mediators, that is, mediators schemas and views be significantly changed to incorporate a new source. When implementing translators queries from sources, database implementers do not support all the functionality required by mediators. Application programmers graceless failures unavailable sources. Queries simply return failure no further...
Topic detection with large and noisy data collections such as social media must address both scalability accuracy challenges. KeyGraph is an efficient method that improves on current solutions by considering keyword cooccurrence. We show has similar when compared to state-of-the-art approaches small, well-annotated collections, it can successfully filter irrelevant documents identify events in collections. An extensive evaluation using Amazon’s Mechanical Turk demonstrated the increased high...
Drug-target interaction studies are important because they can predict drugs' unexpected therapeutic or adverse side effects. In silico predictions of potential interactions valuable and focus effort on in vitro experiments. We propose a prediction framework that represents the problem using bipartite graph drug-target augmented with drug-drug target-target similarity measures makes probabilistic soft logic (PSL). Using rules PSL, we models based triad tetrad structures. apply (blocking)...
Large scale disasters bring together a diversity of organizations and produce massive amounts heterogeneous data that must be managed by these organizations. The lack effective ICT solutions can lead to coordination chaos among organizations, as they track victims' needs respond the disaster. result delayed or ineffective response, potential wastage pledged support, imbalances in aid distribution, transparency. manage potentially improve efficiency effectiveness. Sahana is free open source...
It has been widely recognized that many future database applications, including engineering processes, manufacturing and communications, will require some kind of rule based reasoning. In this paper we study methods for storing manipulating large bases using relational management systems. First, provide a matching algorithm which can be used to efficiently identify applicable rules. The second contribution paper, is our proposal concurrent execution strategies surpass, in terms performance,...
There is an increase in the number of data sources that can be queried across WWW. Such typically support HTML forms-based interfaces and search engines query collections suitably indexed data. The displayed via a browser: One drawback to these there no standard programming interface suitable for applications submit queries. Second, output (answer query) not well structured. Structured objects have extracted from documents which contain irrelevant may volatile. Third, domain knowledge about...
The Distributed Information Search COmponent (DISCO) is a prototype heterogeneous distributed database that accesses underlying data sources. DISCO currently focuses on three central research problems in the context of these systems. First, since capabilities each source different, transforming queries into subqueries difficult. We call this problem weak problem. Second, performs operations generally unique way, cost for performing an operation may vary radically from one wrapper to another....
We consider an architecture of mediators and wrappers for Internet accessible WebSources limited query capability. Each call to a source is WebSource Implementation (WSI) it associated with both capability (a possibly dynamic) cost. The multiplicity WSIs varying costs capabilities increases the complexity traditional optimizer that must assign each remote relation in while generating (optimal) plan. present two-phase Web Query Optimizer (WQO). In pre-optimization phase, WQO selects one or...
We consider Cooperative Information Systems (CIS) that are multidatabase systems (MDBMS), with a common object-oriented model, based on the ODMG standard, together local databases may be relational, object-oriented, or dedicated data servers. The MDBMS interface (or mediator interface) describes this CIS could different from union of interfaces describe each database. In particular, defined by semantic knowledge includes views over particular databases, integrity constraints, and about...
The DARPA Intelligent Integration of Information (I 3 ) effort is based on the assumption that systems can easily exchange data. However, as a consequence rapid development research, and prototype implementations, in this area, initial outcome program appears to have been produce new set systems. While they perform certain advanced information integration tasks, cannot communicate with each other.With view understanding solving problem, there was group discussion at Information/Persistent...
Presents a technique for semantic query optimization (SQO) object databases. We use the ODMG-93 (Object Data Management Group) standard ODL Database Language) and OQL Query languages. The schema are translated into DATALOG representation. Semantic knowledge about model particular application is expressed as integrity constraints. This an extension of standard. SQO performed in representation, equivalent logic (subsequently) obtained. based on residue Chakravarthy et al. (1990). show that our...
The need for supply chain integration (SCI) methodologies has been increasing as a consequence of the globalization production and sales, advancement enabling information technologies. In this paper, we describe our experience with implementing modeling SCIs. We present architecture software components prototype implementation. then discuss variety sharing methodologies. Then, within framework multi-echelon process model spanning multiple organizations, summarize research on benefits...
Wide area data delivery requires timely propagation of up-to-date information to thousands clients over a wide network. Applications include web caching, RSS source monitoring, and email access via mobile Data sources vary widely in their update patterns may experience different rates at times or unexpected changes patterns. Traditional solutions are either push-based, which servers push updates clients, pull-based, require check for servers. While push-based ensure delivery, they not always...
Authority flow is an effective ranking mechanism for answering queries on a broad class of data. Systems have been developed to apply this principle the Web (PageRank and topic sensitive PageRank), bibliographic databases (ObjectRank), biological (Hubs Knowledge project). However, these systems following drawbacks: (a) There no way explain user why particular result received its current score; (b) The authority rates, which shown dramatically affect results' quality in ObjectRank, be set...