NFDI4DS | UHH-SEMS - Publication Details

Val Tannen

ORCID: 0009-0008-6847-7274

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5036528289

Research Areas

Advanced Database Systems and Queries
Scientific Computing and Data Management
Semantic Web and Ontologies
Data Management and Algorithms
Distributed and Parallel Computing Systems
Logic, Reasoning, and Knowledge
Research Data Management Practices
Logic, programming, and type systems
Service-Oriented Architecture and Web Services
Data Quality and Management
Distributed systems and fault tolerance
Advanced Data Storage Technologies
Advanced Algebra and Logic
Formal Methods in Verification
Business Process Modeling and Analysis
Data Mining Algorithms and Applications
Bayesian Modeling and Causal Inference
Parallel Computing and Optimization Techniques
Peer-to-Peer Network Technologies
Optimization and Search Problems
Algorithms and Data Compression
Cloud Computing and Resource Management
Genomics and Phylogenetic Studies
Genetics, Bioinformatics, and Biomedical Research
Mobile Agent-Based Network Management

University of Pennsylvania
2015-2024

California University of Pennsylvania
2006-2024

Pennsylvania State University
2024

Philadelphia University
1994-2023

Provenance semirings

OPENALEX - Publications

Todd J. Green Grigoris Karvounarakis Val Tannen

We show that relational algebra calculations for incomplete databases, probabilistic bag semantics and why-provenance are particular cases of the same general algorithms involving semirings. This further suggests a comprehensive provenance representation uses semirings polynomials. extend these considerations to datalog formal power series. give calculation as well evaluation databases. Finally, we some containment conjunctive queries is standard set semantics.

10.1145/1265530.1265535 article EN 2007-06-11

The iPlant Collaborative: Cyberinfrastructure for Plant Biology

OPENALEX - Publications

Stephen A. Goff Matthew Vaughn Sheldon McKay Eric Lyons Ann E. Stapleton and 60 more

The iPlant Collaborative (iPlant) is a United States National Science Foundation (NSF) funded project that aims to create an innovative, comprehensive, and foundational cyberinfrastructure in support of plant biology research (PSCIC, 2006). developing uniquely enables scientists throughout the diverse fields comprise address Grand Challenges new ways, stimulate facilitate cross-disciplinary research, promote computer science interactions, train next generation on use education. Meeting...

10.3389/fpls.2011.00034 article EN cc-by Frontiers in Plant Science 2011-01-01

Principles of programming with complex objects and collection types

OPENALEX - Publications

Peter Buneman Shamim A. Naqvi Val Tannen Limsoon Wong

We present a new principle for the development of database query languages that primitive operations should be organized around types. Viewing relational as consisting sets records, this dectates we investigate separately records and sets. There are two immediate advantages approach, which is partly inspired by basic ideas from category theoryl. First, it provides language structures in record set types may freely combined: nested relations or complex objects. Second, fundamental closely...

10.1016/0304-3975(95)00024-q article EN cc-by-nc-nd Theoretical Computer Science 1995-09-01

Comprehension syntax

OPENALEX - Publications

Peter Buneman Leonid Libkin Dan Suciu Val Tannen Limsoon Wong

The syntax of comprehensions is very close to the a number practical database query languages and is, we believe, better starting point than first-order logic for development languages. We give an informal account language based on comprehension that deals uniformly with variety collection types; it also includes pattern matching, variant types function definition. show, again informally, how natural fragment structural recursion, much more powerful programming paradigm types. show small...

10.1145/181550.181564 article EN ACM SIGMOD Record 1994-03-01

K2/Kleisli and GUS: Experiments in integrated access to genomic data sources

OPENALEX - Publications

Susan B. Davidson Jonathan Crabtree Brian P. Brunk Jonathan Schug Val Tannen and 2 more

The integrated access to heterogeneous data sources is a major challenge for the biomedical community. Several solution strategies have been explored: link-driven federation of databases, view integration, and warehousing. In this paper we report on our experiences with two systems that were developed at University Pennsylvania: K2, integration implementation, GUS, warehouse. Although warehouse approaches each advantages, there no clear "winner." Therefore, in selecting best strategy...

10.1147/sj.402.0512 article EN IBM Systems Journal 2001-01-01

Querying data provenance

OPENALEX - Publications

Grigoris Karvounarakis Zachary G. Ives Val Tannen

Many advanced data management operations (e.g., incremental maintenance, trust assessment, debugging schema mappings, keyword search over databases, or query answering in probabilistic databases), involve computations that look at how a tuple was produced, e.g., to determine its score existence. This requires answers queries such as, "Is this derivable from trusted tuples?"; "What tuples are derived relation?"; should answer receive, given initial scores of the base tuples?". Such questions...

10.1145/1807167.1807269 article EN 2010-06-06

Provenance for aggregate queries

OPENALEX - Publications

Yael Amsterdamer Daniel Deutch Val Tannen

We study in this paper provenance information for queries with aggregation. Provenance was studied the context of various query languages that do not allow aggregation, and recent work has suggested to capture by annotating different database tuples elements a commutative semiring propagating annotations through evaluation. show aggregate pose novel challenges rendering approach inapplicable. Consequently, we propose new approach, where annotate just but also individual values within tuples,...

10.1145/1989284.1989302 article EN 2011-06-13

Putting lipstick on pig

OPENALEX - Publications

Yael Amsterdamer Susan B. Davidson Daniel Deutch Tova Milo Julia Stoyanovich and 1 more

Workflow provenance typically assumes that each module is a "black-box", so output depends on all inputs ( coarse-grained dependencies). Furthermore, it does not model the internal state of module, which can change between repeated executions. In practice, however, an may depend only small subset fine-grained dependencies) as well module. We present novel framework marries database-style and workflow-style provenance, by using Pig Latin to expose functionality modules, thus capturing...

10.14778/2095686.2095693 article EN Proceedings of the VLDB Endowment 2011-12-01

BioKleisli: a digital library for biomedical researchers

OPENALEX - Publications

Susan B. Davidson Caroline Overton Val Tannen Limsoon Wong

10.1007/s007990050003 article EN International Journal on Digital Libraries 1997-04-01

Query reformulation with constraints

OPENALEX - Publications

Alin Deutsch Lucian Popa Val Tannen

Let Σ 1 , 2 be two schemas, which may overlap, C a set of constraints on the joint schema ∪ and q -query. An (equivalent) reformulation in presence is -query, such that gives same answers as any -database instance satisfies . In general, there exist multiple reformulations choosing among them require, for example, cost model.

10.1145/1121995.1122010 article EN ACM SIGMOD Record 2006-03-01

Annotated XML

OPENALEX - Publications

J. Nathan Foster Todd J. Green Val Tannen

We present a formal framework for capturing the provenance of data appearing in XQuery views XML. Building on previous work relations and their (positive) query languages, we decorate unordered XML with annotations from commutative semirings show that these suffice large positive fragment applied to this data. In addition tracking metadata, can be used represent process repetitions, incomplete XML, probabilistic provides basis enforcing access control policies security applications.

10.1145/1376916.1376954 article EN 2008-06-09

The ORCHESTRA Collaborative Data Sharing System

OPENALEX - Publications

Zachary G. Ives Todd J. Green Grigoris Karvounarakis Nicholas E. Taylor Val Tannen and 3 more

Sharing structured data today requires standardizing upon a single schema, then mapping and cleaning all of the data. This results in queriable mediated instance. However, for settings which is being collaboratively authored by large community, e.g., sciences, there often lack consensus about how it should be represented, what correct, sources are authoritative. Moreover, such seldom static: frequently updated, cleaned, annotated. The ORCHESTRA collaborative sharing system develops new...

10.1145/1462571.1462577 article EN ACM SIGMOD Record 2008-09-30

ORCHESTRA

OPENALEX - Publications

Todd J. Green Grigoris Karvounarakis Nicholas E. Taylor Olivier Biton Zachary G. Ives and 1 more

Article ORCHESTRA: facilitating collaborative data sharing Share on Authors: Todd J. Green University of Pennsylvania, Philadelphia, PA PAView Profile , Grigoris Karvounarakis Nicholas E. Taylor Olivier Biton Zachary G. Ives Val Tannen Authors Info & Claims SIGMOD '07: Proceedings the 2007 ACM international conference Management dataJune Pages 1131–1133https://doi.org/10.1145/1247480.1247631Published:11 June 41citation70DownloadsMetricsTotal Citations41Total Downloads70Last 12 Months20Last 6...

10.1145/1247480.1247631 article EN 2007-06-11

On provenance and privacy

OPENALEX - Publications

Susan B. Davidson Sanjeev Khanna Sudeepa Roy Julia Stoyanovich Val Tannen and 1 more

Provenance in scientific workflows is a double-edged sword. On the one hand, recording information about module executions used to produce data item, as well parameter settings and intermediate items passed between executions, enables transparency reproducibility of results. other workflow often contains private or confidential uses proprietary modules. Hence, providing exact answers provenance queries over all may reveal information. In this paper we discuss privacy concerns -- data,...

10.1145/1938551.1938554 article EN 2011-02-08

The Semiring Framework for Database Provenance

OPENALEX - Publications

Todd J. Green Val Tannen

Imagine a computational process that uses complex input consisting of multiple "items" (e.g.,files, tables, tuples, parameters, configuration rules) The provenance analysis such allows us to understand how the different items affect output computation. It can be used, for example, derive confidence in (given confidences items), minimum access clearance with classifications), minimize cost obtaining item pricing scheme). also applies probabilistic reasoning about an distributions), as well...

10.1145/3034786.3056125 article EN 2017-05-09

DBSP: Automatic Incremental View Maintenance for Rich Query Languages

OPENALEX - Publications

Mihai Budiu Tej Chajed Frank McSherry Leonid Ryzhyk Val Tannen

Incremental view maintenance (IVM) has long been a central problem in database theory. Many solutions have proposed for restricted classes of languages, such as the relational algebra, or Datalog. These techniques do not naturally generalize to richer languages. In this paper we give general, heuristic-free solution 3 steps: (1) describe simple but expressive language called DBSP describing computations over data streams; (2) new mathematical definition IVM and general algorithm solving...

10.14778/3587136.3587137 article EN Proceedings of the VLDB Endowment 2023-03-01

Coming Soon ...