- Distributed systems and fault tolerance
- Advanced Data Storage Technologies
- Advanced Database Systems and Queries
- Cloud Computing and Resource Management
- Parallel Computing and Optimization Techniques
- Software System Performance and Reliability
- Service-Oriented Architecture and Web Services
- Optimization and Search Problems
- Teaching and Learning Programming
- Distributed and Parallel Computing Systems
- Real-Time Systems Scheduling
- Advanced Software Engineering Methodologies
- Petri Nets in System Modeling
- Business Process Modeling and Analysis
- Logic, programming, and type systems
- Software Engineering Research
- IoT and Edge/Fog Computing
- Scientific Computing and Data Management
- Data Quality and Management
- Formal Methods in Verification
- Software Reliability and Analysis Research
- Software Testing and Debugging Techniques
- Logic, Reasoning, and Knowledge
- Data Management and Algorithms
- Peer-to-Peer Network Technologies
The University of Sydney
2016-2025
Data61
2005-2015
Information Technology University
2008-2015
UNSW Sydney
2012-2014
Murdoch University
2014
University of Waikato
2014
Swinburne University of Technology
2014
Lite-On Technology Corporation (China)
2014
The University of Queensland
2014
University of Technology Sydney
2014
Replicating data across multiple centers allows using closer to the client, reducing latency for applications, and increases availability in event of a center failure. MDCC (Multi-Data Center Consistency) is an optimistic commit protocol geo-replicated transactions, that does not require master or static partitioning, strongly consistent at cost similar eventually protocols. takes advantage Generalized Paxos transaction processing exploits commutative updates with value constraints...
Snapshot Isolation (SI) is a multiversion concurrency control algorithm, first described in Berenson et al. [1995]. SI attractive because it provides an isolation level that avoids many of the common anomalies, and has been implemented by Oracle Microsoft SQL Server (with certain minor variations). does not guarantee serializability all cases, but TPC-C benchmark application [TPC-C], for example, executes under without serialization anomalies. All major database system products are delivered...
The foundation courses in computer science pose particular challenges for teacher and learner alike. This paper describes some of these how we have designed problem-based learning (PBL) to address them. We discuss the problems were keen overcome: purely technical focus many courses; individual need establish foundations a range areas which are important graduates. then outline our course design, showing created courses. reports evaluation approach. has two parts: assessment trial, with...
To minimize network latency and remain online during server failures partitions, many modern distributed data storage systems eschew transactional functionality, which provides strong semantic guarantees for groups of multiple operations over items. In this work, we consider the problem providing Highly Available Transactions (HATs): that do not suffer unavailability system partitions or incur high latency. We introduce a taxonomy highly available analyze existing ACID isolation consistency...
Minimizing coordination, or blocking communication between concurrently executing operations, is key to maximizing scalability, availability, and high performance in database systems. However, uninhibited coordination-free execution can compromise application correctness, consistency. When coordination necessary for correctness? The classic use of serializable transactions sufficient maintain correctness but not all applications, sacrificing potential scalability. In this paper, we develop a...
Many popular database management systems implement a multiversion concurrency control algorithm called snapshot isolation rather than providing full serializability based on locking. There are well-known anomalies permitted by that can lead to violations of data consistency interleaving transactions would maintain if run serially. Until now, the only way prevent these was modify applications introducing explicit locking or artificial update conflicts, following careful analysis conflicts...
One major benefit claimed for cloud computing is elasticity: the cost to a consumer of computation can grow or shrink with workload. This paper offers improved ways quantify elasticity concept, using data available consumer. We define measure that reflects financial penalty particular consumer, from under-provisioning (leading unacceptable latency unmet demand) over-provisioning (paying more than necessary resources needed support workload). have applied several workloads public cloud; our...
Databases can provide scalability by partitioning data across several servers. However, multi-partition, multi-operation transactional access is often expensive, employing coordination-intensive locking, validation, or scheduling mechanisms. Accordingly, many real-world systems avoid mechanisms that useful semantics for multi-partition operations. This leads to incorrect behavior a large class of applications including secondary indexing, foreign key enforcement, and materialized view...
Many popular database management systems offer snapshot isolation rather than full serializability. There are well-known anomalies permitted by that can lead to violations of data consistency interleaving transactions individually maintain consistency. Until now, the only way prevent these was modify applications introducing artificial locking or update conflicts, following careful analysis conflicts between all pairs transactions.
Several common DBMS engines use the multi- version concurrency control mechanism called Snapshot Isolation, even though application programs can experience non- serializable executions when run concurrently on such a platform. proposals exist for modifying programs, without changing their semantics, so that they are certain to execute serializably an engine uses SI. We evaluate performance impact of these proposals, and find some have limited (only few percent drop in throughput at given...
Causal consistency is the strongest model that available in presence of partitions and provides useful semantics for human-facing distributed services. Here, we expose its serious inherent scalability limitations due to write propagation requirements traditional dependency tracking mechanisms. As an alternative classic potential causality, advocate use explicit or application-defined happens-before relations. Explicit a subset tracks only relevant dependencies reduces several dangers causal...
Database system benchmarks like TPC-C and TPC-E focus on emulating database applications to compare different DBMS implementations. These use carefully constructed queries executed within the context of transactions exercise specific RDBMS features, measure throughput achieved. Cloud services benchmark frameworks YCSB, other hand, are designed for performance evaluation distributed NoSQL key-value stores, early examples which did not support transactions, so single operations that inside...
The rise of data-intensive "Web 2.0" Internet services has led to a range popular new programming frameworks that collectively embody the latest incarnation vision Object-Relational Mapping (ORM) systems, albeit at unprecedented scale. In this work, we empirically investigate modern ORM-backed applications' use and disuse database concurrency control mechanisms. Specifically, focus our study on common feral, or application-level, mechanisms for maintaining integrity, which, across ORM often...
The traditional architecture for a DBMS engine has the recovery, concurrency control and access method code tightly bound together in storage records. We propose different approach, where is factored into two layers (each of which might have multiple heterogeneous instances). A Transactional Component (TC) works at logical level only: it knows about transactions their "logical" undo/redo but does not know page layout, B-trees etc. Data (DC) physical structure. It supports record oriented...
Cloud computing has attracted attention as an important platform for software deployment, with perceived benefits such elasticity to fluctuating load, and reduced operational costs compared running in enterprise data centers. While some is written from scratch specially the cloud, many organizations also wish migrate existing applications a cloud platform. Such migration exercise not easy: changes need be made deal differences environment, programming model storage APIs, well varying...
Group communication services are becoming accepted as effective building blocks for the construction of fault-tolerant distributed applications. Many specifications group have been proposed. However, there is still no agreement about what these should say, especially in cases where partitionable , i.e., failures may lead to simultaneous creation groups with disjoint memberships, such that each unware existence any other group. In this paper, we present a new, succinct specification...
Server-side component technologies such as Enterprise JavaBeans (EJBs), .NET, and CORBA are commonly used in enterprise applications that have requirements for high performance scalability. When designing applications, architects must select suitable technology platform application architecture to provide the required performance. This is challenging no methods or tools exist predict without building a significant prototype version subsequent benchmarking. In this paper, we present an...
A database supporting multiple versions of records may use the to support queries past or increase concurrency by enabling reads and writes be concurrent. We introduce a new control approach that enables all SQL isolation levels including serializability utilize while also transaction time functionality. The key insight is manage range possible timestamps for each captures impact conflicts have occurred. Using these ranges as constraints often permits concurrent access where lock based would...
This paper presents a method that allows replicated database system to provide global isolation level stronger than the provided on each individual replica. We propose new multi-version concurrency control algorithm called, serializable generalized snapshot (SGSI), targets middleware systems. Each replica runs locally and replication guarantees one-copy serializability. introduce novel techniques level, namely readset extraction enhanced certification prevents read-write write-write...
Modern implementations of DBMS software are intended to take advantage high core counts that becoming common in high-end servers. However, we have observed several database platforms, including MySQL, Shore-MT, and a commercial system, exhibit throughput collapse as load increases, even for workload with little or no logical contention locks. Our analysis MySQL identifies latch within the lock manager bottleneck responsible this collapse.