Principles Of Distributed Database Systems Exercise Solutions Guide

Solving exercises in Distributed Database Systems requires a shift in perspective from local optimization to global system coordination.


Dr. Elara Vance stared at the error log. It wasn't just red; it was a deep, angry crimson that seemed to pulse on her terminal. Twenty-three nodes in her distributed database cluster, spread across three continents, were returning a "referential integrity anomaly." It was 3:00 AM. The CET-SAT simulation, a global test of their distributed financial ledger, had failed catastrophically.

"Not tonight," she whispered, kneading her temples. The exercise was simple in theory: execute a series of atomic transactions that moved virtual currency between accounts while maintaining ACID properties across the network. The solution, the beautiful theoretical proof on her whiteboard, had promised convergence. Reality, as always, had other plans.

The problem was a phantom read. A classic edge case in multi-version concurrency control (MVCC). Node Alpha in London and Node Gamma in Tokyo had both approved a withdrawal from the same phantom account within 50 milliseconds of each other. Their local timestamps had conflicted, and the global consensus protocol—a modified Paxos—had chosen both. Now the ledger was in a superposition of states: both rich and poor.

Elara pulled up her copy of the instructor's manual, Principles of Distributed Database Systems: Exercise Solutions. It wasn't a book she had written; rather, it was the accumulated wisdom of a hundred previous failures, curated by her mentor, Professor Hideo Tanaka. He called it "The Grimoire."

She flipped to Chapter 9: Global Commit Protocols. Exercise 9.4 read:

Problem: Two-phase commit (2PC) is blocking. Describe a scenario where a coordinator failure leads to an indefinite wait for subordinate nodes. Propose a remedy using three-phase commit (3PC) or Paxos.

The solution in the grimoire was clear. But her current problem wasn't just a blocking coordinator. It was a lying coordinator. Node Alpha's leader had crashed after sending "PREPARE" but before logging its decision. Upon recovery, it had no memory of the transaction. The other nodes, waiting for a "GLOBAL-COMMIT," had timed out and unilaterally aborted—except Node Gamma, which had already applied the withdrawal due to a rogue heuristic.

She reached for the physical, dog-eared copy of the Grimoire. Inside, a handwritten note from Professor Tanaka said: "The exercise is never the storm. The exercise is learning how to patch the hull while the storm is still raging."

The official solution to 9.4 was a Paxos-based replacement for 2PC. But Paxos assumes a fair leader. She didn't have a leader. She had anarchy.

So she closed the book. She would not follow the solution. She would extend it.

She opened a new terminal window and began to write a corrective algorithm. She called it the "Phoenix Commit."

Step 1 (Detect): Run a distributed diff on the write-ahead logs of all 23 nodes. Find the anomaly: transaction #A442.

Step 2 (Quarantine): 2PC is blocking. 3PC is non-blocking but assumes no network partitions. Phoenix Commit would assume a byzantine failure—a node that lies about its state. She instructed each node to broadcast not just its vote, but its entire log hash since the last global checkpoint.

Step 3 (Reconcile): Use a quorum of 15 nodes (a strict majority + 2) to rebuild the true sequence of events. The majority spoke: Node Gamma had acted alone. The withdrawal from account #LK-99 was invalid.

Step 4 (Heal): Issue a compensating transaction. Not a rollback (that would violate isolation in their current read-committed snapshot), but a reverse transfer with a zero-value timestamp. A ghost transaction that would cancel the error without ever having existed in the official timeline.

She typed the final command:

EXECUTE PHOENIX_COMMIT ('A442', 'HEAL');

Silence.

Then, one by one, the nodes turned from angry red to calm green. Node London. Node Singapore. Node São Paulo. Finally, Node Tokyo. All 23 nodes reported STATE: CONSISTENT. The ledger re-converged. The virtual accounts balanced. The CET-SAT simulation passed with a score of 99.9999%—the 0.0001% being the ephemeral trace of the ghost transaction, a scar that only Elara would ever know to look for.

She leaned back, exhausted. The principles from the textbook—atomicity, consistency, isolation, durability—weren't commandments. They were constraints. And the exercise solutions weren't recipes. They were starting points.

Professor Tanaka's voice echoed from a memory: "The best solution to a distributed systems problem is the one you don't have to deploy. The second best is the one that survives first contact with the enemy—which is always the network, the clock, or your own hubris."

Elara looked at her whiteboard, at the beautiful theoretical proof. Then she looked at her terminal, at the ugly, elegant, 47-line Phoenix Commit patch.

She saved the patch as exercise_9.4_vance_solution.pdf and added a new note to the Grimoire:

Addendum: The official solution works for 99% of failures. For the other 1%, you must be willing to forget the exercise and solve the principle. The principle is not "don't fail." The principle is "fail in a way you can survive."

Outside, dawn bled over the data center. The distributed database hummed, its 23 hearts beating in silent agreement. And Elara Vance, for the first time that night, smiled.

The storm had passed. The hull was patched. And the ledger was true.

Official exercise solutions for Principles of Distributed Database Systems

by M. Tamer Özsu and Patrick Valduriez (3rd and 4th editions) are primarily restricted to instructors. However, students can access several high-quality alternative resources for practice. University of Waterloo 1. Official Companion Sites (Instructor Restricted)

The authors provide companion websites for the latest editions. While these sites host presentation slides and errata for public download, full exercise solutions require instructor registration and evidence of course adoption. University of Waterloo 4th Edition Companion Site 3rd Edition Companion Site University of Waterloo 2. Available Public Study Resources

If you are looking for specific problem breakdowns, several academic and community platforms host partial solutions: Chapter-Specific Solutions : Platforms like host documents covering specific topics, such as Chapter 3: Distributed Database Design (Horizontal/Vertical Fragmentation). University Course Documents

: Some university portals host solution manuals or PDFs uploaded by students for study purposes, such as the Principles Of Distributed Database Systems Solution Manual

which covers key concepts like the CAP theorem and ACID properties. GitHub Tech Notes

: Developers and students often post personal notes and summaries of textbook exercises. For example, tech-notes

provides structured summaries of the principles discussed in the text. 3. Alternative Practice Resources

If you are using the book for self-study and cannot access the restricted solutions, consider these similar resources that provide open-access practice problems: Database System Concepts

: This textbook (Silberschatz, Korth, Sudarshan) provides a public Solution to Practice Exercises

page, which includes a dedicated section on distributed databases. Distributed Systems - Principles and Paradigms : The authors of this related text provide a comprehensive open PDF of solutions

for concepts like distribution transparency and failure recovery. Database System Concepts - 7th edition particular type of problem (e.g., fragmentation or concurrency control) to solve? Principles of Distributed Database Systems, Third Edition

Official exercise solutions for Principles of Distributed Database Systems

(by M. Tamer Özsu and Patrick Valduriez) are generally restricted to instructors. However, specific chapter solutions and study guides are available through academic platforms. 📖 Accessing Solutions Solving exercises in Distributed Database Systems requires a

Official Instructor Site: The authors provide a dedicated portal for the 4th Edition and 3rd Edition. Access typically requires a verified teaching account.

Chapter-Specific Previews: Detailed solutions for Chapter 3 (Distributed Database Design), including fragmentation and join graph exercises, can be found on Studocu.

Academic Repositories: Full solution manuals are sometimes uploaded to student resource sites like Course Hero. 💡 Sample Exercise Solution: Horizontal Fragmentation

Below is a summary of a common exercise from the text regarding Primary Horizontal Fragmentation: Problem: Derive fragments for an employee relation ASGcap A cap S cap G based on two applications: Accesses employees by their role (RESP). Accesses employees by their assignment duration (DUR). Solution Steps: Define Simple Predicates: Form Minterm Predicates: Combine role and duration (e.g.,

Create Fragments: Each non-empty minterm defines a fragment of the database (e.g., 🛠️ Key Topics Covered in Manuals

Principles of Distributed Database Systems: Exercise Solutions & Key Concepts

Mastering distributed database systems (DDBS) requires more than just reading theory; it demands a hands-on approach to solving complex architectural puzzles. Whether you are studying for an exam or designing a scalable system, working through exercise solutions is the best way to internalize how data moves across a network.

This guide explores the core principles of DDBS through the lens of common exercise problems and their practical solutions. 1. Data Fragmentation and Allocation

One of the first hurdles in any DDBS course is determining how to split a global relation into pieces (fragmentation) and where to store them (allocation). Exercise Scenario:

You have a global relation Employee (EmpID, Name, Dept, Salary, Location). You need to fragment this based on the query: "Find employees working in New York or London." Solution Approach:

Horizontal Fragmentation: This involves using a SELECT operation. You define fragments based on the Location attribute.

Vertical Fragmentation: If a query only needs Name and Salary, you would use a PROJECT operation to split columns rather than rows.

The Correctness Rules: Ensure your solution meets three criteria: Completeness (no data lost), Reconstruction (can join/union back to the original), and Disjointness (no unnecessary duplication). 2. Distributed Query Optimization

Querying a distributed system is expensive because of "communication costs." Exercises often ask you to calculate the cost of a Join operation across two different sites. Key Concept: Semijoins

A common solution to reduce data transfer is the Semijoin. Instead of sending an entire table across the network, you send only the joining column, filter the remote table, and send the smaller result back.

Exercise Tip: When asked to find the "optimal execution plan," always compare the total bytes transferred in a standard Join versus a Semijoin. The formula usually looks like: 3. Distributed Concurrency Control

How do you maintain consistency when multiple users edit the same data on different continents? Solution: Two-Phase Locking (2PL)

In distributed exercises, you'll often encounter the Centralized 2PL vs. Distributed 2PL debate.

Centralized: One site manages all locks. Simple, but a single point of failure.

Distributed: Each site manages locks for its own data. More resilient, but harder to detect Global Deadlocks.

Wait-Die vs. Wound-Wait: These are common algorithmic solutions for deadlock prevention.

Wait-Die: Older transaction waits for younger, younger dies. Wound-Wait: Older transaction "wounds" (preempts) younger. 4. Reliability and the Two-Phase Commit (2PC)

Reliability exercises often focus on what happens when a site or a link fails during a transaction. The 2PC Protocol Steps:

Voting Phase: The coordinator asks all participants if they are ready to commit.

Decision Phase: If all vote "Yes," the coordinator sends a "Global Commit." If any vote "No" or timeout, it sends a "Global Abort."

Common Problem: What happens if the coordinator fails after the voting phase?Solution: This is the "blocking problem" of 2PC. Participants may be left in an uncertain state, holding locks indefinitely until the coordinator recovers. This is why modern systems often look toward Three-Phase Commit (3PC) or Paxos/Raft consensus algorithms. 5. Parallelism and Data Replication

Modern exercises often touch on CAP Theorem (Consistency, Availability, Partition Tolerance).

Exercise Question: "Can a system be CA (Consistent and Available) during a network partition?"

Solution: No. During a partition (P), you must choose between Consistency (refusing the update to keep data uniform) or Availability (allowing the update even if other sites don't see it yet). Summary Checklist for Students

When looking for or writing solutions to distributed database problems, always check for:

Minimization of data transfer: Is there a way to do this with fewer bytes?

Transparency: Does the user feel like they are using a single database?

Site Autonomy: Can a single site function if the others go offline?

By applying these principles to your exercises, you move from theoretical knowledge to architectural expertise.

Principles of Distributed Database Systems

A distributed database system is a collection of multiple databases that are connected through a network, allowing users to access and share data across different locations. The main goals of a distributed database system are:

Key Concepts

Types of Distributed Database Systems

Exercise Solutions

Exercise 1: What are the main advantages of a distributed database system? Problem: Two-phase commit (2PC) is blocking

Solution: The main advantages of a distributed database system are:

Exercise 2: What is fragmentation in a distributed database system?

Solution: Fragmentation is the process of breaking a large database into smaller fragments, each stored at a different site.

Exercise 3: What is replication in a distributed database system?

Solution: Replication is the process of maintaining multiple copies of data at different sites to improve availability and performance.

Exercise 4: Consider a distributed database system with three sites: A, B, and C. Each site has a copy of a relation R. The relation R has the following tuples:

| ID | Name | Age | | --- | --- | --- | | 1 | John | 25 | | 2 | Jane | 30 | | 3 | Joe | 35 |

Site A has the following fragment of R:

| ID | Name | Age | | --- | --- | --- | | 1 | John | 25 | | 2 | Jane | 30 |

Site B has the following fragment of R:

| ID | Name | Age | | --- | --- | --- | | 2 | Jane | 30 | | 3 | Joe | 35 |

Site C has the following fragment of R:

| ID | Name | Age | | --- | --- | --- | | 1 | John | 25 | | 3 | Joe | 35 |

a. What is the fragmentation of R?

b. What is the replication factor of R?

Solution:

a. The fragmentation of R is:

R = R1 ∪ R2 ∪ R3

where R1, R2, and R3 are the fragments of R at sites A, B, and C, respectively.

b. The replication factor of R is 3, since there are three copies of R, one at each site.

Exercise 5: Consider a distributed database system with two sites: A and B. Site A has a relation R1, and site B has a relation R2. The relations R1 and R2 have the following tuples:

R1:

| ID | Name | Age | | --- | --- | --- | | 1 | John | 25 | | 2 | Jane | 30 |

R2:

| ID | Name | Age | | --- | --- | --- | | 3 | Joe | 35 | | 4 | Sarah | 20 |

Design a distributed query to retrieve all tuples from R1 and R2.

Solution:

The distributed query can be written as:

SELECT * FROM R1 UNION SELECT * FROM R2

This query retrieves all tuples from R1 at site A and R2 at site B, and combines them into a single result set.

Finding reliable exercise solutions for Principles of Distributed Database Systems

(by M. Tamer Özsu and Patrick Valduriez) usually involves looking for the official instructor manual or community-verified repositories.

Here is a breakdown of the core principles typically covered in those exercises, along with how to find specific solutions: Key Principles Covered in Exercises

If you are working through the textbook, most problems focus on these four pillars: Fragmentation & Distribution Design: Exercises often ask you to perform Horizontal Fragmentation (using predicates) or Vertical Fragmentation

(using affinity matrices). The goal is to minimize irrelevant data access. Query Decomposition & Optimization:

You'll likely encounter problems on converting Calculus to Algebra and using the Iterative Dynamic Programming algorithms to find the lowest-cost join order across sites. Distributed Concurrency Control: Solutions here focus on 2-Phase Locking (2PL) Timestamp Ordering

. A common exercise involves detecting "Global Deadlock" using a Distributed Wait-For Graph. Reliability & 2-Phase Commit (2PC):

Problems usually simulate a site or link failure and ask you to determine if the coordinator or participants can reach a decision (Commit/Abort) based on their logs. Where to Find Solutions University Course Pages:

Many professors (e.g., from Waterloo, ETH Zurich, or Georgia Tech) post "Assignment Solutions" for this specific curriculum. Searching for CS 448 solutions Distributed Databases syllabus PDF often yields direct answer keys. GitHub Repositories:

Search GitHub for "Özsu Valduriez solutions." Several graduate students have uploaded their worked-out LaTeX solutions for the 3rd and 4th editions. Publisher Resources: with R+W &gt

If you have an instructor's login, the official Pearson or Springer site provides the "Instructor’s Guide," which contains the definitive answers to every end-of-chapter problem. Quick Tip for Solving "Joins" When solving distributed join exercises, always check if a

is more efficient than a standard join. Reducing the size of the relation before shipping it across the network is almost always the "correct" answer in this textbook's context. specific problem (like a fragmentation matrix or a query tree)? AI responses may include mistakes. Learn more

Finding formal exercise solutions for the authoritative textbook Principles of Distributed Database Systems

(4th Edition, 2020) by M. Tamer Özsu and Patrick Valduriez can be challenging because the authors primarily restrict full solution manuals to instructors. University of Waterloo

However, you can access specific helpful resources and sample solutions through the following official and verified academic channels: 1. Official Textbook Resources The authors maintain a dedicated site at the University of Waterloo

for the 4th edition. While the full manual is restricted, this site is the most reliable source for: Solutions to Selected Exercises

: Links to specific PDFs containing verified answers for core chapters. Presentation Slides

: These often contain "in-class" examples and solved problems that mirror the exercises in the book.

: Crucial for ensuring you aren't trying to solve an exercise with a typo. Official Site Principles of Distributed Database Systems, 4th Ed 2. Verified Solutions for Key Concepts

Common exercises in this field often focus on specific algorithmic problems. You can find high-quality, solved examples for these topics on academic platforms: Data Fragmentation & Allocation

: Step-by-step solutions for vertical and horizontal fragmentation can be found on Distributed Query Optimization

: Look for solutions regarding join ordering and semijoin programs, which are frequently used in distributed systems homework. Concurrency Control

: Solutions involving Two-Phase Commit (2PC) and Paxos consensus algorithms are often provided in university course repositories like those at 3. Alternative Peer-to-Peer Learning

If official solutions are unavailable for a specific problem, these platforms host student-uploaded solution sets: CourseHero

: Hosts various versions of the "Principles of Distributed Database Systems Exercise Solutions" uploaded by students from institutions like GITAM University BITS Pilani Database System Concepts (Practice Site) : While for a different book, the Practice Exercises

by Silberschatz et al. provide publicly available solutions for overlapping topics like distributed transactions and deadlock. Course Hero

Official exercise solutions for the textbook "Principles of Distributed Database Systems" by M. Tamer Özsu and Patrick Valduriez are primarily reserved for instructors who teach courses using the book. However, select resources and examples of specific solutions are available through academic platforms and institutional sites. Official Instructor Resources

Access to the full, authorized solution manual is typically restricted to educators to maintain the integrity of student assessments:

Official Book Site: The Principles of Distributed Database Systems site notes that solutions are only available to verified instructors.

Requesting Access: If you are an instructor, you can often request these materials directly from the publisher or through the University of Waterloo CS faculty portal. Publicly Accessible Solution Samples

For students looking for practice or specific problem breakdowns, some chapters and problems have been shared online:

Fragmentation Exercise (Ch 3): A detailed solution for Primary Horizontal Fragmentation (Exercise 3.2) is available, illustrating how to derive minterm predicates for distributed design.

Technical Summaries: Platforms like GitHub host community-generated study notes that summarize key principles like CSMA/CD, network topologies (Bus, Star, Ring, Mesh), and data distribution strategies.

Assignment Banks: Academic sites like Scribd and Course Hero often host student-uploaded assignments and partial solution sets covering query processing and concurrency control. Key Concepts Covered in Exercises

Most solutions focus on the following foundational distributed principles:

Fragmentation & Allocation: Dividing relations into horizontal or vertical fragments and placing them across nodes.

Transparency: Exercises often ask to define or apply levels of transparency (location, fragmentation, replication).

Distributed Transactions: Implementation of ACID properties (Atomicity, Consistency, Isolation, Durability) across multiple sites.

Concurrency Control: Managing simultaneous data access using distributed locking or timestamp ordering.

Query Optimization: Calculating the cost of moving data versus local processing for global queries.

Are you working on a specific chapter or exercise number from the book that you need help with? Principles of Distributed Database Systems, Third Edition


Problem:
Relation Orders(OrderID, CustID, Amount) at site X (10,000 tuples).
Relation Customers(CustID, Name, City) at site Y (5,000 tuples).
Query: find orders from customers in ‘Paris’. Write a semi-join to reduce transmission.

Solution:
Semi-join reduces the size of the left operand before full join.

Step 1 – Project relevant attributes from right relation:
π_CustID(σ_City=‘Paris’(Customers)). Size: if 10% of customers in Paris → 500 CustIDs.

Step 2 – Send projection to site X:
Transmit 500 CustIDs (approx. 500*4 bytes = small).

Step 3 – Compute semi-join at site X:
Orders ⋉ Customers’ = σ_CustID in (Paris CustIDs)(Orders). Assume each customer has 5 orders → 2500 orders remain.

Step 4 – Send reduced Orders to site Y for final join:
Transmit 2500 tuples instead of 10,000. Savings: 75% reduction.

Answer:
Semi-join reduces cost significantly. The semi-join expression:
Orders ⋉ (π_CustID(σ_City=‘Paris’(Customers)))


You have a replicated data item across 5 sites (S1..S5). A quorum consensus protocol requires R readers and W writers, with R+W > N. Given failures or network partitions, determine if reads/writes succeed.