CS542 CS542 Topics Topics in in Distributed Distributed
22 Slides482.50 KB
CS542 CS542 Topics Topics in in Distributed Distributed Systems Systems Diganta Goswami
Communication Communication Modes Modes in in Distributed Distributed System System Unicast (best effort or reliable) Messages are sent from exactly one process to one process. Best effort: if a message is delivered it would be intact; no reliability guarantees. Reliable: guarantees delivery of messages. Broadcast Messages are sent from exactly one process to all processes on the network. Broadcast protocols are not practical. Multicast Messages broadcast within a group of processes. A multicast message is sent from any one process to the group of processes on the network. Reliable multicast can be implemented “above” (i.e., “using”) a reliable unicast. This lecture!
Other Other Examples Examples of of Multicast Multicast Use Use Akamai’s Configuration Management System (called ACMS) uses a core group of 3-5 servers. These servers continuously multicast to each other the latest updates. They use reliable multicast. After an update is reliably multicast within this group, it is then sent out to all the (1000s of) servers Akamai has all over the world. Air Traffic Control System: orders by one ATC need to be ordered (and reliable) multicast out to other ATC’s. Newsgroup servers multicast to each other in a reliable and ordered manner. Facebook servers multicast your updates to each other
What’re What’re we we designing designing in in this this class class Application (at process p) One process p send multicast deliver multicast (upcall) MULTICAST PROTOCOL Incoming messages
Basic Basic Multicast Multicast (B(B Let’s assume the all processes know the group multicast) multicast) membership A straightforward way to implement B-multicast is to use a reliable one-to-one send (unicast) operation: – B-multicast(group g, message m): for each process p in g, send (p,m). – receive(m): B-deliver(m) at p. A “correct” process a “non-faulty” process A basic multicast primitive guarantees a correct process will eventually deliver the message, as long as the sender (multicasting process) does not crash. – Can we provide reliability even when the sender crashes (after it has sent the multicast)?
Reliable Reliable Multicast Multicast Integrity: A correct (i.e., non-faulty) process p delivers a message m at most once. Validity: If a correct process multicasts (sends) message m, then it will eventually deliver m itself. – Guarantees liveness to the sender. Agreement: If some one correct process delivers message m, then all other correct processes in group(m) will eventually deliver m. – Property of “all or nothing.” – Validity and agreement together ensure overall liveness: if some correct process multicasts a message m, then, all correct processes deliver m too.
Reliable Reliable R-Multicast R-Multicast Algorithm Algorithm R-multicast “USES” B-multicast reliable unicast “USES”
Reliable Reliable Multicast Multicast Algorithm Algorithm (R-multicast) (R-multicast) Integrity Agreement Integrity, Validity if some correct process B-multicasts a message m, then, all correct processes R-deliver m too. If no correct process B-multicasts m, then no correct processes R-deliver m.
What What about about Multicast Multicast Ordering? Ordering? FIFO ordering: If a correct process issues multicast(g,m) and then multicast(g,m’), then every correct process that delivers m’ will have already delivered m. Causal ordering: If multicast(g,m) multicast(g,m’) then any correct process that delivers m’ will have already delivered m. Total ordering: If a correct process delivers message m before m’ (independent of the senders), then any other correct process that delivers m’ will have already delivered m.
Total, Total, FIFO FIFO and and Causal Causal Ordering Ordering T1 Totally ordered messages T1 and T2. FIFO-related messages F1 and F2. Causally related messages C1 and C3 Causal ordering implies FIFO ordering (why?) Total ordering does not imply causal ordering. Causal ordering does not imply total ordering. Hybrid mode: causal-total ordering, FIFO-total ordering. T2 F1 F3 F2 Time C1 C2 P1 C3 P2 P3
Display Display From From Newsgroup Newsgroup Newsgroup: os.interesting Item From Subject 23 A.Hanlon Mach 24 G.Joseph Microkernels 25 A.Hanlon Re: Microkernels 26 T.L’Heureux RPC performance 27 M.Walker Re: Mach end What is the most appropriate ordering for this application? (a) FIFO (b) causal (c) total What is the most appropriate ordering for Facebook posts?
Providing Providing Ordering Ordering Guarantees Guarantees (FIFO) (FIFO) Look at messages from each process in the order they were sent: Each process keeps a sequence number for each other process (vector) When a message is received, If Message# is accept as expected (next sequence), higher than expected, buffer in a queue lower than expected, reject
Implementing Implementing FIFO FIFO Ordering Ordering Spg: the number of messages p has sent to g. Rqg: the sequence number of the latest group-g message that p has delivered from q (maintained for all q at p) For p to FO-multicast m to g – p increments Spg by 1. – p “piggy-backs” the value Spg onto the message. – p B-multicasts m to g. At process p, Upon receipt of m from q with sequence number S: – p checks whether S Rqg 1. If so, p FO-delivers m and increments R qg – If S Rqg 1, p places the message in the hold-back queue until the intervening messages have been delivered and S Rqg 1. – If S Rqg 1, reject m
Hold-back Hold-back Queue Queue for for Arrived Arrived Multicast Multicast Messages Messages Message processing deliver Hold-back queue Incoming messages Delivery queue When delivery guarantees are met
Example: Example: FIFO FIFO Multicast Multicast (do NOT confuse with vector timestamps) “Accept” Deliver Physical Time P1 P2 P3 000 000 100 200 1 2 1 Reject: 1 1 1 Accept: 2 1 1 2 210 Accept Reject: 1 01 1 1 000 000 Accept 1 0 1 200 Sequence Vector 100 200 210 Accept: 1 0 1 Buffer 2 0 1 000 210 1 1 200 100 210 Accept Buffer 2 1 1
Total Total Ordering Ordering Using Using aa Sequencer Sequencer Sequencer Leader process
ISIS: ISIS: Total Total ordering ordering without without sequencer sequencer P2 1 Message 3 22 2 1 3 Agreed Seq 1 2 P1 3 P3 eq S d ose p o r P P4
ISIS ISIS algorithm algorithm for for total total ordering ordering 1. The multicast sender multicasts the message to everyone. 2. Recipients add the received message to a special queue called the priority queue, tag the message undeliverable, and reply to the sender with a proposed priority (i.e., proposed sequence number). Further, this proposed priority is 1 more than the latest sequence number heard so far at the recipient, suffixed with the recipient's process ID. The priority queue is always sorted by priority. 3. The sender collects all responses from the recipients, calculates their maximum, and re-multicasts original message with this as the final priority for the message. 4. On receipt of this information, recipients mark the message as deliverable, reorder the priority queue, and deliver the set of lowest priority messages that are marked as deliverable.
Proof Proof of of Total Total Order Order For a message m1, consider the first process p that delivers m1 At p, when message m1 is at head of priority queue Suppose m2 is another message that has not yet been delivered (i.e., is on the same queue or has not been seen yet by p) operation at sender finalpriority(m2) Due to “max”and since proposed priorities by process p only increase proposedpriority(m2) Since queue ordered by increasing priority finalpriority(m1) Suppose there is some other process p’ that delivers m2 before it delivers m1. Then at p’, Due to “max” operation at sender finalpriority(m1) proposedpriority(m1) Since queue ordered by increasing priority finalpriority(m2) a contradiction!
Causal Causal Ordering Ordering using using vector vector timestamps timestamps The number of group-g messages from process j that have been seen at process i so far
Example: Example: Causal Causal Ordering Ordering Multicast Multicast Reject: Accept P1 1,0,0 0,0,0 (1,0,0) P2 0,0,0 1,0,0 1,1,0 1,1,0 (1,1,0) (1,1,0) 1,1,0 (1,0,0) (1,1,0) P3 0,0,0 1,0,0 1,1,0 1,1,0 Accept Accept: Buffer, missing P1(1) Physical Time Accept Buffered message
Summary Summary Multicast is operation of sending one message to multiple processes in a given group Reliable multicast algorithm built using unicast Ordering – FIFO, total, causal