CSx55: Distributed Systems

csu-logo
red-line
[Schedule] [Assignments] [Infospaces] [Grading] [Syllabus]

[Canvas]

[Home]

Schedule

Last updated on Wednesday, April 24, 2024 2:25 PM
Professor Lecture Coordinates
 

Shrideep Pallickara
Office: Room 364, CS Building
Office Hours:
1:00-2:00 pm Friday
[in-person and via Zoom]

E-mail: compsci_csx55 {aT} colostate.edu
(with the obvious change)
Tel: 970.492.4209

 

TTh: 9:30 - 10:45 am
Fridays: 4:00-5:00 pm
[CSB-130]

Graduate Teaching Assistants

Gabriele Maurina

Daniel Rehberg

Ethan Seefried

E-mail: compsci_csx55 {aT} colostate.edu
(with the obvious change)

Readings will be based on the following textbooks.

[TvS] Distributed Systems: Principles and Paradigms. Andrew S. Tanenbaum and Maarten van Steen. 3nd Edition. Createspace, ISBN 9781530281756.
[CDKB]
Distributed Systems: Concepts and Design. George Coulouris, Jean Dollimore, Tim Kindberg, Gordon Blair. 5th Edition. Addison Wesley. ISBN: 978-0132143011
[KS] Distributed Computing: Principles, Algorithms, and Systems. Ajay Kshemkalyani and Mukesh Singhal. 1st edition. Cambridge University Press. ISBN: 0521876346/ 978-0521876346.
[GPB] Java Concurrency in Practice. Brian Goetz, Tim Peierls, Joshua Bloch, Joseph Bowbeer, David Holmes, and Doug Lea. Addison-Wesley Professional. ISBN: 0321349601/978-0321349606.
[OW] Java Threads. Scott Oaks and Henry Wong. . 3rd Edition. O’Reilly Press. ISBN: 0-596-00782-5/978-0-596-00782-9
[TW] Hadoop: The Definitive Guide. Tom White. 3rd Edition. Early Access Release. O’Reilly Press. ISBN: 978-1-449-31152-0.
[KKWZ] Learning Spark: Lightning-Fast Big Data Analysis. 1st Edition. Holden Karau, Andy Konwinski, Patrick Wendell, and Matei Zaharia. O'Reilly. 2015. ISBN-13: 978- 1449358624.
[KW] High Performance Spark: Best Practices for Scaling and Optimizing Apache Spark. Holden Karau and Rachel Warren. O'Reilly Media. 2017. ISBN-13: 978-1491943205.
[NL] Distributed Algorithms. Nancy Lynch. 1st edition. Morgan Kaufman. ISBN: 1558603484/978-1558603486.
[GR] Cloud Application Architectures: Building Applications and Infrastructure in the Cloud. George Reese.1st edition. O'Reilly. ISBN: 0596156367/978-0596156367.
[PD] Computer Networks: A Systems Approach. Larry Peterson and Bruce Davie. 4th edition. Morgan Kaufmann. ISBN: 978-0-12-370548-8.
[FS] Practical Cryptography. Niels Ferguson and Bruce Schneier. 1st edition. Wiley Publishing. ISBN: 0-471-22894-X/0-471-22357-3.
[WS] Cryptography and Network Security: Principles and Practice. William Stallings. 5th Edition. Prentice Hall. ISBN: 0136097049/978-0136097044
[RR] Unix Systems Programming. Kay Robbins & Steve Robbins, 2nd edition. Prentice Hall. ISBN: 978-0-13-042411-2.
[SGG] Operating Systems Concepts. Avi Silberschatz, Peter Galvin, Greg Gagne. 8th edition. John Wiley & Sons, Inc. ISBN-13: 978-0-470-12872-5.
   
Introduction [CS455 and CS555] References and HW
This module introduces students to the course, logistics, and the set of topics that are to be covered.

[HW1 released 01/17]
 

Objectives:
[1] Clarify logistics
[2] Clarify expectations for the CS455 and CS555 variants of the course.

 
01/18



Lecture 1
   
Threads, Thread Safety, and Concurrent Programming [CS455 and CS555] Readings

Threads vs processes, thread lifecycle, stacks and heaps, creation and management of threads, data synchronization, race conditions, intrinsic locks and reentracy. Compound actions, sharing objects and confinement, multivariable invariants and thread-safety, making a class thread-safe, multivariable invariants, adding functionality to a thread-safe class, synchronized & concurrent collections, and locking strategies.

[OW] Ch {1, 2,3, 4}
[SGG] Ch {4}
[GPB] Ch {5, 11}

  Objectives:
  1. Design thread safe programs that scale and leverage concurrency
  2. Reason about program correctness in multi-threaded settings
  3. Design scalable server-side programs
 
01/19

01/23

01/25

01/26

01/30

02/01

02/02

02/06


Lecture 2

Lecture 3

Lecture 4

Lecture 5

Lecture 6

Lecture 7

Lecture 8

Lecture 9
CS455 HW2 (02/07)


CS555 HW2 (02/07)
   
Architectures & Topology [CS455 and CS555] Readings
Architectural styles for designing systems including layered, objects, data, and event based. Role of topologies in systems design and their implications on throughput, scaling, fault tolerance and resiliency, and latencies.

[TvS] Ch {6}
[CDKB] Ch {15}
[KS] Ch {9}

[TvS] Ch {2}
  Objectives:
  1. Explain the role topologies play in scaling and failure recovery.
  2. Identify differences between topologies based on random graphs, regular graphs, power law, and small-world networks.

 
02/08

02/09


Lecture 10

Lecture 11

   
Peer-to-Peer Systems & Distributed Hashtables [CS455 and CS555]  

Peer to Peer (P2P) Systems: characteristics, P2P generations, P2P middleware and requirements.
Structured and Unstructured P2P Systems.  Time & Space complexity of P2P algorithms. Exemplar Systems: Pastry, Chord, Tapestry, Napster, Gnutella, and BitTorrent.

[TvS] Chap {5}
[CDKB] Chap {7, 10}
[KS] Chap {18}
[GPB] Chap {1,2,11}
 

Objectives:

  1. Explain structured and unstructured P2P systems
  2. Explain bounded routing in structured P2P systems
  3. Build systems based on structured P2P concepts
 

02/13

02/15

02/16

02/20

02/22

02/23

02/27


Lecture 12

Lecture 13

Lecture 14

Lecture 15

Lecture 16

Lecture 17

Lecture 18

   
Programming models for Cloud Computing: MapReduce [CS455 and CS555]  
Comparison with RDBMS, HPC, Grid computing and volunteeer computing, core architectural framework, pushing computations to the data, Map and Reduce functions, orchestration of tasks, partitioning functions, refinements, and combiner functions.


[MapReduce-Paper]

 

Objectives:

  1. Express programs using MapReduce.
  2. Design systems that push computations to the data to preserve data locality.
  3. Design programs that leverage frameworks to scale.
 
02/29

03/01

Lecture 19


Lecture 20


   
Term Project Pitches and Midterm [CS455 and CS555]  




03/06

03/07

03/08

Lecture 21

Midterm

Lecture 23


 
   
Hadoop MapReduce and HDFS [CS455 and CS555]  
The Hadoop ecosystem, developing MapReduce programs using Hadoop, job configuration and submission, MapReduce data flow, combiner functions and requirements for combiners, tasks and split strategies, YARN and MapReduce Runtimes.


[TW] Ch {1, 2}
 

Objectives:

  1. Express programs using MapReduce.
  2. Design systems that push computations to the data to preserve data locality.
  3. Design programs that leverage frameworks to scale.
  4. Understand the design of the Hadoop Distributed File System
 
03/19

03/21

03/22

03/26

03/28

Lecture 24

Lecture 25

Lecture 26

Lecture 27

Lecture 28



   
Spark [CS455 and CS555]  

Software stack, interactive shells in Spark, core Spark concepts, Resilient Distributed Datasets (RDDs), lazy evaluations, Operations:transformations & actions, Pair RDDs, dependencies and transformations, and partitioning schemes. Narrow and wide transformations.


[KKWZ] Chap {1-4}
 

Objectives:

  1. Understand memory residency and its implications
  2. Express functionality using lambda expressions

 
03/29

04/02

04/04

04/05

Lecture 29

Lecture 30

Lecture 31

Lecture 32


   
Spark Streaming [CS555]  
Architecture and abstractions, execution, stateful and stateless transformations, windowed operations, and performance considerations. [KKWZ] Chap {5,6}
 

Objectives:

  1. Explain differences between traditional batch-style computations and streaming.
  2. Explain microbatching in streaming computations.
  3. Design programs that facilitate incrementatl computation of results.
 
04/09

04/10

Lecture 33

Lecture 33-B [Extra - For reference]

   
Time & Logical Clocks [CS555] [NS] Ch {3}
[JS] Ch {3}
The passage of time and synchronizing off of that is one of the core constructs in modern distributed systems. We we will explore different approaches to clock synchronization in distributed environments. In particular, we will look at (1) Time and Global Positioning Systems. (2) Time synchronization algorithms: Berkley Algorithm, Cristian’s Algorithm, and synchronization in wireless settings. and (3) Lamport's Clocks, Vector and Matrix Clocks
 

Objectives:

  1. Explain the importance of time synchronization in distributed systems.
  2. Contrast capabilties across different types of logical clocks: scalar, vector, and matrix.
 

04/11



Lecture 34

   
   

Replication, Consistency and Coherence [CS555]

[TvS] Chap {7}

[Amazon-Consistency Paper]

Performance and correctness implications of replication and consistency.
(a) Data and client centric consistency, consistent ordering of operations: sequential consistency and causal consistency. Monotonic read, Monotonic write, Read-your-writes, and Writes-follow-read.
(b) Consistency Protocols
(c) Replica placements
(d) Brewer's CAP theorem and design implications

 

Objectives:

  1. Design systems based on the CAP theorem
 

04/16

04/XY


Lecture 35

Lecture 35-B [Not on Final Exam]

   
Extreme Scale Distributed Storage Systems [CS555]  

In this module, we will cover two of the most powerful extreme scale storage systems and the design considerations that underpin them. These frameworks substantially advanced the design of storage systems that scale and they have completely different approaches to the scaling problem.

Amazon Dynamo
: Assumptions & requirements, design choices, systems architecture, partitioning algorithm, replication, versioning, and experiences.


Google File System: Demand pulls, file system interface, chunking, managing file system metadata, records and atomic appends, creating snapshots, replication, consistency, deletion and garbage collection.


[Dynamo-Paper]


[GFS Paper]
 

Objectives:

  1. Contrast different approaches to scaling in distributed systems.
 

04/18

04/23

04/25



Lecture 36

Lecture 37

Lecture 38
   
Distributed Mutual Exclusion [CS455 and CS555]  

Distributed coordination, conditions requirements for distributed mutual exclusion, token-based and permission-based approaches, Central-Server algorithm, Ricarat and Agarwala's algorithm, Maekawa's algorithm and voting sets..

[TvS] Ch {6}
[CDKB] Ch {15}
 

Objectives:

  1. Analyze distributed mutual exclusion algorithms.
  2. Design systems that can enforce mutual exclusion in distributed settings.
 






   
Distributed Election Algorithms [CS455 and CS555]  

Election Algorithms: Requirements, performance of algorithms, ring-based algorithm, failure detectors, Garcia-Molina Bully Algorithm, and elections in wireless environments.


[CDKB] Ch {15}
[KS] Ch {9}
  Objectives:
  1. Analyze distributed election algorithms.
  2. Explain key differences between different types of failure detectors.
 





   
   
Term Project Presentations [CS455 and CS555]  

Each group will present their term projects to the entire class. The format of these term project presentations will be prescriptive so that the core achievements can be highlighted. The term project presentation guidelines will be posted right after Spring Break.

 
 

Objectives:

  1. Present research work coherently and objectively to a peer audience.
  2. Highlight and assess key contributions of a work. In particular, identify which works are innovative and what makes these works so compelling.
  3. Frame questions effectively during research presentations.

 



Term Project Presentation Guidelines

Schedule A

Schedule B


   

Comprehensive Final Exam in CSB-130
Thursday, May 9th, 2:00-4:00 pm


 
   

 

 

 


Department of Computer Science, Colorado State University,
Fort Collins, CO 80523 USA
© 2024 Colorado State University