CS455: Introduction to Distributed Systems

[Schedule] [Assignments] [Infospaces] [Grading] [Syllabus]


[Announcements] [Home]


All assignments are due at 5:00 PM on the due date. There is a late penalty of 7.5% per-day for up to a maximum of 2 days. All assignments will be posted at least 2 weeks prior to its due date. We will have a mix of both written and programming assignments. All assignments will be posted on this page. All assignments should be submitted using the checkin system. Comprehensive instructions for using this is available in this PDF document.

Each assignment in this course is split into two components: a programming component that accounts for 80% of the grade for the assignment and a written component that accounts for the remaining 20%. The written part of the assignment will be posted after the programming component has been submitted. The questions in the written part are intended to reflective so that you think a little deeper about your implementation choices, possible extensions to your work, and how you would address inefficiencies in your work. Programming assignments are due on Wednesdays and written assignments including term papers/presentations are due on Fridays.

Assignment HW-Test: Test of the Checkin system

This assignment just makes sure that you are able to use the checkin system. There are no points for this assignment, but it is mandatory. The checkin folder set aside for this submission is HW-TEST. Addtional details are available here.

Note: Checkin system will go live on 1/21/2020
Due: 1/30/2019


Assignment 1: Routing Packets Within a Structured Peer-to-Peer (P2P) Network Overlay

The objective of this assignment is to get you familiar with coding in a distributed setting where you need to manage the underlying communications between nodes. Upon completion of this assignment you will have a set of reusable classes that you will be able to draw upon. As part of this assignment you will be implementing routing schemes for packets in a structured peer-to-peer (P2P) overlay system. This assignment requires you to: (1) construct a logical overlay over a distributed set of nodes, and then (2) use partial information about nodes within the overlay to route packets. The assignment demonstrates how partial information about nodes comprising a distributed system can be used to route packets while ensuring correctness and convergence.  Additional details are avaiable here.

Programming Component (HW1-PC) posted 01/21, and due 2/19/2020 checkin-folder: HW1-PC

Written Component posted: 2/19/2020, and due 2/21/2020 checkin-folder: HW1-WC

Assignment 2: Scalable Server Design: Using Thread Pools & Micro Batching to Manage and Load Balance Active Network Connections

As part of this assignment you will be developing a server to handle network traffic by designing and building your own thread pool. This thread pool will have a configurable number of threads that will be used to perform tasks relating to network communications. Specifically, you will use this thread pool to manage all tasks relating to network communications. This includes: 

  1. Managing incoming network connections 
  2. Receiving data over these network connections 
  3. Organizing data into batches to improve performance
  4. Sending data over any of these links 

Unlike the previous assignment where we had a receiver thread associated with each socket, we will be managing a collection of connections using a fixed thread pool. A typical set up for this assignment involves a server with a thread pool size of 10 and 100 active clients that send data over their connections. Additional details are available here.

Programming Component (HW2-PC) posted 02/05 due 03/11 checkin-folder: HW2-PC

Written Component (HW2-WC) posted 03/11 due 03/13 checkin-folder: HW2-WC

Assignment 3: Analyzing Air Quality Data Collected across the United States using MapReduce

The objective of this assignment is to gain experience in developing MapReduce programs. As part of this assignment, you will be working with data collected from the EPA’s Air Quality System (AQS). You will be developing MapReduce programs that parse and process hourly recordings of temperature and criteria gas levels at various outdoor monitors between 1980 and 2019. You will be using Apache Hadoop (version 3.1.2) to implement this assignment. Additional details are available here.

Additional Useful Documents:

[1]. Hadoop Setup Guide

[2]. Running the WordCount example

Programming Component (HW2-PC) posted 03/11 due 04/15 checkin-folder: HW3-PC

Written Component (HW2-WC) will be posted 04/15 due 04/19 checkin-folder: HW3-WC

Term Project & Paper: Using Spark for Scalable Analytics [Group assignment: Teams of 2-3]
CS455 is a capstone course and includes a writing component in the form of a term project and paper. As part of this assignment you will be doing a term project that involves using Apache Spark for performing analytics. You are free to use Spark for processing on-disk files or to use it for processing data streams. Additional details about the Term Project are available here.

Deliverable-0 due Wedneday, 4/1/2020, @ 5:00 pm

Deliverable-1: Term project proposal due 4/10/2020 @ 5:00 pm MT.  The folder set aside for the final submission using checkin is TP-D1

Final Deliverables: Source codes (4/29/2020) and report (5/1/2020) are due @ 5:00 pm MT. 

There is also a separate presentation component for your term project. All presentations must follow the Term Project Presentation Guidelines. A 15 minute video of your term project presentation must be submitted in lieu of a presentation in class.





Department of Computer Science, Colorado State University,
Fort Collins, CO 80523 USA
© 2020 Colorado State University