csu logo green Computer Science Department

Introduction to Big Data: schedule CS 435
Fall 2020
| Home | Syllabus | Schedule | Assignments | Grading Policy | Course Policy | Code of Conduct | Canvas |

Note that this schedule will be altered during the semester. Please make sure to check it every week.

Please retrieve your course materials from the Canvas Modules.

Week 1. (8/24, 8/26)

Topics
Introduction to Big Data
Course Introduction

- Big Data and Analytics: Data collection, Sampling and Preprocessing
-
Overview of Big Data Computing Stack
-
Introduction to MapReduce


Notes
Colorado State University Academic Calendar [Link]

Week 2. (8/31, 9/2)

Topics
- MapReduce Design Pattern I. Numerical Summarization
- MapReduce Design Pattern II. Filtering Patterns


 
Week 3. (9/9)

Topics
- MapReduce Design Pattern II. Filtering Patterns (continued)


 
Week 4. (9/14, 9/16)

Topics
- MapReduce Design Pattern III. Data Organization Patterns
- MapReduce Design Pattern IV. Join Patterns

 

 
Week 5. (9/21, 9/23)
Topics
- MapReduce Design Pattern V. I/O patterns
- How MapReduce works

 

 
Week 6. (9/28,9/30)
Topics
Large-scale Analytics 1. Web-Scale Link Analysis



 
Week 7. (10/5, 10/7)

Topics
Large-scale Analytics 1. Web-Scale Link Analysis: continued
Large-scale Analytics 2. Clustering: K-Means Clustering using Canopy algorithm
Midterm Review

 

 
Week 8. (10/12, 10/14)

Topics
Midterm
Large-Scale Analytics 3. Predictive Analytics: Linear Regression using Gradient Descent Algorithm
Planning Term Project


 
Week 9. (10/19,10/21)

Topics
Large-Scale Analytics 4. Recommendation Systems: Collaborative Filtering

Part 2. Data Retrieval and Exchange

Week 10. (10/26, 10/28)

Topics
In-Memory Computing Framework for Scalable Analytics with Apache Spark


 
Week 11. (11/2, 11/4)

Topics
In-Memory Computing Framework for Scalable Analytics with Apach Spark-Continued
Distributed File Systems: Google File System


Week 12. (11/9, 11/11)

Topics
Distributed File Systems: Google File System


Week 13. (11/16, 11/18)

Topics
Distributed File Systems: Google File System -- continued
NoSQL storage system I. Key-Value storage systems (Amazon's Dynamo)


Week 14. (11/23, 11/25): No class

Fall Break: No class




Week 15. (11/30, 12/2)
Topics
NoSQL storage system II. Colume Family storage systems (Google's BigTable)
Data Exchange Models
Representational State Transfer (REST)


 
Week 16. (12/7, 12/9)

Term Project Presentations: Schedule (TBA)

Final Exam: Schedule TBA

Home Syllabus Schedule Assignments Grading Policy Course Policy Code of Conduct Canvas