csu logo green Computer Science Department

Big Data: TP Proposal

Fall 2017

Home Syllabus Schedule Assignments Grading Policy Course Policy Code of Conduct Canvas

Proposal (Due on 10/11 by 5:00PM)


The purpose of the term project is for you to learn how to formulate a simple Big Data problem/task/application and to gain experience in solving it using algorithms, system design and techniques taught in class.  You are encouraged to find teammate, but individual projects are also allowed. ___________________________________________________________________________________________________

Components of the proposal report

Checkpoint 1. Title of your project

This should be concise and self-descriptive.

Checkpoint 2. Team information

You should list your team members' names and email addresses here.

Section 1. Introduction

The proposal should clearly identify the motivation, background (if needed) and problem of this project in this section. It should include at least one or two carefully crafted paragraph that states and highlights the problem. The problem formulation should be able to answer following questions:

  • What is the problem you are solving?   This should also include the background for the problem.
  • Why is it interesting as a Big Data problem and who would use it if it were solved?

Section 2. Proposed Strategy to Solve the Problem 

Describe and justify your proposed approach to solve the problem. The description of the strategy should include, 

  • The algorithms/techniques/models you plan to use in this project.
  • The framework you plan to use in this project.
  • The dataset you plan to use in this project

The course project for CS535 requires your software as a part of the final output. Please note that you are also required to produce software as the final output of this project. Your strategy should include a plan for the software development.

  • What software components do you plan to build?
  • How will your software interact with your users?
  • What will be the input and output of each function?


Section 3. Description of Dataset

Describe your dataset. This is a critical component of your project. You should provide information about your datasest including,

  • Overview of dataset
  • Data aquisition
  • Source (e.g. URL)
  • Description of data (e.g. size, attributes, and format)
  • Restriction of data (if applicable)
  • Relevant research or analysis using this dataset (if applicable)

Section 4. Plan for testing and Evaluation

Describe how testing and evaluation will be performed. Your software should be tested before you provide the final results and presentation. What is your plan for testing your software?  

  • What will be your test data? (e.g. We will select 10% of data using the random sampling algorithm for testing and increase the percentage up tp 100%)
  • What will be your testing scenario? (e.g. We will use a file stream to immitate real-time streaming data to test the network communication)
  • How will you deploy your software? (e.g. CS120 cluster with XYZ machines, or Amazon's AWS cluster)

The proposal should include an evaluation plan including metrics that you will use to identify if you have succeeded or not.  If you come up with a metric, also provide an intuitive feel for what this metric captures and why you think this is appropriate.

For example, if your project involves classification, you can list accuracy measures that will be used and provide justification. Also, you should provide what your targe accuracy with your project.

Section 5. Project timeline (weekly plan)

You should provide a table with a weekly plan to complete the term project. If you have teammate, the plan should also include information about the respective roles.

Section 6. Plan for Related Work

Included a plan for your related work section in the final report. What will be the list of references you will use? All references must be cited in the report. 

  • The authors' names
  • The titles of the works
  • The names of publisher
  • The date (or year) the copies were published
  • The page numbers of your sources (if available)

Checkpoint 3. Submission

If it is a team submission, please submit only one copy of and specify the team members in the author list associated with the document.

This document should be2,000 ~ 2,500 words. Do not exceed the limit.


* Presentation

Your presentation should cover the content discussed in your report. Please create 6 page of slides

•Page 1: Title (with the team info)
•Page 2: Introduction
•Page 3,4: Your approach/software
•Page 5,6: Plan for software testing and evaluation

Presentation should be no longer than 12 minutes including the Q&A session. (10 mintues: presentation, 2 minutes: Q&A)
Presentation will be peer-reviewed

•All of the team members should present
•Audience will get 2% of participation score based on their questions, and attendance
•Please submit your slides via canvas at least 2 hours before your presentation





Home Syllabus Schedule Assignments Grading Policy Course Policy Code of Conduct Canvas