csu logo green Department of Computer Science

CS 480-A1 SPRING
Principles of Data Management 2013

-Home -Syllabus -Schedule -Assignments -Grading Policy -Course Policy -Code of Conduct -RamCT
Syllabus

Instructor: Sangmi Lee Pallickara
Office: CSB Room 456
Office Hours: W,F 11:00 AM ~ noon and by appointment
Email: sangmi (at cs dot colostate dot edu)
Tel: +1.970.492.4153
Fax: +1.970.491.2466

Lecture Times and Location
Monday, Wednesday, and Friday, 10:00 PM ~ 10:50 PM
Wager, 132

Special Labs
We will provide special lab sessions (5 ~ 6 sessions) covering technical issues involved in the programming assignments.
Time and Location: TBA

Teaching Assistant:
Srijeet Chatterjee
Office hours: M 6-8PM (at CSB120) and by appointment
Email Address:
cs480{at}cs{dot}colostate{dot}edu

Description

Modern scientific instruments and Internet-scale applications generate voluminous data pertaining to vital signs, weather phenomena, social networks that connect millions of users, and the origins of distant planets. Data produced in these settings hold the promise to significantly advance knowledge. CS 480-A1 covers fundamental issues in large-scale data management. The course examines issues related to data organization, representation, access, storage, and processing. This will include topics such as metadata, data storage systems, self-descriptive data representations, semi-structured data models, ontology, semantic web, and large-scale data analysis.

Prerequisite
CS370 (with a C [2.0] or better)

Topics
- Data representation and exchange
- XML, RDF, Ontology and Semantic Web
- Metadata
- Data Exchange
- Internet-scale data storage systems
- Large-scale data analytics: data flow management

- Data cleaning, and Data cataloging
- Factors for storage technologies
- Data provenance

Textbooks (Optional)
[1] Serge Abiteboul, Peter Buneman, and Dan Suciu, Data on the Web: From Relations to Semistructured Data and XML, Morgan Kaufman Series in Data Management Systems
[2] Arie Shoshani, and Doron Rotem, "Storage Technology",Scientific Data Management Challenges, Technology, and Deployment, Chapman & Hall/CRC, 2010
[3] Alan Gates, Programming Pig, O'Reilly, 2011
[4] Dean Allengang and Jim Hendler, Semantic Web for the Working Ontolotist, Effective Modeling in RDFS and OWL, Morgan Kaufmann, 2011, ISBN 978-0-12-385965-5

[5] Jiawie Han, Micheline Kamber, and Jian Pei. Data Mining Concepts and Technologies, Morgan Kaufmann, 2012

Course Structure
This course is divided into three sections. In the first part, we look at data representation, models and metadata. In the second part we will discuss technologies for Internet-scale data storage; these are systems designed to handle petabytes of data. This will include topics such as the Google File System and PNUTS. In the last part, we will cover issues related to data analytics focuing on the data flow management. In this part, we will also discuss ontology, and Semantic Web technologies.

The course consists of lectures, exams/quizzes, assignments and a term project. Students will be assessed by two exams (1 mid-semester exam and 1 final exam), quizzes, and assignments. A total of 4 programming assignments are planned for this semester. Not that this is a plan, and thus is subject to change! Lecture slides, and assignments will be made available on the web page of CS480-A1. Programming assignments and the term project must be submitted via RamCT, unless otherwise noted on an assignment or by the instructor. Special lab sessions will cover technical issues involved in the assignments. All quizzes will be in class. You can expect at most one quiz per week. Grades will be posted on RamCT.

Important course announcements (e.g., change in assignment due dates) will be posted on the course web page. Students should check the course web page at least twice a week for new announcements. Please see the Professional Conduct section of this web page.

Late and Makeup Policy
Mid-semester and Final Exams: Make-up exams are only given in extraordinary circumstances (e.g., illness, death of family member). Students must consult with the instructor as soon as possible, preferably before the start of the exam. Course examination dates are listed in the syllabus; be aware of them and plan accordingly.

No make-ups will be given for missed quizzes.

Programming assignments are to be submitted electronically using the checkin system. Always check the assignment page for due dates. Assignments can be submitted up to a maximum of 2 days past the deadline. There will be a deduction of 10 % (of the total score for the assignment) as the penalty per day. For example, if the assignment was due at 5:00 PM on Thursday: (1) you will lose 10 % of the total score if you submit between 5:01 PM Thursday and 5:00 PM Friday, (2) if you submit between 5:01 PM Friday and 5:00pm Saturday, you will lose 20% of the points set aside for this assignment. (3) No submissions will be accepted after 5:00pm on Saturday and you will be given a 0 for that assignment.

We will try our best to return assignments will within 5 working days after the end of the late period.

Grading Information
Please check the grading policy page.

Professional Conduct
All students are expected to conduct themselves professionally. We (the instructor and GTA) assume you are familiar with the policies in the student information sheet for the department. Additionally, you are computing professionals, albeit perhaps just starting. You should be familiar with the code of conduct for the primary professional society, ACM. You can read the ACM Code of Conduct.
We work to maintain an environment supportive of learning in the classroom and laboratory. Towards that end, we require that you be courteous to and respectful of your fellow participants (i.e., classmates, instructor, GTA and any tutors). In particular:
* Please turn off the ringer on your cell phone. If you are expecting an emergency call, sit near the door and slide out discretely to take it.
* If you plan to use a laptop during class, please sit at the back of the classroom and turn off any sound from the machine. The tap-tap of the keyboard and the images showing on a screen can be distracting to those sitting around you. Also, be aware if you IM during class, the giggles, snorts or other reactions to what you are reading can be heard by the class and instructor and may be completely inappropriate with what is going on in the classroom.
* Laptops must be shut during exams and quizzes.

Important Dates: TBA

CSU Academic Calendar