For your practical project for this course, you will be given a task that many of you will face in the work force. This task is writing code to analyze data, and just as importantly, you will have to write up the narrative about that data as if you were submitting a report to your employer. Another way to look at the narrative is like an executive summary of the data. If you handed the report to your employer, could they make business decisions off of your report?
Please note, that this project is worth 8% of your final grade!
50 points will be dedicated to the coding portion, and 100 points towards the written report.
Select a dataset
Insecticide & Bees
Effects of insecticide on bees (usda.gov). Field and Lab data regarding the effects of 4 sublethal concentrations of a neonicotinoid insecticide (Imidacloprid) on honey bees and about a dozen native bee species.
Biodiversity in New York State
Biodiversity by County - Distribution of Animals, Plants and Natural Communities in the State of New York(data.gov). This dataset lists Endangered or Threatened animals and plants that have been seen in New York.
Fort Collins Weather Data
Weather in Fort Collins is great. Except for when it’s not. This dataset contains the temperature and wind speed for every hour between 2008 and 2015, and is provided by the Fort Collins Weather Station and Colorado State University.
Gender Diversity Data
Are you curious about tends on who signs up for what majors at Colorado State University? The diversity data looks at numbers and genders across the College of Natural Sciences and Engineering.
"The World Glacier Inventory (WGI) contains information for over 130,000 glaciers. The WGI is based primarily on aerial photographs and maps with most glaciers having one data entry only. Hence, the data set can be viewed as a snapshot of the glacier distribution in the second half of the 20th century. It is based on the original WGI (WGMS 1989) from the World Glacier Monitoring Service (WGMS)." - National Snow and Ice Data Center
New Orleans Restaurants
Locations of restaurants throughout New Orleans, as indicated by occupational licenses(data.gov). This dataset lists Restaurants that have been registered in New Orleans, Louisiana.
Based on the dataset you select you will need to find the corresponding lab in ZyBooks. The lab has some sample way to analyze the data, and the ability to download the dataset for you to use in other programs. Don’t write code right away! Instead..
Look at the information you have on the dataset? What do the columns represent?
Write down a couple questions you have about the data? What are some things you could learn?
When you write code, do it in steps.
- Write code to read the data, and maybe just print it to the screen
- Write one simple test, focusing on how you convert the data
- Explore the tests that we give you, as a way to get you thinking
- Write additional tests, one at a time, and evaluate the results.
Write a report on the data. It is the narrative about the data for those who do not have the information.
Your written report should be a maximum of six pages. There is no minimum, but you should be able to fully express the narrative in the space allowed. It should be noted that six page is a common number for conference proceedings, and does not include a separate title page or bibliography. We expect most reports to be under this number, maybe a couple pages at the most.
Your report will be turned in via canvas, and you will find an rubric for the report in the assignment listing. Your TAs and instructor will grade your reports based on the rubric.
What to include?
Detail the narrative of the primary dataset you analyzed. How does this narrative fit with other information you have found online about similar datasets (i.e. other references)? Are there trends to look at? Did you find ways to build graphs (note: you can use third party programs such as Google Sheets to build graphs about the data you analyze. At the bare minimum it needs to include some of the numbers you were asked to find on the coding half of this assignment.
It should have an intro, body and conclusion at a minimum.
Can you include graphs?
Yes, please do! You DO NOT need to write the code to generate the graph. Instead use Google Sheets or Excel or Pages (or other program of your choice).
Do you need references?
Yes. Every dataset is referenced on the dataset page, and you should find outside sources to confirm any info you find.
Why this practical project?
Many of you will continue onto other majors, without much a demand in coding. However, nearly every major requires analyzing data in some form, and having experience coding means you can use that experience to help you write scripts and applications to analyze that data.
As always, reach out to your TAs and make sure you start this project early. To help with formatting and good writing practices, here are a few resources
- Maddie has written a page on the best practices for CS 150 Essay.
- CSU Writing Center
- How to Quote and Paraphrase
Rubber Duck for Debugging?
Often the best thing you can do is explain what you are attempting to do to a friend. In the process of explaining it, you usually figure out your errors, and it helps give you a direction to go from there.
Sometimes in companies, this can be a time-cost for other developers so thus the introduction of a Rubber Duck for testing. This gives you an object for you to talk to just like you would a friend. The object doesn’t have to give feedback (just like a friend doesn’t), but instead, the process of talking through the code out loud will often provide insights into both development and errors.