Syllabus

Modified

April 12, 2024

INFO 3312/5312 - Data Communication

Instructor

  • Dr. Benjamin Soltoff
  • Office: Gates Hall 216
  • Email: soltoffbc@cornell.edu
  • Office hours: Wednesdays 1-3pm (216 Gates Hall)

Course logistics

  • Meets TuTh 1:25 - 2:40 pm for 28 sessions
  • Discussion sections meet on Fridays at varying times for 15 sessions
  • 4 credits, offered for a letter grade
  • Prerequisites: INFO 2950 or INFO 5001. Prior experience with R and Git/GitHub is required.

Course description

Data scientists often present information to disseminate their findings. This course introduces theories and applications of communicating with data, with an emphasis on visualizations. To support this approach, we will focus on the what, why, and how of data visualization. “What” focuses on specific types of visualizations for a particular purpose, as well as tools for constructing these plots. In “how” we will focus on the process of generating a data visualization from pre-processing the raw data, mapping attributes of the data to plot aesthetics, strategically determining how to define the visual encoding of the data for maximal accessibility, and finalizing the visualization to consider the importance of visual appeal. In “why” we discuss the theory tying together the “how” and the “what”, and consider empirical evidence of best-practices in data communication.

Course learning objectives

By the end of the semester, you will…

  • Implement principles of designing and creating effective data visualizations.
  • Evaluate, critique, and improve upon one’s own and others’ data visualizations based on how good a job the visualization does for communicating a message clearly and correctly.
  • Post-process and refine plots for effective communication.
  • Master using R and a variety of modern data visualization packages to reproducibly create data visualizations.
  • Work reproducibly individually and collaboratively using Git and GitHub.

Office hours

Click here for the instructor and TA office hours and locations.

You are welcome to attend the office hours for any INFO 3312/5312 TA, regardless of section.

Textbooks

All books are freely available online.

ggplot2: Elegant Graphics for Data Analysis Hadley Wickham, Danielle Navarro, and Thomas Lin Pedersen Springer, 3rd edition (in progress) Hard copy only available of 2nd edition
Fundamentals of Data Visualization Claus O. Wilke O’Rielly Media, 2019 Hard copy available

Course community

We want you to feel like you belong in this class and are respected. Cornell University (as an institution) and we (as human beings and instructors of this course) are committed to full inclusion in education for all persons. If for any reason you feel that we have failed these goals, please either let us know or report it, and we will address the issue.

Services and reasonable accommodations are available to persons with temporary and permanent disabilities, to students with DACA or undocumented status, to students facing mental health or other personal challenges, and to students with other kinds of learning challenges. Please feel free to let me know if there are circumstances affecting your ability to participate in class. Some resources that might be of use include:

Academic accommodations

We want all students to have the opportunity to be successful in this course. Accommodations can help provide some flexibility and equitable classroom access.

Per university policy, this course provides the following accommodations:

Accessibility

If there is any portion of the course that is not accessible to you due to challenges with technology or the course format, please let me know so we can make appropriate accommodations.

Student Disability Services is available to ensure that students are able to engage with their courses and related assignments. Students should be in touch with Student Disability Services to request or update accommodations under these circumstances.

If you have an approved SDS accommodation, please send a copy of this letter to the instructors at soltoffbc@cornell.edu so we can ensure your accommodations are implemented in this course.

Communication

All lecture notes, assignment instructions, an up-to-date schedule, and other course materials may be found on the course website: info3312.infosci.cornell.edu.

Announcements will be posted through Canvas Announcements periodically. Please check Canvas (or ensure Canvas announcements are forwarded to your email) to ensure you have the latest announcements for the course.

Where to get help

  • If you have a question during lecture or discussion, feel free to ask it! There are likely other students with the same question, so by asking you will create a learning opportunity for everyone.
  • The course staff is here to help you be successful in the course. You are encouraged to attend office hours to ask questions about the course content and assignments. Many questions are most effectively answered as you discuss them with others, so office hours are a valuable resource. Please use them!
  • Outside of class and office hours, any general questions about course content or assignments should be posted on the course discussion forum. There is a chance another student has already asked a similar question, so please check the other posts on GitHub Discussions before adding a new question. If you know the answer to a question posted on the discussion board, I encourage you to respond!

Email

If there is a question that’s not appropriate for the public forum, please email us at soltoffbc@cornell.edu. Barring extenuating circumstances, we will respond to INFO 3312/5312 emails within 48 hours Monday - Friday. Response time may be slower for emails sent Friday evening - Sunday.

Activities & Assessment

The activities and assessments in this course are designed to help you successfully achieve the course learning objectives. They are designed to follow the Prepare, Practice, Perform format.

  • Prepare: Includes reading assignments and lectures to introduce new concepts and ensure a basic comprehension of the material. The goal is to help you prepare for the in-class activities during lecture.

  • Practice: Includes in-class application exercises where you will begin to apply the concepts and methods introduced in the prepare assignment. The activities will graded for completion, as they are designed for you to gain experience with the visualization and communication techniques before working on graded assignments.

  • Perform: Includes homeworks and projects. These assignments build upon the prepare and practice assignments and are the opportunity for you to demonstrate your understanding of the course material and how it is applied to communicate effectively using real-world data.

Lectures (Prepare)

Part of the class time will be lectures that introduce new concepts or review topics from the preparation materials. Lectures will not repeat everything in the readings, they will instead highlight important and known to be complex concepts and will be supplemented with live coding activities. You are expected to attend every lecture.

Application exercises (Practice)

A majority of the in-class lectures will be dedicated to working on Application Exercises (AEs). These exercises will give you an opportunity to apply the communication techniques introduced in the prepare assignment. These AEs are due within one day of the corresponding lecture period. Specifically, AEs from Tuesday lectures are due Wednesday by 11:59 pm, and AEs from Thursday lectures are due Friday by 11:59 pm.

Because these AEs are for practice, they will be graded based on completion, i.e., a good-faith effort has been made in attempting all parts. Successful on-time completion of at least 85% of AEs will result in full credit for AEs in the final course grade.

Homework (Perform)

In homework, you will apply what you’ve learned during lecture and discussion to complete visualization and communication tasks. You may discuss homework assignments with other students; however, homework should be completed and submitted individually. Homework must be typed up using Quarto and GitHub and submitted as a PDF in Gradescope.

Homework assignments are due 11:59 pm on the indicated due date.

The lowest homework grade will be dropped at the end of the semester.

Project (Perform)

The purpose of the projects is to apply what you’ve learned throughout the semester to solve some sort of real-world problem. All projects are completed in teams.

  • Project 1: Teams will be given a dataset to visualize. Completed over the first half of the semester.
  • Project 2: Teams will create something related to data visualization/communication. Completed over the second half of the semester.

The deliverables for each project will include a data visualization, a write up of the process and findings, and a presentation. For the second project, you will be encouraged to think beyond a traditional two-dimensional data visualization (e.g. interactive web apps/dashboards, data art, generative art, physical/tangible visualizations, ggplot2 extensions, etc.).

Grading

The final course grade will be calculated as follows:

Category Percentage
Homework 40%
Project 1 20%
Project 2 30%
Application exercises 10%

The final letter grade will be determined based on the following thresholds:

Letter Grade Final Course Grade
A+ >= 98
A 93 - 97.99
A- 90 - 92.99
B+ 87 - 89.99
B 83 - 86.99
B- 80 - 82.99
C+ 77 - 79.99
C 73 - 76.99
C- 70 - 72.99
D+ 67 - 69.99
D 63 - 66.99
D- 60 - 62.99
F < 60

Teams

You will be assigned to a different team for each of your two projects. You are encouraged to sit with your teammates in lecture and you will also work with them in the discussion sessions. All team members are expected to contribute equally to the completion of each project and you will be asked to evaluate your team members after each assignment is due. Failure to adequately contribute to an assignment will result in a penalty to your mark relative to the team’s overall mark.

You are expected to make use of the provided GitHub repository as their central collaborative platform. Commits to this repository will be used as a metric (one of several) of each team member’s relative contribution for each project.

Graduate requirements for INFO 5312

Students in INFO 5312 have additional expectations in the course:

  • INFO 5312 homework will at times be graded against a more stringent rubric
  • INFO 5312 students will be grouped together for all projects

The final letter grade will be determined using the same thresholds as for INFO 3312.

Course policies

Academic honesty

TL;DR: Don’t cheat!

Please abide by the following as you work on assignments in this course:

  • You may discuss individual homework and lab assignments with other students; however, you may not directly share (or copy) code or write up with other students. For team assignments, you may collaborate freely within your team. You may discuss the assignment with other teams; however, you may not directly share (or copy) code or write up with another team. Unauthorized sharing (or copying) of the code or write up will be considered a violation for all students involved.

  • You may not discuss or otherwise work with others on the exams. Unauthorized collaboration or using unauthorized materials will be considered a violation for all students involved. More details will be given closer to the exam date.

  • Reusing code: Unless explicitly stated otherwise, you may make use of online resources (e.g. StackOverflow) for coding examples on assignments. You may not directly copy and paste from these sources, but instead you need to adapt the code to fit your specific task. You must explicitly cite where you obtained the code using a code comment # immediately near the appearance of the reused code in the file. Any recycled code that is discovered and is not explicitly cited will be treated as plagiarism.

  • Use of generative artificial intelligence (GAI): Cornell’s report on Generative Artificial Intelligence for Education and Pedagogy outlines many of the potential benefits and drawbacks to using GAI in the classroom. In this course, we see the value of coding assistants such as GitHub Copilot and ChatGPT to generate code from text. However as an introductory course, we need to ensure that GAI is not used as a substitute or replacement for student learning. GAI should not be used by students to replace your ability to think clearly. Students who use GAI should use it to facilitate, rather than hinder, learning.

    • GAI tools for reference purposes: You may make use of the technology as a reference tool, similar to looking up the documentation for a function or Googling your problem. For example, I hate writing regular expressions. Absolutely loathe it. Say I have a dataset where I need to clean a character column to remove all words that are within double asterisk symbols. I might ask ChatGPT

      How do I make a scatterplot using ggplot2 in R?

    • GAI tools for writing my code/analysis: You may not make use of the technology to complete substantive portions of your assignments for you. For example, you may not upload your data file to a GAI platform and ask it to create charts and statistical models for you.

    • GAI tools for narrative: unless instructed otherwise, you may not use GAI to write narrative on assignments. In general, you may use generative AI as a resource as you complete assignments but not to answer the exercises for you.

    You are ultimately responsible for the work you turn in; it should reflect your understanding of the course content.

Any violations in academic honesty standards as outlined in the Cornell University Code of Academic Integrity and those specific to this course will result in a 0 for the assignment (or possibly more) and will be reported to the College of Engineering Academic Integrity Hearing Board.

Extra credit

Students can earn up to a maximum of 1 percentage point towards their final grade through the extra credit assignment. This is the only opportunity for extra credit in the course.

Late work & extensions

The due dates for assignments are there to help you keep up with the course material and to ensure the course staff can provide feedback within a timely manner. We understand that things come up periodically that could make it difficult to submit an assignment by the deadline.

Late work

  • Homework assignments: A slip day allows you to submit an assignment 24 hours after the deadline and still receive credit without a late penalty. You are provided with a total of 3 slip days for the entire semester. Slip days may be used on homework assignments. You can use up to 1 slip day for a given homework assignment. Note that the lowest homework assignment will be dropped at the end of the semester.

    To use your slip days, just submit your assignment late. No need to email telling us you are submitting using your slip days. Check Canvas to see how many of your slip days you have used before submitting an assignment late.

    If you use a slip day, do not submit anything to Gradescope before the submission deadline. We may begin grading before the slip day deadline and we will grade whatever submission we see in Gradescope.

    If you run out of slip days or fail to submit your assignment prior to the slip day deadline without prior permission then your assignment will not be accepted.

  • Application exercises: There is no late work accepted for application exercises, since these are designed to help you prepare for homeworks and projects.

  • Projects: Late work is not accepted.

Waiver for extenuating circumstances

If you need a bit of extra time, please use your slip days. Slip days are specifically intended for legitimate reasons for needing an extension like disability, religious observance, Title IX, student athletics, medical problems, and military service.

If using your slip days for accommodations is not working for you or if you have an SDS accommodation which includes deadline flexibility, you may request a deadline extension in-advance of the deadline. We will work with you to develop reasonable accommodations that align with your individual situation.

To request a deadline extension:

  1. Commit and push the work you have completed up to this point on the assignment.
  2. Email soltoffbc@cornell.edu. In your email clearly state
    1. The assignment
    2. What you have already completed on the assignment.
    3. What you have left to complete.
    4. Your proposed deadline extension (e.g. Monday, February 8th at 11:59pm.)

Regrade requests

  • Homework assignments: Regrade requests must be submitted on Gradescope within a week of when an assignment is returned. Regrade requests will be considered if there was an error in the grade calculation or if you feel a correct answer was mistakenly marked as incorrect. Requests to dispute the number of points deducted for an incorrect response will not be considered. Note that by submitting a regrade request, the entire question will be graded which could potentially result in losing points.

  • Projects: Copy the template below into an email. Send that email to soltoffbc@cornell.edu.

    Subject:

    INFO 3312/5312 (REQUEST) Regrade

    Email Message Template:

    NetID: TODO
    Team Name: TODO
    Assignment: TODO

    Directly state the mistake(s) in the grading of your assignment. Be specific and specify the total points that you believed should be returned for each mistake. (1-3 brief and concise bullets):

    TODO

    (optional) If necessary, briefly explain why your approach to this assignment is a good choice (1-3 brief and concise bullets):

    TODO

    Tips:

    • When writing, please be respectful, thoughtful, and professional.
    • Be brief and concise. Bullet points are encouraged. Please do not write a lengthy explanation.
    • Form and ground your argument based on ideas and principles presented in this course. This is the primary criteria we use to evaluate your regrade request.
    • Assume that we made a mistake; avoid accusing us being unfair or punishing you.