HW 07 - Interpreting and explaining predicted danceability

Homework
Modified

April 26, 2024

Important

This homework is due Friday, May 3 at 11:59pm ET.

Getting started

  • Go to the info3312-sp24 organization on GitHub. Click on the repo with the prefix hw-07. It contains the starter documents you need to complete the lab.

  • Clone the repo and start a new project in RStudio.

Packages

# packages for wrangling data and the original models
library(tidyverse)
library(tidymodels)
library(rcis)
library(patchwork)

# packages for model interpretation/explanation
library(DALEX)
library(DALEXtra)
library(lime)

Guidelines + tips

As we’ve discussed in lecture, your plots should include an informative title, axes should be labeled, and careful consideration should be given to aesthetic choices.

Remember that continuing to develop a sound workflow for reproducible data analysis is important as you complete this homework and other assignments in this course. There will be periodic reminders in this assignment to remind you to render, commit, and push your changes to GitHub. You should have at least 3 commits with meaningful commit messages by the end of the assignment.

Workflow + formatting

Make sure to

  • Update author name on your document.
  • Label all code chunks informatively and concisely.
  • Follow the Tidyverse code style guidelines.
  • Make at least 3 commits.
  • Resize figures where needed, avoid tiny or huge plots.
  • Turn in an organized, well formatted document.
Important

Any time you are asked to recreate a visualization, approximate it as closely as possible. We do not care if there are minor differences due to resolution, aspect ratio, font size, etc., as long as your visualization captures the spirit of the original.

Interpreting and explaining predicted danceability

You will interpret the result of a random forest model that predicts the “danceability” of a collection of songs.

The source of the data is Spotify and contains detailed song-level data for every song in a playlist created by or liked by the instructor.

spotify-model.RData contains the training set, test set, and tidymodels workflow for a fitted random forest model. You do not need to estimate a new machine learning model for this assignment. Instead, you will interpret the random forest model to understand how it makes predictions about the danceability of songs. Likewise, for a handful of songs you will explain the model’s predictions to understand why it predicts a particular danceability score for a given song.

The dataset includes the following variables:

Column name Variable description
.id Unique identification number for each song in the dataset
acousticness A confidence measure from 0.0 to 1.0 of whether the track is acoustic. 1.0 represents high confidence the track is acoustic.
album Name of the album from which the song originates.
artist The artist who recorded the song.
danceability Danceability describes how suitable a track is for dancing based on a combination of musical elements including tempo, rhythm stability, beat strength, and overall regularity. A value of 0.0 is least danceable and 1.0 is most danceable.
energy Energy is a measure from 0.0 to 1.0 and represents a perceptual measure of intensity and activity. Typically, energetic tracks feel fast, loud, and noisy. For example, death metal has high energy, while a Bach prelude scores low on the scale. Perceptual features contributing to this attribute include dynamic range, perceived loudness, timbre, onset rate, and general entropy.
explicit A logical value which indicates whether or not the song contains explicit lyrics
instrumentalness Predicts whether a track contains no vocals. “Ooh” and “aah” sounds are treated as instrumental in this context. Rap or spoken word tracks are clearly “vocal”. The closer the instrumentalness value is to 1.0, the greater likelihood the track contains no vocal content. Values above 0.5 are intended to represent instrumental tracks, but confidence is higher as the value approaches 1.0.
key_name The key the track is in
liveness Detects the presence of an audience in the recording. Higher liveness values represent an increased probability that the track was performed live. A value above 0.8 provides strong likelihood that the track is live.
loudness The overall loudness of a track in decibels (dB). Loudness values are averaged across the entire track and are useful for comparing relative loudness of tracks. Loudness is the quality of a sound that is the primary psychological correlate of physical strength (amplitude). Values typically range between -60 and 0 db.
mode_name Mode indicates the modality (major or minor) of a track, the type of scale from which its melodic content is derived.
playlist_name The (anonymized) name of the Spotify playlist where the song is included.
speechiness Speechiness detects the presence of spoken words in a track. The more exclusively speech-like the recording (e.g. talk show, audio book, poetry), the closer to 1.0 the attribute value. Values above 0.66 describe tracks that are probably made entirely of spoken words. Values between 0.33 and 0.66 describe tracks that may contain both music and speech, either in sections or layered, including such cases as rap music. Values below 0.33 most likely represent music and other non-speech-like tracks.
tempo The overall estimated tempo of a track in beats per minute (BPM). In musical terminology, tempo is the speed or pace of a given piece and derives directly from the average beat duration.
time_signature An estimated time signature. The time signature (meter) is a notational convention to specify how many beats are in each bar (or measure). The time signature ranges from 3 to 7 indicating time signatures of “3/4”, to “7/4”.
track Name of the song
valence A measure from 0.0 to 1.0 describing the musical positiveness conveyed by a track. Tracks with high valence sound more positive (e.g. happy, cheerful, euphoric), while tracks with low valence sound more negative (e.g. sad, depressed, angry).
Warning

The dataset contains several ID variables that were not used to fit the model. These include .id, album, artist, and track. Because of how tidymodels combines the feature engineering recipe and the model specification, these columns are passed automatically to DALEX and lime. If they appear in any of your interpretations/explanations, they should be ignored since they were not actually used to fit the model.

Exercise 1

Load the data files and model, and prepare the data for interpretation. Import the training set, test set, and tidymodels workflow from spotify-model.RData. Create an appropriate explainer object using DALEX.

Now is a good time to render, commit, and push. Make sure that you commit and push all changed documents and your Git pane is completely empty before proceeding.

Exercise 2

What are the most important features in the model? Use permutation-based feature importance to identify the most relevant features in the model. Estimate all feature importance scores using a random sample of 1000 observations from the training set, and report your results as the ratio change in the RMSE. Provide a substantive written interpretation of the results.

Now is a good time to render, commit, and push. Make sure that you commit and push all changed documents and your Git pane is completely empty before proceeding.

Exercise 3

Evaluate the marginal effect of the most important features on the predicted danceability. Use the top-5 features from the previous exercise and interpret the marginal effect of each feature on the predicted danceability. Provide a substantive written interpretation of the results.

  • Include the ICE curves for each feature.
  • Choose an appropriate visualization type depending on if the feature is categorical or continuous.
Tip

Review Tidy Modeling with R for guidance on how to better display the results from the partial dependence calculations.

Now is a good time to render, commit, and push. Make sure that you commit and push all changed documents and your Git pane is completely empty before proceeding.

Exercise 4

Explain the predictions for specific songs in the test set using Shapley values. Explain why the random forest model generated its predicted danceability for these songs found in the test set:

Use Shapley values to generate your explanations. Include a graph for each song that includes the actual prediction from the model and the average prediction for the entire test set, along with the average contributions as determined by your Shapley values. Provide a written interpretation for each song.

Now is a good time to render, commit, and push. Make sure that you commit and push all changed documents and your Git pane is completely empty before proceeding.

Exercise 5

Explain the predictions for specific songs in the test set using LIME. Explain why the random forest model generated its predicted danceability for the same songs as in the previous exercise, this time using the LIME algorithm to generate your explanations. Report the top-5 features for each song that contributed to the model’s prediction. Provide a written interpretation for each song.

Additionally, compare the explanations developed using Shapley values and LIME for each song. Are the explanations consistent between the two methods? Which method do you find preferable for explaining the model’s predictions for individual songs and why?


Render, commit, and push one last time.

Make sure that you commit and push all changed documents and your Git pane is completely empty before proceding.

Wrap up

Submission

  • Go to http://www.gradescope.com and click Log in in the top right corner.
  • Click School Credentials \(\rightarrow\) Cornell University NetID and log in using your NetID credentials.
  • Click on your INFO 3312 course.
  • Click on the assignment, and you’ll be prompted to submit it.
  • Mark all the pages associated with exercise. All the pages of homework lab should be associated with at least one question (i.e., should be “checked”).

Grading

  • Exercise 1: 2 points
  • Exercise 2: 12 points
  • Exercise 3: 12 points
  • Exercise 4: 12 points
  • Exercise 5: 12 points
  • Total: 50 points