Optimizing color spaces

Notes
Modified

May 20, 2026

NoteLearning objectives
  • Identify how color can be effectively used in data visualizations
  • Distinguish types of color scales and their appropriate use cases
  • Generate color scales using the {colorspace} package
  • Implement optimal color palettes

Uses of color

Color in data visualization serves distinct purposes. Using color that matches its intended purpose makes charts easier to read; mismatched color usage introduces confusion.

1. Qualitative: distinguish categories

A qualitative (categorical) palette assigns distinct, unordered colors to nominal categories — region, species, country. The colors should be perceptually equidistant: no single color should appear more important than the others.

A qualitative color palette showing distinct hues for unordered categories.

Example of a qualitative color palette for unordered categories

For example, a scatterplot of US state population growth encodes US region with a qualitative palette.

2. Sequential: represent ordered values

A sequential palette encodes magnitude using a gradient from low to high. It is appropriate for ordered or continuous numeric variables where zero (or the minimum) is a meaningful anchor.

A sequential color palette showing a gradient from low to high values.

Example of a sequential color palette for ordered values

Here we use a tile heatmap to visualize the monthly average temperatures across four locations. Each tile’s fill maps to the mean temperature. Since there is a clear ordering of temperatures from low to high, a sequential palette is appropriate.

3. Diverging: represent centered values

A diverging palette is appropriate when data has a meaningful midpoint — typically zero or the mean — and values on both sides of that midpoint are equally important. The palette should grade from one hue through a neutral midpoint to a contrasting hue.

A diverging color palette showing two contrasting hues with a neutral midpoint.

Example of a diverging color palette for centered values

Correlation matrices are a classic application: positive and negative correlations require equally visible emphasis on both sides of zero.

Source: Claus O. Wilke, Fundamentals of Data Visualization

Source: Claus O. Wilke, Fundamentals of Data Visualization

4. Highlight: draw attention to selected elements

A highlight palette uses one saturated color for the data you want to emphasize and muted grays for everything else. This does not encode a continuous variable — it encodes membership in a meaningful subset.

A highlight color palette showing one vivid accent color against muted background colors.

Example of a highlight color palette for emphasizing selected elements

The athlete example below highlights one sport (track) against four others:

Source: Claus O. Wilke, Fundamentals of Data Visualization

Source: Claus O. Wilke, Fundamentals of Data Visualization

Choosing the right scale type

The most common mistake with color palettes is mismatching the palette type to the variable type. The rule is straightforward:

Variable type Palette type Example
Nominal (unordered categories) Qualitative Region, species, country
Ordered / continuous numeric Sequential Temperature, income, proportion
Diverging numeric (centered) Diverging Correlation, gain/loss, z-score
Emphasis on a subset Highlight One country vs. all others

Qualitative scale for nominal variables

Comparing the same nominal variable — country — with sequential (wrong) vs. qualitative (right) palettes makes the issue concrete:

Sequential scale for ordered variables

When the same countries are ranked by total golds — an ordered comparison — sequential shading conveys magnitude:

Quantitative \(\neq\) continuous

A common misconception is that all numeric variables should be encoded with a sequential palette and categorical variables with a qualitative palette. This is not the case: categorical variables can have a directionality encoded in the responses. Likert survey responses (Strongly agree → Strongly disagree) are a specific case where the variable has directionality and should use some sort of directional palette (sequential or diverging). A qualitative palette treats the responses as unordered; a diverging palette anchors on neutral and shows the opposition between favorable and unfavorable ends:

Sequential for continuous spatial data

For choropleth maps encoding a continuous proportion (e.g., percent foreign-born), a sequential palette conveys magnitude clearly. Using a qualitative palette on the same data implies the proportions are unordered categories — which they are not:

Binning continuous variables

For county-level spatial data, a continuous sequential gradient can be hard to read: adjacent low values appear almost identical. Binning the continuous variable into a small number of discrete steps improves readability:

Scale functions in R

Choosing an optimal color palette is challenging in both the design and implementation. R offers many different methods for generating color scales, but it is difficult to access all these different methods and many palettes may not be optimal.

Built-in {ggplot2} functions

{ggplot2} ships with many scale_* functions for color and fill:

Function Aesthetic Data type Palette type
scale_color_hue() color discrete qualitative
scale_fill_hue() fill discrete qualitative
scale_color_gradient() color continuous sequential
scale_color_gradient2() color continuous diverging
scale_fill_viridis_c() fill continuous sequential
scale_fill_viridis_d() fill discrete sequential
scale_color_brewer() color discrete qual / div / seq
scale_fill_distiller() fill continuous qual / div / seq

The naming is inconsistent: some use color, some use fill; some use c for continuous and d for discrete, others do not; some accept a palette argument and some do not. Furthermore, many of the default palettes produced by {ggplot2} are inaccessible. In practice, it is easy to reach for the wrong function.

{colorspace} for consistent, optimal palettes

The {colorspace} package provides a consistent naming scheme:

scale_<aesthetic>_<datatype>_<colorscale>()
  • <aesthetic>: color or fill
  • <datatype>: discrete, continuous, or binned
  • <colorscale>: qualitative, sequential, diverging, or divergingx
Function Aesthetic Data type Palette type
scale_color_discrete_qualitative() color discrete qualitative
scale_fill_continuous_sequential() fill continuous sequential
scale_fill_binned_sequential() fill binned sequential
scale_fill_binned_diverging() fill binned diverging
scale_color_continuous_divergingx() color continuous diverging extended

The naming is predictable: once you know the aesthetic, data type, and palette type you need, the function name follows directly. Additionally, most of the palettes available in the package are optimized for perceptual uniformity and accessibility, and the package provides a large library of palettes to choose from.

Temperature heatmap examples

The temperature tile chart makes scale progression easy to compare:

Code
ggplot(temps_months, aes(x = month, y = location, fill = mean)) +
  geom_tile(width = 0.95, height = 0.95) +
  coord_cartesian(ratio = 1, expand = FALSE) +
  labs(fill = "Temp (°F)")

Default fill — arbitrary blue gradient that does not encode temperature intuition.

Code
ggplot(temps_months, aes(x = month, y = location, fill = mean)) +
  geom_tile(width = 0.95, height = 0.95) +
  coord_cartesian(ratio = 1, expand = FALSE) +
  scale_fill_viridis_c() +
  labs(fill = "Temp (°F)")
1
scale_fill_viridis_c() uses a perceptually uniform sequential palette that reads well in grayscale.

Code
ggplot(temps_months, aes(x = month, y = location, fill = mean)) +
  geom_tile(width = 0.95, height = 0.95) +
  coord_cartesian(ratio = 1, expand = FALSE) +
  scale_fill_viridis_c(option = "B", begin = 0.15) +
  labs(fill = "Temp (°F)")
1
option = "B" selects the Inferno palette. begin = 0.15 avoids the very dark end, which can be hard to distinguish from black.

Code
ggplot(temps_months, aes(x = month, y = location, fill = mean)) +
  geom_tile(width = 0.95, height = 0.95) +
  coord_cartesian(ratio = 1, expand = FALSE) +
  scale_fill_continuous_sequential(palette = "YlGnBu") +
  labs(fill = "Temp (°F)")
1
scale_fill_continuous_sequential() from {colorspace} gives access to a much larger palette library. "YlGnBu" is a Yellow-Green-Blue gradient from ColorBrewer.

Browsing available palettes

{colorspace} provides hcl_palettes() to browse all available palettes by type. Calling it with plot = TRUE renders a visual overview.

TipUse the interactive palette viewer

You can also use the interactive palette viewer at http://hclwizard.org:3000/hclwizard/ to explore palettes for use in R.

For extended diverging palettes (useful when the scale does not have a symmetric range around zero):

When to not use {colorspace}

There are a few situations when you probably do not want to use {colorspace} to set your palette.

Highlighting

Highlighting a single category against a muted background is not a built-in palette type in {colorspace}. Instead, you can use scale_color_manual() or scale_fill_manual() to set one color to a saturated hue and the rest to muted grays.

Consider the track and field athletes example again:

colors <- c("#BD3828", rep("#808080", 4))
fills <- c(alpha(colors[1], .815), alpha(colors[2:5], .5))

ggplot(
  data = male_Aus,
  mapping = aes(
    x = height,
    y = pcBfat,
    shape = sport,
    color = sport,
    fill = sport
  )
) +
  geom_point(size = 3) +
  scale_shape_manual(values = 21:25) +
  labs(
    x = "height (cm)",
    y = "% body fat"
  ) +
  scale_color_manual(values = colors) +
  scale_fill_manual(values = fills)
1
Define a vector of colors: one saturated red and four muted grays.
2
Use scales::alpha() increase the color transparancy without using a separate alpha channel in the plot - the highlight color is more opaque than the muted colors.
3
Use scale_shape_manual() to set the shapes for all categories. Note these values allow separate colors for the outline (color) and fill (fill) aesthetics.
4
Use scale_color_manual() and scale_fill_manual() to apply the custom colors.

Okabe-Ito colorblind-friendly palette

The Okabe-Ito palette is designed for colorblindness and is widely used in accessibility-conscious data visualization. However, it is not available in {colorspace} — you must use scale_color_manual() or scale_fill_manual() with the hex codes directly.

Okabe-Ito colorblind-friendly palette
Color name Hex code R, G, B (0-255)
Black #000000 0, 0, 0
Orange #E69F00 230, 159, 0
Sky Blue #56B4E9 86, 180, 233
Bluish Green #009E73 0, 158, 115
Yellow #F0E442 240, 228, 66
Blue #0072B2 0, 114, 178
Vermilion #D55E00 213, 94, 0
Reddish Purple #CC79A7 204, 121, 167
# use scale_color_colorblind()
p_pop +
  scale_color_colorblind()
1
scale_color_manual() gives full control. Here I avoid using the first color (black) since it draws more attention than the others, when I want to treat all regions equally.

# use scale_color_manual() with Okabe-Ito hex codes
p_pop +

  scale_color_manual(
    values = c(

      West = "#E69F00",
      South = "#56B4E9",
      Midwest = "#009E73",
      Northeast = "#F0E442"
    )
  )

Summary

  • Color serves four purposes: qualitative (distinguish categories), sequential (encode magnitude), diverging (encode deviation from a midpoint), and highlight (draw attention to a subset)
  • Match the palette type to the variable type
  • The {colorspace} scale_<aesthetic>_<datatype>_<colorscale>() naming convention makes the right function easy to find
  • Consider binning over continuous fills when the number of distinct values matters for comparison

Acknowledgements

Material derived in part from Fundamentals of Data Visualization by Claus O. Wilke and STA 313: Advanced Data Visualization.