Colours and colour-blindness

HOME About 5% of men and a small percentage of women are 'colour-blind', meaning that they cannot distinguish certain colours, usually green vs red but more rarely blues and yellows. When we use colours in plots in R, we should make them intelligible to colour-blinds too.

I am fortunate in having two colour-blind colleagues who will quickly point out any plots I produce which aren't clear to them.

We talk about colour-blind-friendly plots in our workshops, and here are some of the tips we pass on. We start with the physics and biology of colour vision - and the consequences for design - but I'll skip that as lots of material is available on the web. I particularly like the summary by  Okabe and Ito.

Basic guidelines

  • Use a colour-blind-friendly palette
  • Make points larger, lines thicker
  • Provide additional cues: shapes, dashed or dotted lines
  • Don't refer to the data only be colour
  • Provide labels directly on the plot rather than in a legend or key.

Colour-blind-friendly palette

R has a default palette of 8 colours which are used when you specify the colour by number. For example:

palette("default")  # Use the default palette
barplot(rep(1,10), yaxt="n", col=1:10, names.arg=1:10)

Note that if you specify a colour beyond 8, the values are recycled: 9 -> 1, 10 ->2, etc.

We can see what this looks like to red-green colour-blind people with functions in the dichromat package:

library(dichromat)
palette(dichromat(palette()))
barplot(rep(1,10), yaxt="n", col=1:10, names.arg=1:10)

Colours 2 and 3, red and green for most of us, are almost indistinguishable for 5% of men. And if in the text or  my talk I say "The red bar represents xyz..." they will not know which I am talking about. So: Don't refer to things only by colour.

Here is the palette of colour-blind friendly colours proposed by Okabe and Ito:

.cbColors <- c(  black = "#000000",
        orange = "#E69F00",
        sky = "#56B4E9",
        green = "#009E73",
        yellow = "#F0E442",
        blue = "#0072B2",
        vermillion = "#D55E00",
        purple = "#CC79A7")
grDevices::palette(.cbColors)

I have this in my .Rprofile file (hence the need to specify the grDevices package), so that it's always loaded when I start R. This is what it looks like to normally-sighted people:

palette(.cbColors)  # Use CB-friendly palette
barplot(rep(1,10), yaxt="n", col=1:10, names.arg=1:10)

And as viewed by my colour-blind friend:

library(dichromat)
palette(dichromat(palette()))
barplot(rep(1,10), yaxt="n", col=1:10, names.arg=1:10)

Of course this assumes that you are specifying colours in your plotting code by numbers. If you use col='red' and col='green', you will get the colours you ask for.

Point size and shape, line width and style

It's easier to tell the colour of a wide line or a large point, even with normal colour vision, so don't settle for default sizes.

In the second plot above, we have provided additional cues to identify the lines and points: lines can be solid, dotted, dashed, etc and points can have symbols of different shapes. These will be distinguishable even if the colours are not clear.

These additional cues are invaluable when we need to refer to items on the plot: we can talk about "the dashed line" or "the blue triangles" and everyone will know what we mean.

Colour-matching vs direct labelling

Colour matching is especially hard for colour-blinds, even if the colours can be distinguished - and it's not easy for the rest of us. Colours look different depending on the surroundings: see the optical illusions here. And looking back and forth between the plot and the legend interrupts the flow of the message. For many reasons, directly labelling features on the plot makes the viewers life much easier, even if it takes more effort on the part of the author.

Updated 5 December 2018 by Mike Meredith