Colour Maps for Satellite Imagery Derivatives

Introduction

Whole text books are written on data visualisation. I’ve not read much of any of them, but I’ve spent quite a bit of time thinking and experimenting with how satellite imagery derivatives such as NDVI are best visualised. There is no one size fits all. Basically the aim is to be able to extract as much information with as little effort and time as possible. As an agronomist, you are never just looking at one field but multiple fields and over the entire season and even comparing to past seasons. This article is to offer some ideas I’ve developed on the subject plus a few tips and gotchas along the way. It’s a bit opinionated and not a text book chapter.

Background

There are a few ways to process and display satellite imagery such as natural or true colour, alternative band combinations, spectral indices (including vegetation indexes) and more. In this article I will focus on how spectral indices are visualised.

First, what is a spectral index? A spectral index is a formula designed to take individual bands as inputs and generate a derivative dataset intended to measure or depict a certain physical characteristic of the environment on the ground. For example, NDVI takes in two bands – red and near-infrared reflectance – into a formula that outputs a single layer where each pixel has a value of -1 to 1. In a perfect world, dense, healthy vegetation would have a value close to 1, and zero vegetation ends up around 0 or a bit below. And then there is everything in between.

If we just had a map of a paddock divided up into 10m-by-10m squares (as a grid) with a number representing the NDVI value in each square, it would be time-consuming to analyze and difficult to grasp the overall trends across the paddock. This is where data visualisation comes in. We can take a colour scale (also known as a colormap) that applies a colour to each square of the grid based on the NDVI value. There are many colour scales and many ways to apply the colour scale over the data.

Simple color scales
MS Excel Conditional Formatting applies color scales to cells in the same way they are applied to vegetation index pixels.

Ask these questions

The person interpreting the map should understand at least to some degree how the map has applied colour to numerical values. Data can be misunderstood by a poorly applied colour scale or a user not understanding how to interpret it. Whenever looking at a spectral index visualisation, consider asking these questions:

  • What is the minimum & maximum value in the dataset?
  • What is the minimum & maximum value on the colour scale?
  • Is the colour scale applied evenly across the dataset?
  • Does the colour scale have enough colour changes to represent the variability in the data?
  • Conversely, does the colour scale misrepresent the data with too many colour changes on a dataset with a very tight range on the min & max?
  • Will this colour scale transfer directly to other dates of paddocks you are comparing it to, or is the min & max value of the colour change based on the statistics of the dataset?

…there are many other considerations, but you get the idea – think this through.

Dynamic vs Fixed

Question 6 above mentions the idea of transferring colour scales across datasets so they can be directly compared. This is called a fixed colour scale. A fixed colour scale always has the same minimum and maximum. The advantage is that you can directly compare different dates or fields side by side, and the same colour will represent the same value consistently. The brain often associates colours with specific meanings, so having those change can make interpretation more time-consuming.

A dynamic colour scale, on the other hand, is set by the statistics in the dataset. For example, the bottom and top 2% of values may be clipped or excluded as outliers. The min and max values are then identified and applied to the colour scale. The advantage of a dynamic colour scale is that it can extract more variability from the data by adjusting to the dataset's range.

My preferred color scale

My favourite color scale is called Turbo. You can read about why it works well on their blog: Turbo, An Improved Rainbow Colormap for Visualization – Google AI Blog. As is, this is my go-to when applying a dynamic colour scale. When needing a fixed colour scale, I take the Turbo colormap and add a high contrast tip at either end, called Turbo Tips.

This approach attempts to get the best of both worlds:

  • A fixed colour scale, so comparing different dates and paddocks is more logical.
  • The ability to visualise changes throughout the growth cycle, from none to high biomass, with a high contrast set of colours.

This works well for detecting emerging crops at the lower end and signals saturation at the upper end.

Visuals

These explanations are best understood through visualisation. Look closely at the color map legend in the Turbo Tips column.

The 17 May capture is particularly interesting as the eastern side of the paddock begins to show signs of wheat emerging.

Comparison of color scales across dates
Wheat paddock planted 9 May 2022. Three dates, with three ways to apply a color scale: dynamic, fixed, and fixed with tips.

As you can see from the histograms below, the data distribution is diverse for these three datasets, making it challenging to apply a single fixed colour scale to all three.

Histogram data distribution
Histogram. Band 1: 2022-05-17, Band 2: 2022-06-08, Band 3: 2022-08-07

Fixed colour scale and data quality

A fixed colour scale applied over time is great for tracking progress and comparisons, but this only works well with robust data. Thankfully, we now get good quality atmospherically corrected data for Landsat 8/9 and Sentinel 2. It's generally safe to compare an image from 10 days ago directly with today's image. You can visually see or measure physical changes based on increases or decreases in the value of the spectral index applied.

Unfortunately, this does not apply equally across all remotely sensed imagery. It's essential to understand if the timeseries imagery has been ground-calibrated or processed to be analysis-ready. Drone imagery and even aerial imagery can be particularly problematic if left unchecked. Additionally, different processing levels for Landsat 8/9 and Sentinel 2 exist. For reliable comparisons, obtain Level 2 data.

Conclusion

There is much more to discuss on this topic, but the basic idea is that data visualisation can be messy and opinionated. However, as we move towards more frequent imagery, being able to visualise it in a way that allows quick and accurate interpretation is important.