The Heat Map Chart

What is a Heat Map Chart?

A heat map chart, closely related to the map chart, visualizes data through color variations in a tabular format. Two variables of interest are used to create an x and y axis, which extend to create multiple rows and columns. These rows and columns form a table made of cells, or boxes, and a third variable is used to determine what color or shade each box in the table should be.

The boxes within the table are either different colors to represent categorical data, or different shades of the same color to represent numerical data. The data contained within a cell is based on the relationship between the two variables in the connecting row and column. Heat maps are simply cross tables or spreadsheets that use colors instead of numbers.

When to Use a Heat Map Chart

Primarily, heat map charts should be used to display a more generalized view of multi-dimensional data. This is especially true when dealing with larger quantities of data because colors are often easier to distinguish between and make sense of than raw numbers. As mentioned previously, the data can either be categorical or numeric, which only changes the coloring in the chart.  

While other data visualizations require some basic knowledge to interpret, heat maps are innately self-explanatory– the darker the shade, the greater the quantity (the higher the value, the tighter the dispersion, etc.). This is why heat map charts are very efficient in drawing attention to trends and showing variance within variables. The use of color and shading in heat map charts makes pattern recognition easy, as well as detecting similarities and relationships between variables.

Let’s take a look at an example where the heat map chart is a great choice of visualization:

Chart made using Chartio

In this example, we’re interested in looking at the number of UFO sightings on the east coast of the United States in the year 2018. The columns represent the states on the east coast, the rows represent the months in 2018, and the shade of blue represents the number of UFO sightings. Notice the shade scale in the top right corner of the chart– this shows the highest and lowest number of sightings and provides an indication of the relative values of the in between colors.

From the map chart, it’s easy to see that Florida consistently has the highest UFO sighting totals among the states for each month with the highest number of sightings being in August of 2018; we know this due to that particular box having the darkest shade of blue on the whole chart. Other high UFO sightings, indicated by dark shading, include Massachusetts in April and New York in January-April. The sightings in New York for those four months clearly form a dark cluster at the top chart which draws attention at first glance.

It’s also easy to see which states and months had few, or no, UFO sightings based on the lightly shaded and white boxes. Though we don’t know the exact number of UFO sightings for many of the states and months, the heat map chart gives us a general idea and highlights key points of the data.

When NOT to Use a Heat Map

Determining when to use a heat map chart is usually simple, but there are some areas where a heat map chart is not the ideal visualization choice. First, heat map charts should not be used to look at individual data values. Instead, they rely on color to communicate generalities of data values, and it’s very difficult to extract individual values from the chart unless a value is included in each box or the chart legend is very detailed.

Second, heat map charts should not be used to generalize too many variables with similar data values. Though we said previously that heat map charts are great for working with large amounts of data, having too much data becomes a problem, especially when many of the variables have the same or similar data values. Because colors and shades are used to represent individual values in a heat map chart, having too many of the same values will result in a chart of all one color or a chart with few colors that are practically non differentiable. This issue makes interpretation of the chart difficult and diminishes the overall purpose.

Let’s take a look at an example where the heat map chart is NOT a great choice of visualization:  

Chart made using Chartio

This example is very similar to the previous one– we’re still interested in looking at the number of UFO sightings in 2018, but this time in all of the United States. The columns represent all of the United States, the rows represent the months in 2018, and the shade of blue represents the number of UFO sightings. Again, notice the shade scale in the top right corner of the chart– it still shows the highest and lowest number of sightings, different from the previous example, and provides an indication of the relative values of the in between colors.

So why is the heat map chart a poor choice of visualization for this data? Well, minus the obvious exception, the majority of the chart seems to be the same light blue color; there are simply too many categories (states) with similar data values. Because of this, the only concrete information we can gain from this chart is that in April of 2018, California reported the largest number of UFO sightings, whereas in the previous example, we were able to know the top three largest sightings.

The similarity in color between a majority of the boxes makes it difficult to distinguish between the number of sightings for each state and month, making them all seem like same value when, most likely, they’re not. This detracts from being able to see patterns and similarities between variables, and is ultimately overgeneralizing the data. While you may only want to know the largest number of UFO sightings as shown in this chart, there are other chart types that could make that information much easier to gain.

It’s important to note here that the number of sightings in California for April 2018 is an outlier data point that has clearly affected the shade scale for this chart. If this data point was removed from the chart, the shade scale wouldn’t be as spread out and we could possibly see some differentiation in color for the overall chart. Then, even though many of the states might still have similar values, there would be a greater difference in the shade because of the smaller scale. Unfortunately, outlier data points can sometimes be unavoidable.

Comparison of Distribution Chart Types

Simply put, the heat map chart is a data visualization that’s used to show distributions of multiple values. Other types of visualizations that show the distribution of multiple values are the map and bubble map charts. The table below gives the use case and pros and cons of each distribution chart type:

Heat Map Chart

Map Chart

Bubble Map Chart

Use

  • Visualize data through color variations in a tabular format
  • Visualize data through color variations by geographical location
  • Visualize data through sized circles, or bubbles, by geographical location

Pros

  • Can display large amounts of information  
  • Color variation can clearly depict relationships between data points and help to draw conclusions about trends
  • Simple tabular format
  • Can display large amounts of information  
  • Color variation can clearly depict relationships between data points and help to draw conclusions about trends
  • Easily shows how data breaks down regionally
  • Can display large amounts of information  
  • Easily shows how data breaks down regionally
  • Can compare proportions over geographic regions

Cons

  • Graduations in color are not as effective for discerning subtle differences
  • Showing exact values can be difficult, better for relative data
  • Graduations in color are not as effective for discerning subtle differences
  • Showing exact values can be difficult, better for relative data
  • Showing exact values can be difficult, better for relative data
  • Can become overcrowded easily

References

About Bryn Burns

Hi! I'm Bryn Burns. I am a current senior at Virginia Tech pursuing degrees in Statistics and Mathematics. Data science and visualization are two things I'm very passionate about, as well as working with numbers and helping people learn. I'm thrilled to share my knowledge here at The Data School!