How CASE WHEN works

6
min read

CASE WHEN takes in values, checks them against a condition and THEN outputs values into a new column based on if it satisfies the condition.

CASE WHEN in SQL operates very similarly to “if then” statements in other programming languages. Replace the “if” with CASE WHEN and “else if” with WHEN, and the rest matches:


Note: CASE WHEN statements will always output new values to a new column which is different than “if then” which can replace values in the same column.


Now let’s see what the full query would have looked like for that CASE WHEN statement, notice the title of the output column at the end of the CASE WHEN statement.


CODE:https://gist.github.com/fronofro/090f5b91cb43fe31c9e328da59434439.js

Let’s break down each component of the CASE WHEN statement:

  • CASE: indicates a condition loop has been started and that the conditions will follow.
  • WHEN: indicates the start of a condition that should be checked by the query.
  • THEN: executed when the condition is true, determines the output for the true condition. After THEN is executed, CASE will return to the top of the loop and begin checking the next entry. If the condition is false, the next WHEN statement will be evaluated.
  • ELSE: catches all of the entries that were not true for any of the WHEN conditions. If no ELSE statement is present and all WHEN conditions are false, the returned value will be NULL.
  • END: Indicates the end of the CASE loop.
  • AS: Used to set a specific name for the returned CASE column.

NOTE: If you put the column name after THEN or ELSE it will put the value from the original column into the newly created column. In the example we see this done with ELSE City which puts LA unchanged in the new column

Syntax

CODE:https://gist.github.com/fronofro/e818e4121a7a7936232451ccc1a8bd43.js

Note: you do not have to show all of the columns, or any columns besides the CASE column in the output.

CASE WHEN Issues

Query Time

Time to complete a query can be a problem. There is no defined maximum for the number of WHEN conditions you can have within a CASE WHEN statement. CASE WHEN queries can become very slow because the query has to check each condition for every row until it finds a case where it satisfies the condition. This can dramatically increase the query time.

If queries are taking a long time to finish, consider finding ways to optimize your query. More on optimizing queries from Data School here.

Data Types

Make sure that you are comparing the correct data types in your queries. If you try to compare incompatible types SQL will return an “Invalid Input Syntax” error.

Same Data Type

CODE:https://gist.github.com/fronofro/d6735f771d4b8578d8361c6a31bbbbb8.js

The Name field is a VARCHAR and we compare it to the characters “40”.

Different Data Type

CODE:https://gist.github.com/fronofro/0d57c94d692cf5d8731e83ae9c694607.js


The Name field is a VARCHAR and cannot be compared to an INTEGER.

Remember you can cast any of your values in order to make the data types match. More on casting from intersystems here. To determine what data type is used in a column you can use the “\d (Table name)” command and look for the “Type” column:

CODE:https://gist.github.com/fronofro/642d0f4954bfb1d0787b2b89d1191300.js


In Depth Example

This example uses the Chinook database with PostgreSQL 11. The “Track” table in the Chinook database is a large, informational table on many different songs by many different artists.

The “TrackId” in the table below is unique for each entry. However, the “GenreId” column is a reference to another table that links each id number to a genre of music. Let’s say you lost the “Genre” table (indicated with the red X through it). Now you could not JOIN in Genre Names, How could we replace the GenreId in the Track table?


CASE WHEN allows you to assign the genres yourself. Looking at the first few songs, you can see they all have the same GenreId and we know that 1 corresponded to the “Rock” genre. We can use CASE WHEN to check if there is a 1 in the GenreId Column and then put “Rock” into a newly created GenreType column:

CODE:https://gist.github.com/fronofro/5e2376eed83a7340133a01b512b6d9fa.js


There are more than just 1 genre of music in the track table. Order the entries alphabetically to see some examples of songs that do not have a “GenreId” of 1. In the GenreType column null is returned for any values that did not satisfy the conditions of the CASE WHEN statement:

CODE:https://gist.github.com/fronofro/b94adc9babc6277312becbf0c943fa98.js

Note: Null values are sometimes not printed in the output, other times they will be greyed out or italicized like this Null

All of the null entries in GenreType column are not “Rock” songs, so you can use CASE WHEN to label them as “Not Rock”. This can be done using the ELSE keyword:

CODE:https://gist.github.com/fronofro/e0e3b08317c4178f1d74a3ab586622b3.js


Alternatively the WHEN keyword can be used several times to create multiple conditions:

CODE:https://gist.github.com/fronofro/94950ac452ecd0ccc88d2c20a26297ed.js


Summary:

  • CASE WHEN is used to determine conditional statements in SQL
  • CASE declares the start of the conditions
  • WHEN declares a condition
  • THEN declares the return of a true condition
  • ELSE catches any entries that do not trigger a conditional check
  • END declares the end of the CASE checks
  • You can have a multitude of WHEN statements
  • AS can be used to create a header for the return data, otherwise it is given the header “case”

References:

https://www.w3schools.com/sql/sql_case.asp

https://www.postgresql.org/docs/7.4/functions-conditional.html

Give Feedback on our Google Doc

Chapter topics