CASE WHEN takes in values, checks them against a condition and THEN outputs values into a new column based on if it satisfies the condition.
CASE WHEN in SQL operates very similarly to “if then” statements in other programming languages. Replace the “if” with CASE WHEN and “else if” with WHEN, and the rest matches:
Note: CASE WHEN statements will always output new values to a new column which is different than “if then” which can replace values in the same column.
Now let’s see what the full query would have looked like for that CASE WHEN statement, notice the title of the output column at the end of the CASE WHEN statement.
Let’s break down each component of the CASE WHEN statement:
NOTE: If you put the column name after THEN or ELSE it will put the value from the original column into the newly created column. In the example we see this done with ELSE City which puts LA unchanged in the new column
Note: you do not have to show all of the columns, or any columns besides the CASE column in the output.
Time to complete a query can be a problem. There is no defined maximum for the number of WHEN conditions you can have within a CASE WHEN statement. CASE WHEN queries can become very slow because the query has to check each condition for every row until it finds a case where it satisfies the condition. This can dramatically increase the query time.
If queries are taking a long time to finish, consider finding ways to optimize your query. More on optimizing queries from Data School here.
Make sure that you are comparing the correct data types in your queries. If you try to compare incompatible types SQL will return an “Invalid Input Syntax” error.
The Name field is a VARCHAR and we compare it to the characters “40”.
The Name field is a VARCHAR and cannot be compared to an INTEGER.
Remember you can cast any of your values in order to make the data types match. More on casting from intersystems here. To determine what data type is used in a column you can use the “\d (Table name)” command and look for the “Type” column:
This example uses the Chinook database with PostgreSQL 11. The “Track” table in the Chinook database is a large, informational table on many different songs by many different artists.
The “TrackId” in the table below is unique for each entry. However, the “GenreId” column is a reference to another table that links each id number to a genre of music. Let’s say you lost the “Genre” table (indicated with the red X through it). Now you could not JOIN in Genre Names, How could we replace the GenreId in the Track table?
CASE WHEN allows you to assign the genres yourself. Looking at the first few songs, you can see they all have the same GenreId and we know that 1 corresponded to the “Rock” genre. We can use CASE WHEN to check if there is a 1 in the GenreId Column and then put “Rock” into a newly created GenreType column:
There are more than just 1 genre of music in the track table. Order the entries alphabetically to see some examples of songs that do not have a “GenreId” of 1. In the GenreType column null is returned for any values that did not satisfy the conditions of the CASE WHEN statement:
Note: Null values are sometimes not printed in the output, other times they will be greyed out or italicized like this Null
All of the null entries in GenreType column are not “Rock” songs, so you can use CASE WHEN to label them as “Not Rock”. This can be done using the ELSE keyword:
Alternatively the WHEN keyword can be used several times to create multiple conditions: