Close
Data TutorialsPostgreSQL

How to Use the Substring Function in PostgreSQL and Redshift

Posted by Tim Miller

Sometimes a string in your database might contain some information embedded in that string as a substring. This string could be an ID number that might exist in an activity or tracking number for your company’s website.

Here is an example where a company that monitors website traffic activity has a specific naming convention for the Activity Tracking ID. The naming convention is “Activity ID” which is 6 characters and in the first position, “Customer ID” which is 7 characters and appears second, and in the third position “Page ID” which is 3 characters long. These three ID numbers are put together in a single string with no dividing characters. In this example, a tracking number might look like this: WSL194OH08856CAS.

You may want to pull out the Customer ID number, and to do that you can use the following substring function:

SUBSTRING("WSL194OH08856CAS",7,7)

Knowing that the customer ID number is 7 digits long and begins with the 7th position, the formula outlined above performs in this fashion:

Parsing Column Substring

Also, you might want to just have the Activity ID. To get this you will use the same function just in a slightly different way.

SUBSTRING("WSL194OH08856CAS",1,6)

Just like the previous code snippet, we looked at this one will begin with the 1st position, and returns a string that lasts 6 characters.

In our final example, you might be interested in Page ID, which in our case is at the end of the string. For this you can use the function like this:

SUBSTRING("WSL194OH08856CAS",14)

In this example, by omitting the final argument you are commanding the database to return to you everything after and including the character in the 14th position.

Using this function you can now parse out a string in a column based on known positions and lengths of substrings that it might contain. This is a useful technique for splitting a column up into multiple substrings if those substrings can be used in a different way than the entirety of the original string.