Survivorship Bias

Last modified: July 10, 2019

Survivorship bias is the tendency to draw inaccurate conclusions based on things that have survived. There are two main ways people reach erroneous conclusions through survivorship bias.

Inferring a norm

The things that survived a process are the only things that ever existed

Example:

“Most castles were made of stone” vs “most castles were made of wood but were destroyed by fire or withered away over time.”

Present Day

Present day we only see castles that were made of stone

All we see is the surviving stone castle

Medieval Times

Back in Medieval times there were many wooden castles as well

We can see the same stone castle but also the more numerous wooden castles

People assume that what they see or have concrete evidence of are the only things that have ever existed. When in fact most of the things that have existed in the past have never seen before.

Inferring causality

Anything that survived a process was impacted by that process.

Example:

“Men get tough fighting in the coliseum” vs “only tough men survive the coliseum.”

Strong survive because they are strong. They do not get string from the coliseum

People assume the competition in the coliseum caused the outcome but actually it filtered out the weak people and the strong people survived.

Survivorship Bias in Business

These biased ways of interpreting events and data is common in business. Let’s imagine you work at a business intelligence software company with a two week free trial period that just launched.

After one week, in the middle of the trial, you only have a few people still active. Let’s say everyone who is still engaging on your site is a data analyst. The data analyst users are creating progressively more complicated analyses in your BI tool.

What conclusions could you draw?

  • My BI tool resonates with data analysts
  • My BI tool empowers deep analysis

Why might these conclusions be wrong?

My BI tool resonates with data analysts (inferring a norm)

Without examining the people who stopped using the too, we do not know if that group of people in the trial had data analysts in it as well. Let’s say everyone that started the trial was a data analyst and you had more people give up than keep engaging. This challenges us to do more investigative work to find out why some data analysts kept using the BI tool when others didn’t.

Just because all of your engaged users are data analysts does not mean your product resonates with all data analysts.

How to reach a more informed conclusion:

Analyze everyone who started the trial to find patterns that truly separate cohorts. Maybe there is something unique about those who engaged and those that did not continue the trial

This type of analysis can also be done pre-trial. Are there other tools people are using that resonate with them. If there are multiple tools that are used by data analyst, then there are likely a variety of needs to be met within the data analyst community. Why are data analyst buying different products? This question can help uncover these differences within a group.

My BI tool empowers deep analysis (inferring causality)

Right now you are not comparing your users’ capabilities against a control. They may be doing the most advanced thing in your BI tool, but they might be capable of even more advanced analysis when using other BI tools. Without comparing the ease at which these analysts perform these types of analyses using other tools, you cannot determine whether your product is more useful to them than alternatives. They might be very skilled analysts and succeeding in spite of your design, not because of it. (change the word succeeding)

How to reach a more informed conclusion

Do a pre-assessment of users skill level to see if your product/features are actually adding value. Run tests of people performing similar tasks with the tools they currently use These tests become a great PR piece for your product or feature if you can cite a positive benefit: “Users complete BI analysis 2 times faster on our tool than our leading competitor’s tool”.

Summary

  • Analyze the full cohort, not just the users who are still engaging after two weeks
  • Compare user behavior with competitor tools to judge impact of your product
  • Conduct pre-assessments to judge current skill levels

Written by: Matt David
Reviewed by: Mike Yi , Blake Barnhill , Matthew Layne

Next – Statistic vs Distribution

Get new data chapters sent right to your Inbox