Contributors

Matt David

Editor of the Data School at Chartio,
Contributor at Data School

Matt has over 7 years industry experience using data and is currently the Editor of the Data School @ Chartio. Previously he worked at Udacity as Product Lead for the School of Data Science. Data has become a prerequisite skill set for more and more non-data jobs and he is passionate about making data concepts more easily understood to increase data literacy for everybody.

Matt David's Picture

Social

Author of 47 chapters

Start Modeling Data

From SQL Optimization

Quick introduction on using to dbt and BigQuery to model data. Modeling data can have significance performance impacts on your database.

Data Mart Implementation

From Cloud Data Management

Learn the best practices for building a Data Mart

Data Mart Maintenance

From Cloud Data Management

Learn Best Practices for Maintaining a a Data Mart, such as handling errors and incorporating new data sources.

Defining a Data Governor

From Cloud Data Management

Data Governors maintain the database so that is remains valuable to an organization. This involves security, education, and modeling.

Source Data Connections

From Cloud Data Management

Learn how to configure your database to analyze source data effectively.

Source Data Best Practices

From Cloud Data Management

Learn how to manage queries, model in a BI tool, and use drag and drop query interfaces.

Starting with Source Data

From Cloud Data Management

Learn how to analyze data from applications, production databases, and financial records.

Why Build a Data Lake

From Cloud Data Management

Learn why you should build a data lake to improve analytics at your company.

Why Build Data Marts

From Cloud Data Management

Learn the best reasons to build a data mart on top of your data warehouse

Row vs Column Oriented Databases

From

Column oriented databases have become dominant over row oriented databases in data warehousing.

Build the Metrics

From How to Design a Dashboard

Use SQL to query the database to get the data behind the metrics people want to see. Use our template to quickly build accurate queries. Learn more.

Conclusion

From How to Design a Dashboard

To create a Dashboard you need to define stakeholders and metrics, prototype designs, use SQL and a BI tool, and finally share it with the company.

Dashboard Design Process

From How to Design a Dashboard

See how design thinking principals map to dashboard design. Define, Prototype, Build, and Deploy Dashboards.

Dashboard Prototyping and Feedback

From How to Design a Dashboard

Learn techniques to prototype your dashboard and iterate. See examples of prototyped dashboards and download dashboard templates.

Determine the Metrics to Monitor

From How to Design a Dashboard

Define what metrics actually matter to business professionals. Use customer development techniques to determine the best metrics.

Finding the Data That Builds Metrics

From How to Design a Dashboard

Learn to collaborate with your data team to discover what data can be used within a dashboard. Use SQL to find and assess columns to use.

Identifying Key Roles

From How to Design a Dashboard

Learn the key stakeholders for a dashboard design process. Learn how to work with Designers, Developers, and your audience.

Introduction

From How to Design a Dashboard

Dashboards help organizations make data driven decisions. Learn how to apply design thinking to creating useful dashboards.

Scaling Dashboards

From How to Design a Dashboard

Successful dashboards get used a lot so they need to be built to scale. Learn to share dashboards in a scalable way.

What is a Dashboard?

From How to Design a Dashboard

Learn what dashboards are and why they are important to making decisions. Dashboards have a long history for supporting decision making.

What Makes a Great Dashboard (ACES)

From How to Design a Dashboard

Learn the best practices behind dashboard design to support decision making. Accuracy, Clarity, Empowerment, and being Succinct.

Accessing Data

From How to Teach People SQL

Accessing data requires permission to the database, asking questions in SQL, and knowing how to explore a schema.

Difference between WHERE and ON in SQL

From How to Teach People SQL

Understand how filtering and joining can be done in both the ON and WHERE clauses in SQL. Choose the best strategy for JOINing data in SQL.

Cross Join - Animated

From How to Teach People SQL

Visualize how SQL is joining two tables using a Cross Join. See animated visualizations of the data being CROSS JOINed in SQL. Learn more.

0 Rows Returned

From How to Teach People SQL

Learn the most common reasons you will get 0 rows returned from your SQL query. See how JOINs and over filtering may be the problem.

Syntax Errors

From How to Teach People SQL

Learn the most common reasons for SQL errors due to syntax. Spelling errors, Quotation marks, Capitalization, Data Types, and more.

Full Outer Join - Animated

From How to Teach People SQL

Visualize how SQL is joining two tables using a Full Outer JOIN. See animated visualizations of the data being FULL OUTER joined in SQL. Learn more.

How Aggregations Work

From How to Teach People SQL

Visualize how SQL aggregates data by viewing examples of COUNT, AVG, and SUM. See animations showing each step of the SQL query.

How Subqueries Work

From How to Teach People SQL

Visualize how subqueries work in SQL by seeing the intermediate table that is created during the process. See animations showing each step of the SQL query.

Inner Join - Animated

From How to Teach People SQL

Visualize how SQL is joining two tables using an Inner JOIN. See animated visualizations of the data being INNER joined in SQL. Learn more.

Introduction

From How to Teach People SQL

Introduction about why teaching SQL requires understanding of the audience who is less adept at dealing with abstract ideas

Left & Right Join - Animated

From How to Teach People SQL

Visualize how SQL is joining two tables using a Left JOIN and a Right JOIN. See animated visualizations of the data being LEFT and RIGHT joined in SQL. Learn more.

Union - Animated

From How to Teach People SQL

Visualize how SQL is joining two tables using a Union All join. See animated visualizations of the data being Unioned in SQL. Learn more.

Why Databases?

From How to Teach People SQL

Databases are better for managing data than spreadsheets when you consider size, accuracy, and security

Additional Practice

From Learn SQL

Practice common SQL commands with simple questions.

Confirmation Bias

From Avoid Misrepresenting Data

Confirmation bias negatively affects the accuracy of your analysis. Learn how to detect Confirmation Bias and how to avoid this cognitive bias.

Define Experiment Parameters

From Avoid Misrepresenting Data

Defining Experiment Parameters improves analysis and increases trust in results that are shared in an organization.

Metric vs Metrics

From Avoid Misrepresenting Data

A single Metric can be misleading. Learn how to use multiple Metrics to avoid misleading yourself and others.

Overall vs Groups

From Avoid Misrepresenting Data

Overall statistics in data can be misleading because there may be distinct groups within the data that have very different statistics. Learn to avoid this analysis mistake.

Predicting Outcomes

From Avoid Misrepresenting Data

Predicting Outcomes prevents cognitive biases from affecting how your interpretation of the results of an experiment. Learn more.

Relative vs Absolute Change

From Avoid Misrepresenting Data

Relative and Absolute changes can bias your interpretation of data you are analyzing. Learn to interpret them correctly.

Review Outcomes

From Avoid Misrepresenting Data

Review Outcomes of Feature Releases to evaluate their impact and to create institutional knowledge. Learn how to interpret results accurately.

Selection Bias

From Avoid Misrepresenting Data

Selection bias negatively affects your analysis. Learn how to detect Selection Bias and how to avoid this cognitive bias.

Statistic vs Distribution

From Avoid Misrepresenting Data

Distributions provide much more nuance to the data than a statistic does. Learn how to query to get distributions and then interpret them.

Survivorship Bias

From Avoid Misrepresenting Data

Survivorship bias negatively affects your analysis. Learn how to detect Survivorship Bias and how to avoid this cognitive bias.

Trends

From Avoid Misrepresenting Data

Trend in data can be misleading depending on what time frame you are looking at it. Learn how to accurately interpret trends.

Data Lake Security

From Cloud Data Management

Learn best practices for ensuring data security on a Data Lake database.

Reviewer of 85 chapters

What is a Query Plan

From SQL Optimization

Query Plan's are the code that SQL runs on your database. Learn how the query plan works and how to optimize your SQL queries. Learn more

Keeping your Data Stack Agile as a Startup

From Data Conversations

What considerations you should have while building a modern data stack as a startup.

Taking a DA Hub Approach to Solving Analytics Challenges

From Data Conversations

The Data Analytics Hub details how they foster productive conversations about data within companies

About this Book

From Cloud Data Management

Information about who this book is for, who it's not, how we wrote it, disclaimers, influences and how to contribute.

Acknowledgments & Contributions

From Cloud Data Management

This is a community driven book - with contributions from many different people and organizations. Help keep it relevant and continually improving.

Data Lake Maintenance

From Cloud Data Management

Learn best practices for data lake maintenance. Handle Data Source updates and improve performance.

Data Warehouse Implementation

From Cloud Data Management

Learn how to setup a Data Warehouse. Model and transform data to make it easy to analyze.

Data Warehouse Maintenance

From Cloud Data Management

Learn the best practices to maintain a Data Warehouse. Learn how to add new data, remove deprecated data, and optimize for performance..

Data Warehouse Security

From Cloud Data Management

Learn how to secure sensitive data on your database and BI platform.

Democratized or Centralized - choosing your workflow

From Cloud Data Management

Choosing whether data should be completely run by a centralized team, or efforts should be made into enable others in the organization to work with data is a big choice for companies, with great consequences on both sides.

Doing more with your Data Mart

From Cloud Data Management

Learn why you should use a tool like census to push your data back to your data sources

ETL vs ELT

From Cloud Data Management

Learn why you should use an ELT over a ETL process for your Data Lake

Evaluating Data Stack Technologies

From Cloud Data Management

Learn the various functions a Data Stack needs to perform in order to select the correct data tools to take raw data and turn it into insight.

Extract and Load a Lake

From Cloud Data Management

Learn how to extract and load data sources like SalesForce, Hubspot, Marketo, etc into a single source.

Introduction - The 4 Stages of Data Sophistication

From Cloud Data Management

Learn how Data Governance practices change as the level of data sophistication changes.

Data Warehouse Architecture

From Cloud Data Management

Learn why you should build a Single Source of Truth in your Data Warehouse. Overcome common obstacles and empower your colleagues

What Stage are You at?

From Cloud Data Management

Figure out what level of data sophistication your team is at.

What Engine to Use For a Data Lake

From Cloud Data Management

Learn the pros and cons of the modern database options like Snowflake, Redshift and BigQuery to build your Data Warehouse on.

Why Build a Data Warehouse

From Cloud Data Management

Data inside of Data Lakes is challenging to work with, because it is messy and not optimized for ad hoc querying. Data in a Data Warehouse is clean, simple, and easy to use.

Creating an EC2 Server

From

Learn how to quickly set up an EC2 server on AWS. Follow along with screenshots and instructions.

Running Jupyter Notebook on an EC2 Server

From

Learn how to run a Jupyter Notebook on an AWS EC2 Server and the advantages of this.

Using Jupyter Notebooks

From

This article walks through the basics of using a Jupyter notebook. It covers how to load data, create a notebook, and begin using that notebook.

Choosing Metrics for Product Launch

From Fundamentals of Analysis

Learn how to evaluate which metrics you should select to determine if your product launch was a success

Types of Data Analysis

From Fundamentals of Analysis

There are different types of analysis. Each is suited to different roles within a company, analyst, executive, data scientists, etc.

What is an Outlier?

From Fundamentals of Analysis

Learn how to detect Outliers in different types of data and scenarios.

What is Data and Why Should I Care

From Fundamentals of Analysis

Learn about why data is relevant to all people. Data is the basis to answer any question scientifically

What is the Interquartile Range?

From Fundamentals of Analysis

Learn what the Interquartile Range (IQR) is and how to use it to determine outliers.

Arranging Your Charts as a Dashboard

From How to Design a Dashboard

Maximize understanding of a dashboard by arranging data visualizations intelligently. Learn common design patterns to improve readability.

Design a Dashboard Example

From How to Design a Dashboard

Example of how applying design thinking improves dashboard design

Find the Best Visualizations for Your Metrics

From How to Design a Dashboard

Decide what chart works best for your visualization based on if you want to show composition, relationship, distribution or comparison.

Making Sure Your Dashboard Always Gets Better

From How to Design a Dashboard

Use feedback loops from your audience to keep iterating on dashboards you create. Learn to improve it's usability and performance.

Sharing the Dashboard – Distribution Strategies

From How to Design a Dashboard

Learn the best practices and common pitfalls of sharing out dashboards. Distributing your dashboard is critical to it being useful.

How Regex in SQL Works

From How to Teach People SQL

Learn how Regex works in SQL and how to use it in your queries. See the Regex process visualized in gifs. Learn more.

How to Find Outliers with SQL

From How to Teach People SQL

Find Outliers quickly with SQL. Detect outliers using simple ORDER BY techniques and using inter quartile range. Learn more.

Syntax Conventions

From How to Teach People SQL

Learn the most important syntax conventions and styles to writing SQL

How CASE WHEN Works

From How to Teach People SQL

CASE WHEN is a SQL function that works a lot like IF THEN in other programming languages. Learn to use CASE WHEN in SQL.

How Window Functions Work

From How to Teach People SQL

Learn how window functions work by looking at gifs that show the process in slow motion. See every step in SQL animated in slow motion. Learn more.

SQL Join Types Explained Visually

From How to Teach People SQL

Data within a database exists across multiple tables, JOINs allow you to combine datasets into new tables for analysis. Learn more.

Basic SQL Practice Grounds

From Learn SQL

After you've got the basic SQL fundamentals. Put that knowledge to the test in these interactive SQL practice exercises.

FROM

From Learn SQL

This interactive SQL tutorial will explain the FROM command with syntax and use cases on how to interact with your data in the PostgreSQL database. Learn how to use the SQL FROM command by following the steps in this interactive tutorial.

Quick Introductory SQL Concepts

From Learn SQL

Learn how to use SQL with your PostgreSQL database. In this interactive SQL data tutorial, get an introduction that is designed to get you querying your PostgreSQL database as quickly as possible. Learn SQL basics, tables and SQLbox.

LIMIT and OFFSET

From Learn SQL

Learn how to LIMIT the number of results that are returned you can simply use the LIMIT SQL command at the end of the query to specify. You can use the LIMIT command on your PostgreSQL database. Read the tutorial to learn more.

ORDER BY

From Learn SQL

Learn how to use the SQL ORDER BY command in PostgreSQL. The ORDER BY command is used to sort your result data in ascending or descending order. Follow the interactive SQL tutorial to use cases and best practices for the ORDER BY command.

Browsing the SCHEMA

From Learn SQL

SCHEMA is the collection of tables and relationships in your database. In this interactive SQL tutorial, learn how to browse a SCHEMA with PostgreSQL.

SELECT

From Learn SQL

Learn the SELECT SQL statement with your PostgreSQL database in this interactive SQL data tutorial. The SELECT statement is a basic SQL command to access data.

5 Redshift SQL shortcuts

From Learn SQL

Learn how to use row number, date trunc, mod, split part, and rename to improve your queries.

AND OR Boolean Logic

From Learn SQL

This post goes into detail on how Boolean logic works and how Boolean logic can be molded with truth tables

Conditional Logic

From Learn SQL

Learn how to use CASE WHEN, IIF, and UNION with WHERE to apply conditional logic within a Postgres SQL query

Copying Data Between Tables

From Learn SQL

Learn to copy data from a table or query into another table using the INSERT command in psql.

Exclude a Column

From Learn SQL

Learn how to exclude a column in the SELECT statement of your query

Export to CSV with \copy

From Learn SQL

This article discusses how to export data from psql to a csv file using the copy commands

Find Duplicates

From Learn SQL

Learn how to find duplicate values in your database using SQL.

PostgreSQL Generate_Series

From Learn SQL

This article outlines how to use PostgreSQL's Generate_Series() function

How to Create a Copy of a Database in PostgreSQL

From Learn SQL

Learn how to create a copy of a database in postgres using psql.

How to Export PostgreSQL Data to a CSV or Excel File

From Learn SQL

Learn how to export PostgreSQL data to a CSV or Excel file. Follow these instructions to copy tables, copy a query results, and open CSV files.

How to Replace Nulls with 0s in SQL

From Learn SQL

This article talks about how to use the UPDATE statement to clean data.

How to Start a PostgreSQL Server on Mac OS X

From Learn SQL

This article shows how to start a PostgreSQL server using both Homebrew and the potgres app on a Mac.

Importing Data from CSV in PostgreSQL

From Learn SQL

This article outlines how to use psql to import data from csv files.

Insert multiple rows

From Learn SQL

Insert multiple rows into a table in a single SQL query using Postgres

Meta commands in PSQL

From Learn SQL

This article talks about what meta commands are and provides a list of some common examples.

Outputting Query Results to Files with \o

From Learn SQL

This article discusses how to use the o meta command to output to a text file

How To Generate Random Data in PostgreSQL

From Learn SQL

You can generate random values and sequences in SQL. Learn how with this interactive PostgreSQL tutorial.

SQL Cheat Sheet

From Learn SQL

Reference common commands and operators that are often forgotten

Using ALTER in PostgreSQL

From Learn SQL

Learn the different potential uses of the ALTER command in PSQL. Add and drop columns from tables. Rename and change schemas and more.

What is the difference between UNION and UNION ALL

From Learn SQL

UNION ALL keeps all of the records from each of the original data sets, UNION removes any duplicate records.

Aggregate Functions

From Learn SQL

Learn about the COUNT function in SQL. The COUNT command is used to returns the count of how many rows there are. This interactive SQL tutorial will teach you how to COUNT.

DATE and TIME Functions

From Learn SQL

DATE and TIME values in PostgreSQL have a whole special set of functions and operators for their proper use. Learn how to use dates in PostgreSQL in this interactive SQL tutorial, so you can get the DATE and TIME of your data analysis.

GROUP BY

From Learn SQL

The GROUP BY SQL statement is used to aggregate functions like COUNT, MAX, MIN, SUM and AVG. In this interactive SQL tutorial, learn to use the GROUP BY .

JOIN Relationships and JOINing Tables

From Learn SQL

A JOIN SQL statement combines two data from two tables together. In this interactive SQL tutorial, learn the most common JOIN statements.

Mid Level SQL Practice Grounds

From Learn SQL

After you've mastered mid-level SQL fundamentals, put that knowledge to the test in these interactive SQL practice exercises.

Operators

From Learn SQL

Learn about Operators in SQL and how to use them in Postgres. Operators are used to specify conditions in a SQL statement.

WHERE

From Learn SQL

Learn how to use the WHERE command in PostgreSQL. In this interactive SQL tutorial, learn the best practices of the WHERE command for analysis and exploration.

Increase Ecommerce Sales with Metrics

From Avoid Misrepresenting Data

Learn why using a single metric prevents you from being able to increase ecommerce sales. See how to use multiple metrics to get more sales.

BigQuery Optimization

From SQL Optimization

Learn how BigQuery optimizes your database through specific hardware optimization strategies.

Indexing

From SQL Optimization

Indexing is a critical part of database optimization. Indexing can dramatically increase query speed. See how indexes work and learn to build indexes with SQL.

Multicolumn Indexes

From SQL Optimization

This article walks through what multicolumn indexes are, how to make them, and when to use them.

Optimization with EXPLAIN ANALYZE

From SQL Optimization

The EXPLAIN shows the query plan for SQL queries in Postgres. Learn how to interpret the results from EXPLAIN and use it to optimize your SQL queries.

Optimize your SQL Query

From SQL Optimization

Learn quick tips for how to optimize your SQL queries

Redshift Optimization

From SQL Optimization

Learn how Redshift manages optimization for your cloud data warehouse. Learn when to manually tune the database further for more performance for SQL queries.

Order of a SQL Query

From SQL Optimization

Learn the order of the SQL query to understand where you can optimize a query. Learn to prioritize FROM, JOIN, and WHERE.

Partial Indexes

From SQL Optimization

Partial indexes are an advanced database optimization technique. Learn how to create Partial Indexes and measure their performance.

Scheduling Modeling

From SQL Optimization

Schedule data modeling tasks to make data easy to query. Learn to use dbt Cloud to model data to optimize your data warehouse. Learn more.

Snowflake Optimization

From SQL Optimization

Snowflake is a powerful cloud database. Learn how it optimizes your database automatically, and hot to increase performance manually.

Views

From SQL Optimization

SQL VIEWs allow you to create temporary or permanent references to data. This is a common database optimization technique. Learn more.

Driving Actionable Insights Through Data Engagement

From Data Conversations

Learn how to drive actionable insights by using data engagement within your company.