Matt David
Editor of the Data School at Chartio,
Contributor at Data School
Matt has over 7 years industry experience using data and is currently the Editor of the Data School @ Chartio. Previously he worked at Udacity as Product Lead for the School of Data Science. Data has become a prerequisite skill set for more and more non-data jobs and he is passionate about making data concepts more easily understood to increase data literacy for everybody.
Social
- Website: https://dataschool.com
- LinkedIn: Matt David
- Twitter: https://twitter.com/fronofro
Author of 47 chapters
Start Modeling Data
From SQL OptimizationQuick introduction on using to dbt and BigQuery to model data. Modeling data can have significance performance impacts on your database.
Data Mart Implementation
From Cloud Data ManagementLearn the best practices for building a Data Mart
Data Mart Maintenance
From Cloud Data ManagementLearn Best Practices for Maintaining a a Data Mart, such as handling errors and incorporating new data sources.
Defining a Data Governor
From Cloud Data ManagementData Governors maintain the database so that is remains valuable to an organization. This involves security, education, and modeling.
Source Data Connections
From Cloud Data ManagementLearn how to configure your database to analyze source data effectively.
Source Data Best Practices
From Cloud Data ManagementLearn how to manage queries, model in a BI tool, and use drag and drop query interfaces.
Starting with Source Data
From Cloud Data ManagementLearn how to analyze data from applications, production databases, and financial records.
Why Build a Data Lake
From Cloud Data ManagementLearn why you should build a data lake to improve analytics at your company.
Why Build Data Marts
From Cloud Data ManagementLearn the best reasons to build a data mart on top of your data warehouse
Row vs Column Oriented Databases
FromColumn oriented databases have become dominant over row oriented databases in data warehousing.
Build the Metrics
From How to Design a DashboardUse SQL to query the database to get the data behind the metrics people want to see. Use our template to quickly build accurate queries. Learn more.
Conclusion
From How to Design a DashboardTo create a Dashboard you need to define stakeholders and metrics, prototype designs, use SQL and a BI tool, and finally share it with the company.
Dashboard Design Process
From How to Design a DashboardSee how design thinking principals map to dashboard design. Define, Prototype, Build, and Deploy Dashboards.
Dashboard Prototyping and Feedback
From How to Design a DashboardLearn techniques to prototype your dashboard and iterate. See examples of prototyped dashboards and download dashboard templates.
Determine the Metrics to Monitor
From How to Design a DashboardDefine what metrics actually matter to business professionals. Use customer development techniques to determine the best metrics.
Finding the Data That Builds Metrics
From How to Design a DashboardLearn to collaborate with your data team to discover what data can be used within a dashboard. Use SQL to find and assess columns to use.
Identifying Key Roles
From How to Design a DashboardLearn the key stakeholders for a dashboard design process. Learn how to work with Designers, Developers, and your audience.
Introduction
From How to Design a DashboardDashboards help organizations make data driven decisions. Learn how to apply design thinking to creating useful dashboards.
Scaling Dashboards
From How to Design a DashboardSuccessful dashboards get used a lot so they need to be built to scale. Learn to share dashboards in a scalable way.
What is a Dashboard?
From How to Design a DashboardLearn what dashboards are and why they are important to making decisions. Dashboards have a long history for supporting decision making.
What Makes a Great Dashboard (ACES)
From How to Design a DashboardLearn the best practices behind dashboard design to support decision making. Accuracy, Clarity, Empowerment, and being Succinct.
Accessing Data
From How to Teach People SQLAccessing data requires permission to the database, asking questions in SQL, and knowing how to explore a schema.
Difference between WHERE and ON in SQL
From How to Teach People SQLUnderstand how filtering and joining can be done in both the ON and WHERE clauses in SQL. Choose the best strategy for JOINing data in SQL.
Cross Join - Animated
From How to Teach People SQLVisualize how SQL is joining two tables using a Cross Join. See animated visualizations of the data being CROSS JOINed in SQL. Learn more.
0 Rows Returned
From How to Teach People SQLLearn the most common reasons you will get 0 rows returned from your SQL query. See how JOINs and over filtering may be the problem.
Syntax Errors
From How to Teach People SQLLearn the most common reasons for SQL errors due to syntax. Spelling errors, Quotation marks, Capitalization, Data Types, and more.
Full Outer Join - Animated
From How to Teach People SQLVisualize how SQL is joining two tables using a Full Outer JOIN. See animated visualizations of the data being FULL OUTER joined in SQL. Learn more.
How Aggregations Work
From How to Teach People SQLVisualize how SQL aggregates data by viewing examples of COUNT, AVG, and SUM. See animations showing each step of the SQL query.
How Subqueries Work
From How to Teach People SQLVisualize how subqueries work in SQL by seeing the intermediate table that is created during the process. See animations showing each step of the SQL query.
Inner Join - Animated
From How to Teach People SQLVisualize how SQL is joining two tables using an Inner JOIN. See animated visualizations of the data being INNER joined in SQL. Learn more.
Introduction
From How to Teach People SQLIntroduction about why teaching SQL requires understanding of the audience who is less adept at dealing with abstract ideas
Left & Right Join - Animated
From How to Teach People SQLVisualize how SQL is joining two tables using a Left JOIN and a Right JOIN. See animated visualizations of the data being LEFT and RIGHT joined in SQL. Learn more.
Union - Animated
From How to Teach People SQLVisualize how SQL is joining two tables using a Union All join. See animated visualizations of the data being Unioned in SQL. Learn more.
Why Databases?
From How to Teach People SQLDatabases are better for managing data than spreadsheets when you consider size, accuracy, and security
Confirmation Bias
From Avoid Misrepresenting DataConfirmation bias negatively affects the accuracy of your analysis. Learn how to detect Confirmation Bias and how to avoid this cognitive bias.
Define Experiment Parameters
From Avoid Misrepresenting DataDefining Experiment Parameters improves analysis and increases trust in results that are shared in an organization.
Metric vs Metrics
From Avoid Misrepresenting DataA single Metric can be misleading. Learn how to use multiple Metrics to avoid misleading yourself and others.
Overall vs Groups
From Avoid Misrepresenting DataOverall statistics in data can be misleading because there may be distinct groups within the data that have very different statistics. Learn to avoid this analysis mistake.
Predicting Outcomes
From Avoid Misrepresenting DataPredicting Outcomes prevents cognitive biases from affecting how your interpretation of the results of an experiment. Learn more.
Relative vs Absolute Change
From Avoid Misrepresenting DataRelative and Absolute changes can bias your interpretation of data you are analyzing. Learn to interpret them correctly.
Review Outcomes
From Avoid Misrepresenting DataReview Outcomes of Feature Releases to evaluate their impact and to create institutional knowledge. Learn how to interpret results accurately.
Selection Bias
From Avoid Misrepresenting DataSelection bias negatively affects your analysis. Learn how to detect Selection Bias and how to avoid this cognitive bias.
Statistic vs Distribution
From Avoid Misrepresenting DataDistributions provide much more nuance to the data than a statistic does. Learn how to query to get distributions and then interpret them.
Survivorship Bias
From Avoid Misrepresenting DataSurvivorship bias negatively affects your analysis. Learn how to detect Survivorship Bias and how to avoid this cognitive bias.
Trends
From Avoid Misrepresenting DataTrend in data can be misleading depending on what time frame you are looking at it. Learn how to accurately interpret trends.
Data Lake Security
From Cloud Data ManagementLearn best practices for ensuring data security on a Data Lake database.
Reviewer of 85 chapters
What is a Query Plan
From SQL OptimizationQuery Plan's are the code that SQL runs on your database. Learn how the query plan works and how to optimize your SQL queries. Learn more
Keeping your Data Stack Agile as a Startup
From Data ConversationsWhat considerations you should have while building a modern data stack as a startup.
Taking a DA Hub Approach to Solving Analytics Challenges
From Data ConversationsThe Data Analytics Hub details how they foster productive conversations about data within companies
About this Book
From Cloud Data ManagementInformation about who this book is for, who it's not, how we wrote it, disclaimers, influences and how to contribute.
Acknowledgments & Contributions
From Cloud Data ManagementThis is a community driven book - with contributions from many different people and organizations. Help keep it relevant and continually improving.
Data Lake Maintenance
From Cloud Data ManagementLearn best practices for data lake maintenance. Handle Data Source updates and improve performance.
Data Warehouse Implementation
From Cloud Data ManagementLearn how to setup a Data Warehouse. Model and transform data to make it easy to analyze.
Data Warehouse Maintenance
From Cloud Data ManagementLearn the best practices to maintain a Data Warehouse. Learn how to add new data, remove deprecated data, and optimize for performance..
Data Warehouse Security
From Cloud Data ManagementLearn how to secure sensitive data on your database and BI platform.
Democratized or Centralized - choosing your workflow
From Cloud Data ManagementChoosing whether data should be completely run by a centralized team, or efforts should be made into enable others in the organization to work with data is a big choice for companies, with great consequences on both sides.
Doing more with your Data Mart
From Cloud Data ManagementLearn why you should use a tool like census to push your data back to your data sources
ETL vs ELT
From Cloud Data ManagementLearn why you should use an ELT over a ETL process for your Data Lake
Evaluating Data Stack Technologies
From Cloud Data ManagementLearn the various functions a Data Stack needs to perform in order to select the correct data tools to take raw data and turn it into insight.
Extract and Load a Lake
From Cloud Data ManagementLearn how to extract and load data sources like SalesForce, Hubspot, Marketo, etc into a single source.
Introduction - The 4 Stages of Data Sophistication
From Cloud Data ManagementLearn how Data Governance practices change as the level of data sophistication changes.
Data Warehouse Architecture
From Cloud Data ManagementLearn why you should build a Single Source of Truth in your Data Warehouse. Overcome common obstacles and empower your colleagues
What Stage are You at?
From Cloud Data ManagementFigure out what level of data sophistication your team is at.
What Engine to Use For a Data Lake
From Cloud Data ManagementLearn the pros and cons of the modern database options like Snowflake, Redshift and BigQuery to build your Data Warehouse on.
Why Build a Data Warehouse
From Cloud Data ManagementData inside of Data Lakes is challenging to work with, because it is messy and not optimized for ad hoc querying. Data in a Data Warehouse is clean, simple, and easy to use.
Creating an EC2 Server
FromLearn how to quickly set up an EC2 server on AWS. Follow along with screenshots and instructions.
Running Jupyter Notebook on an EC2 Server
FromLearn how to run a Jupyter Notebook on an AWS EC2 Server and the advantages of this.
Using Jupyter Notebooks
FromThis article walks through the basics of using a Jupyter notebook. It covers how to load data, create a notebook, and begin using that notebook.
Choosing Metrics for Product Launch
From Fundamentals of AnalysisLearn how to evaluate which metrics you should select to determine if your product launch was a success
Types of Data Analysis
From Fundamentals of AnalysisThere are different types of analysis. Each is suited to different roles within a company, analyst, executive, data scientists, etc.
What is an Outlier?
From Fundamentals of AnalysisLearn how to detect Outliers in different types of data and scenarios.
What is Data and Why Should I Care
From Fundamentals of AnalysisLearn about why data is relevant to all people. Data is the basis to answer any question scientifically
What is the Interquartile Range?
From Fundamentals of AnalysisLearn what the Interquartile Range (IQR) is and how to use it to determine outliers.
Arranging Your Charts as a Dashboard
From How to Design a DashboardMaximize understanding of a dashboard by arranging data visualizations intelligently. Learn common design patterns to improve readability.
Design a Dashboard Example
From How to Design a DashboardExample of how applying design thinking improves dashboard design
Find the Best Visualizations for Your Metrics
From How to Design a DashboardDecide what chart works best for your visualization based on if you want to show composition, relationship, distribution or comparison.
Making Sure Your Dashboard Always Gets Better
From How to Design a DashboardUse feedback loops from your audience to keep iterating on dashboards you create. Learn to improve it's usability and performance.
Sharing the Dashboard – Distribution Strategies
From How to Design a DashboardLearn the best practices and common pitfalls of sharing out dashboards. Distributing your dashboard is critical to it being useful.
How Regex in SQL Works
From How to Teach People SQLLearn how Regex works in SQL and how to use it in your queries. See the Regex process visualized in gifs. Learn more.
How to Find Outliers with SQL
From How to Teach People SQLFind Outliers quickly with SQL. Detect outliers using simple ORDER BY techniques and using inter quartile range. Learn more.
Syntax Conventions
From How to Teach People SQLLearn the most important syntax conventions and styles to writing SQL
How CASE WHEN Works
From How to Teach People SQLCASE WHEN is a SQL function that works a lot like IF THEN in other programming languages. Learn to use CASE WHEN in SQL.
How Window Functions Work
From How to Teach People SQLLearn how window functions work by looking at gifs that show the process in slow motion. See every step in SQL animated in slow motion. Learn more.
SQL Join Types Explained Visually
From How to Teach People SQLData within a database exists across multiple tables, JOINs allow you to combine datasets into new tables for analysis. Learn more.
Basic SQL Practice Grounds
From Learn SQLAfter you've got the basic SQL fundamentals. Put that knowledge to the test in these interactive SQL practice exercises.
Quick Introductory SQL Concepts
From Learn SQLLearn how to use SQL with your PostgreSQL database. In this interactive SQL data tutorial, get an introduction that is designed to get you querying your PostgreSQL database as quickly as possible. Learn SQL basics, tables and SQLbox.
LIMIT and OFFSET
From Learn SQLLearn how to LIMIT the number of results that are returned you can simply use the LIMIT SQL command at the end of the query to specify. You can use the LIMIT command on your PostgreSQL database. Read the tutorial to learn more.
Browsing the SCHEMA
From Learn SQLSCHEMA is the collection of tables and relationships in your database. In this interactive SQL tutorial, learn how to browse a SCHEMA with PostgreSQL.
5 Redshift SQL shortcuts
From Learn SQLLearn how to use row number, date trunc, mod, split part, and rename to improve your queries.
AND OR Boolean Logic
From Learn SQLThis post goes into detail on how Boolean logic works and how Boolean logic can be molded with truth tables
Conditional Logic
From Learn SQLLearn how to use CASE WHEN, IIF, and UNION with WHERE to apply conditional logic within a Postgres SQL query
Copying Data Between Tables
From Learn SQLLearn to copy data from a table or query into another table using the INSERT command in psql.
Export to CSV with \copy
From Learn SQLThis article discusses how to export data from psql to a csv file using the copy commands
PostgreSQL Generate_Series
From Learn SQLThis article outlines how to use PostgreSQL's Generate_Series() function
How to Create a Copy of a Database in PostgreSQL
From Learn SQLLearn how to create a copy of a database in postgres using psql.
How to Export PostgreSQL Data to a CSV or Excel File
From Learn SQLLearn how to export PostgreSQL data to a CSV or Excel file. Follow these instructions to copy tables, copy a query results, and open CSV files.
How to Replace Nulls with 0s in SQL
From Learn SQLThis article talks about how to use the UPDATE statement to clean data.
How to Start a PostgreSQL Server on Mac OS X
From Learn SQLThis article shows how to start a PostgreSQL server using both Homebrew and the potgres app on a Mac.
Importing Data from CSV in PostgreSQL
From Learn SQLThis article outlines how to use psql to import data from csv files.
Insert multiple rows
From Learn SQLInsert multiple rows into a table in a single SQL query using Postgres
Meta commands in PSQL
From Learn SQLThis article talks about what meta commands are and provides a list of some common examples.
Outputting Query Results to Files with \o
From Learn SQLThis article discusses how to use the o meta command to output to a text file
How To Generate Random Data in PostgreSQL
From Learn SQLYou can generate random values and sequences in SQL. Learn how with this interactive PostgreSQL tutorial.
Using ALTER in PostgreSQL
From Learn SQLLearn the different potential uses of the ALTER command in PSQL. Add and drop columns from tables. Rename and change schemas and more.
What is the difference between UNION and UNION ALL
From Learn SQLUNION ALL keeps all of the records from each of the original data sets, UNION removes any duplicate records.
Aggregate Functions
From Learn SQLLearn about the COUNT function in SQL. The COUNT command is used to returns the count of how many rows there are. This interactive SQL tutorial will teach you how to COUNT.
DATE and TIME Functions
From Learn SQLDATE and TIME values in PostgreSQL have a whole special set of functions and operators for their proper use. Learn how to use dates in PostgreSQL in this interactive SQL tutorial, so you can get the DATE and TIME of your data analysis.
JOIN Relationships and JOINing Tables
From Learn SQLA JOIN SQL statement combines two data from two tables together. In this interactive SQL tutorial, learn the most common JOIN statements.
Mid Level SQL Practice Grounds
From Learn SQLAfter you've mastered mid-level SQL fundamentals, put that knowledge to the test in these interactive SQL practice exercises.
Increase Ecommerce Sales with Metrics
From Avoid Misrepresenting DataLearn why using a single metric prevents you from being able to increase ecommerce sales. See how to use multiple metrics to get more sales.
BigQuery Optimization
From SQL OptimizationLearn how BigQuery optimizes your database through specific hardware optimization strategies.
Indexing
From SQL OptimizationIndexing is a critical part of database optimization. Indexing can dramatically increase query speed. See how indexes work and learn to build indexes with SQL.
Multicolumn Indexes
From SQL OptimizationThis article walks through what multicolumn indexes are, how to make them, and when to use them.
Optimization with EXPLAIN ANALYZE
From SQL OptimizationThe EXPLAIN shows the query plan for SQL queries in Postgres. Learn how to interpret the results from EXPLAIN and use it to optimize your SQL queries.
Redshift Optimization
From SQL OptimizationLearn how Redshift manages optimization for your cloud data warehouse. Learn when to manually tune the database further for more performance for SQL queries.
Order of a SQL Query
From SQL OptimizationLearn the order of the SQL query to understand where you can optimize a query. Learn to prioritize FROM, JOIN, and WHERE.
Partial Indexes
From SQL OptimizationPartial indexes are an advanced database optimization technique. Learn how to create Partial Indexes and measure their performance.
Scheduling Modeling
From SQL OptimizationSchedule data modeling tasks to make data easy to query. Learn to use dbt Cloud to model data to optimize your data warehouse. Learn more.
Snowflake Optimization
From SQL OptimizationSnowflake is a powerful cloud database. Learn how it optimizes your database automatically, and hot to increase performance manually.
Views
From SQL OptimizationSQL VIEWs allow you to create temporary or permanent references to data. This is a common database optimization technique. Learn more.
Driving Actionable Insights Through Data Engagement
From Data ConversationsLearn how to drive actionable insights by using data engagement within your company.