Difference Between Primary Key and Index

What is a Primary Key?

A primary key is a database column or a combination of columns which uniquely identify all the records of a table. For each record in the table, the primary key acts like a “Social Security Number” or a “Driver’s License Number” where each number is linked only to one person. The key is used to compare, sort, and store records, as well as create relationships between them. Let’s look at an example of a primary key using SQL:

Example: To start, let’s create a table called “Persons” with the following fields :

  1. ID
  2. Last Name
  3. First Name
  4. Age

As people might share common names or be the same age, the last name, first name, and age columns cannot be used as the primary key. The column titled “ID” will be used as the primary key. The “ID” primary key enables us to uniquely identify each row in the table.

CREATE TABLE Persons (
ID int NOT NULL,
LastName varchar(255) NOT NULL,
FirstName varchar(255),
Age int,
PRIMARY KEY (ID)
);

What is an Index?

Indexes are used to quickly locate data within a table. It eliminates the need to search every row in the table each time the table is accessed. An index can be created using one or more columns from a database table.

If we are querying using SQL with a WHERE clause, and the table does not have indexes on any column, the SQL server has to walk through the whole table and check every row to see if it matches, this can be a slow operation on big tables. Let’s look at an example of an index using SQL:

Example: Using the table from the first example titled “Persons”, let’s create an index that searches the database by last name.

CREATE INDEX name
ON Persons (LastName);

By doing this, while running a SQL query only the rows containing the index will be loaded into computer memory. Beneficially, matching records can be found much easier in an indexed database.

Now that you have an overview of both a primary key and an index, let’s do a quick comparison of the key features.

Primary Key vs Index

Primary Key

Index

Used to create relationship between recordsUsed to easily locate records
It may not have a NULL valueIt may be NULL
It helps in data integrityIt helps in data distribution
It has to be uniqueIt may be unique or non-unique
It should ideally be immutable (unchanged)Index can be updated whenever required

Conclusion

To summarize, primary key is just a command that makes sure you have unique and not null values in a particular set of columns. It enables consistent storage of data by ensuring we do not end up with duplicate records. On the other hand, index is a separate object in a database that optimizes the storage of table records to help with quick retrieval of records. Hence, there would not be any performance penalty if a table uses an index, but does not use a primary key.

Resources

  1. https://www.w3schools.com/sql/sql_primarykey.asp
  2. https://www.tutorialspoint.com/sql/sql-indexes.htm

 

Rohan Joseph

About Rohan Joseph

Practicing the dark arts of data science. I am currently pursuing Master's in Operations Research at Virginia Tech and working with Chartio to democratize analytics in every organization.