In this Blog we will discuss how index actually work and help improve the performance of the database.
Clustered Index Structure:
EmployeeId is the primary key, so by default a clusterd index on the EmployeeId column is created. This means employee data is sorted by EmployeeId column and physically stored in a series of data pages in a tree like structure that looks like the following.
- The nodes at the bottom of the tree are called data pages or leaf nodes and contain the actual data rows, in our case employee rows.
- These employee rows are sorted by EmployeeId column, because EmployeeId is the primary key and by default a clusterd index on this column is created.
- For our example, let's say in Employees table we have 1200 rows and let's assume in each data page we have 200 rows.
- So, in the first data page we have 1 to 200 rows, in the second 201 to 400, in the third 401 to 600, so on and so forth.
- The node at the top of the tree is called Root Node.
- The nodes between the root node and the leaf nodes are called intermediate levels.
- The root and and the intermediate level nodes contain index rows.
- Each index row contains a key value (in our case Employee Id) and a pointer to either an intermediate level page in the B-tree, or a data row in the leaf node.
- So this tree like structure has a series of pointers that helps the query engine find data quickly.
How SQL Server finds a row by ID:
For example, let's say we want to find Employee row with EmployeeId = 1120
So the database engine starts at the root node and it picks the index node on the right, because the database engine knows it is this node that contains employee IDs from 801 to 1200.
From there, it picks the leaf node that is present on the extreme right, because employee data rows from 1001 to 1200 are present in this leaf node.
The data rows in the leaf node are sorted by Employee ID, so it's easy for the database engine to find the employee row with Id = 1120.
Notice in just 3 operations, SQL Server is able to find the data we are looking for. It's making use of the clustered index we have on the table. Let's look at this in action.
Non-Clustered Index in SQL Server:
This is when we create a non-clustered index on the Name column.
CREATE NONCLUSTERED INDEX IX_Employees_Name
ON [dbo].[Employees] ([Name])
In an non-clusterd index we do not have table data. We have key values and row locators.
We created a non-clustered index on the Name column, so the key values, in this case Employee names are sorted and stored in alphabetical order.
The row locators at the bottom of the tree contain Employee Names and cluster key of the row. In our example, Employee Id is the cluster key.