Wednesday, 30 December 2015

What is the Difference between ROW_NUMBER(), RANK() and DENSE_RANK()?

Introduction

All of these three functions are used to calculate RowID for the result set returned from a query but in a slightly different way.

Row_Number()

This function will assign a unique id to each row returned from the query.
Consider the following query:
DECLARE @Table TABLE (
      Col_Value varchar(2)
)

INSERT INTO @Table (Col_Value)
      VALUES ('A'),('A'),('A'),('B'),('B'),('C'),('C');

SELECT
      Col_Value,
      ROW_NUMBER() OVER (ORDER BY Col_Value) AS 'RowID'
FROM
      @Table;    
After executing it, we will get:
Col_ValueRowID
A1
A2
A3
B4
B5
C6
C7
As we notice, each and every row has a unique ID.

Rank()

This function will assign a unique number to each distinct row, but it leaves a gap between the groups. Let me explain with a query, we will use the same query we used above with Rank().
SELECT
      Col_Value,
      Rank() OVER (ORDER BY Col_Value) AS 'RowID'
FROM
      @Table;
This query will return:
Col_ValueRowID
A1
A1
A1
B4
B4
C6
C6
As we can see, rowid is unique for each distinct value, but with a gap. What is this gap?
This gap represents number of occurrence. For example: value ‘a’ is repeated thrice and has rank ‘1’, the next rank will be 1+3=4. Same with the next value 4+2=6 and so on.

Dense_Rank()

This function is similar to Rank with only difference, this will not leave gaps between groups.
So if we use the same query used for Rank, then:
SELECT
      Col_Value,
      DENSE_RANK() OVER (ORDER BY Col_Value) AS 'RowID'
FROM
      @Table;
We will get:
Col_ValueRowID
A1
A1
A1
B2
B2
C3
C3
So it is clear that it generates a unique id for each group and without repetition.
As we are clear now what these functions do, we can use them in different scenarios based on our need.
For example: Row_Number() can be used if we want to skip and fetch some records. Take a look at this query:
WITH AllRecords
AS (
      SELECT
            Col_Value,
            ROW_NUMBER() OVER (ORDER BY Col_Value) AS 'RowID'
      FROM
            @Table)

SELECT
      Col_Value,
      RowID
FROM
      AllRecords
WHERE
      RowID >= 4
      AND RowID <= 5;
This will return only those rows with RowID 4 and 5. This function is very useful in paging data in SQL instead of in code till SQL SERVER 2012. In SQL Server 2012, Microsoft introduces a new feature Offset Fetch similar to what we did in the above query.
In SQL Server 2012, the above query will be written as:
SELECT
      Col_Value
FROM
      @Table
ORDER BY
      Col_Value OFFSET 3 ROWS FETCH NEXT 2 ROWS ONLY;
This will skip first 3 rows and will fetch next 2.
Similarly, we can use Dense_Rank() when we need to calculate rowid with Select Distinct.
SELECT
DISTINCT
      Col_Value,
      DENSE_RANK() OVER (ORDER BY Col_Value) AS 'RowID'
FROM
      @Table;

Will return:-
Col_ValueRowID
A1
B2
C3
Or when we need a report where ranks clashes. For example: Two students scored same marks and share same division.
We can use Rank() if we don’t want consecutive numbers.

Difference Between CTE, Temp table and table variable

 Temp Table or Table variable or CTE are commonly used for storing data temporarily in SQL Server. In this article, you will learn the differences among these three.

CTE

CTE stands for Common Table expressions. It was introduced with SQL Server 2005. It is a temporary result set and typically it may be a result of complex sub-query. Unlike temporary table its life is limited to the current query. It is defined by using WITH statement. CTE improves readability and ease in maintenance of complex queries and sub-queries. Always begin CTE with semicolon.

A sub query without CTE is given below :

  1. SELECT * FROM (
  2. SELECT Addr.Address, Emp.Name, Emp.Age From Address Addr
  3. Inner join Employee Emp on Emp.EID = Addr.EID) Temp
  4. WHERE Temp.Age > 50
  5. ORDER BY Temp.NAME

By using CTE above query can be re-written as follows :

  1. ;With CTE1(Address, Name, Age)--Column names for CTE, which are optional
  2. AS
  3. (
  4. SELECT Addr.Address, Emp.Name, Emp.Age from Address Addr
  5. INNER JOIN EMP Emp ON Emp.EID = Addr.EID
  6. )
  7. SELECT * FROM CTE1 --Using CTE
  8. WHERE CTE1.Age > 50
  9. ORDER BY CTE1.NAME

When to use CTE

  1. This is used to store result of a complex sub query for further use.
  2. This is also used to create a recursive query.

Temporary Tables

In SQL Server, temporary tables are created at run-time and you can do all the operations which you can do on a normal table. These tables are created inside Tempdb database. Based on the scope and behavior temporary tables are of two types as given below-
  1. Local Temp Table

    Local temp tables are only available to the SQL Server session or connection (means single user) that created the tables. These are automatically deleted when the session that created the tables has been closed. Local temporary table name is stared with single hash ("#") sign.
    1. CREATE TABLE #LocalTemp
    2. (
    3. UserID int,
    4. Name varchar(50),
    5. Address varchar(150)
    6. )
    7. GO
    8. insert into #LocalTemp values ( 1, 'Shailendra','Noida');
    9. GO
    10. Select * from #LocalTemp
    The scope of Local temp table exist to the current session of current user means to the current query window. If you will close the current query window or open a new query window and will try to find above created temp table, it will give you the error.
  2. Global Temp Table

    Global temp tables are available to all SQL Server sessions or connections (means all the user). These can be created by any SQL Server connection user and these are automatically deleted when all the SQL Server connections have been closed. Global temporary table name is stared with double hash ("##") sign.
    1. CREATE TABLE ##GlobalTemp
    2. (
    3. UserID int,
    4. Name varchar(50),
    5. Address varchar(150)
    6. )
    7. GO
    8. insert into ##GlobalTemp values ( 1, 'Shailendra','Noida');
    9. GO
    10. Select * from ##GlobalTemp
    Global temporary tables are visible to all SQL Server connections while Local temporary tables are visible to only current SQL Server connection.

Table Variable

This acts like a variable and exists for a particular batch of query execution. It gets dropped once it comes out of batch. This is also created in the Tempdb database but not the memory. This also allows you to create primary key, identity at the time of Table variable declaration but not non-clustered index.
  1. GO
  2. DECLARE @TProduct TABLE
  3. (
  4. SNo INT IDENTITY(1,1),
  5. ProductID INT,
  6. Qty INT
  7. )
  8. --Insert data to Table variable @Product
  9. INSERT INTO @TProduct(ProductID,Qty)
  10. SELECT DISTINCT ProductID, Qty FROM ProductsSales ORDER BY ProductID ASC
  11. --Select data
  12. Select * from @TProduct
  13. --Next batch
  14. GO
  15. Select * from @TProduct --gives error in next batch

Note

  1. Temp Tables are physically created in the Tempdb database. These tables act as the normal table and also can have constraints, index like normal tables.
  2. CTE is a named temporary result set which is used to manipulate the complex sub-queries data. This exists for the scope of statement. This is created in memory rather than Tempdb database. You cannot create any index on CTE.
  3. Table Variable acts like a variable and exists for a particular batch of query execution. It gets dropped once it comes out of batch. This is also created in the Tempdb database but not the memory.

    -----------------------------------------------------------------------------------------------------------
    Both Temporary Tables (a.k.a # Tables) and Table Variables (a.k.a @ Tables) in Sql Server provide a mechanism for Temporary holding/storage of the result-set for further processing.
    Below table lists out some of the major difference between Temporary Table and Table Variable. Each of these differences are explained in-detail with extensive list of examples in the next articles in this series which are listed above.
    1. SYNTAX
    Below is the sample example of Creating a Temporary Table, Inserting records into it, retrieving the rows from it and then finally dropping the created Temporary Table.
    -- Create Temporary Table
    CREATE TABLE #Customer
    (Id INT, Name VARCHAR(50))
    --Insert Two records
    INSERT INTO #Customer
    VALUES(1,'Basavaraj')
    INSERT INTO #Customer
    VALUES(2,'Kalpana')
    --Reterive the records
    SELECT * FROM #Customer
    --DROP Temporary Table
    DROP TABLE #Customer
    GO

    Below is the sample example of Declaring a Table Variable, Inserting records into it and retrieving the rows from it.
    -- Create Table Variable
    DECLARE @Customer TABLE
    (
     Id INT,
     Name VARCHAR(50)  
    )
    --Insert Two records
    INSERT INTO @Customer
    VALUES(1,'Basavaraj')
    INSERT INTO @Customer
    VALUES(2,'Kalpana')
    --Reterive the records
    SELECT * FROM @Customer
    GO
    RESULT:
    2. MODIFYING STRUCTURE
    Temporary Table structure can be changed after it’s creation it implies we can use DDL statements ALTER, CREATE, DROP.
    Below script creates a Temporary Table #Customer, adds Address column to it and finally the Temporary Table is dropped.
    --Create Temporary Table
    CREATE TABLE #Customer
    (Id INT, Name VARCHAR(50))
    GO
    --Add Address Column
    ALTER TABLE #Customer
    ADD Address VARCHAR(400)
    GO
    --DROP Temporary Table
    DROP TABLE #Customer
    GO
    Table Variables doesn’t support DDL statements like ALTER, CREATE, DROP etc, implies we can’t modify the structure of Table variable nor we can drop it explicitly.
    3. STORAGE LOCATION
    One of the most common MYTH about Temporary Table & Table Variable is that: Temporary Tables are created in TempDB and Table Variables are created In-Memory. Fact is that both are created in TempDB, below Demos prove this reality.
    4. TRANSACTIONS
    Temporary Tables honor the explicit transactions defined by the user.Table variables doesn’t participate in the explicit transactions defined by the user.
    5. USER DEFINED FUNCTION
    Temporary Tables are not allowed in User Defined Functions.Table Variables can be used in User Defined Functions.
    6. INDEXES
    Temporary table supports adding Indexes explicitly after Temporary Table creation and it can also have the implicit Indexes which are the result of Primary and Unique Key constraint.Table Variables doesn’t allow the explicit addition of Indexes after it’s declaration, the only means is the implicit indexes which are created as a result of the Primary Key or Unique Key constraint defined during Table Variable declaration.
    7. SCOPE
    There are two types of Temporary Tables, one Local Temporary Tables whose name starts with single # sign and other one is Global Temporary Tables whose name starts with two # signs.Scope of the Local Temporary Table is the session in which it is created and they are dropped automatically once the session ends and we can also drop them explicitly. If a Temporary Table is created within a batch, then it can be accessed within the next batch of the same session. Whereas if a Local Temporary Table is created within a stored procedure then it can be accessed in it’s child stored procedures, but it can’t be accessed outside the stored procedure.Scope of Global Temporary Table is not only to the session which created, but they will visible to all other sessions. They can be dropped explicitly or they will get dropped automatically when the session which created it terminates and none of the other sessions are using it.Scope of the Table variable is the Batch or Stored Procedure in which it is declared. And they can’t be dropped explicitly, they are dropped automatically when batch execution completes or the Stored Procedure execution completes.