Thursday, 26 September 2019

Difference between subquery and corelated query in sql server

The difference between correlated and regular subquery is also a frequently asked SQL interview question. Mostly asked on telephonic interview where they cannot ask you to solve query and check the fundamentals and theoretical concepts.

In this article, I am going to compare correlated subquery with the regular one of different parameters e.g. their working, speed, performance, and dependency. I'll also give you a good example of correlated subquery e.g. the Nth highest salary problem and explain how exactly it solves the problem.

So, if interviewer ask you to find the 4th highest salary then there can only be at most 4 salary which are equal to or greater than the 4th highest salary.  This is just an example, you can use correlated subquery to solve many such problems in the world of data and SQL.

In short, here are the main difference between correlated and non-correlated subqueries in SQL

Working - 

A non-correlated subquery is executed only once and its result can be swapped back for a query, on the other hand, a correlated subquery executed multiple times, precisely once for each row returned by the outer query.

For example, following query is an example of non-correlated subquery:

SELECT MAX(Salary) FROM Employee 
WHERE Salary NOT IN ( SELECT MAX(Salary) FROM Employee)


Here the subquery is SELECT MAX(Salary) from Employee, you can execute and substitute the result of that query e.g. if subquery return 10000 then outer query is reduced to

SELECT MAX(Salary) from Employee where Salary NOT IN (10000). 


This is not possible with a correlated subquery, which needs to be executed multiple times as shown below:

SELECT e.Name, e.Salary FROM Employee e
WHERE 2 = (
SELECT COUNT(Salary) FROM Employee p WHERE p.salary >= e.salary)


In this example, the subquery is SELECT COUNT(Salary) FROM Employee p WHERE p.salary >= e.salary, you cannot swap it's value for the outer query because it needs to be executed for each employee.

Let's say the first row of employee has salary 5000, in this case, e.salary will be 500 and subquery will be

SELECT COUNT(Salary) FROM Employee p WHERE p.salary >= 5000

and subquery will find how many salaries are higher than 5000 if count return 2 then it's the second highest salary. This logic needs to be executed for each row outer query will process.

No comments:

Post a Comment