The difference between correlated and regular subquery is also a frequently asked SQL interview question. Mostly asked on telephonic interview where they cannot ask you to solve query and check the fundamentals and theoretical concepts.
In this article, I am going to compare correlated subquery with the regular one of different parameters e.g. their working, speed, performance, and dependency. I'll also give you a good example of correlated subquery e.g. the Nth highest salary problem and explain how exactly it solves the problem.
So, if interviewer ask you to find the 4th highest salary then there can only be at most 4 salary which are equal to or greater than the 4th highest salary. This is just an example, you can use correlated subquery to solve many such problems in the world of data and SQL.
In short, here are the main difference between correlated and non-correlated subqueries in SQL
Working -
A non-correlated subquery is executed only once and its result can be swapped back for a query, on the other hand, a correlated subquery executed multiple times, precisely once for each row returned by the outer query.
For example, following query is an example of non-correlated subquery:
Here the subquery is SELECT MAX(Salary) from Employee, you can execute and substitute the result of that query e.g. if subquery return 10000 then outer query is reduced to
This is not possible with a correlated subquery, which needs to be executed multiple times as shown below:
In this example, the subquery is SELECT COUNT(Salary) FROM Employee p WHERE p.salary >= e.salary, you cannot swap it's value for the outer query because it needs to be executed for each employee.
Let's say the first row of employee has salary 5000, in this case, e.salary will be 500 and subquery will be
and subquery will find how many salaries are higher than 5000 if count return 2 then it's the second highest salary. This logic needs to be executed for each row outer query will process.
In this article, I am going to compare correlated subquery with the regular one of different parameters e.g. their working, speed, performance, and dependency. I'll also give you a good example of correlated subquery e.g. the Nth highest salary problem and explain how exactly it solves the problem.
So, if interviewer ask you to find the 4th highest salary then there can only be at most 4 salary which are equal to or greater than the 4th highest salary. This is just an example, you can use correlated subquery to solve many such problems in the world of data and SQL.
Working -
A non-correlated subquery is executed only once and its result can be swapped back for a query, on the other hand, a correlated subquery executed multiple times, precisely once for each row returned by the outer query.
For example, following query is an example of non-correlated subquery:
SELECT MAX(Salary) FROM Employee WHERE Salary NOT IN ( SELECT MAX(Salary) FROM Employee)
Here the subquery is SELECT MAX(Salary) from Employee, you can execute and substitute the result of that query e.g. if subquery return 10000 then outer query is reduced to
SELECT MAX(Salary) from Employee where Salary NOT IN (10000).
This is not possible with a correlated subquery, which needs to be executed multiple times as shown below:
SELECT e.Name, e.Salary FROM Employee e WHERE 2 = ( SELECT COUNT(Salary) FROM Employee p WHERE p.salary >= e.salary)
In this example, the subquery is SELECT COUNT(Salary) FROM Employee p WHERE p.salary >= e.salary, you cannot swap it's value for the outer query because it needs to be executed for each employee.
Let's say the first row of employee has salary 5000, in this case, e.salary will be 500 and subquery will be
SELECT COUNT(Salary) FROM Employee p WHERE p.salary >= 5000
and subquery will find how many salaries are higher than 5000 if count return 2 then it's the second highest salary. This logic needs to be executed for each row outer query will process.