How to Use the SQL Subquery: A Detailed Guide
When working with SQL, retrieving complex data often requires breaking down queries into smaller, more manageable parts. This is where subqueries come in.
In this guide, we’ll explore subqueries in SQL, covering their various uses, types, applications, and best practices to make the most out of this powerful feature. By the end of this article, we’ll have a solid understanding of how to use the SQL subquery effectively to improve our database operations.
Firstly, let’s discuss what SQL subquery is and how it helps.
What is SQL subquery?
A SQL subquery, otherwise known as a nested SQL query, is a query within another query that helps filter, transform, or aggregate data before passing the results to the main query. We can use subqueries in SELECT
, INSERT
, UPDATE
, and DELETE
statements as well as WHERE
, FROM
, and HAVING
clauses.
Subqueries allow users to break complex queries into smaller, more manageable parts. This makes it easier to retrieve and manipulate data in a structured way. They are particularly useful for performing calculations, filtering data dynamically, and reducing redundancy in SQL queries.
The general syntax for a subquery is:
SELECT column1, column2, ...
FROM table_name
WHERE column_name [operator] (SELECT column_name FROM table_name WHERE condition);
Here, the part inside the parentheses is the subquery, and the part surrounding it is the main query.
For example, suppose there are two tables, employees
and departments
. If we want to find the names of employees who work in the ‘Sales’ department, we can use this query:
SELECT employee_nameFROM employeesWHERE department_id = (SELECT department_id FROM departments WHERE department_name = 'Sales');
In this query:
- The subquery is executed first.
- It searches the
departments
table and returns thedepartment_id
of the ‘Sales’ department. - Next, the outer query is executed.
- It retrieves the
department_id
returned by the subquery and returns all employees whosedepartment_id
matches with the retrieved one.
This is how queries, including subqueries, are executed in SQL.
In the next section, we’ll discuss the several types of subqueries in SQL.
How to Transform Tables with SQL
Practice more SQL in this course that covers how to manipulate and transform data.Try it for freeTypes of subqueries
Subqueries can be divided into different types based on how they return data. Let’s see how some of them work with examples based on this products
table:
product_id | product_name | price | category_id |
---|---|---|---|
1 | Laptop | 1000 | 101 |
2 | Smartphone | 600 | 102 |
3 | Tablet | 300 | 101 |
4 | Headphones | 150 | 103 |
Scalar subqueries
A scalar subquery is a type of subquery that returns a single value (one column and one row). It is commonly used in the SELECT
statement and the WHERE
clause.
For example, if we want to retrieve products that cost more than the average price in the products
table, we can use this query:
SELECT product_nameFROM productsWHERE price > (SELECT AVG(price) FROM products);
In this query, the scalar subquery first calculates the average price of all products, and then the outer query selects products with a price greater than the average price:
product_name |
---|
Laptop |
Smartphone |
Multi-row subqueries
A multi-row subquery returns a single column of values but multiple rows. It is often used with IN
and ANY
operators.
For example, if we want to find products that belong to the same category as ‘Smartphone’ in the products
table, we can use this query:
SELECT product_nameFROM productsWHERE category_id IN (SELECT category_id FROM products WHERE product_name = 'Smartphone');
In this query, the multi-row subquery first retrieves the category ID of ‘Smartphone’, and then the outer query selects products in the same category:
product_name |
---|
Smartphone |
Multi-column subqueries
A multi-column subquery returns multiple columns but a single row. It is often used with comparison operators.
For example, if we want to retrieve the details of the product with the highest price in the products
table, we can use this query:
SELECT *FROM productsWHERE (price, category_id) = (SELECT MAX(price), category_id FROM products);
In this query, the multi-column subquery first determines the maximum price and category ID. Then, the main query selects the product matching the highest price and category:
product_id | product_name | price | category_id |
---|---|---|---|
1 | Laptop | 1000 | 101 |
Correlated subqueries
A correlated subquery relies on the outer query for its execution. The subquery is executed once for each row processed by the outer query. It is often used with EXISTS
and comparison operators.
For example, if we want to retrieve products whose price is higher compared to the average price in their own category in the products
table, we can use this query:
SELECT product_name, price, category_idFROM products p1WHERE price > (SELECT AVG(price) FROM products p2 WHERE p2.category_id = p1.category_id);
The subquery is correlated because p1.category_id
from the main query refers to p2.category_id
in the subquery. For each product in the main query, the subquery computes the average price of products in the same category. Then, the main query retrieves products whose price is higher than the average price in their own category:
product_name | price | category_id |
---|---|---|
Laptop | 1000 | 101 |
Nested subqueries
A nested subquery is executed independently of the outer query. The subquery runs first, produces a result, and then the outer query uses that result. These subqueries are often used with IN
, EXISTS
, ANY
, ALL
, or as part of a WHERE
clause.
For example, if we want to retrieve products from categories that have an average price higher than the overall average price in the products
table, we can use this query:
SELECT product_nameFROM productsWHERE category_id IN (SELECT category_id FROM products GROUP BY category_id HAVING AVG(price) > (SELECT AVG(price) FROM products));
In this query:
- The inner subquery calculates the overall average price.
- Then, the middle subquery calculates the average price for each category and compares it to the overall average price.
- Then, the main query retrieves products from categories that have an average price higher than the overall average price.
Here is the output:
product_name |
---|
Laptop |
Smartphone |
Tablet |
In the next section, we’ll discuss the different ways of using the SQL subquery.
How to use the SQL subquery
In this section, we’ll learn how to use the SQL subquery with SELECT
, FROM
, INSERT
, UPDATE
, and DELETE
.
Firstly, let’s learn how to use the SQL subquery with the SELECT
statement.
Using the SQL subquery with the SELECT
statement
The SELECT
statement in SQL allows us to retrieve data from a database. We can use the SQL subquery with the SELECT
statement to perform operations in SQL.
Consider a table employees
containing the given data:
employee_id | name | department_id | salary |
---|---|---|---|
1 | Alice | 101 | 60000 |
2 | Bob | 102 | 55000 |
3 | Charlie | 101 | 70000 |
4 | David | 103 | 75000 |
5 | Eve | 102 | 50000 |
We’ll use this table for all the upcoming examples involving SQL subqueries.
Firstly, for example, if we want to find the average salary of each department in the employees
table, we can do that by combining the SQL subquery and the SELECT
statement:
SELECT name, salary, (SELECT AVG(salary) FROM employees WHERE department_id = e.department_id) AS dept_avg_salaryFROM employees e;
In this query, the subquery first finds the department-wise average salary, and then the main query displays each employee’s name, salary, and their department’s average salary:
name | salary | dept_avg_salary |
---|---|---|
Alice | 60000 | 65000 |
Bob | 55000 | 52500 |
Charlie | 70000 | 65000 |
David | 75000 | 75000 |
Eve | 50000 | 52500 |
Using the SQL subquery with the WHERE
clause
The WHERE
clause in SQL enables us to filter records in a table.
For example, if we want to retrieve employees from departments that have more than one employee in the employees
table, we can do that by combining the SQL subquery and the WHERE
clause:
SELECT nameFROM employeesWHERE department_id IN (SELECT department_id FROM employees GROUP BY department_id HAVING COUNT(*) > 1);
In this query, the subquery first finds the departments with more than one employee and then the main query retrieves the employees in those departments:
name |
---|
Alice |
Bob |
Charlie |
Eve |
Using the SQL subquery with the FROM
clause
The FROM
clause in SQL specifies the table to be manipulated.
For example, if we want to identify employees earning more than their department’s average salary in the employees
table, we can do that by combining the SQL subquery and the FROM
clause:
SELECT e.name, e.salaryFROM (SELECT department_id, AVG(salary) AS avg_salary FROM employees GROUP BY department_id) AS dept_avgJOIN employees e ON e.department_id = dept_avg.department_idWHERE e.salary > dept_avg.avg_salary;
In this query, the subquery first calculates the average salary per department. Then, the main query compares each employee’s salary with their department’s average and retrieves those who earn more:
name | salary |
---|---|
Bob | 55000 |
Charlie | 70000 |
Using the SQL subquery with the INSERT
statement
The INSERT
statement in SQL allows us to insert new data into a table.
For example, if we want to insert a new employee with a salary set to the average salary into the employees
table, we can do that by combining the SQL subquery and the INSERT
statement:
INSERT INTO employees (employee_id, name, department_id, salary) VALUES(6, 'Robert', 103, (SELECT AVG(salary) FROM employees));SELECT * FROM employees;
In this query, the subquery first calculates the average salary in the employees
table, and then the main query inserts a new employee into the table with a salary set to the average salary:
employee_id | name | department_id | salary |
---|---|---|---|
1 | Alice | 101 | 60000 |
2 | Bob | 102 | 55000 |
3 | Charlie | 101 | 70000 |
4 | David | 103 | 75000 |
5 | Eve | 102 | 50000 |
6 | Robert | 103 | 62000 |
Using the SQL subquery with the UPDATE
statement
The UPDATE
statement in SQL allows us to update existing data in a table.
For example, if we want to increase the salaries of employees earning less than 60,000 by 50% in the employees
table, we can do that by combining the SQL subquery and the UPDATE
statement:
UPDATE employeesSET salary = salary * 1.5WHERE employee_id IN (SELECT employee_id FROM employees WHERE salary < 60000);SELECT * FROM employees;
In this query, the subquery finds employees earning less than 60,000, and then the main query increases their salaries by 50%:
employee_id | name | department_id | salary |
---|---|---|---|
1 | Alice | 101 | 60000 |
2 | Bob | 102 | 82500 |
3 | Charlie | 101 | 70000 |
4 | David | 103 | 75000 |
5 | Eve | 102 | 75000 |
6 | Robert | 103 | 62000 |
Using the SQL subquery with the DELETE
statement
The DELETE
statement in SQL allows us to delete existing data from a table.
For example, if we want to delete employees earning less than the average salary from the employees
table, we can do that by combining the SQL subquery and the DELETE
statement:
DELETE FROM employeesWHERE salary < (SELECT AVG(salary) FROM employees);SELECT * FROM employees;
In this query, the subquery calculates the average salary, and then the main query removes employees earning less than that:
employee_id | name | department_id | salary |
---|---|---|---|
2 | Bob | 102 | 82500 |
4 | David | 103 | 75000 |
5 | Eve | 102 | 75000 |
In the next section, we’ll cover some best practices for using SQL subqueries.
Best practices for using SQL subqueries
There are some best practices that we can follow while using SQL subqueries, including:
- Use subqueries selectively: If the same result can be achieved with a
JOIN
, prefer that approach, as it often performs better. - Optimize performance: Index columns used in subqueries to enhance execution speed.
- Avoid correlated subqueries when possible: These execute repeatedly for each row, affecting performance.
- Use table aliases: This improves query readability and maintainability.
- Test subqueries separately: Run the subquery independently before integrating it into the main query to ensure correctness and expected results.
- Keep subqueries readable: Format and indent subqueries properly to enhance readability and maintainability.
Applying these best practices can significantly enhance the efficiency of using subqueries in SQL.
Conclusion
In this article, we learned that subqueries in SQL are a versatile tool that can simplify advanced queries by breaking them into smaller, more manageable parts. They can be used within various SQL clauses, including SELECT
, FROM
, and WHERE
, to filter, transform, or aggregate data dynamically. Understanding their execution flow and different types enables efficient query writing.
If you want to learn more about SQL subqueries, check out the How to Transform Tables with SQL course on Codecademy.
'The Codecademy Team, composed of experienced educators and tech experts, is dedicated to making tech skills accessible to all. We empower learners worldwide with expert-reviewed content that develops and enhances the technical skills needed to advance and succeed in their careers.'
Meet the full teamRelated articles
- Article
How to Use the CASE Statement in SQL (With Examples)
Learn how to use the SQL `CASE` statement to implement conditional logic efficiently. Explore its syntax, use cases, best practices, and performance considerations. - Article
SQL Commands
Glossary of commonly used SQL commands. - Article
Difference Between WHERE and HAVING Clause in SQL
Discover the key differences between SQL's WHERE and HAVING clauses. Learn their syntax and use cases and how to apply them effectively in data filtering.
Learn more on Codecademy
- Course
How to Transform Tables with SQL
Practice more SQL in this course that covers how to manipulate and transform data.With CertificateIntermediate2 hours - Skill path
Analyze Data with SQL
Learn to analyze data with SQL and prepare for technical interviews.Includes 9 CoursesWith CertificateBeginner Friendly17 hours - Free course
Intro to SQL
Use SQL to create, access, and update tables of data in a relational database.Beginner Friendly2 hours