Subqueries and Derived Tables

Subqueries and Derived Tables

Here’s where SQL gets really elegant. Sometimes one query isn’t enough to answer your question. Maybe you need to find employees who earn more than the average salary, or customers who’ve spent more than $1000 this year. That’s where subqueries come in - queries inside other queries that let you solve complex problems step by step.

What’s a Subquery?

A subquery is simply a query nested inside another query. Think of it as asking a question to help answer a bigger question.

1SELECT first_name, salary
2FROM employees
3WHERE salary > (SELECT AVG(salary) FROM employees);

The inner query SELECT AVG(salary) FROM employees runs first and returns one number. Then the outer query uses that number to find employees earning above average.

Simple Subqueries in WHERE Clauses

This is the most common use - filtering based on a calculated value.

Find employees earning more than average:

1SELECT first_name, last_name, salary
2FROM employees
3WHERE salary > (SELECT AVG(salary) FROM employees);

Find products more expensive than the average product in Electronics:

1SELECT product_name, price
2FROM products
3WHERE price > (
4    SELECT AVG(price) 
5    FROM products 
6    WHERE category = 'Electronics'
7);

Find departments with more employees than the IT department:

1SELECT department_name, employee_count
2FROM departments
3WHERE employee_count > (
4    SELECT employee_count 
5    FROM departments 
6    WHERE department_name = 'IT'
7);

Using IN with Subqueries

Sometimes your subquery returns multiple values, and you want to match any of them:

Find employees who work in departments located in New York:

1SELECT first_name, last_name
2FROM employees
3WHERE department_id IN (
4    SELECT department_id
5    FROM departments
6    WHERE city = 'New York'
7);

Find customers who placed orders in 2024:

1SELECT customer_name, email
2FROM customers
3WHERE customer_id IN (
4    SELECT customer_id
5    FROM orders
6    WHERE YEAR(order_date) = 2024
7);

EXISTS: Does This Record Exist?

EXISTS checks whether a subquery returns any rows at all. It’s often more efficient than IN:

Find customers who have placed at least one order:

1SELECT customer_name
2FROM customers c
3WHERE EXISTS (
4    SELECT 1
5    FROM orders o
6    WHERE o.customer_id = c.customer_id
7);

Find products that have never been ordered:

1SELECT product_name
2FROM products p
3WHERE NOT EXISTS (
4    SELECT 1
5    FROM order_items oi
6    WHERE oi.product_id = p.product_id
7);

Notice the SELECT 1 - since EXISTS only cares whether rows exist, it doesn’t matter what you select.

Subqueries in SELECT Clauses

You can use subqueries to add calculated columns:

Show employees with their salary compared to department average:

1SELECT 
2    first_name,
3    salary,
4    (SELECT AVG(salary) 
5     FROM employees e2 
6     WHERE e2.department_id = e1.department_id) as dept_avg_salary
7FROM employees e1;

Show customers with their total order count:

1SELECT 
2    customer_name,
3    (SELECT COUNT(*) 
4     FROM orders o 
5     WHERE o.customer_id = c.customer_id) as total_orders
6FROM customers c;

Correlated vs. Non-Correlated Subqueries

Non-correlated subqueries run once and return a value:

1-- This runs once: SELECT AVG(salary) FROM employees
2SELECT * FROM employees WHERE salary > (SELECT AVG(salary) FROM employees);

Correlated subqueries run once for each row in the outer query:

1-- This runs once per employee to get their department's average
2SELECT first_name, salary
3FROM employees e1
4WHERE salary > (
5    SELECT AVG(salary) 
6    FROM employees e2 
7    WHERE e2.department_id = e1.department_id
8);

Correlated subqueries are more flexible but can be slower.

Derived Tables: Subqueries in FROM

You can use a subquery as if it were a table:

1SELECT dept_name, avg_salary
2FROM (
3    SELECT department_id, AVG(salary) as avg_salary
4    FROM employees
5    GROUP BY department_id
6) salary_by_dept
7JOIN departments d ON salary_by_dept.department_id = d.department_id;

The subquery creates a temporary result set that you can then join with other tables.

Real-World Examples

Find top-performing salespeople (above average sales):

 1SELECT 
 2    salesperson_name,
 3    total_sales
 4FROM (
 5    SELECT 
 6        employee_id,
 7        first_name + ' ' + last_name as salesperson_name,
 8        SUM(sale_amount) as total_sales
 9    FROM employees e
10    JOIN sales s ON e.employee_id = s.employee_id
11    WHERE YEAR(sale_date) = 2024
12    GROUP BY employee_id, first_name, last_name
13) salesperson_totals
14WHERE total_sales > (
15    SELECT AVG(total_sales)
16    FROM (
17        SELECT SUM(sale_amount) as total_sales
18        FROM sales
19        WHERE YEAR(sale_date) = 2024
20        GROUP BY employee_id
21    ) avg_calc
22);

Find customers who spent more this year than last year:

 1SELECT customer_name
 2FROM customers c
 3WHERE (
 4    SELECT SUM(total_amount)
 5    FROM orders o
 6    WHERE o.customer_id = c.customer_id 
 7      AND YEAR(order_date) = 2024
 8) > (
 9    SELECT COALESCE(SUM(total_amount), 0)
10    FROM orders o
11    WHERE o.customer_id = c.customer_id 
12      AND YEAR(order_date) = 2023
13);

Common Patterns

Top N per group using subqueries:

1-- Top 2 highest-paid employees per department
2SELECT first_name, department_id, salary
3FROM employees e1
4WHERE (
5    SELECT COUNT(*)
6    FROM employees e2
7    WHERE e2.department_id = e1.department_id
8      AND e2.salary > e1.salary
9) < 2;

Finding records without matches:

1-- Customers who haven't ordered in 2024
2SELECT customer_name
3FROM customers c
4WHERE NOT EXISTS (
5    SELECT 1
6    FROM orders o
7    WHERE o.customer_id = c.customer_id
8      AND YEAR(order_date) = 2024
9);

When to Use Subqueries vs. JOINs

Use subqueries when:

  • You need to filter based on aggregate values
  • You’re checking for existence/non-existence
  • The logic is clearer as a step-by-step process

Use JOINs when:

  • You need columns from multiple tables in your result
  • Performance is critical (JOINs are often faster)
  • You’re combining data rather than filtering it

Sometimes you can solve the same problem either way:

1-- Subquery approach:
2SELECT customer_name FROM customers
3WHERE customer_id IN (SELECT customer_id FROM orders WHERE YEAR(order_date) = 2024);
4
5-- JOIN approach:
6SELECT DISTINCT c.customer_name
7FROM customers c
8INNER JOIN orders o ON c.customer_id = o.customer_id
9WHERE YEAR(o.order_date) = 2024;

Common Mistakes

Forgetting that subqueries can return multiple rows:

1-- Wrong if subquery returns multiple values:
2WHERE department_id = (SELECT department_id FROM departments WHERE city = 'New York');
3
4-- Right:
5WHERE department_id IN (SELECT department_id FROM departments WHERE city = 'New York');

Correlated subqueries without proper correlation:

1-- Wrong - missing correlation:
2SELECT * FROM employees e1
3WHERE salary > (SELECT AVG(salary) FROM employees WHERE department_id = e1.department_id);
4
5-- Right - proper table alias:
6SELECT * FROM employees e1
7WHERE salary > (SELECT AVG(salary) FROM employees e2 WHERE e2.department_id = e1.department_id);

Performance issues with complex correlated subqueries: Consider whether a JOIN might be more efficient for complex operations.

Why Subqueries Matter

Subqueries let you break complex problems into manageable pieces. Instead of trying to write one massive, complicated query, you can:

  1. Solve part of the problem with an inner query
  2. Use that result to solve the larger problem
  3. Keep your logic clear and readable

They’re particularly powerful for:

  • Comparative analysis (above/below average)
  • Existence checks (has/hasn’t done something)
  • Complex filtering based on calculated values
  • Step-by-step problem solving

Practice Exercises

  1. Find employees who earn more than their department’s average salary
  2. List customers who have placed more than 5 orders
  3. Find products that cost more than the most expensive product in the ‘Books’ category
  4. Show departments that have more employees than the company average
  5. Find customers who haven’t placed an order in the last 6 months

Master subqueries, and you’ll be able to solve almost any data problem. They’re the key to turning complex business questions into elegant SQL solutions.