How To Use CASE WHEN With SUM: A Full SQL Guide

How To Use CASE WHEN With SUM: A Full SQL Guide

Introduction

SQL (Structured Query Language) is one of the most important querying languages in use today to access and transform relational databases. A common SQL problem statement is to perform a database manipulation based on conditional logic.

The CASE WHEN statement provides flexibility in expressing conditional logic as it can handle multiple conditions. To perform conditional aggregation, the SUM function combined with CASE WHEN in SQL will achieve optimized results.

In this article, we’ll briefly touch upon the SUM() and CASE() functions, some common use cases, and how the two functions can be nested. We will also explore some real-life examples for practice and provide a few useful tips to ensure error-free, optimized code.

SQL CASE() Function Review

The CASE function will help you transform data in SQL with conditional logic, much like an IF-ELSE statement.

SQL Case When Keyword

The CASE expression goes through several conditions and returns a value when the first condition is met. So, once a condition is true, it will stop reading and return the result. If no conditions are true, it returns the value in the ELSE clause.

Example

Let’s say you’ve been asked to categorize employees based on their yearly pay.

EmployeeID EmployeeName Salary
1 John Doe 55000
2 Jane Smith 60000
3 Alice Johnson 120000
4 Bob Williams 45000
5 Eva Davis 120000

Here is the query to achieve the given operation.

SELECT
    EmployeeName,
    Salary,
    CASE
        WHEN salary >= 50000 AND salary < 70000 THEN 'Grade A'
        WHEN salary >= 70000 AND salary < 90000 THEN 'Grade B'
        WHEN salary >= 90000 THEN 'Grade C'
        ELSE 'Grade D'
    END AS Grade
FROM
    employees;

Here is the desired result:

EmployeeID EmployeeName Salary Grade
1 John Doe 55000 Grade A
2 Jane Smith 60000 Grade A
3 Alice Johnson 120000 Grade B
4 Bob Williams 45000 Grade D
5 Eva Davis 120000 Grade C

In this example, the CASE statement handles multiple conditions to categorize employees into different pay grades. Each WHEN condition checks a specific range of salaries and assigns the corresponding pay grade. The ELSE statement handles all other conditions that are not included in the WHEN conditions.

SQL SUM() Function Review

The SUM function in SQL is an aggregate function used for calculating the total sum of values in a specific column. It is most often used with the GROUP BY clause to calculate totals within groups. It can also be used along with DISTINCTto calculate the sum of distinct values in a column or with other arithmetic operations such as COUNTto perform complex calculations.

Note that SUM ignores NULL values. If a row contains a NULL value in the column being summed, it is not included in the calculation. This feature can help when handling missing data.

Combining SUM() With CASE in SQL

Using SUM with CASE WHEN allows you to perform conditional summation in SQL. It’s particularly useful when you want to aggregate values based on specific conditions. Consider it the equivalent of “pivoting” data - combining multiple rows by category and an aggregate function.

Syntax:

SUM(CASE WHEN condition THEN value ELSE alternative END)

In this syntax:

  • condition: The criteria that must be met for the THEN clause to execute.
  • value: The value to sum if the condition is met.
  • THEN: Specifies the result if the condition evaluates to true.
  • ELSE alternative: If the condition is not met, this value is used. Often set to 0 so it doesn’t affect the sum.
  • END: Concludes the CASE expression.

Example:

SELECT
  SUM(CASE WHEN column_name > 100 THEN column_name ELSE 0 END) AS conditional_sum
FROM
  table_name;

In this example, we’re summing all the values in column_name where each value is greater than 100. If a value is 100 or less, it does not contribute to the sum.

Benefits of using SUM with CASE-WHEN in SQL

Now for the above illustration, the same result can be achieved more efficiently by excluding the rows with values 100 or less with the WHERE clause. However, combining SUM with CASE-WHEN is necessary when:

  • Weighted calculations: You want to assign different weightage to the values in different rows, according to multiple conditions.
  • Conditional aggregation: Sometimes, you need to aggregate data conditionally in different ways within the same column. The CASE WHEN inside SUM allows you to differentiate the sums based on varying conditions. It enables you to avoid multiple queries or subqueries and allows for easier readability and maintenance of reporting.
  • Preserving row count: Using a WHERE clause will change the count of the rows returned. With SUM and CASE WHEN, you can maintain the integrity of the row count, which is particularly useful in reports where you want to show both the total and the conditional sums.
  • Aggregating partitioned data: When used with GROUP BY or window functions, SUM, and CASE WHEN can produce aggregates over partitions of the data, which isn’t possible with WHERE alone when multiple clauses need to be satisfied.

SUM() With CASE in SQL: Use Cases

Here are some real-life scenarios where these two functions should be nested:

  • Segmenting sales: Say a retail company wants a break-up of revenue according to multiple product statuses. If products have various status types like ‘active’, ‘discontinued’, and ‘back order’, you can aggregate revenue using the SUM and CASE statements.
  • Time-based conditional summation: Let us assume you work with a marketing agency, and they want to compare the impact of several campaigns run on sales. You can calculate the total sales per campaign by applying the conditional logic based on the campaign dates.
  • Tracking performance metrics: In a table tracking the performance of sales representatives, you might want to calculate the total number of sales above a certain amount, which qualifies for a bonus.
  • Customer segmentation: An e-commerce firm may want to find the total number of customers based on their spending, categorizing them as ‘low’, ‘medium’, and ‘high’ spenders to personalize campaigns targeting each group.

SUM() With CASE in SQL: Common Examples

Let us explore some real-world examples on this topic with written code.

1. You’re working with an e-commerce company. You want to calculate the total sales amount for two categories of items: ‘Electronics’ and ‘Clothing’, as well as a total for all other categories. Use SUM with CASE WHEN to solve the problem.

Let’s say your sales table looks like this:

item_id category amount
1 Electronics 200.00
2 Electronics 450.00
3 Clothing 50.00
4 Clothing 75.00
5 Books 20.00
6 Groceries 30.00

This is the code to pivot the data by sales category:

SELECT
  SUM(CASE WHEN category = 'Electronics' THEN amount ELSE 0 END) AS ElectronicsSales,
  SUM(CASE WHEN category = 'Clothing' THEN amount ELSE 0 END) AS ClothingSales,
  SUM(CASE WHEN category NOT IN ('Electronics', 'Clothing') THEN amount ELSE 0 END) AS OtherSales
FROM
  sales;

Here is the expected output:

ElectronicsSales ClothingSales OtherSales
650.00 125.00 50.00

You can practice a similar interview problem here: Monthly Product Sales

2. Let’s look at a slightly more complex problem: SUM with CASE WHEN and the GROUP BY clause. Suppose we have a table of transactions with each transaction belonging to different departments within a company. We want to calculate the total spend for ‘IT’, ‘HR’, and ‘Marketing’, and also have a total for ‘Other’ departments, grouped by quarters.

Let us assume that you are supplied with the following transactions table.

transaction_id department amount transaction_date
1 IT 1000.00 2023-01-15
2 IT 1500.00 2023-02-17
3 HR 750.00 2023-04-25
4 Marketing 1250.00 2023-05-11
5 Marketing 1350.00 2023-07-19
6 Finance 2000.00 2023-08-21
7 HR 800.00 2023-10-05
8 IT 1100.00 2023-11-12

We will group by the quarters of the transaction dates. The quarters are:

  • Q1: January - March
  • Q2: April - June
  • Q3: July - September
  • Q4: October - December

Here is the code to achieve this transformation.

SELECT
  CASE 
    WHEN EXTRACT(MONTH FROM transaction_date) BETWEEN 1 AND 3 THEN 'Q1'
    WHEN EXTRACT(MONTH FROM transaction_date) BETWEEN 4 AND 6 THEN 'Q2'
    WHEN EXTRACT(MONTH FROM transaction_date) BETWEEN 7 AND 9 THEN 'Q3'
    WHEN EXTRACT(MONTH FROM transaction_date) BETWEEN 10 AND 12 THEN 'Q4'
  END AS Quarter,
  SUM(CASE WHEN department = 'IT' THEN amount ELSE 0 END) AS ITSales,
  SUM(CASE WHEN department = 'HR' THEN amount ELSE 0 END) AS HRSales,
  SUM(CASE WHEN department = 'Marketing' THEN amount ELSE 0 END) AS MarketingSales,
  SUM(CASE WHEN department NOT IN ('IT', 'HR', 'Marketing') THEN amount ELSE 0 END) AS OtherSales
FROM
  transactions
GROUP BY
  Quarter;

The above query uses the CASE WHEN statement in two ways; one, simply to categorize the transaction dates by quarters, and the other way is by nesting the sum and case statement to aggregate the spend by department. The output from the above query would look like this:

Quarter ITSales HRSales MarketingSales OtherSales
Q1 2500.00 0.00 0.00 0.00
Q2 0.00 750.00 1250.00 0.00
Q3 0.00 0.00 1350.00 2000.00
Q4 1100.00 800.00 0.00 0.00

This table shows the total sales amount for each department, grouped by the fiscal quarter. The ‘OtherSales’ column includes sales from departments not listed specifically (like ‘Finance’ in the transactions table).

3. HR Analytics Case Study: In this example, we’ll look at SUM with CASE WHEN in addition to an arithmetic operation. Let’s consider a scenario where we have an employees table, and we want to calculate the total salary expenditure for different job titles while also considering overtime. We want to sum up the regular salaries and the overtime payments separately and also include a total sum for each job title.

The following employees table is provided to you.

employee_id job_title salary overtime_hours overtime_rate
1 Software Developer 7000.00 5 50.00
2 Software Developer 7200.00 3 50.00
3 Graphic Designer 5000.00 10 30.00
4 Sales Associate 4500.00 0 0.00
5 Sales Associate 4600.00 2 45.00
6 Graphic Designer 5100.00 8 30.00
7 Human Resources 4500.00 0 0.00

While calculating the total compensation per job role, we’ll include the CASE WHEN statement to apply conditional logic to distinguish between job types and compute remuneration accordingly. Here is the code to get the desired result:

SELECT
  job_title,
  SUM(salary) AS TotalSalaries,
  SUM(overtime_hours * overtime_rate) AS TotalOvertimePayments,
  SUM(
    salary + 
    (overtime_hours * overtime_rate) + 
    (CASE 
      WHEN job_title = 'Software Developer' THEN 500 
      WHEN job_title = 'Graphic Designer' THEN 300 
      ELSE 200 
     END)
  ) AS TotalCompensation
FROM
  employees
GROUP BY
  job_title;

The output from the above query will now include the bonus in the total compensation:

job_title TotalSalaries TotalOvertimePayments TotalCompensation
Software Developer 14200.00 400.00 15100.00
Graphic Designer 10100.00 540.00 10940.00
Sales Associate 9100.00 90.00 9390.00
Human Resources 4500.00 0.00 4700.00

The CASE WHEN statement in the query adds a conditional bonus to total compensation by job type based on their job title.

4. We’ll look at one last example - the SUM with CASE WHEN statement when used with a logical operator. An electronics retail company tracks its sales and wants to run a special performance analysis.

They aim to categorize sales into different bonus eligibility groups based on the amount of sales and the region where the sale occurred. The conditions for categorization are:

  • Premium Sales: Any sales that are over $2000, or any sales in the “East” region regardless of the amount, are considered premium. These sales receive a higher commission.
  • Standard Sales: Any sales between $1000 and $2000, except those in the “East” region, are considered standard. They receive a standard commission.
  • Promotional Sales: Sales made in the “West” region or during the promotional period (June to July), regardless of the amount, are considered promotional and are eligible for special discounts or offers in future transactions.

The company wants a report that sums these amounts by region to help with financial planning and bonus allocations.

Here is the sales table:

sale_id employee_id product_id sale_amount sale_date region
1 101 30 1200.00 2023-06-15 East
2 102 22 800.00 2023-06-17 West
3 103 11 2300.00 2023-06-25 East
4 101 45 400.00 2023-07-03 North
5 102 30 1800.00 2023-07-19 East
6 104 25 1500.00 2023-07-21 South
7 103 22 900.00 2023-08-05 West
8 101 11 1600.00 2023-08-12 East

And here is the code to get the required output.

SELECT
  region,
  SUM(sale_amount) AS TotalSales,
  SUM(
    CASE 
      WHEN sale_amount > 2000 OR region = 'East' THEN sale_amount
      ELSE 0 
    END
  ) AS PremiumSales,
  SUM(
    CASE 
      WHEN sale_amount BETWEEN 1000 AND 2000 AND region != 'East' THEN sale_amount
      ELSE 0 
    END
  ) AS StandardSales,
  SUM(
    CASE 
      WHEN region = 'West' OR (sale_date BETWEEN '2023-06-01' AND '2023-07-31') THEN sale_amount
      ELSE 0 
    END
  ) AS PromotionalSales
FROM
  sales
GROUP BY
  region;

The output would group sales according to the company’s specifications:

region TotalSales PremiumSales StandardSales PromotionalSales
East 6000.00 6000.00 0.00 2000.00
West 1700.00 0.00 0.00 1700.00
North 400.00 0.00 0.00 400.00
South 1500.00 1500.00 0.00 0.00

Likewise, you can combine the SUM and CASE statements with other logical operators such as BETWEEN and AND in a similar way to incorporate conditional logic in similar scenarios.

SUM with CASE WHEN in SQL: Syntax Across RDBMSs

The SUM with CASE WHEN construct is quite standard in SQL and is typically supported across most SQL databases with no variation in syntax.

However, in MySQL, you have the option to use the IF() function. It works similarly to programming languages’ IF-ELSE statements and can only be used in SELECT statements, WHERE clauses, and other places where you would use a MySQL expression.

Here’s how you might nest an IF() function with SUM() to perform conditional summation.

SELECT 
  SUM(IF(category IN ('A', 'B'), amount, 0)) AS TotalSales
FROM sales;

While the IF() function has an easier syntax to follow, it is important to remember CASE is part of the SQL standard and is supported by virtually all SQL-compliant RDBMS, whereas IF() is specific to MySQL. If you want to ensure that your skills are transferable, you should prefer CASE. Additionally, CASE can handle multiple conditions in a sequence making it more versatile. So, it is generally recommended to use CASE-WHEN over IF-ELSE even when querying in MySQL.

Tips For Using SUM with CASE WHEN

It is important to remember the following caveats to make your code error-free and efficient:

  • Ordering conditional logic: The ordering of conditions in a CASE expression is significant, as the conditions are evaluated in the order they are written. Once a condition is met, the CASE expression will return the corresponding result and not evaluate any subsequent conditions. This downside is important to bear in mind to ensure that your query produces the correct result.
  • Handling NULLs: The CASE expression within an aggregate function like SUM must be carefully written to handle NULL values appropriately, as SUM disregards rows with NULLs.
  • Efficiency: The CASE statement is an additional operation on each row, which can introduce overhead, especially for a large dataset. Use a WHERE clause to limit the rows whenever possible before implementing CASE logic. This can help the query planner to use an index to filter rows first, thus reducing the number of rows over which the CASE needs to be evaluated.

Conclusion

We hope that this article has helped you understand the different ways to combine the SUM and CASE-WHEN functions to perform powerful transformations in SQL.

For an end-to-end overview of SQL, including complex joins, advanced reporting, and creating multi-table databases, you can explore our specialized learning path.

Related articles:

More SQL guides: