The SQL ** COUNT (CASE WHEN...)** statement is a powerful tool in SQL programming that allows you to perform counts based on specified conditions. By leveraging conditional counting, you can efficiently retrieve, analyze, and aggregate data with a single query. In this article, we will explore syntax, usage, real-world examples, and advanced techniques of SQL’s

`COUNT (CASE WHEN...)`

Before we get started with leveraging the power of ** COUNT (CASE WHEN...)**, let’s first review

`COUNT`

`CASE WHEN`

In SQL, ** COUNT(COLUMN)** is an aggregation function used to calculate all non-null) instances within a specified column. A more inclusive form called

`COUNT(*)`

In SQL, the ** CASE WHEN** statement is a conditional expression that allows you to perform logical evaluations based on specified conditions and return relevant values. This is useful in conducting conditional transformations or aggregations of data.

** COUNT (CASE WHEN...)** is one such “conditional aggregation” we can perform with those simpler functions. Let’s explore how we are able to use it in SQL queries.

** COUNT (CASE WHEN...)** is known as a

`CASE WHEN`

`COUNT`

To understand the usage of SQL’s ** COUNT (CASE WHEN...)** in practical scenarios, let’s take a look at the statement in action. Consider the following code snippet:

```
SELECT
COUNT (
CASE WHEN order_type = 'purchase' then 1
ELSE 0
END
) FROM ORDERS;
```

In this query, we utilize the ** COUNT (CASE WHEN...)** statement to count the number of purchases within the “orders” table. From how we coded the query, we only count a row if it has the order_type of ‘purchase’. This can be expressed by tagging the result as 1. On the other hand, if it is not of type purchase, we can tag it as 0.

Let’s consider the following queries. Which of the following is logically equivalent to the query above? (Focus specifically on: `COUNT(CASE WHEN order_type = 'purchase' then 1 END)`

).

**A:**`COUNT(CASE WHEN order_type = 'purchase' then 0 END)`

**B:**`COUNT(CASE WHEN order_type = 'purchase' then 2 END)`

**C:**`COUNT(CASE WHEN order_type = 'purchase' then NULL END)`

**D:**`COUNT(CASE WHEN order_type = 'purchase' then 1 ELSE 0 END)`

In SQL, the COUNT() function only considers non-null values. The case statement inside the COUNT() function generates a value for each row in the dataset. The COUNT() function then counts the number of rows that do not have a NULL value generated by the case statement.

Given this understanding, the logically equivalent options to ** COUNT(CASE WHEN order_type = 'purchase' then 1 END)** are:

**A:**`COUNT(CASE WHEN order_type = 'purchase' then 0 END)`

**B:**`COUNT(CASE WHEN order_type = 'purchase' then 2 END)`

These queries are equivalent because they also produce non-null values for each row where ** order_type = 'purchase'**, and hence will be counted by the COUNT() function.

The options that are not logically equivalent are:

**C:**`COUNT(CASE WHEN order_type = 'purchase' then NULL END)`

This query is not equivalent because it produces NULL when

, and hence it will not be counted by the COUNT() function.`order_type = 'purchase'`

**D:**`COUNT(CASE WHEN order_type = 'purchase' then 1 ELSE 0 END)`

This query is not equivalent because it produces a non-null value for every row in the dataset, not just those where

, and hence all rows will be counted by the COUNT() function, changing the result from the original query.`order_type = 'purchase'`

Real-world examples are invaluable in helping us understand and apply concepts in practical ways. You may take different approaches with the ** COUNT (CASE WHEN...)** when designing for unique business solutions and business logic, so familiarizing yourself now with patterns of use can make your work go by much quicker in the future.

In your role on the logistics team at Happy Farms Inc., you are responsible for ensuring smooth operations and meeting customer demands. One of our esteemed buyers has expressed interest in purchasing a significant quantity of sheep products. You must ensure that we have sufficient stock to fulfill their order.

Let’s explore different ways to reach the solution:

One of the most basic and commonly used problem scenarios when using ** COUNT (CASE WHEN...)** is to count the number of instances of a classification.

To determine the availability of sheep products, you can swiftly retrieve the information from our comprehensive database through a simple query:

```
SELECT
COUNT(CASE WHEN product = 'sheep' THEN 1)
FROM
products;
```

The use of ** COUNT (CASE WHEN...)** as a part of our query logic might not be needed to perform such a simple task however. It might instead be more efficient, and easier to code, to use the

`WHERE`

```
SELECT
COUNT(*) AS sheep_product_stock
FROM
products
WHERE
product = 'sheep';
```

As things become more complex, there are instances where the use of ** COUNT (CASE WHEN...)** is not only beneficial but also necessary. Let’s see an iterated version of our sheep storage question for these use cases.

When you need to count based on a single classification, the ** WHERE** approach can be effective. However, real-world business challenges often necessitate more intricate queries, where counting based on multiple classifications becomes necessary. In such cases, relying solely on the

`WHERE`

`COUNT (CASE WHEN...)`

Happy Farms has received a new request. One of our esteemed buyers wishes to purchase various products, not just sheep. It is crucial to verify our stock availability to fulfill their order promptly. We will use ** COUNT (CASE WHEN...)** to find the status of all categories of livestock.

```
SELECT
COUNT(CASE WHEN product = 'sheep' AND status = 'pending' THEN 1 END)
AS sheep_pending,
COUNT(CASE WHEN product = 'sheep' AND status = 'shipped' THEN 1 END)
AS sheep_shipped,
COUNT(CASE WHEN product = 'sheep' AND status = 'cancelled' THEN 1 END)
AS sheep_cancelled,
COUNT(CASE WHEN product = 'cow' AND status = 'pending' THEN 1 END)
AS cow_pending,
COUNT(CASE WHEN product = 'cow' AND status = 'shipped' THEN 1 END)
AS cow_shipped,
COUNT(CASE WHEN product = 'cow' AND status = 'cancelled' THEN 1 END)
AS cow_cancelled,
COUNT(CASE WHEN product = 'goat' AND status = 'pending' THEN 1 END)
AS goat_pending,
COUNT(CASE WHEN product = 'goat' AND status = 'shipped' THEN 1 END)
AS goat_shipped,
COUNT(CASE WHEN product = 'goat' AND status = 'cancelled' THEN 1 END)
AS goat_cancelled
FROM orders;
```

This query looks ugly, and there are certainly ways to make it more efficient by utilizing the GROUP BY clause. Let’s take a look at that now.

By applying the GROUP BY clause, we can streamline the query and enhance its performance. The grouping operation allows us to aggregate the counts based on common attributes, resulting in a cleaner and more concise query structure.

Let’s take a look at an improved version of the query using the GROUP BY clause:

```
SELECT
product,
COUNT(CASE WHEN status = 'pending' THEN 1 END) AS count_pending,
COUNT(CASE WHEN status = 'shipped' THEN 1 END) AS count_shipped,
COUNT(CASE WHEN status = 'cancelled' THEN 1 END) AS count_cancelled
FROM orders
GROUP BY product;
```

That’s a significant improvement! Because COUNT is an aggregate function we can use GROUP BY to generate more detailed and informative result sets.

The GROUP BY clause allows us to organize the data based on specific attributes, such as ‘product’ in this case, creating distinct groups within the dataset. Then, with COUNT, we can perform calculations within each group, such as counting the occurrences of different status categories like ‘pending,’ ‘shipped,’ and ‘cancelled.’

This combination of functions enables us to obtain a more granular view of our data, revealing the distribution and quantities of orders across different product categories.

The ** COUNT(CASE WHEN...)** statement may appear simple at first glance, but it offers immense power when used strategically. By employing advanced techniques, you can harness its full potential, especially when tackling interview questions. In this section, we will delve into effective strategies for utilizing

`COUNT(CASE WHEN...)`

Consider a different approach to counting values using the CASE WHEN statement. Instead of counting non-null values, we can count instances of `NULL`

. This technique can be useful in certain scenarios. Let’s explore it through the following interview question:

The support team at your company is trying to analyze ticketing data to improve response times. They have noticed that a significant number of tickets have no assigned agent, making it difficult to track the progress of these tickets.

Write a SQL query to count the total number of tickets, and the number of tickets that are either assigned or unassigned to agents.

Note: Tickets that are not assigned to any agent have ** NULL** in their

`agent_id`

Example:

**Input**:

** tickets** table

** tickets** table

Column | Type |
---|---|

id | INTEGER |

issue | VARCHAR |

agent_id | INTEGER |

created_at | TIMESTAMP |

updated_at | TIMESTAMP |

**Output:**

Column | Type |
---|---|

total_tickets | INTEGER |

tickets_with_agent | INTEGER |

tickets_without_agent | INTEGER |

```
SELECT
COUNT(*) AS total_tickets,
COUNT(agent_id) AS tickets_with_agent,
COUNT(CASE WHEN agent_id IS NULL 1) AS tickets_without_agent
FROM
tickets;
```

Answering this question is relatively easy, and there are several ways to approach it. Nonetheless, one of the simplest techniques to tackle this question is by utilizing the ** COUNT(CASE WHEN...)** method. By employing count CASE WHEN, we can reverse the usual behavior of the count function and instead count the number of NULL values using the conditional

`CASE WHEN agent_id IS NULL 1`

.In some scenarios, you may need to nest ** CASE WHEN** statements within

`COUNT`

Let’s explore this through an example:

You are a data analyst at a videogame production company and you’ve been asked to analyze the player’s behaviors. Specifically, they want to know the count of players who have played a number of games that is less than 10 but more than 5, as well as the count of players who have played 10 or more games.

The ** players** table is as follows:

Column | Type |
---|---|

id | INTEGER |

name | VARCHAR |

games_played | INTEGER |

registered_at | TIMESTAMP |

Write a SQL query to obtain the needed information.

**Output:**

```
SELECT
COUNT(CASE WHEN games_played > 5 AND games_played < 10 THEN 1 END) AS players_more_than_5_to_10_games,
COUNT(CASE WHEN games_played >= 10 THEN 1 END) AS players_10_plus_games
FROM
players;
```

In this query, we use nested ** CASE WHEN** conditions inside the

`COUNT`

`COUNT`

Having learned the basics of ** COUNT (CASE WHEN...)** it’s now time to test yourself with an interview question. The following question has many approaches, but try to solve it with

`COUNT (CASE WHEN...)`

Given an `employees`

and `departments`

table, select the top 3 departments with at least 10 employees and rank them according to the percentage of their employees making over $100,000 in salary.

**Example:**

**Input:**

`employees`

table

Columns | Type |
---|---|

id | INTEGER |

first_name | VARCHAR |

last_name | VARCHAR |

salary | INTEGER |

department_id | INTEGER |

`departments`

table

Columns | Type |
---|---|

id | INTEGER |

name | VARCHAR |

**Output:**

Column | Type |
---|---|

percentage_over_100k | FLOAT |

department_name | VARCHAR |

number_of_employees | INTEGER |