If you’ve ever needed to SQL combine multiple rows into one for reporting or data cleanup, you know how tricky it can get. Joins, aggregations, and deduplication often cause silent errors that break dashboards, mislead stakeholders, or slow production pipelines. String aggregation, duplicate joins, and performance issues are the most common culprits.
In this guide, we’ll show you three practical techniques to solve these problems: how to aggregate strings cleanly, deduplicate after joins, and tune queries for large datasets. Each method comes with a step-by-step explanation, copy-ready SQL, and an interview-style example to sharpen your skills.
For deeper practice, the Interview Query dashboard lets you test these queries on realistic datasets, so you can see how your solutions perform under real conditions.
Combining rows in SQL is not just a formatting trick. It is a core skill for analysts and engineers working with fragmented or relational data. Anytime you query user behavior, transactions, or event logs, you will eventually need to SQL group multiple rows into one or SQL aggregate rows into one record. The goal is always the same: turn scattered details into a clean, usable representation of the entity you care about.
There are three common situations where this comes up:
Many-to-one relationships
A single user may have dozens of actions, purchases, or tags. Leaving those as separate rows makes downstream analysis messy. Instead, you group them into one row per user, often with string or JSON aggregation.
Reporting and dashboards
For business reporting, clarity is non-negotiable. Dashboards, CSV exports, or automated emails should not display every raw event. Each customer, session, or product needs to be summarized into a single row so the report remains readable and actionable.
Interview and technical challenges
SQL interviews frequently test this exact skill. Candidates are asked to “merge” results, like consolidating purchase history, event logs, or review comments into one row per user or item. These problems are designed to check whether you understand how to reduce noisy data into clear summaries.
Ultimately, combining rows is about more than just collapsing data. It is about structuring your query so that the results mirror how the business or system actually thinks about entities.
When you need to SQL concatenate rows into single row, built-in string aggregation functions like STRING_AGG(), GROUP_CONCAT(), and LISTAGG() are your go-to SQL tools. They let you flatten multiple rows into a single string, grouped by a key column, which is perfect for reporting or exporting clean summaries.
Here’s how the major databases handle it:
STRING_AGG() – PostgreSQL, SQL ServerGROUP_CONCAT() – MySQLLISTAGG – Oracle (often searched as listagg SQL)Suppose you have a comments table like this:
user_id comment
1 Great article!
1 Very informative.
2 Loved this post.
2 Helpful resource.
You can combine each user’s comments into one string per user:
PostgreSQL / SQL Server STRING_AGG():
SELECT
user_id,
STRING_AGG(comment, '; ') AS combined_comments
FROM comments
GROUP BY user_id;
MySQL GROUP_CONCAT():
SELECT
user_id,
GROUP_CONCAT(comment SEPARATOR '; ') AS combined_comments
FROM comments
GROUP BY user_id;
Oracle LISTAGG():
SELECT
user_id,
LISTAGG(comment, '; ') WITHIN GROUP (ORDER BY comment) AS combined_comments
FROM comments
GROUP BY user_id;
This technique is essential when you need to present multiple related values in a compact format. It’s especially helpful in dashboards, user summaries, and exports where rows must be grouped and values combined.
Keep in mind that long aggregations can impact performance, especially if you’re working with thousands of records or joining across large tables.
If you need to turn multiple SQL rows into one row with different values in separate columns, conditional aggregation is the go-to solution. This approach uses the MAX(CASE WHEN … THEN … END) pattern to pivot rows based on specific conditions.
This is especially useful in structured datasets—like schedules, survey responses, or user status logs—where each row represents one piece of a larger unit. Instead of stacking values vertically, this method lets you spread them horizontally across columns.
Take this employee_shifts table as an example:
employee_id shift_day shift_time
101 Monday Morning
101 Tuesday Evening
102 Monday Afternoon
102 Tuesday Morning
To display each employee’s shifts by day on one row, use:
SELECT
employee_id,
MAX(CASE WHEN shift_day = 'Monday' THEN shift_time END) AS monday_shift,
MAX(CASE WHEN shift_day = 'Tuesday' THEN shift_time END) AS tuesday_shift
FROM employee_shifts
GROUP BY employee_id;
Result:
employee_id monday_shift tuesday_shift
101 Morning Evening
102 Afternoon Morning
This pattern is not just about formatting. It enables you to SQL combine two rows into one based on a condition, making insights easier to extract. It’s particularly helpful in interview questions where you need to pivot transactional or time-based data into a flat, comparison-friendly format.
Just be aware: if you’re working with many dynamic values (e.g., 30 days of data), you’ll need dynamic SQL or a lot of repetitive columns, so it scales best when your conditions are known and fixed.
Sometimes you will not have access to modern string aggregation functions like STRING_AGG() (SQL Server 2017+) or GROUP_CONCAT() (MySQL), or you may need more control over the output. In these cases, XML and JSON functions offer powerful workarounds to SQL combine rows into one string efficiently.
In SQL Server, a classic trick is using FOR XML PATH('') with STUFF(). By concatenating values into an XML string and then stripping out the XML tags, you can build a clean, comma-separated list. This method predates STRING_AGG() and is still useful when working in older environments:
Let’s say you have a messages table like this:
user_id message
1 Welcome back!
1 Your order has shipped.
2 New login from device.
2 Password changed successfully.
SQL Server – FOR XML PATH :
SELECT
customer_id,
STUFF((
SELECT ', ' + message
FROM messages m2
WHERE m2.user_id = m1.user_id
FOR XML PATH(''), TYPE).value('.', 'NVARCHAR(MAX)'), 1, 2, '') AS combined_messages
FROM messages m1
GROUP BY user_id;
This method mimics the effect of STRING_AGG() and works on older SQL Server versions. It’s a solid fallback when STRING_AGG() isn’t supported.
PostgreSQL – JSON_AGG() or array_agg():
SELECT
user_id,
array_to_string(array_agg(message), '; ') AS combined_messages
FROM messages
GROUP BY user_id;
Alternatively, if you want to store the messages in JSON format:
SELECT
user_id,
JSON_AGG(message) AS message_array
FROM messages
GROUP BY user_id;
These methods provide more structured output than plain strings, which are ideal for applications or front-end consumption. But for simple flat strings, native functions are often faster and easier to write.
Using XML/JSON techniques to SQL server concatenate rows gives you more control over formatting and encoding, but it may be overkill for basic aggregation needs.
Tip: For best performance, ensure you’re filtering early and using indexes on your grouping and joining keys. XML and JSON-based operations can be more compute-intensive than native string functions.
One of the most common sources of unexpected duplicates in SQL is a left join when right table has multiple matches. If the right-side table returns more than one matching row per key, your result set can produce duplicates or inflate aggregates.
You can read more about aggregate functions from Interview Query article with examples and best practices.
Let’s start with an example.
users
user_id name
1 Alice
2 Bob
logins
user_id login_time
1 2023-07-01
1 2023-07-05
2 2023-07-02
If you run a simple LEFT JOIN:
SELECT u.user_id, [u.name](http://u.name/), l.login_time
FROM users u
LEFT JOIN logins l
ON u.user_id = l.user_id;
You’ll get:
user_id name login_time
1 Alice 2023-07-01
1 Alice 2023-07-05
2 Bob 2023-07-02
Notice that Alice appears twice. This is a classic left join multiple matches issue.
You can review more about LEFT JOIN function on Interview Query with detailed explanations and real interview examples.
If you only need one login time per user (e.g., the most recent), use a subquery:
SELECT u.user_id, u.name, l.latest_login
FROM users u
LEFT JOIN (
SELECT user_id, MAX(login_time) AS latest_login
FROM logins
GROUP BY user_id
) l ON u.user_id = l.user_id;
ROW_NUMBER() Filter (Postgres / SQL Server)If you want more control (e.g., not just the max), use a CTE with ROW_NUMBER():
WITH ranked_logins AS (
SELECT *,
ROW_NUMBER() OVER (PARTITION BY user_id ORDER BY login_time DESC) AS rn
FROM logins
)
SELECT u.user_id, u.name, l.login_time
FROM users u
LEFT JOIN ranked_logins l
ON u.user_id = l.user_id AND l.rn = 1;
Both methods solve the left join when the right table has multiple matches problem by reducing or filtering the right-side duplicates. Choose the one that fits your use case: aggregated summary vs. specific row logic.
Let’s walk through a real-world interview-style SQL question that requires combining multiple rows into one. This type of task is common in product, marketing, and analyst interviews, where clean and aggregated reporting is essential.
If you want to simulate these challenges in a live setting, try an Interview Query mock interview to practice under real interview conditions. You will get feedback from professionals to improve your skills.
Prompt:
Given a sales table and a products table, write a query to show, for each customer, the list of product names they purchased, separated by commas. Return customer_id and a single string of all product names in alphabetical order.
Tables:
sales
sale_id customer_id product_id
1 101 10
2 101 12
3 102 10
4 102 15
products
product_id product_name
10 Laptop
12 Mouse
15 Monitor
This is a classic SQL query combine rows into one problem: we want one row per customer with a list of product names.
We’ll use a JOIN to pull in product names, and then aggregate those names using a string function.
PostgreSQL or SQL Server:
SELECT
s.customer_id,
STRING_AGG(p.product_name, ', ' ORDER BY p.product_name) AS product_list
FROM sales s
JOIN products p ON s.product_id = p.product_id
GROUP BY s.customer_id;
MySQL:
SELECT
s.customer_id,
GROUP_CONCAT(p.product_name ORDER BY p.product_name SEPARATOR ', ') AS product_list
FROM sales s
JOIN products p ON s.product_id = p.product_id
GROUP BY s.customer_id;
Result:
customer_id product_list
101 Laptop, Mouse
102 Laptop, Monitor
We perform a JOIN to bring in product names, then use STRING_AGG() or GROUP_CONCAT() to combine those names into a single row per customer. This technique is often referred to as a SQL join concatenate multiple rows pattern, where you enrich your dataset before aggregating it.
In interviews, it’s important to explain your reasoning: you’re grouping by the customer, ordering the product names for clean output, and choosing the right aggregation function for the database. Bonus points if you mention edge cases, like how to handle duplicate products or customers with no purchases.
This type of transformation is foundational in real-world analytics, whether you’re summarizing user activity, product tags, or campaign channels.
Here are 6 targeted SQL practice questions that cover real-world scenarios like aggregation, pivoting, and handling joins to sharpen your interview prep.
You can practice many more of these questions on Interview Query, where each problem comes from real interview experiences and is tied to company-specific datasets.
1 . Calculate the number of responses grouped by SMS type
Use a GROUP BY clause with a COUNT() to summarize the number of responses per SMS type. You can use filters to isolate specific types or time ranges. This is a great example of how to get multiple rows of data in a single row in SQL using aggregation. It simulates real-world marketing datasets and user engagement summaries.
2 . Find the top three age groups by decade based on CTR
Use conditional aggregation to group age ranges into buckets and compute the average click-through rate per group. Then apply ORDER BY and LIMIT to get the top three. This is an excellent exercise in SQL group multiple rows into one by computed ranges. It mirrors audience segmentation tasks in digital analytics.
3 . Write a query to pivot a table of monthly product sales
Apply the MAX(CASE WHEN …) pattern to convert months into columns. This shows how to pivot multiple rows of sales into a single row per product. It’s a classic SQL multiple rows into one row scenario. This type of transformation is common in time-series and financial dashboards.
4 . Find employee names who joined before their managers
Use a SELF JOIN and compare hire dates to find employees hired earlier than their direct managers. While not using aggregation directly, it helps reinforce structured joins before applying them to combine or flatten results. This supports conceptual understanding of joins before tackling left join multiple matches problems. It mimics hierarchy-based reporting challenges.
5 . Create a subquery for the top 3 ads by click-through rate within each campaign
Combine ROW_NUMBER() with PARTITION BY to rank and filter the top ads. This problem helps you practice controlling data explosion from joins. It’s ideal for tackling SQL join, concatenating multiple rows with rankings. Great for digital ad performance dashboards or competitive intelligence.
You can review the PARTITION BY function on Interview Query to avoid any common mistakes.
6 . Find the average number of accepted friend requests per user
Group by user and apply conditional filters to count only accepted requests, then compute an average. This question reinforces group-based aggregation logic. It’s useful for understanding SQL combine rows into one string or metric for user-level behavioral insights.
Even seasoned SQL users run into challenges when trying to SQL collapse rows into one. These pitfalls are especially relevant for analysts, engineers, and data scientists working with production queries on large datasets. Below are the most common issues you’ll encounter, why they happen, and how to avoid them.
You can practice with real professionals through Interview Query mock interviews to get real-time feedback and be better prepared for any future interviews.
Over-running string length limits
Functions like GROUP_CONCAT() in MySQL or older Oracle/SQL Server, concatenation methods impose default length limits (often 1024 or 4000 characters). Without increasing these settings (e.g., SET group_concat_max_len = 100000; in MySQL), results may be silently truncated, which is dangerous in reporting. Always verify your DBMS’s default limits and configure them explicitly when working with long strings or free-text fields.
Forgetting to order inside aggregates
Aggregates like STRING_AGG() (SQL Server, Postgres) or LISTAGG() (Oracle) do not guarantee the order unless you use WITHIN GROUP (ORDER BY …). Skipping this means you might get random or inconsistent results across runs, which is unacceptable for reporting or customer-facing exports. Always specify an ordering column (like timestamp or primary key) to produce deterministic output.
Memory and spill warnings on large datasets
Aggregating large strings, JSON objects, or arrays can trigger memory spills, where the database shifts from in-memory to disk processing. This introduces major slowdowns and higher I/O load. To mitigate, consider batching with window functions, pre-aggregating in CTEs, or breaking queries into smaller materialized steps. Monitoring execution plans will help you detect when spills occur.
Null handling and delimiters
A subtle but frequent bug is forgetting how NULL values interact with string aggregation. In some systems, NULL is ignored; in others, it propagates and nullifies the result. Similarly, a missing or misplaced delimiter can break downstream parsing. Always test with NULL inputs and use functions like COALESCE() to enforce predictable results.
Use STRING_AGG(), GROUP_CONCAT(), or LISTAGG when you need to flatten values into a delimited list for reporting or exports. Reach for conditional aggregation with MAX(CASE WHEN …) when pivoting categorical data into columns. For complex formatting or nested output, XML and JSON functions offer flexible ways to reshape structured data.
If you are ready to push further, you can:
And when you are ready to test yourself under real conditions, Interview Query gives you access to mock SQL interviews and personalized coaching. It is the fastest way to see how your queries hold up when the clock is ticking, and to get feedback that turns theory into confidence.