PostgreSQL GROUPING SETS
In PostgreSQL, the GROUPING SETS clause is a subclause of the GROUP BY clause.
The result of the SELECT and WHERE clauses are grouped separately by each specified group in the grouping set, and aggregates functions executed for each group same as simple GROUP BY clauses, and then the final results are returned.
The GROUPING SETS clause returns the same result as applying the UNION ALL operator on the multiple queries with GROUP BY clauses.
SELECT
<column1>,
<column2>
FROM
<table_name>
GROUP BY
GROUPING SETS (
(column1, column2),
(column1),
(column2),
()
)
[ORDER BY <column_list<];
Let's see how to use the GROUPING SETS clause using the following employee
table.
The following query uses the GROUP BY clause to return the sum of salary in each department. In other words, it defines a single grouping set on the dept_id
column.
SELECT dept_id, SUM(salary)
FROM employee
GROUP BY dept_id;
The following query uses the GROUP BY clause to return the sum of salary for each gender. So it defines grouping set on the gender
column.
SELECT gender, SUM(salary)
FROM employee
GROUP BY gender;
The following query uses the GROUP BY clause to return the sum of salary for each department and gender. So it defines grouping set on dept_id
and gender
columns.
SELECT dept_id, gender, SUM(salary)
FROM employee
GROUP BY dept_id, gender;
The following query fetch sum of salary of employees for all department and gender. So it defines a grouping set which is denoted by ().
Now if you want all the above result sets in a single query, you can use the UNION ALL operator to combine result sets. As UNION ALL operator requires all result sets to have the same number of columns and compatible data types, you need to add NULL to the selection list of each query as below.
SELECT dept_id, gender, SUM(salary) FROM employee GROUP BY dept_id, gender
UNION ALL
SELECT dept_id, NULL, SUM(salary) FROM employee GROUP BY dept_id
UNION ALL
SELECT NULL, gender, SUM(salary) FROM employee GROUP BY gender
UNION ALL
SELECT NULL, NULL, SUM(salary) FROM employee;
Although the above query works as expected, it's very lengthy and it can give performance issues as multiple times data is selected from the employee table.
The same result can be achieved by using the GROUPING SETS clause which is a subset of the GROUP BY clause. Here the query is much shorter and returns the same result.
SELECT dept_id, gender, SUM(salary) FROM employee
GROUP BY
GROUPING SETS (
(dept_id, gender),
(dept_id),
(gender),
()
);