PostgreSQL: Group By CUBE
In PostgreSQL, the CUBE clause is a subclause of the GROUP BY clause. It generates multiple grouping sets that include all possible combinations of columns.
With the GROUPING SETS, you can define a set of columns that you want to group in a query. The CUBE clause performs the same operation as GROUPING SETS for all possible combinations of the specified columns.
SELECT
<column1>,
<column2>
FROM <table_name>
GROUP BY
CUBE(<column_list>)
[ORDER BY <column_list<];
The CUBE subclause is a short way to generate grouping sets for all the possible combinations of the specified columns. For example, the following CUBE and GROUPING SETS are equal.
-- the followings are equal
CUBE(c1, c2, c3)
GROUPING SETS (
(c1, c2, c3),
(c1,c2),
(c2,c3),
(c1,c3),
(c1),
(c2),
(c3),
( )
)
Let's understand CUBE better with some examples using the following employee table.
First, let's see the result of the GROUPING SETS clause. The following query returns dept_id
, gender
, and sum(salary)
group by multiple grouping sets.
SELECT dept_id, gender, SUM(salary) FROM employee
GROUP BY
GROUPING SETS (
(dept_id, gender),
(dept_id),
(gender),
()
);
The following displays the result of the above query in pgAdmin.
In the above result, first, it returns the sum of salaries of all employees (1st row), then it returns the sum of salaries grouped by dept_id
and gender
(rows 2,3,4,5). Then, returns the sum of salaries grouped by dept_id
(rows 6 & 7), and finally the sum of salaries grouped by gender
(rows 8 & 9).
Now, the same result can be achieved using CUBE in the shortest way, as shown below.
SELECT dept_id, gender, SUM(salary)
FROM employee
GROUP BY
CUBE(dept_id, gender);
The following is the result of the above query in pgAdmin.
PostgreSQL allows performing partial CUBE operation also to reduce the number of aggregate calculated. For example, if you don’t want the sum of salaries grouped by gender
then use it with the GROUP BY clause but not in the CUBE() clause, as shown below.
SELECT dept_id, gender, SUM(salary)
FROM employee
GROUP BY
dept_id,
CUBE(gender);
In the above result, it doest not calculate salaries for each dept_id
column as it is not included in the CUBE().