SQL - GROUP BY Clause
The GROUP BY clause is used to get the summary data based on one or more groups. The groups can be formed on one or more columns. For example, the GROUP BY query will be used to count the number of employees in each department, or to get the department wise total salaries.
You must use the aggregate functions such as COUNT()
, MAX()
, MIN()
, SUM()
, AVG()
, etc., in the SELECT query.
The result of the GROUP BY clause returns a single row for each value of the GROUP BY column.
Syntax:
SELECT column1, column2,...columnN FROM table_name
[WHERE]
[GROUP BY column1, column2...columnN]
[HAVING]
[ORDER BY]
The SELECT clause can include columns that are used with the GROUP BY clause. So, to include other columns in the SELECT clause, use the aggregate functions like COUNT()
, MAX()
, MIN()
, SUM()
, AVG()
with those columns.
- The GROUP BY clause is used to form the groups of records.
- The GROUP BY clause must come after the WHERE clause if present and before the HAVING clause.
- The GROUP BY clause can include one or more columns to form one or more groups based on that columns.
- Only the GROUP BY columns can be included in the SELECT clause. To use other columns in the SELECT clause, use the aggregate functions with them.
For the demo purpose, we will use the following Employee
and Department
tables in all examples.
EmpId | FirstName | LastName | Salary | DeptId | |
---|---|---|---|---|---|
1 | 'John' | 'King' | '[email protected]' | 33000 | 1 |
2 | 'James' | 'Bond' | 1 | ||
3 | 'Neena' | 'Kochhar' | '[email protected]' | 17000 | 2 |
4 | 'Lex' | 'De Haan' | '[email protected]' | 15000 | 1 |
5 | 'Amit' | 'Patel' | 18000 | 1 | |
6 | 'Abdul' | 'Kalam' | '[email protected]' | 25000 | 2 |
DeptId | Name |
---|---|
1 | 'Finance' |
2 | 'HR' |
Consider the following GROUP BY query.
SELECT DeptId, COUNT(EmpId) as 'Number of Employees'
FROM Employee
GROUP BY DeptId;
--following query will return same data as above
SELECT DeptId, COUNT(*) as 'No of Employees'
FROM Employee
GROUP BY DeptId;
The above query includes the GROUP BY DeptId
clause, so you can include only DeptId
in the SELECT clause. You need to use aggregate functions to include other columns in the SELECT clause, so COUNT(EmpId)
is included because we want to count the number of employees in the same DeptId
.
The 'No of Employees'
is an alias of the COUNT(EmpId)
column. The query will display the following result.
DeptId | No of Employees |
---|---|
1 | 4 |
2 | 2 |
The following query gets the department name instead of DeptId
in the result.
SELECT dept.Name as 'Department', count(emp.empid) as 'No of Employees'
FROM Employee emp, Department dept
WHERE emp.deptid = dept.DeptId
GROUP by dept.Name
Department | No of Employees |
---|---|
Finance | 4 |
HR | 2 |
In the same way, the following query gets the department-wise total salaries.
SELECT dept.Name, sum(emp.salary) as 'Total Salaries'
FROM Employee emp, Department dept
WHERE emp.deptid = dept.DeptId
GROUP by dept.Name
Department | Total Salaries |
---|---|
Finance | 66000 |
HR | 42000 |
The following query would throw an error, because dept.Name
is not included in the GROUP BY clause, or no aggregate function is used.
SELECT dept.Name, sum(emp.salary) as 'Total Salaries'
FROM Employee emp, Department dept
WHERE emp.deptid = dept.DeptId
GROUP by dept.DeptId