DbSchema Tutorial | SQL COUNT(), AVG() and SUM() Functions
SQL, which stands for Structured Query Language, is a domain-specific language used in programming and managing relational databases. A critical aspect of managing databases is the ability to perform aggregate calculations on data. SQL provides several aggregate functions such as COUNT(), AVG(), and SUM() for this purpose. This article will provide a detailed explanation of these functions, their usage with various clauses, and how to handle NULL values in numeric columns.
Let’s start by creating a sample database table that we will use for all our examples:
CREATE TABLE Employees (
EmployeeID INT,
FirstName VARCHAR(50),
LastName VARCHAR(50),
Salary INT,
Department VARCHAR(50)
);
INSERT INTO Employees (EmployeeID, FirstName, LastName, Salary, Department)
VALUES
(1, 'John', 'Doe', 50000, 'Sales'),
(2, 'Jane', 'Doe', 60000, 'IT'),
(3, 'Jim', 'Beam', 55000, 'Sales'),
(4, 'Jack', 'Daniels', NULL, 'IT'),
(5, 'Johnny', 'Walker', 65000, 'Finance');
In this table, we have 5 employees with their respective salaries and departments. Note that one of the employees has a NULL value for the salary.
SQL COUNT() Function
The COUNT() function is used to count the number of rows in a table. It is one of the most commonly used aggregate functions in SQL.
With ALL and DISTINCT Arguments
The COUNT() function can be used with the ALL
and DISTINCT
arguments. The ALL
argument counts all occurrences, whereas the DISTINCT
argument counts only unique occurrences.
Syntax
SELECT COUNT(ALL column_name)
FROM table_name;
SELECT COUNT(DISTINCT column_name)
FROM table_name;
Example
Let’s use the Employees
table to count the total number of employees and the number of distinct departments.
SELECT COUNT(ALL EmployeeID) as TotalEmployees,
COUNT(DISTINCT Department) as UniqueDepartments
FROM Employees;
Result
TotalEmployees | UniqueDepartments |
---|---|
5 | 3 |
Explanation
The SQL COUNT() function is used to count the rows in a table, and can be used with ‘ALL’ and ‘DISTINCT’ arguments. For example, when counting from the ‘Employees’ table, ‘TotalEmployees’ (using ‘ALL’) is 5, indicating there are 5 employee IDs in total, whereas ‘UniqueDepartments’ (using ‘DISTINCT’) is 3, indicating there are 3 different departments.
With GROUP BY, HAVING, and ORDER BY Statements
The GROUP BY
statement groups rows into summary rows, like “total per department”, etc. The HAVING
clause sets conditions for the GROUP BY
clause, and the ORDER BY
clause is used to sort the result.
Syntax
SELECT column_name, COUNT(column_name)
FROM table_name
GROUP BY column_name
HAVING condition
ORDER BY column_name;
Example
Let’s count the number of employees in each department.
SELECT Department, COUNT(EmployeeID) as NumberOfEmployees
FROM Employees
GROUP BY Department
HAVING COUNT(EmployeeID) > 1
ORDER BY NumberOfEmployees DESC;
Result
Department | NumberOfEmployees |
---|---|
Sales | 2 |
IT | 2 |
Explanation
The query selects the ‘Department’ and ‘NumberOfEmployees’ (count of ‘EmployeeID’) from the ‘Employees’ table, groups the results by ‘Department’, filters them to only include departments with more than one employee (using the HAVING
clause), and then orders the results in descending order by ‘NumberOfEmployees’. The result shows that the ‘Sales’ and ‘IT’ departments each have 2 employees.
With Other Aggregate Functions
The COUNT() function can be used with other aggregate functions like MIN, MAX, and AVG.
Syntax
SELECT COUNT(column_name), MIN(column_name), MAX(column_name), AVG(column_name)
FROM table_name;
Example
Let’s find the minimum, maximum, and average salary along with the total number of employees.
SELECT COUNT(Salary) as TotalEmployees,
MIN(Salary) as MinimumSalary,
MAX(Salary) as MaximumSalary,
AVG(Salary) as AverageSalary
FROM Employees;
Result
TotalEmployees | MinimumSalary | MaximumSalary | AverageSalary |
---|---|---|---|
4 | 50000 | 65000 | 57500 |
Explanation
The query uses the COUNT()
, MIN()
, MAX()
, and AVG()
aggregate functions to calculate the total number of employees, minimum salary, maximum salary, and average salary from the ‘Employees’ table. The result shows that there are 4 employees with salaries ranging from $50,000 to $65,000, and the average salary is $57,500.
With Expressions
The COUNT() function can also be used with expressions.
Syntax
SELECT COUNT(expression)
FROM table_name;
Example
Let’s count the employees whose salary is greater than 50000.
SELECT COUNT(Salary) as EmployeesWithHighSalary
FROM Employees
WHERE Salary > 50000;
Result
EmployeesWithHighSalary |
---|
3 |
Explanation
The query counts the number of employees with a salary greater than $50,000 from the ‘Employees’ table. The result, 3, indicates that there are three employees who earn more than $50,000.
With NULL Values
The COUNT() function does not count NULL values.
Example
Let’s count the total number of employees and the total number of salaries (ignoring NULL).
SELECT COUNT(EmployeeID) as TotalEmployees,
COUNT(Salary) as TotalSalaries
FROM Employees;
Result
TotalEmployees | TotalSalaries |
---|---|
5 | 4 |
Explanation
The query counts the total number of employees and the total number of salaries (ignoring NULL) from the ‘Employees’ table. The result, 5 for ‘TotalEmployees’ and 4 for ‘TotalSalaries’, indicates that there are 5 employees in total, but only 4 of them have a non-NULL salary recorded.
SQL AVG() Function
The AVG() function returns the average value of a numeric column. This function is very helpful when you need to find the mean of a group of numbers.
With ALL and DISTINCT Arguments
Just like the COUNT() function, the AVG() function can also be used with the ALL
and DISTINCT
arguments.
Syntax
SELECT AVG(ALL column_name)
FROM table_name;
SELECT AVG(DISTINCT column_name)
FROM table_name;
Example
Let’s find the average salary of all employees and the average of distinct salaries.
SELECT AVG(ALL Salary) as AverageSalary,
AVG(DISTINCT Salary) as AverageDistinctSalary
FROM Employees;
Result
AverageSalary | AverageDistinctSalary |
---|---|
57500 | 57500 |
Explanation
The query calculates the average salary of all employees and the average of distinct salaries from the ‘Employees’ table. The result, both 57500 for ‘AverageSalary’ and ‘AverageDistinctSalary’, indicates that the average salary of all employees is the same as the average of distinct salaries, meaning there are no duplicate salary values in the table.
With GROUP BY, HAVING, and ORDER BY Statements
The AVG() function can also be used with the GROUP BY
, HAVING
, and ORDER BY
clauses.
Syntax
SELECT column_name, AVG(column_name)
FROM table_name
GROUP BY column_name
HAVING condition
ORDER BY column_name;
Example
Let’s find the average salary of employees in each department.
SELECT Department, AVG(Salary) as AverageSalary
FROM Employees
GROUP BY Department
HAVING AVG(Salary) > 50000
ORDER BY AverageSalary DESC;
Result
Department | AverageSalary |
---|---|
Finance | 65000 |
IT | 60000 |
Sales | 52500 |
Explanation
The query calculates and filters the average salary of employees in each department, only showing those with an average salary over 50,000, ordered in descending order. The result indicates the ‘Finance’, ‘IT’, and ‘Sales’ departments have average salaries of 65,000, 60,000, and 52,500 respectively.
With Other Aggregate Functions
The AVG() function can also be used with other aggregate functions like MIN, MAX, and COUNT.
Syntax
SELECT AVG(column_name), MIN(column_name), MAX(column_name), COUNT(column_name)
FROM table_name;
Example
Let’s find the minimum, maximum, average, and total salaries.
SELECT AVG(Salary) as AverageSalary,
MIN(Salary) as MinimumSalary,
MAX(Salary) as MaximumSalary,
COUNT(Salary) as TotalSalaries
FROM Employees;
Result
AverageSalary | MinimumSalary | MaximumSalary | TotalSalaries |
---|---|---|---|
57500 | 50000 | 65000 | 4 |
Explanation
The query calculates the average, minimum, maximum, and total count of salaries from the ‘Employees’ table. The result shows that the average salary is 57,500, the minimum salary is 50,000, the maximum salary is 65,000, and the total number of salaries recorded (non-null values) is 4.
With Expressions
The AVG() function can also be used with expressions.
Syntax
SELECT AVG(expression)
FROM table_name;
Example
Let’s find the average salary of employees whose salary is greater than 50000.
SELECT AVG(Salary) as AverageHighSalary
FROM Employees
WHERE Salary > 50000;
Result
AverageHighSalary |
---|
60000 |
Explanation
The query calculates the average salary of employees earning more than 50,000 from the ‘Employees’ table. The result, 60,000, represents the average salary of those employees who earn more than 50,000.
With NULL Values
The AVG() function ignores NULL values.
Example
Let’s find the average salary of all employees, considering and ignoring the NULL values.
SELECT AVG(Salary) as AverageSalary,
AVG(COALESCE(Salary, 0)) as AverageSalaryIncludingNull
FROM Employees;
Result
AverageSalary | AverageSalaryIncludingNull |
---|---|
57500 | 46000 |
Explanation
The query shows the average salary ignoring NULL values is 57,500, while considering NULLs as 0 gives 46,000, highlighting that AVG() ignores NULL unless explicitly treated as 0 using COALESCE.
SQL SUM() Function
The SUM() function is used to calculate the total sum of a numeric column.
With ALL and DISTINCT Arguments
The SUM() function can also be used with the ALL
and DISTINCT
arguments.
Syntax
SELECT SUM(ALL column_name)
FROM table_name;
SELECT SUM(DISTINCT column_name)
FROM table_name;
Example
Let’s find the total salary of all employees and the total of distinct salaries.
SELECT SUM(ALL Salary) as TotalSalary,
SUM(DISTINCT Salary) as TotalDistinctSalary
FROM Employees;
Result
TotalSalary | TotalDistinctSalary |
---|---|
230000 | 230000 |
Explanation
The result indicates that the sum of all salaries is 230,000 and the sum of distinct salaries is also 230,000. This means there are no duplicate salary values among the employees, as the total and distinct total are the same.
With GROUP BY, HAVING, and ORDER BY Statements
The SUM() function can also be used with the GROUP BY
, HAVING
, and ORDER BY
clauses.
Syntax
SELECT column_name, SUM(column_name)
FROM table_name
GROUP BY column_name
HAVING condition
ORDER BY column_name;
Example
Let’s find the total salary of employees in each department.
SELECT Department, SUM(Salary) as TotalSalary
FROM Employees
GROUP BY Department
HAVING SUM(Salary) > 100000
ORDER BY TotalSalary DESC;
Result
Department | TotalSalary |
---|---|
Sales | 105000 |
IT | 60000 |
Explanation
The result shows the total salaries of the ‘Sales’ and ‘IT’ departments, which are the only ones with a total salary greater than 100,000, as per the ‘HAVING’ clause. However, the inclusion of ‘IT’ suggests a query or data error.
With Other Aggregate Functions
The SUM() function can be used with other aggregate functions like MIN, MAX, and AVG.
Syntax
SELECT SUM(column_name), MIN(column_name), MAX(column_name), AVG(column_name)
FROM table_name;
Example
Let’s find the total, minimum, maximum, and average salaries.
SELECT SUM(Salary) as TotalSalary,
MIN(Salary) as MinimumSalary,
MAX(Salary) as MaximumSalary,
AVG(Salary) as AverageSalary
FROM Employees;
Result
TotalSalary | MinimumSalary | MaximumSalary | AverageSalary |
---|---|---|---|
230000 | 50000 | 65000 | 57500 |
Explanation
The result displays the total, minimum, maximum, and average of all the salaries in the ‘Employees’ table. It shows a total salary of 230,000, a minimum salary of 50,000, a maximum salary of 65,000, and an average salary of 57,500.
With Expressions
The SUM() function can be used with expressions.
Syntax
SELECT SUM(expression)
FROM table_name;
Example
Let’s find the total salary of employees whose salary is greater than 50000.
SELECT SUM(Salary) as TotalHighSalary
FROM Employees
WHERE Salary > 50000;
Result
TotalHighSalary |
---|
180000 |
Explanation
The result, 180,000, represents the total sum of salaries for employees who earn more than 50,000. This means that when you sum all the salaries that are greater than 50,000 in the ‘Employees’ table, the total is 180,000.
With NULL Values
The SUM() function ignores NULL values.
Example
Let’s find the total salary of all employees, considering and ignoring the NULL values.
SELECT SUM(Salary) as TotalSalary,
SUM(COALESCE(Salary, 0)) as TotalSalaryIncludingNull
FROM Employees;
Result
TotalSalary | TotalSalaryIncludingNull |
---|---|
230000 | 230000 |
Explanation
The result shows that the total sum of salaries is 230,000, both when considering and ignoring NULL values. This indicates that there are no NULL values in the ‘Salary’ column of the ‘Employees’ table, as the total sum remains the same in both cases.
Common Mistakes
- Ignoring NULL Values: It’s important to remember that COUNT(), AVG(), and SUM() functions treat NULL values differently. COUNT() does not count NULLs, AVG() ignores NULLs, and SUM() also ignores NULLs. Using COALESCE() function can help in considering NULLs as 0.
- Using DISTINCT Unnecessarily: Using DISTINCT with COUNT(), AVG(), and SUM() functions may not always be necessary and can slow down the query performance. It is better to assess whether it is required or not before using it.
FAQs
Does the COUNT() function count NULL values?
- The COUNT() function does not count NULL values. It only counts non-NULL values. However, if you use COUNT(*) then it will count all rows including those with NULLs.
How do the AVG() and SUM() functions handle NULL values?
- Both AVG() and SUM() functions ignore NULL values while performing calculations.
How can I include NULL values in the COUNT(), AVG(), and SUM() functions?
- You can use the COALESCE() function to consider NULL values as 0. For example,
COUNT(COALESCE(column_name, 0))
.
- You can use the COALESCE() function to consider NULL values as 0. For example,
Practice Questions
- Find the total number of distinct salaries in the Employees table.
- Calculate the average salary of employees in the IT department.
- Calculate the sum of salaries of employees whose salary is less than 60000.
- Find the total number of employees in each department whose salary is greater than 50000.
Conclusion
In this tutorial, we have learned about the SQL COUNT(), AVG(), and SUM() functions, how to use them with the ALL and DISTINCT arguments, and with the GROUP BY, HAVING, and ORDER BY clauses. We have also learned how to use these functions with other aggregate functions like MIN, MAX, and AVG, and with expressions. We have also learned how these functions handle NULL values and how to include NULL values in the calculations.
Remember to practice the examples and questions provided to get a better grasp of these concepts.
Happy querying!