DbSchema Tutorial | SQL SELECT DISTINCT Statement
SQL
, or Structured Query Language, is the cornerstone of many data-related operations. One of its most powerful statements is the SELECT DISTINCT
statement. This tutorial aims to guide fresh students through the nuances
of this statement, ensuring a robust understanding by its conclusion.
1. Introduction to SQL SELECT DISTINCT
The SELECT DISTINCT
statement is pivotal when you aim to retrieve unique values from a table, thereby filtering out duplicates.
Example:
Sample Table (Students):
StudentID | Name |
---|---|
1 | Alice |
2 | Bob |
3 | Alice |
Query:
1 | SELECT DISTINCT Name FROM Students; |
Result:
Name |
---|
Alice |
Bob |
Explanation:
The result showcases unique names. Even though ‘Alice’ appears twice in the table, the DISTINCT
clause ensures it appears only once in the result.
2. SQL DISTINCT on Multiple Columns
This allows for unique
combinations of several columns, rather than unique values from just one column.
Example:
Sample Table (Orders):
OrderID | Product | Color |
---|---|---|
1 | Apple | Red |
2 | Apple | Green |
3 | Orange | Orange |
4 | Apple | Red |
Query:
1 | SELECT DISTINCT Product, Color FROM Orders; |
Result:
Product | Color |
---|---|
Apple | Red |
Apple | Green |
Orange | Orange |
Explanation:
The result provides unique combinations of Product
and Color
. Notice how Apple
+ Red
only appears once, despite being present twice in the original table.
3. SQL DISTINCT on (Multiple) NULL Values
NULL
values are always considered unique from one another in SQL.
Sample Table (Employees):
ID | Name | Address |
---|---|---|
1 | John | NULL |
2 | Michael | 123 Lane St |
3 | Sarah | NULL |
Query:
1 | SELECT DISTINCT Address FROM Employees; |
Result:
Address |
---|
NULL |
123 Lane St |
Explanation:
While there are two NULL values in the Address
column of the table, the result only displays one. This illustrates the principle that all NULLs are distinct.
4. SQL DISTINCT vs GROUP BY
While both provide unique results, GROUP BY
is designed for aggregating data based on specific columns.
Example:
Sample Table (ProductSales):
Product | SaleDate | UnitsSold |
---|---|---|
Apple | 2023-01-01 | 5 |
Apple | 2023-01-01 | 3 |
Orange | 2023-01-02 | 2 |
Apple | 2023-01-03 | 4 |
Using DISTINCT:
To retrieve unique combinations of Product and SaleDate:
1 | SELECT DISTINCT Product, SaleDate FROM ProductSales; |
Result:
Product | SaleDate |
---|---|
Apple | 2023-01-01 |
Orange | 2023-01-02 |
Apple | 2023-01-03 |
Using GROUP BY:
To aggregate the total units sold for each product on each date:
1 | SELECT Product, SaleDate, SUM(UnitsSold) as TotalUnits |
Result:
Product | SaleDate | TotalUnits |
---|---|---|
Apple | 2023-01-01 | 8 |
Orange | 2023-01-02 | 2 |
Apple | 2023-01-03 | 4 |
Explanation:
- Using
DISTINCT
, we can fetch unique combinations of Product and SaleDate. - With
GROUP BY
, not only can we group data based on columns, but we can also perform aggregate functions (like SUM) to understand data at a more granular level. In this case, we aggregated the total units sold for each product on a specific date.
5. SELECT COUNT on DISTINCT
To measure the number of unique values in a column, DISTINCT
can be paired with the COUNT
function.
Query:
1 | SELECT COUNT(DISTINCT Name) FROM Students; |
Result:
COUNT(DISTINCT Name) |
---|
2 |
Explanation:
The result indicates there are two unique student names in the ‘Students’ table.
7. SQL SELECT VS SELECT DISTINCT
Aspect | SELECT Statement | SELECT DISTINCT Statement |
---|---|---|
Purpose | Fetches all data from specified columns. | Fetches unique data, removing duplicates. |
Performance | Generally faster as it retrieves data as-is. | Might be slower on large datasets due to filtering. |
Use Cases | When needing all data, including duplicates. | When distinct values are required. |
Example Result | Could get ‘Apple’, ‘Apple’, ‘Orange’. | Would get ‘Apple’, ‘Orange’. |
8. Common Mistakes
- Applying
DISTINCT
to the wrong columns. - Inefficient performance due to excessive use of
DISTINCT
on large datasets. - Misinterpretation of NULL values with
DISTINCT
.
9. Frequently Asked Questions (FAQs)
Q: Can I combine
DISTINCT
with other SQL functions?A: Yes,
DISTINCT
can be used alongside functions like COUNT, SUM, and others.
10. Practice Questions
- Retrieve unique colors of products in the “Orders” table.
- Determine the count of unique student names in the “Students” table.
- Extract unique combinations of product and color for the product “Apple” from the “Orders” table.
To truly master the SELECT DISTINCT
statement in SQL, regular practice and real-world application are essential. This guide offers a comprehensive overview, but hands-on experience will solidify your understanding.