I sometimes get the question, what is the difference between DISTINCT and GROUP BY?
The answer is, it depends. What I mean is, it depends on what you’re trying to accomplish. If you’re just trying to get rid of duplicate rows then both functions generate the same execution plan and both get rid of duplicate rows. So if all you’re trying to do is get rid of duplicate rows then there is no real difference between the two.
However, GROUP BY has the additional ability to aggregate data using SUM(), COUNT(), MIN(), MAX(), etc functions.
If you have a table of people’s first names, using DISTINCT would allow you to get a list of the unique names in the column. Example: select distinct(FirstName) from People.
Now, if you want to actually aggregate data, like return the name and the number of times is occurs, you could use GROUP BY. Example: select FirstName, count(*) from People group by FirstName
Conclusion: If you only need the distinct values, use DISTINCT. If you need to perform aggregate functions on the data, use GROUP BY.
Happy Tuning!