Click here to Skip to main content
15,889,216 members
Articles / Database Development / SQL Server / SQL Server 2012

Get Ready to Learn SQL: 6. How to Group and Summarize your Results

Rate me:
Please Sign up or sign in to vote.
5.00/5 (2 votes)
23 Oct 2014MIT6 min read 9.3K   7  
How to Group and Summarize your Results in SQL

In today’s lesson, you’re going to learn about grouping results returned from your queries using the GROUP BY clause.

The objectives of today’s lesson are to:

  • Learn to how to group results using GROUP BY
  • Use aggregate functions to perform calculations
  • Understand how to filter groups using the HAVING clause

GROUP BY Clause

The SQL GROUP BY Clause is used to output a row across specified column values. It is typically used in conjunction with aggregate functions such as SUM or Count to summarize values. In SQL, groups are unique combinations of fields. Rather than returning every row in a table, when values are grouped, only the unique combinations are returned.

The GROUP BY Clause is added to the SQL Statement after the WHERE Clause. Here is an example where we are listing OrderID, excluding quantities greater than 100.

SQL
SELECT OrderID
FROM OrderDetails
WHERE Quantity <= 100
GROUP BY OrderID;

There are a couple of things to note. First, the columns we want to summarize are listed, separated by commas, in the GROUP BY clause. Second, this same list of columns must be listed in the select statement; otherwise the statement fails.

When this statement is run, not every filtered row is returned. Only unique combinations of OrderID are included in the result. This statement is very similar to:

SQL
SELECT DISTINCT OrderID
FROM OrderDetails
WHERE Quantity <= 100;

But there is a key difference. The DISTINCT modifier stops at outputting unique combination of rows, whereas with the GROUP BY statement, we can calculate values based on the underlying filtered rows for each unique combination.

In other words, using our example, with the GROUP BY, we can calculate the number or OrderDetails per order as follows:

SQL
SELECT OrderID, COUNT(OrderID) as NumOrderDetails
FROM OrderDetails
GROUP BY OrderID;

COUNT is an example of an aggregate function, these are what really give the GROUP BY statement its special value.

Aggregate Functions

Some functions, such as SUM, are used to perform calculations on a group of rows, these are called aggregate functions. In most cases, these functions operate on a group of values which are defined using the GROUP BY clause. When there isn’t a GROUP BY clause, it is generally understood the aggregate function applies to all filtered results.

Some of the most common aggregate functions include:

AVG(expression) Calculate the average of the expression
COUNT(expression) Count occurrences of non-null values returned by the expression
COUNT(*) Counts all rows in the specified table
MIN(expression) Finds the minimum expression value
MAX(expression) Finds the maximum expression value
SUM(expression) Calculate the sum of the expression

These functions can be used on their own on in conjunction with the GROUP BY clause. On their own, they operate across the entire table; however, when used with GROUP BY, their calculations are “reset” each time the grouping changes. In this manner, they act as subtotals.

General Syntax of an Aggregate Function

When using the aggregate function, you can either compute the result on all values or distinct values. For instance, to count all OrderDetails records, we could use the expression:

SQL
SELECT COUNT(OrderID)
FROM OrderDetails;

To count the distinct of orders making up the details, we would use the following:

SQL
SELECT COUNT(DISTINCT OrderID)
FROM OrderDetails;

Using Aggregate Functions with GROUP BY

AVG and SUM

The SUM function totals up the values returned, in a similar fashion AVG calculates the average.

Let’s see if we can calculate the total order amount from the OrderDetails. From previous lessons, we know how to calculate the total amount for each detail as:

SQL
SELECT OrderID, ProductID, UnitPrice * Quantity as TotalPrice
FROM OrderDetails;

Since we can apply aggregate function to expressions, we can set up a grouping on OrderID to calculate the total price per order as:

SQL
SELECT OrderID, SUM(UnitPrice * Quantity) as TotalPrice
FROM OrderDetails
GROUP BY OrderID;

We can even sort by the total to get the top orders first:

SQL
SELECT OrderID, SUM(UnitPrice * Quantity) as TotalPrice
FROM OrderDetails
GROUP BY OrderID
ORDER BY TotalPrice DESC;

In a similar fashion, we can calculate the average order detail amount as:

SQL
SELECT OrderID, 
       AVG(UnitPrice * Quantity) as AverageOrderAmount
FROM OrderDetails
GROUP BY OrderID;

For the curious, since an average is calculated as the sum of the sample divided by the sample count, then using AVG in the above statement is the same as:

SQL
SELECT OrderID, 
       SUM(UnitPrice * Quantity) / COUNT(OrderID)  as AverageOrderAmount
FROM OrderDetails
GROUP BY OrderID;

We covered a lot in this section. Here are some key points to remember:

  1. An aggregate function can evaluate an expression such as SUM(A + B)
  2. You should alias aggregate functions, so the column names are meaningful
  3. When working with aggregate functions and GROUP BY, it is sometimes easier to think about the details first, that is write a simple SELECT statement, inspect the results, then add in the fancy stuff.

COUNT

The COUNT function is used when you need to know how many records exist in a table or within a group. COUNT(*) will count every record in the grouping; whereas COUNT(expression) counts every record where expression’s result isn’t null. You can also use Distinct with COUNT to find the number of unique values within a group.

To find the number of OrderDetail Lines per order:

SQL
SELECT OrderID, COUNT(OrderDetailID)
FROM OrderDetails
GROUP BY OrderID;

To find the number of unique orders per product:

SQL
SELECT ProductID, COUNT(DISTINCT OrderID)
FROM OrderDetails
GROUP BY ProductID;

MIN and MAX

Use MIN and MAX to find the smallest and largest values, respectively, within a table or group.

For example, to find the smallest and largest product quantities ordered within an order, try:

SQL
SELECT OrderID, 
       MIN(Quantity) as MinQuantity, 
       MAX(Quantity) as MaxQuantity
FROM OrderDetails
GROUP BY OrderID;

You can also find the MIN or MAX value of a calculation. Here, we find the highest product amount ordered within a product:

SQL
SELECT OrderID, 
       MAX(UnitPrice * Quantity) as MaxAmount
FROM OrderDetails
GROUP BY OrderID;

HAVING Clause

The HAVING clause is used to filter groups according to the results of the aggregate functions. This makes it possible to solve problems such as select all orders that have more than two order detail lines.

That example looks like:

SQL
SELECT OrderID, COUNT(OrderDetailID)
FROM OrderDetails
GROUP BY OrderID
HAVING COUNT(OrderDetailID) > 2;

If we wanted to find all orders greater than $1000, we would write:

SQL
SELECT OrderID, 
       SUM(UnitPrice * Quantity) as TotalPrice
FROM OrderDetails
GROUP BY OrderID
HAVING TotalPrice > 1000
ORDER BY TotalPrice DESC;

This query is the same as one from the previous section with the addition of the HAVING clause. We could have written the HAVING clause as:

SQL
HAVING SUM(UnitPrice * Quantity) > 1000

But, since the column was already aliased, we used it instead.

To hammer home HAVING, I want to show one last example. Here you’ll see the HAVING statement includes an aggregate function that isn’t in the SELECT list.

SQL
SELECT OrderID, 
       SUM(UnitPrice * Quantity) as TotalPrice
FROM OrderDetails
GROUP BY OrderID
HAVING AVG(UnitPrice * Quantity) > 500
ORDER BY TotalPrice DESC;

In the above query, we’re getting the total price for orders where the average OrderDetail amount is greater than $500.00.

Final Statement about HAVING

Though they perform a similar function, there is a key distinction between the WHERE clause and HAVING. The WHERE clause filter individual records; whereas, the HAVING clause filters on the groups.

To keep it straight in my head, I like to think of the WHERE clause doing its work before any groupings take place, and then the HAVING clause taking over after the groups are formed.

Exercises

It’s important to practice! Use the sample database to answer these questions.

  1. What is the average quantity ordered in the OrderDetails table?
  2. Display the Min, Max, and Average Quantity ordered for each product in OrderDetails.
  3. Return total sales, by product for all orders, but only include products included on 7 or more OrderDetails.

Answers to Exercises

Congratulations! You just learned how to use the GROUP BY and HAVING clauses to summarize and filter on summarized information. More tutorials are to follow! Remember! I want to remind you all that if you have other questions you want answered, then post a comment or tweet me.

I’m here to help you. What other topics would you like to know more about?

License

This article, along with any associated source code and files, is licensed under The MIT License


Written By
Easy Computer Academy, LLC
United States United States
Hello my name is Kris. I’m here because I am passionate about helping non-techie people to overcome their fear of learning SQL.

I know what it is like to not know where to start or whether the time spent learning is worth the effort. That is why I am here to help you to:
- Get started in an easy to follow step-by-step manner.
- Use your time wisely so you focus on what is important to learn to get the most value from your time.
- Answer your questions. Really! Just post a comment and I’ll respond. I’m here to help.

It wasn’t long ago that I was helping a colleague with some reporting. She didn’t know where to start and soon got overwhelmed and lost as she didn’t know SQL.

I felt really bad, as she was under pressure to get some summary information to her boss, the built-in reports were falling short, and to make them better would require her to know SQL. At that time that seemed impossible! It in dawned on me, it doesn’t have to be that way.

Then I discovered a way for anyone with the desire to easily learn SQL. I worked with my co-worker, started to teach her what I learned and soon she was able to write reports and answer her boss’ questions without getting stressed or ploughing hours into manipulating data in Excel.

It hasn’t always been easy. Sometimes the information seems abstract or too conceptual. In this case I’ve found out that a visual explanation is best. I really like to use diagrams or videos to explain hard-to-grasp ideas.

Having video, pictures, and text really help to reinforce the point and enable learning.

And now I want to help you get the same results.

The first step is simple, click here http://www.essentialsql.com/get-started-with-sql-server/

Comments and Discussions

 
-- There are no messages in this forum --