SQL Wizardry Part One - Joins
Dec 21, 2013
9 min read
SQL
SQL-Server
Intermediate
T-SQL

by Christian Graus
Contributor
Introduction
Welcome to this, my first article in many years. I am intending on doing a series of articles on SQL, one a week. I know I am probably not writing anything that’s not already on the site, but my goal is to provide a series of articles that cover things in an organised way, almost like a book that people can work through. I suspect a lot of developers work with databases, and know a little bit of SQL, but really don’t understand how to really get the most out of it. Hopefully I can help change that.Background
This first article is going to start with basics. I assume you know how to write a ‘where’ clause, and I’m going to start by talking about the different ways of connecting tables to select data between them. If you didn’t download the code for this article, please do so now. It contains all the SQL you’ll want to run in Management Studio while reading through this. Download it now. I’ll wait…. OK, back ? The first thing you’ll see is code that looks like this:with listall(id) as
(
select 1 union all
select 2 union all
select 3 union all
select 4 union all
select 5 union all
select 6 union all
select 7 union all
select 8 union all
select 9 union all
select 10
),
listodd(id) as
(
select 1 union all
select 3 union all
select 5 union all
select 7 union all
select 9 union all
select 11
),
listeven(id) as
(
select 2 union all
select 4 union all
select 6 union all
select 8 union all
select 10
)
INNER JOIN
I assume that everyone knows what this does:select * from listall
select * from listall la inner join listodd lo on lo.id = la.id
select * from listeven le inner join listodd lo on lo.id = le.id
OUTER JOIN
Now, what does this do ?select * from listall lo left outer join listeven le on lo.id = le.id
select lo.id, isEven = case when le.id is null then 0 else 1 end from listall lo left outer join listeven le on lo.id = le.id
select lo.id, isnull(le.id, -1) from listall la left outer join listeven le on la.id = le.id
select * from listeven le left outer join listall la on la.id = le.id
select * from listeven le right outer join listall la on la.id = le.id
select * from listeven le full outer join listodd lo on lo.id = le.id
select coalesce(le.id, lo.id) as idList, le.id, lo.id from listeven le full outer join listodd lo on lo.id = le.id order by idlist
CROSS JOIN
This is probably the one people know the least. What do you expect this to do ?select * from listeven le cross join listodd lo
select le.id, lo.id as id2 from listeven le cross join listodd lo
select * from listeven, listodd
However, while you may see this used in online examples, it has not been 'valid' SQL for over 20 years, and no database implementation is required to support it, so you should not use it in your own code.
INTERSECT
Although they are not really joins, I am going to cover two other things here. The first is ‘intersect’. It works like this:select * from listall
intersect
select * from listodd
select * from myClientList where clientid in ( select clientid from list1 intersect select clientid from list2)
EXCEPT
I recently had the task of transforming a large data set from one database/format to another. We had a ton of complex business rules, so I ended up with a view that grabbed all the data from the old DB, a lot of views to process different data types, then a view at the top to aggregate all those values, and turn them in to one list again. It was very complex, but at the end, we wanted to make sure we had not missed any records. This is exactly the sort of situation where the ‘EXCEPT’ operation is invaluable.select * from listall
except
select * from listodd
UNION and UNION ALL
I nearly forgot to cover these. What if, given our data set, we wanted a sequence from 1 to 11 ? This will do that:select * from listall
union
select * from listodd
order by id
select * from listall
union all
select * from listodd
order by id
I have seen some truly awful SQL code based on unions. Remember, union means, run these separate SQL queries, then merge them. If you can instead create one query, it’s bound to be faster ( but UNION itself is VERY fast, what I mean is, if you run two queries that are complex, you pay the cost of both queries, instead of just running one ). However, UNION was invaluable in the task I was describing earlier, because I had a base view, and different views for different business rules, which then needed to be pulled together again. It WAS slow, but it was the best possible tool for the task at hand.
Several people suggested I add a visual representation of the different join types. There is already an excellent article on CP that does this, you can see it here. I see no reason to duplicate it, when it is better for people seeking more information to read that article, as well as mine. I will add, the 'excluding joins' the author shows are perfectly valid, for the purposes of the techniques I've shown, I'd use the 'EXCEPT' keyword to get the same result. I am not sure which is faster.
As you can see, there’s a lot more to SQL than just the select statement and inner joins. Next week I’ll set out to explain how to get the most out of CTEs. If you have any thing you’d like me to specifically cover, or even any general SQL questions, please ask. I promise I’ll answer every question, even if I sometimes have to say that I’m not sure ( but I’ll try to find out ).
Part two ( which covers SELECT statements ) can now be read here.
License
This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)