|
Klaus-Werner Konrad wrote: Arguments,
Nope, I call them Oracle joins (I loathe Oracle) because that is the way I was taught in the 90s then I moved to SQL SErver and was introduced to the JOIN and have never used it since.
I will refactor any procedure I see using them and have a short, sharp discussion with the dev that uses it. Use join or quit.
Never underestimate the power of human stupidity
RAH
|
|
|
|
|
Mycroft Holmes wrote: Nope, I call them Oracle joins
Yeah - a really qualified Response ...
I ask for arguments, an your resposne is
Mycroft Holmes wrote: I will [...] have a short, sharp discussion with the dev that uses it
THIS is my Intention - to have a DISCUSSION about it !
But you have obvoiously no arguments, but prejudices against it - not the ideal base for discussions.
What would be your arguments in a discussion with your developer - do you even have any arguments ?
Note: I asked politely for arguments for - or against - my preferation, because (as every of us) I want to improve my abilities, and also to give some impressions to SQL newbies to decide what to do, and not to do.
So ...
HAVE VOU ARGUMENTS, or are you just felt to open your mouth ... ?
|
|
|
|
|
Klaus-Werner Konrad wrote: THIS is my Intention - to have a DISCUSSION about it !
So we are supposed to be telepathic so we can identify your INTENTIONS.
You stated that you prefer the JOIN format, then you asked for arguments to that position. Nowhere did you mention discussion.
I stated that I had no argument and qualified why. In other words I was supporting your preference!
I did not down vote your question or response (was not even aware you had been) as the question was valid and the response just means you are a bit pissy this morning.
Never underestimate the power of human stupidity
RAH
|
|
|
|
|
Well - the title should be clear enough, and I presented JOIN syntax and WHERE syntax,
expressing my worry about noone seems using WHERE.
THEN I asked for arguments - obviously (I thought so) WHY noone is using WHERE syntax,
and NOT to support my preference.
In other words: Are there any points (other than preference) to not using WHERE syntax ??
|
|
|
|
|
Klause,
theoretically, the first approach is more optimized and can give better performance while working with large data sets.
WHY?
In first approach, the intermediate tables are expressed clearly and the steps to arrive the result is evident.
eg:
step1 -> Create Intermediate table I : O join C on ID
step2 -> Result R : I join F on ID
this is optimum route to reach the result.
How about second approach?
In a simple case, SQL optimizer might arrive at same steps by analyzing the tables and columns used in WHERE clause but not always. Especially when the WHERE clause is relatively complex.
|
|
|
|
|
I was under the impression there was no difference by the time the optimizer had done it's job so I have always based my bigotry opinion on the readability.
Never underestimate the power of human stupidity
RAH
|
|
|
|
|
Generally there isn't, but in specific cases there are.
|
|
|
|
|
The downvote you got is stupid because the question is valid.
The problem is though that readability isn't the same as clarity of intention, and there is more to it than you see at the first glance.
Firstly, when using an ANSI Join you specify which TABLES to join.
When using an implicit join you specify which fields to join on.
This is normally not making any difference, but it might in some specific cases.
Consider this query:
SELECT O.Order_No, C.Address, F.Address
FROM Orders O, Address C, Address F
WHERE O.Customer = C.ID
AND O.Forwarder = F.ID(+)
AND F.OtherID = 1 Is it the same as:
SELECT O.Order_No, C.Address, F.Address
FROM Orders O
JOIN Address C
ON O.Customer = C.ID
LEFT OUTER JOIN Address F
ON O.Forwarder = F.ID
AND F.OtherID = 1 or:
SELECT O.Order_No, C.Address, F.Address
FROM Orders O
JOIN Address C
ON O.Customer = C.ID
LEFT OUTER JOIN Address F
ON O.Forwarder = F.ID
WHERE F.OtherID = 1
The difference is crucial as the results differ.
Secondly, when using an implicit join there is no way to force the optimizer to do the joins in a certain order, which normally is a good thing. The optimizer is usually better than many people on that.
But sometime you know better than the optimizer and can force the joining order as such:
SELECT O.Order_No, C.Address, F.Address
FROM (
Orders O JOIN Address C
ON O.Customer = C.ID
) JOIN Address F
ON O.Forwarder = C.ID
My personal pet peeve is that a field should always have the same name everywhere in a database just like the ISO standard says.
|
|
|
|
|
I'm curious, and lazy, which result does the first query match. I no longer remember the relevance of the (+), left outer join I believe whereas the intention may have been to filter the forwarders.
Never underestimate the power of human stupidity
RAH
|
|
|
|
|
The second one, to match the first one the query would need to look like this:
SELECT O.Order_No, C.Address, F.Address
FROM Orders O, Address C, Address F
WHERE O.Customer = C.ID
AND O.Forwarder = F.ID(+)
AND F.OtherID(+) = 1
|
|
|
|
|
When it comes to the relevance of the (+), think of it as the side that allows nulls (+ nulls)
|
|
|
|
|
Wrong - it's the OPPOSITE side that can be NULL, as in O.Forwarder = F.ID(+)
Here, O.Forwarder my be NULL, but F.ID can NEVER be NULL (it's usually the primary key)
|
|
|
|
|
Have a look here[^]
Want to change your mind?.
|
|
|
|
|
Klaus,
(+) is deprecated OUTER JOIN syntax. It's not about allowing or not allowing NULL.
O.Forwarder = F.ID(+) means include all rows from O even if there is no match in F.
O.Forwarder(+) = F.ID means include all rows from F even if there is no match in O
|
|
|
|
|
Thanks for your thoughts, Jörgen.
Of course the results differ, and to achive the correct result,
one have to write
AND NVL( F.OtherID, 1 ) = 1
Quote: My personal pet peeve is that a field should always have the same name everywhere in a database just like the ISO standard says
Don't know where you read this, but it cannot be right ...
Even in my original simple example you have two addresses in tho ORDERS table, so there is no chance in having 'the same field always the same name'.
Maybe you meant that a field with the same MEANING should have always the same name, like CUSTOMER, FORWARDER etc. ?
|
|
|
|
|
That would of course depend on which result is the correct one. I would also go very far to avoid a function in the where clause, for performance reasons.
The ISO standard I'm referring to is ISO-11179[^].
Considering the amount of text, I can't be bothered to find the actual paragraph that I'm referring to.
But the gist is that if you have a Customer Table the Surrogate Key should be named CustomerID in all tables using it no matter if it's the primary key or a foreign key.
It shouldn't be CustomerID in one table, CID in another, customer in a third, or just ID in the Customer Table. Same name everywhere to avoid confusion.
This also allows you to use the using construct[^] or even a Natural Join[^]. Personally I avoid both as I don't find that they add any clarity.
There are as always exceptions. An obvious one is when you have a Person table where the person have a work address and a home address, both columns referring to the same address table, they can't both be AddressID.
|
|
|
|
|
Do you know if it exist a document or recommendation to create audit table or metadata elements on a database.
I ask this question because I just saw this :
CREATE TABLE client (
id char(36) NOT NULL,
first_name varchar(255) NOT NULL,
last_name varchar(255) NOT NULL,
creation_date timestamp NOT NULL,
update_date timestamp NOT NULL,
CONSTRAINT client_pk PRIMARY KEY (id)
);
I don't like the idea to have metadata like creation_date and update_date in a business oriented table. Actually these two fields exists in all tables of this database.
What if I create a audit table like this:
CREATE TABLE audit (
table_name char(32) NOT NULL,
fk char(36) NOT NULL,
update_date timestamp NOT NULL
);
I don't need a creation date because the creation date is my first record with table_name and fk in this table. I could add a columns to add insert, update, delete information. With thi table I keep all my update dates and not only the last one.
Is this audit table a god practice ?
Can I also use this audit table to avoid conflict. You know when two users update the same record at the same time. It's possible to use the last update time to warn them there is a conflict.
|
|
|
|
|
B413 wrote: I don't like the idea to have metadata like creation_date and update_date in a business oriented table. It's just a good idea to do so. Makes sorting by creation-order easier, makes it easy to see when the table last was updated - but it's not an audit. It doesn't show who changed what, it merely shows when the object was created, and when it was last modified. You'll find those two attributes on a lot of things.
Fetching the top(update_date) gives a fast indication whether the table has changed since you last loaded it into memory.
B413 wrote: Is this audit table a good practice ? Depends; who's responsible for entering data into the new table? Does it happen automatically (using a trigger), or do you depend on the cooperation of the DAL-programmer?
I'd simply be dumping a copy of the entire record (in a separate table, different access-rights), including the user-name and domain. It'd be wasting less space if you only keep a track of the fields that are actually modified then when one makes a complete copy. OTOH, space is cheap nowadays
B413 wrote: Can I also use this audit table to avoid conflict. You know when two users
update the same record at the same time. It's possible to use the last update
time to warn them there is a conflict.
..and then what? Ask the user to undo the work he's done and type it again? Merge the changes? What if they modified the same field?
It's a bit sweeter to "check out" (or lock) the record once a user starts editing. What sourcesafe does with files works equally well with records.
Bastard Programmer from Hell
If you can't read my code, try converting it here[^]
|
|
|
|
|
Personally I don't consider that an audit solution, I say solution because audit is a whole lot of objects. We do a field level audit so we can track any changes to the data. Simply the ID for an inserted record, each field that is changed - only getting the from value and each field from a deleted record.
In theory this allows us to reproduce any action done by the users. Also we do not audit every table as the apps we build are based on other systems so we only track the master table changes.
The audit is initiated by a trigger spit, the ONLY valid use for the dammed things. This is our audit table.
CREATE TABLE [dbo].[AuditLog](
[AuditID] [INT] IDENTITY(1,1) NOT NULL,
[Action] [CHAR](1) NULL,
[TableName] [VARCHAR](128) NULL,
[PrimaryKeyField] [VARCHAR](1000) NULL,
[PrimaryKeyValue] [VARCHAR](1000) NULL,
[FieldName] [VARCHAR](500) NULL,
[OldValue] [VARCHAR](1000) NULL,
[ModifiedDate] [DATETIME] NULL,
[UserName] [VARCHAR](200) NULL
) ON [PRIMARY]
Every table has a modified and modifiedby field. We also have a table of table names that allows us to run a script and apply/remove triggers from the target tables. As I said theres a whole ecosystem for auditing and it all lives in our model database so it is automatically included in every new database created.
In a vast number of years building LOB apps and some of them transactional I have never been hit by the condition of 2 users editing the same record except in 2 cases, contrived conditions to test the effect and in the early 90s using Microsoft Access in a multi user environment. It is a completed furfy IMNSHO.
Never underestimate the power of human stupidity
RAH
|
|
|
|
|
I will second Mycroft's design which is almost identical to the audit table I created.
CRUD dates alone will be of little help as they will not give any context to the data.
Audit tends to mean that you can rebuild a row's values at any point in time.
“That which can be asserted without evidence, can be dismissed without evidence.”
― Christopher Hitchens
|
|
|
|
|
Mycroft Holmes wrote: I have never been hit by the condition of 2 users editing the same record except in 2 cases That doesn't mean that the concurrency-issue does not exist. Depending on the amount of operations, once in a million could happen next monday.
Bastard Programmer from Hell
If you can't read my code, try converting it here[^]
|
|
|
|
|
Eddy Vluggen wrote: Depending on the amount of operations, once in a million could happen next monday.
No that isn't what that means at all.
If you have a business case where modification of the same record by two users can occur then the number of operations has nothing to do with it. When one has such a business case then one should of course take the necessary precautions, based on business needs (not just implementation hacking) to insure that a the best and most correct result is arrived at.
If there is no business case then the only way that can happen is due to poor design and/or implementation. Or perhaps just flat out wrong design/implementation.
|
|
|
|
|
jschell wrote: based on business needs (not just implementation hacking) to insure that a the
best and most correct result is arrived at. Two of us get called by a customer and update the same issue in the bugtracker. Whose result is the most "best and correct"?
You better make sure it's my edit, not yours.
Bastard Programmer from Hell
If you can't read my code, try converting it here[^]
|
|
|
|
|
Eddy Vluggen wrote: Two of us get called by a customer and update the same issue in the bugtracker
I would contend that that is not a reasonable business case, the same bug. Still it MAY be possible and you would have to design around it. However most requirements for this are just not reasonable.
Never underestimate the power of human stupidity
RAH
|
|
|
|
|
Eddy Vluggen wrote: Two of us get called by a customer and update the same issue in the bugtracker
Ok.
First exactly what sort of business scenario is going to lead to two different customer employees to call at the same time to have one modify the very same bug?
Second presumably you are referring to that those two different customer employees are acting independently and both wish to resolve the bug, and not just comment on it, in different ways.
So exactly how, in terms of business usage, if those two same people called at different times would you correctly resolve updating the status two completely different ways? Say one calls on tuesday and says they want to close the bug as no longer applicable and a completely different person calls on wednesday and says they want to increase the priority to critical. Exactly how is your software, and only your software, going to resolve that? And just to make it more fun lets say the guy on tuesday is the CTO of the customers company and the guy on wednesday is a junior developer. (Keeping in mind of course that this is your scenario where two completely different people are interacting with your company at the same time.)
Third, "Whose result is the most 'best and correct'?", is exactly the question. That is business decision not a software decision. You cannot write software that is going to answer that question.
|
|
|
|
|