Click here to Skip to main content
15,884,892 members
Please Sign up or sign in to vote.
3.50/5 (2 votes)
See more:
I am interested in knowing how SQL is used to actually generate results.
I understand how C++ code gets translated to assembly code. I understand how assembly code gets translated into machine code. I understand how the CPU processes this machine code, has a pointer counter to the machine code and works with registers to manipulate the code. I understand components like Arithmetic Logic Units and how they actually work. I learned this in Assembly Level Programming using the IBM 360 assembler. I also learned the 6502 assembly language. I understand linking and debugging via a use of instrumentation and WinHook like functionality. I understand parellel pipelining for Intel processors. I even understand interpretive languages (like the original Basic and) Java and the JVM, and how all of the java byte code is processed at run time.
I have a weak understanding of non Von Neuman architectures and dependency based relationships like that used by that used for dataflow processors. I know that some languages (Prolog??) have expressions where you define the dependency relationships between variables and the system finds you solution based on that. For example, in CAD/CAD systems like Pro/E, one writes a series of geometic relationships which must be maintained. When the system is properly constrained, the system generates a model conforming to the constraints.
But now where was there a good discussion of how this happens for SQL. I have read Hernadez's "An Introduction to Databases for Mere Mortals", but really don't understand how this low level functionality is accomplished.
Suppose that we have a table called mytable, with two columns, int x, int y;

SQL
CREATE TABLE [dbo].[Transaction_Customer_Label](
    [index] [int] NOT NULL,
    [value] [nchar](8) NULL)


Suppose we have a statement like
SQL
SELECT * FROM tabel WHERE index=1

I sort of understand that this must be interpreted to something having indicies which matches the col named index with some way of cheching it against 1.
Behind the scenes, is there some code which basically converts this into something like a big search for values in a rectangular grid?

How is the actual parsing of the SQL statement really accomplished?
I probably need links to articles rather than short blurbs, unless this can be explained in something short.
Posted
Comments
loctrice 16-Jan-12 19:20pm    
postgresql,drizzle,etc.. has public source code. You cant beat that for information.
Sergey Alexandrovich Kryukov 16-Jan-12 20:53pm    
Not that the biggest problem is that it's hard to explain. The problem I can see is different: the question looks pretty much pointless. You are asking: "is there some code which basically converts this into something like a big search for values in a rectangular grid". Yes of course, but not as straightforward as "rectangular grid". You should be familiar in principle with declarative languages. Here, you have several level of indirection and abstractions, but ultimately the declaration is converted into the search, which is different for different systems. SQL itself is abstracted from the physical layer of the database and the search algorithms. You cannot expect that someone explains you everything about every single system, but the main principles seems more or less clear. In other word, I fail to understand what miracle can you see in internal functioning of SQL. Why do you think the problem is in parsing. You see, the parsing itself for all languages is more or less the same, the interpretation of the parsed structure is different.

By the way, you were not accurate in your expression of compiled language functioning. Normally, the code is not parsed into Assembly code; I'm not sure if such compiler exist, because it is absolutely not needed. The code is parsed into some data structures which a free from text detail, and it is used to generate the machine code. In a way, Assembly language is relatively far from machine code.

--SA

1 solution

It isn't short at all and I believe you are using "parsing" incorrectly.

The 'bible' for understanding languages, any language, is the Dragon Book which explains how processing languages occurs. That isn't the actual name but referred to like that for years such that you can even search for that on Amazon. The author is Aho.

For your case AFTER parsing occurs then there are other stages which interpret and optimize to actually run the search. That process is based on the fundamental data structures and implementation of the database.

So to answer your question you must do the following.
1. Learn more about language processing (parsing, semantics, code emitting, interpreters, etc.)
2. Learn more about databases and how they are organized/created.

If you wish after you learn some of the basics of the above I would suggest that you write an interpreter which takes a SQL 'select' with a variable where clause and returns a result. (Don't do anything else except select until you have that working.)
 
Share this answer
 

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900