Building an SQL Logic Engine

Steve Marsh

4.29/5 (6 votes)

Sep 22, 2008

CPOL

6 min read

51028

960

This article describes how to build a logic engine entirely in SQL. In the context of our example, a logic engine is a program that takes in a set of variables and corresponding values, evaluates a set of predefined conditions and builds a result set based on which conditions were found to be true.

Download project - 96.14 KB

Introduction

A powerful programming language is one which allows you to get a lot done with few lines of code. While not as robust as languages like C++ or Java, SQL serves a very specific role and does so with incredible efficiency. There are few applications in existence that are as optimized for raw speed as a modern RDBMS.

This article describes how to build a logic engine entirely in SQL. In the context of our example, a logic engine is a program that takes in a set of variables and corresponding values, evaluates a set of predefined conditions, and builds a result set based on which conditions were found to be true.

You can test out the working example, or download the Visual Studio 2005 project and MSSQL database above.

Background

I've built several different types of logic engines, most of which require a lot of overhead to load, build and parse text statements. A while ago, a project came up that required a logic engine to build a bill of material (BOM) based on selected options for a product. The BOM would be generated based on predefined rules and while these rules would be very simple, there would be several thousand of them and should theoretically be able to generate a BOM for any possible combination of options.

Based on these requirements, I came up with the idea of storing the rules in a database and having the relational database engine perform all the condition evaluation. Performing the logic evaluation in SQL offers several benefits. First it minimized the overhead between the data and business layers and eliminates the need to retrieve large datasets and parse them into objects. It also allows us to execute the evaluation incredibly quickly.

Database Structure & Algorithm

This diagram shows a simplified version of the database we will use to store our rules. Each rule will be represented by a condition group that has multiple conditions. A rule will be in the form: (v1=1) and (v2=2) and (v3=3). Each statement variable=value is stored as a record in the condition table. The condition group will also be tied to materials which should be included in the final bill if the condition evaluates to true.

This program will take in a set of variables, evaluate predefined conditions and based on which conditions are true, will generate a result set, in our case a bill of material. The algorithm for this program is:

Find all individual conditions that are false
Get a list of the condition groups for those conditions
Get a list of all conditions groups that are not in that list
Retrieve items linked to that list of true condition groups

Step 1: Find All Individual Conditions that are False

Our first step is to find conditions that are false. Our condition statement is made up of individual conditions stored as its own record in the database. Because each additional condition always compounds the statement with an and operator, we can infer that if any one condition in the statement is false, then the entire statement is false. This leads us to our first step which is to query for all individual conditions that are false given a set of variable states.

Which conditions don't match any of the variables / value pairs?

Variable : Value
v1 : 1
v2 : 2
v3 : 3

Each condition has two statements we need to check against our data set, one for the variable and one for the value: (var=v1) and (val=1). For a condition to be true, both parts must match any one of the variables. We can test the statement v1=1 with the expression: If (var=v1) and (val=1) then the condition matches. If there are multiple variables to test then we string them together with or statements.

(var=v1 and val=1) or
(var=v2 and val=2) or
(var=v3 and val=3)

If a condition record matches any one of the parenthesized expressions, then we have found a true condition. Let's simplify this expression by representing the statement var=v1 with a letter which represents a Boolean value T or F:

(a and b) or
(c and d) or
(e and f

This statement gives us conditions that are true. However to retrieve a false condition we must reverse the Boolean logic. A condition that is false can be found with the following statement. What this statement says is that if every one of the parenthesized expressions is true (var!=v1 or val!=1) then it failed to match all statements so the condition is false.

(!a or !b) and 
(!c or !d) and
(!e or !f)

The where clause of our SQL statement looks like this:

WHERE(
    (varname<>'v1' OR val<>'1') AND
    (varname<>'v2' OR val<>'2') AND 
    (varname<>'v3' OR val<>'3')
)

The last piece we have to add is the ability to create a condition using a not equal. This means that in addition to creating conditions like v1='one' and ... we should be able to use the not equal operator as well. Ex.(v1=1) and (v2=2) and (v3!=3).

To accomplish this, we add a not flag on the condition record that is set to true when we want to use not equal. This effectively reverses the evaluation. When the condition is false and the not flag is true, the condition is true. This can be represented by the following truth table:

condition	flag	result
true	true	false
false	false	false
true	false	true
false	true	true

You may recognize that this looks exactly like the truth table for the XOR gate. So to add the not flag into our expression we simply have to modify our previous expression with the XOR statement.

WHERE (
    (varname<>'v1' OR val<>'1') AND
    (varname<>'v2' OR val<>'2') AND
    (varname<>'v3' OR val<>'3')
)
XOR (nflag=1)

If this sounds confusing, don't worry it is. But this is the hardest part of the statement and if you can wrap your head around the Boolean logic, the rest will fall into place. This part of the query drives our entire statement and once we have a list of all false conditions, the rest is simple.

Steps 2, 3 & 4

Using the where clause above, we run a query on the distinct conditionGroupIds which gives us all the condition groups that are false. We use this as a sub query to select all condition group ids that are not in the list we just generated. This gives us all condition groups that are true based on our input. The last part is to do a join on the Materials table to build a list of materials that are linked to our true conditions. Here is our final query:

 SELECT m.component, m.qty, m.uom, m.bubble
 FROM(
    SELECT id  FROM ConditionGroups
    WHERE id NOT IN(
        SELECT DISTINCT conditionGroupId
        FROM Conditions
        WHERE 
        (
            (
            (varname<>'v1' OR val<>'1') AND
            (varname<>'v2' OR val<>'2') AND
            (varname<>'v3' OR val<>'3')
            )
            XOR nflag = 0
        )
    )
) AS cg
INNER JOIN Materials AS m
ON m.conditionGroupId = cg.id

Conclusion

So there you have it, a logic evaluation encapsulated into a reasonably concise SQL statement! This is a highly efficient, scalable and generic rule engine and can be used for many different purposes. The sample link running on a MSSQL database server executes in .00043 seconds. The enterprise version of this running on my company's DB2 server executes in .039 seconds after evaluating 3,294 rules from a cold start. That is about the fastest solutions I can think of. Try the example yourself or download the project.

One of the disadvantages of this solution is that you are limited to very simple conditions using only the and, equal and not equal operators. You also have to generate a query for every set of variables being tested which means it is difficult to run this as a stored procedure. That said, I think this example is a creative approach to a common problem and offers some interesting insight into the power of SQL!

Points of Interest

If your SQL engine does not support the XOR operator, you can modify the query as seen below. This is not as concise but does the exact same thing as our other statement.

WHERE(
    (
        (varname<>'v1' OR val<>'1') AND
        (varname<>'v2' OR val<>'2') AND
        (varname<>'v3' OR val<>'3') AND
        nflag = 0
    ) OR (
        NOT(
        (varname<>'v1' OR val<>'1') AND
        (varname<>'v2' OR val<>'2') AND
        (varname<>'v3' OR val<>'3')
        ) AND nflag=1
    )
)

History

22^nd September, 2008: Initial post