Click here to Skip to main content
14,926,097 members
Articles / Database Development
Article
Posted 10 Jan 2018

Stats

7.3K views
1 bookmarked

SQL object dependencies with simple RegEx and some algo magic

Rate me:
Please Sign up or sign in to vote.
5.00/5 (2 votes)
10 Jan 2018CPOL4 min read
A simple yet powerful tool to analyse SQL object dependencies in quite fast and reliable manner.

Introduction

Databases came to the world in 70. Since that they begame bigger in size, with more complicated logic and are still keep growing. So, there are many tools, SQL parsers, which tokenize by-object script.

But common parsers (ANTLR e.g.) may face with script parsing errors, each SQL dialect has it's own specials, and time issues on huge number of DB objects.

So, I want to show much simplier, a RegEx+some trick way

 

Main Concept

  1. Get one-file SQL script of whole db objects.
  2. Split SQL script onto text blocks by using regex that match DDL statements
  3. Search some text string among these text blocks

 

Implementation

1. Take all SQL objects script

Each RDBMS has an option to generate SQL drop/create scripts. So, first we get single SQL script of all database objects. Or we can take northwind.sql as an example.

2. Get all DDL statements with regular expression

I use this expression

C#
\b(create|alter)\s+(unique)*\s*(nonclustered)*\s*(procedure|proc|table|trigger|view|function|index)\b\s([^=<>\s\']+).([^=<>\s\'\(]+)[*\s\b\t]*

FastColoredTextBox gives the great tools to work with regular expressions, so we use 

C#
var result = range.GetRangesByLines(regexStr, RegexOptions.IgnoreCase);

 

First of all i built some useful regex and collect them in one static class

C#
public static class RegexValues
{
    public static string SqlCmdObjects = @"\[\$\(([^=<>\[\]\s\']+)\)\].\[[^=<>\s\']+\]";
    public static string SqlCmdObjectsShort = @"\$\(([^=<>\[\]\s\']+)\)";

    public static string DdlObjects = @"\b(create)\s+(procedure|proc|table|trigger|view|function)\b\s\[\$\(([^=<>\[\]\s\']+)\)\].\[[^=<>\s\']+\]";
    public static string DdlObjectsPrepared = @"\b(create)\s+(procedure|proc|table|trigger|view|function)\b\s\[([^=<>\[\]\s\']+)\].(\[[^=<>\s\']+\])";

    public static string DdlObjects_ = @"\b(create)\s+(procedure|proc|table|trigger|view|function)\b\s\$\(([^=<>\[\]\s\']+)\).[^=<>\s\']+";

    public static string DdlObjectsPrepared_ = @"\b(create)\s+(procedure|proc|table|trigger|view|function)\b\s([^=<>\s\']+).([^=<>\s\'\(]+)[*\s\b\t]*";

    public static string DdlObjectsPreparedProcViewTrigger = @"\b(create)\s+(procedure|proc|trigger|view|function)\b\s([^=<>\s\']+).([^=<>\s\'\(]+)[*\s\b\t]*";

    public static string DdlObjectsPreparedWithIndex = @"\b(create)\s+(unique)*\s*(nonclustered)*\s*(procedure|proc|table|trigger|view|function|index)\b\s([^=<>\s\']+).([^=<>\s\'\(]+)[*\s\b\t]*";

    public static string DdlIndexAll = @"\b(create|alter)\s+(procedure|proc|table|trigger|view|function)\b\s\[([^=<>\[\]\s\']+)\].\[[^=<>\s\']+\]";

    public static string Variables = @"\@([^=<>\s\'\)\(\,]+)";
    public static string SqlCmdVariables = @"\:SETVAR\s+([a-zA-Z_]+)\s+([a-zA-Z_]+)";
}
  • SqlCmdObjects - matches for [$(some_var)].[obj_name] or [$(some_var)].[obj_schema].[obj_name]
  • SqlCmdObjectsShort - matches for $(some_var)
  • DdlObjects - same as SqlCmdObjects plus create PROC/TABLE/VIEW/FUNC/TRIGGER statements
  • DdlObjectsPrepared - same as DdlObjects, but removed $ (sqlcmd variable) restriction
  • DdlObjects_ - same as DdlObjects, but removed enclosing [] restriction
  • DdlObjectsPrepared_ - same as DdlObjectsPrepared, but removed enclosing [] restriction
  • DdlObjectsPreparedProcViewTrigger - DDL statements for only proc, views, triggers and functions, i. e. only code objects
  • DdlObjectsPreparedWithIndex - same as DdlObjectsPrepared_, but extended with index statements
  • DdlIndexAll - same as DdlObjectsPrepared_, but extended with alter statements
  • Variables - regex that finds all @variables in script
  • SqlCmdVariables - finds SQL CMD variables (like :SETVAR var1 val1)

 

 

The heart of this software is the following lines of code:

C#
var range = Syncronized(() => fastColoredTextBox1.Selection.Clone());
range.Normalize();

range.Start = new Place(0, 0);
range.End = Syncronized(() => new Place(tb.GetLineLength(tb.LinesCount - 1), tb.LinesCount - 1));

So we just load SQL script into FastColoredTextBox (thanks alot to its authors!!!) and apply some regex to all its contents. As the output we got RESULT variable, wich contains a list of found ranges.

Range is the powerful class (thanks again to FastColoredTextBox author). Range contains line number and column number of found fragment in SQL script. We just sore list of ranges in virtual listbox (common technique) and on SelectedIndexChanged do the following

C#
fastColoredTextBox1.Selection = range;
fastColoredTextBox1.DoSelectionVisible();

by these two lines we instantly navigate to found piece of code (i. e. DDL statement).

Then we build another list of ranges, but instead of regex string we put the name of some object to be found.

The last step is to find occurences of second list of ranges in the first list of ranges. This is done in method

C#
private void FindUsage(string regex)

of Form1.cs

That's all, the rest work is just common .NET coding to build up all together.

 

As a brief to above - the logic is as follows:

1. we find all create/alter statements by using regex. And remember those line numbers.

2. we find all occurences of text (object e.g. table name) by using regex. And also remember line numbers.

3. now join these two datasets, finding, where the text line in between create/alter statements lines

 

As a result we got simple GUI designer, where you can:

  • open a SQL script or whole folder (scripts will be merged into temp file and opened as singel script)
  • Search all DDL statements (CREATE/ALTER) - list will be build on the left pane, with full navigation support.
  • select some piece of text by mouse (e. g. some SQL object name)
  • right click and select "Find Usages" - list will be build on the right pane - these are SQL objects, containing selected text

Another good thing is that you can search not only SQL objects occurence in SQL objects, but almost any piece of code, text, comment etc.

 

Important Notice

As this way of analyzing SQL dependencies  is based not on metadata info, you should always keep in mind that you're joining two datasets of rows/ So there are some limitations, or, better say, "features" :)

Let's say we have a stored procedure

SQL
create proc test

as

declare @somevar int

create table #tmp(

colum_we_search nvarchar(255),
somevar int)

 

If we apply regex matching "CREATE TABLE" statement, we match temp table along with CREATE PROC.

Then, if we try to search "COLUMN_WE_SEARCH" - it will be found inside that temp table #tmp, not in procedure test.

This can be workarounded with more precise initial regex. For such cases i wrote DdlObjectsPreparedProcViewTrigger regex statement (see above).

 

Points of Interest

I want to add MSAGL https://github.com/Microsoft/automatic-graph-layout support to visualize dependencies.

FastColoredTextBox - is the hi-end, the total breakthrough! I even didn't expect that it has so many cool features!

 

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

Share

About the Author

ASkaeff
Software Developer myself
Russian Federation Russian Federation
Just do software better

Comments and Discussions

 
NewsNew version & bugfixes Pin
ASkaeff19-Dec-17 22:57
MemberASkaeff19-Dec-17 22:57 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.