Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / Languages / VB

Sudoku Algorithm: Generates a Valid Sudoku in 0.018 seconds

4.75/5 (21 votes)
17 Feb 2009CPOL6 min read 1   6.6K  
An article about the simple, yet often annoying to achieve, backtracking algorithm for Sudoku generation.
Image 1

Introduction

When I designed the project this article is based on, it was much less about producing a fast algorithm for Sudoku generation. For all intents and purposes, I had already achieved this in a previous project. No, this project was about testing myself that little bit further to learn new ways of dealing with things and, overall, improving my previous code with the new things I had learnt. Little did I know I'd reduce my all time best 5 second algorithm to a mere 0.07 second algorithm. Now I look at it, it seems so obvious.

Aim

The aim of this project was to create a fast, short and reliable Sudoku algorithm. To do this, I used three main parts; a specialized structure, array lists and generic lists.

The result of this combination and application of the backtracking technique has resulted in exactly what I aimed for. A Sudoku generator that does not make a mistake, is less than 300 lines of code and makes a Sudoku in 0.07 seconds. That was impressive enough but thanks to Keith B., Imperiatus and Spirch, now the generator can now produce a Sudoku at an average of 0.018 seconds.

The Rundown

What is Backtracking

The basic principle of a backtracking algorithm, in regards to Sudoku, is to work forwards, one square at a time to produce a working Sudoku grid. When a problem occurs, the algorithm takes itself back one step and tries a different path. Essentially it's like walking through a maze with some golden thread and going back and forth down dead ends until you find the right way out. It's nearly impossible to produce a valid Sudoku by randomly plotting numbers and trying to make them fit. Likewise, backtracking with a random placement method is equally ineffective (trust me I've tried both). Backtracking best works in a linear method. It is fast, effective and reliable if done correctly. Below is a basic diagram showing the general flow of the algorithm:

AlgorithmTree.jpg

Sudoku

Just in case you have forgotten or are unsure. There is only one rule to Sudoku, every number form 1 to 9 must be placed once, and once only, in every row, column and 3x3 region.

When to Backtrack

Ok, let's use this image as an example. We've been going fine so far for the first 17 numbers, but when it comes to finding a valid fit for the 18th number there are no valid options. every number bar 1 has been used on this line, but 1 has been used above, in the same column and the same region. Therefore; we have to backtrack, in this case back to the 6. Changing the 6 to a 1 would fix this problem.

Example1_2_.jpg

The Key to Backtracking

The most important part of backtracking is, logically, keeping track of what has already been tried and where, or, in this case, what hasn't been tried where. For this example, I used an arraylist of generic lists.

VB.NET
Dim Available(80) as List(Of Integer)

This way, each square of the Sudoku has its own individual list of values, telling the algorithm what hasn't been tried in the square. This way when another number has to be chosen, the only available numbers are the only ones tried.

The Algorithm

You'll notice the algorithm calls a number of other functions, subs and the all important 'Structure', which I haven't yet talked about. For ease of understanding, I have numbered and explained them.

VB.NET
Public Sub GenerateGrid()

    Clear()
    Dim Squares(80) As Square 'an arraylist of squares: see line 86
    Dim Available(80) As List(Of Integer) 'an arraylist of generic lists (nested lists)
    'we use this to keep track of what numbers we can still use in what squares
    Dim c As Integer = 0 'use this to count the square we are up to

    For x As Integer = 0 To Available.Length - 1
        Available(x) = New List(Of Integer)
        For i As Integer = 1 To 9
            Available(x).Add(i)
        Next
    Next

    Do Until c = 81 'we want to fill every square object with values
        If Not Available(c).Count = 0 Then 	'if every number has been tried 
					'and failed then backtrack
            Dim i As Integer = GetRan(0, Available(c).Count - 1)
            Dim z As Integer = Available(c).Item(i)
            If Conflicts(Squares, Item(c, z)) = False Then 	'do a check with the 
							'proposed number
                Squares(c) = Item(c, z) 	'this number works so we add it to the 
					'list of numbers
                Available(c).RemoveAt(i) 'we also remove it from its individual list
                c += 1 'move to the next number
            Else
                Available(c).RemoveAt(i) 	'this number conflicts so we remove it 
					'from its list
            End If
        Else
            For y As Integer = 1 To 9 'forget anything about the current square
                Available(c).Add(y) 'by resetting its available numbers
            Next
            Squares(c - 1) = Nothing 'go back and retry a different number 
            c -= 1 'in the previous square
        End If
    Loop

    Dim j As Integer ' this produces the output list of squares
    For j = 0 To 80
        Sudoku.Add(Squares(j))
    Next

     'This algorithm produces a Sudoku in an average of 0.018 seconds, 
     'tested over 5000 iterations
     End Sub
  1. Clear simply deletes the previously produced Sudoku, if any.
    VB.NET
    Public Sub Clear()
        Sudoku.Clear()
    End Sub
  2. Square is the name I have given to the structure. Each instance of square represents an object that contains information about the value, index and relative position of each square. It contains its region (3x3 area), row, column, index and value.
    VB.NET
    Public Structure Square
        Dim Across As Integer
        Dim Down As Integer
        Dim Region As Integer
        Dim Value As Integer
        Dim Index As Integer
    End Structure
  3. GetRan retrieves a random number between 0 and the last index of the current list, which means it grabs one of the remaining numbers indexes.
    VB.NET
    Private Function GetRan(ByVal lower As Integer, ByVal upper As Integer) _
    	As Integer
        Dim r As New Random
        GetRan = r.Next(lower, upper - 1)
    End Function
  4. Conflicts is, perhaps, the most important function in the overall algorithm. It tells us if the number we are looking at using, is going to work or not. To do this it takes the squares currently produced and compares them with an instance of a not yet produced square. This test square ('the hypothetical') is made in the 'Item' function below.
    VB.NET
    Private Function Conflicts(ByVal CurrentValues As Square(), _
    	ByVal test As Square) As Boolean
          
    For Each s As Square In CurrentValues
        If (s.Across <> 0 And s.Across = test.Across) OrElse _
               (s.Down <> 0 And s.Down = test.Down) OrElse _
               (s.Region <> 0 And s.Region = test.Region) Then
                
            If s.Value = test.Value Then
                Return True
            End If
        End If
    Next
    
    Return False
    End Function
  5. Item takes the given value and the given index and returns a square item containing all relevant information. It does this by calling on 3 other functions to acquire the row, column and region of the square. These other functions use simple math to determine the row, column etc.
    VB.NET
    Private Function Item(ByVal n As Integer, ByVal v As Integer) As Square
        n += 1
        Item.Across = GetAcrossFromNumber(n)
        Item.Down = GetDownFromNumber(n)
        Item.Region = GetRegionFromNumber(n)
        Item.Value = v
        Item.Index = n - 1
    End Function
    
    Private Function GetAcrossFromNumber(ByVal n As Integer) As Integer
        Dim k As Integer
        k = n Mod 9        
        If k = 0 Then Return 9 Else Return k
    End Function
    
    Private Function GetDownFromNumber(ByVal n As Integer) As Integer
        Dim k As Integer
        If GetAcrossFromNumber(n) = 9 Then
            k = n\9   
        Else
            k = n\9 + 1
        End If
        Return k
    End Function
    
    Private Function GetRegionFromNumber(ByVal n As Integer) As Integer
        Dim k As Integer
        Dim a As Integer = GetAcrossFromNumber(n)
        Dim d As Integer = GetDownFromNumber(n)
    
        If 1 <= a And a < 4 And 1 <= d And d < 4 Then
            k = 1
        ElseIf 4 <= a And a < 7 And 1 <= d And d < 4 Then
            k = 2
        ElseIf 7 <= a And a < 10 And 1 <= d And d < 4 Then
            k = 3
        ElseIf 1 <= a And a < 4 And 4 <= d And d < 7 Then
            k = 4
        ElseIf 4 <= a And a < 7 And 4 <= d And d < 7 Then
            k = 5
        ElseIf 7 <= a And a < 10 And 4 <= d And d < 7 Then
            k = 6
        ElseIf 1 <= a And a < 4 And 7 <= d And d < 10 Then
            k = 7
        ElseIf 4 <= a And a < 7 And 7 <= d And d < 10 Then
            k = 8
        ElseIf 7 <= a And a < 10 And 7 <= d And d < 10 Then
            k = 9
        End If
        Return k
    End Function

More About the Structure

If you are completely new to structures like I really was when I started this two day project then here is a brief rundown of how it works. When you create an instance of a structure like I did every time I created a 'square'. The variables defined in the structure itself are created as part of the instance like properties. They can be read and written but are crucially part of the instance, like text to a textbox. In this way, you can create a totally new and extremely useful/versatile object such as the Sudoku square. The beauty of having a list of these custom structures is the ability to access individual items and extract individual values. For instance:

VB.NET
Dim Test as Integer =  SudoGen.Sudoku.Items(0).value

The best way to learn more about them is either by reading or playing around with them a bit, until you get the hang of it. I recommend the latter, nothing beats experience.

Benefit of the List

The most beneficial part about using list (of integer) is the way it automatically adjusts itself when an item is removed. When you take an item out, like a listbox, the next item takes on its index which means when using GetRan, it's not a hit and miss affair, there's always a number available in every item.

Using the Code

For ease of use, the code discussed is compiled in a module called SudoGen. Easy to access, easy to use and easy to incorporate into countless projects.

To generate a Sudoku, it's as easy as:

VB.NET
SudoGen.GenerateGrid

Then it's simple to access the output once the Sudoku has generated. These remain until a new grid is created or the SudoGen.Clear function is called.

VB.NET
SudoGen.Sudoku 'A list(of square) : The output grid

Points of Interest

Removing the components form the module and using them within your main class may be more useful as it will allow you to do different things with the code. For example, you could keep track of the progress of the generator to display to the user.

Nothing really to do with this program, but I believe some Bacteria use backtracking to find the most direct path to a source of food, probing every direction until they find a large enough source to feed the colony, then the unused/unsuccessful parts die off for the greater good of the colony. A bit similar I think. If nature thinks it's the most effective, then methinks it is a wise move to agree.

History

  • Posted: January 26, 2008, 7:04 PM AEDST
  • Updated: January 26, 2008, 8:50 PM AEDST
  • Updated: January 27, 2008, 9:00 AM AEDST
    • Increased Sudoku creation speed from an average of 6-20 seconds, all the way down to 0.07 seconds
  • Updated: February 3, 2008, 11:20 AM AEDST
    • Added suggestions from Keith B. which increased the generator speed from 0.07 to 0.0452
  • Updated: February 18, 2009, 10:59AM AEDST
    • Added suggestions from Imperiatus and Spirch, thank you guys
    • Updated source zip and article code
    • Now runs at an average speed of 0.018

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)