Click here to Skip to main content
15,886,689 members
Articles / Programming Languages / C#
Tip/Trick

Needleman Wunsch Algorithm in C#

Rate me:
Please Sign up or sign in to vote.
5.00/5 (9 votes)
23 Aug 2013CPOL1 min read 64.7K   1.8K   5   20
Sequence Alignment using Needleman Wunsch algorithm in C#

Introduction

I was interested in Needleman Wunsch Algorithm and I was searching for a good example in C#, but I had no luck. So I figured this might help someone.

Background

I wrote this based on this article.

There are three steps:

  1. Initialization step
  2. Matrix Fill step
  3. Traceback step

Let's say we have:

  • sequence #1 GAATTCAGTTA and
  • sequence #2 GGATCGA
  • and scoring scheme is assumed
    • Si,j = 2 if the residue at position i of sequence #1 is the same as the residue at position j of sequence #2 (match score); otherwise
    • Si,j = -1 (mismatch score)
    • w = -2 (gap penalty)

Initialization Step

The first row and the first column of the matrix can be initially filled with 0 as below:

and the code would be as follows:

C#
int refSeqCnt = refSeq.Length + 1;
int alineSeqCnt = alignSeq.Length + 1;

int[,] scoringMatrix = new int[alineSeqCnt, refSeqCnt];

//Initialization Step - filled with 0 for the first row and the first column of matrix
for (int i = 0; i < alineSeqCnt; i++)
{
    scoringMatrix[i, 0] = 0;
}

for (int j = 0; j < refSeqCnt; j++)
{
    scoringMatrix[0, j] = 0;
}    

Matrix Fill Step

For each position, Mi,j is defined to be the maximum score at position i,j.

Mi,j = Maximum[

Mi-1,j-1 + Si,j (Match/mismatch in the diagonal),

Mi,j-1 + w (gap in sequence #1)

Mi-1,j + w (gap in sequence #2)

C#
//Matrix Fill Step
for (int i = 1; i < alineSeqCnt; i++)
{
    for (int j = 1; j < refSeqCnt; j++)
    {
        int scroeDiag = 0;
        if (refSeq.Substring(j - 1, 1) == alignSeq.Substring(i - 1, 1))
            scroeDiag = scoringMatrix[i - 1, j - 1] + 2;
        else
            scroeDiag = scoringMatrix[i - 1, j - 1] + -1;

        int scroeLeft = scoringMatrix[i, j - 1] - 2;
        int scroeUp = scoringMatrix[i - 1, j] - 2;

        int maxScore = Math.Max(Math.Max(scroeDiag, scroeLeft), scroeUp);

        scoringMatrix[i, j] = maxScore;
    }
} 

Traceback Step

After the matrix fill step, the maximum global alignment score for the two sequences is 3. The traceback will determine the actual alignment(s) that results in the maximum score.

This gives alignment of:

  • GAATTCAGTTA
  • GGA-TC-G--A
C#
//Traceback Step
char[] alineSeqArray = alignSeq.ToCharArray();
char[] refSeqArray = refSeq.ToCharArray();

string AlignmentA = string.Empty;
string AlignmentB = string.Empty;
int m = alineSeqCnt - 1;
int n = refSeqCnt - 1;
while (m > 0 && n > 0)
{
    int scroeDiag = 0;

    //Remembering that the scoring scheme is +2 for a match, -1 for a mismatch and -2 for a gap
    if (alineSeqArray[m - 1] == refSeqArray[n - 1])
        scroeDiag = 2;
    else
        scroeDiag = -1;

    if (m > 0 && n > 0 && scoringMatrix[m, n] == scoringMatrix[m - 1, n - 1] + scroeDiag)
    {
        AlignmentA = refSeqArray[n - 1] + AlignmentA;
        AlignmentB = alineSeqArray[m - 1] + AlignmentB;
        m = m - 1;
        n = n - 1;
    }
    else if (n > 0 && scoringMatrix[m, n] == scoringMatrix[m, n - 1] - 2)
    {
        AlignmentA = refSeqArray[n - 1] + AlignmentA;
        AlignmentB = "-" + AlignmentB;
        n = n - 1;
    }
    else //if (m > 0 && scoringMatrix[m, n] == scoringMatrix[m - 1, n] + -2)
    {
        AlignmentA = "-" + AlignmentA;
        AlignmentB = alineSeqArray[m - 1] + AlignmentB;
        m = m - 1;
    }
} 

I tested with 14739 bp sequence but it was a bit slow. If you have a better and faster solution, please let me know. I just wanted to share this with someone who might be interested in this algorithm in C#. You are welcome to give me any advice or suggestions.

Now I would like to research the multiple sequence alignment.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


Written By
Software Developer (Senior)
United States United States
ASP.NET MVC 3/4, C#, jQuery, HTML5, CSS3, javascript, JSP, Java, MS-SQL, Oracle, DBA, Silverlight, C++

Comments and Discussions

 
QuestionConversion to Python Pin
Member 1438227419-May-19 14:34
Member 1438227419-May-19 14:34 
SuggestionGreat Start! Pin
Member 1102072226-Mar-19 15:42
Member 1102072226-Mar-19 15:42 
SuggestionFirst row and first column shouldn't be zero, only element (0,0) is. Pin
kayatao25-May-18 0:49
kayatao25-May-18 0:49 
PraiseWonderful! aligning protein sequences (Antibody Variable Regions) Pin
Member 110207229-Nov-17 12:44
Member 110207229-Nov-17 12:44 
QuestionTraceback step Pin
Member 1305469216-Mar-17 18:27
Member 1305469216-Mar-17 18:27 
Questionsimilarity score Pin
Member 100298287-Mar-15 10:21
Member 100298287-Mar-15 10:21 
Questionquestion Pin
mona mohamed9317-Nov-14 8:21
mona mohamed9317-Nov-14 8:21 
QuestionAffine Gap Pin
Member 112059244-Nov-14 4:14
Member 112059244-Nov-14 4:14 
QuestionTruncated Sequences Pin
Member 1117836826-Oct-14 0:55
Member 1117836826-Oct-14 0:55 
AnswerRe: Truncated Sequences Pin
BlackMirrh29-Oct-14 12:44
BlackMirrh29-Oct-14 12:44 
QuestionFantastic! Pin
Norman Paterson14-Jan-14 15:27
Norman Paterson14-Jan-14 15:27 
AnswerRe: Fantastic! Pin
BlackMirrh15-Jan-14 4:56
BlackMirrh15-Jan-14 4:56 
QuestionUploading a full code would be helpful Pin
Grasshopper.iics16-Aug-13 19:52
Grasshopper.iics16-Aug-13 19:52 
AnswerRe: Uploading a full code would be helpful Pin
BlackMirrh19-Aug-13 3:23
BlackMirrh19-Aug-13 3:23 
GeneralRe: Uploading a full code would be helpful Pin
Grasshopper.iics19-Aug-13 9:41
Grasshopper.iics19-Aug-13 9:41 
GeneralRe: Uploading a full code would be helpful Pin
Grasshopper.iics19-Aug-13 9:44
Grasshopper.iics19-Aug-13 9:44 
GeneralRe: Uploading a full code would be helpful Pin
BlackMirrh19-Aug-13 16:11
BlackMirrh19-Aug-13 16:11 
GeneralRe: Uploading a full code would be helpful Pin
Grasshopper.iics19-Aug-13 18:59
Grasshopper.iics19-Aug-13 18:59 
GeneralRe: Uploading a full code would be helpful Pin
William M. Nelson14-Mar-14 10:36
William M. Nelson14-Mar-14 10:36 
GeneralRe: Uploading a full code would be helpful Pin
BlackMirrh17-Mar-14 5:48
BlackMirrh17-Mar-14 5:48 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.