In this article, I introduce how to using the genetic algorithm for parameter estimates using the code that I construct for my research.
Installing sciBASIC# package via nuget:
PM> Install-Package sciBASIC -Pre
and then add reference to these DLL modules:
- Microsoft.VisualBasic.Data.Csv.dll
- Microsoft.VisualBasic.DataMining.Framework.dll
- Microsoft.VisualBasic.Mathematical.dll
- Microsoft.VisualBasic.Mathematical.ODEsSolver.dll
- Microsoft.VisualBasic.Architecture.Framework_v3.0_22.0.76.201__8da45dcd8060cc9a.dll
- Microsoft.VisualBasic.Data.Bootstrapping.dll
Background
In computer science and operations research, a genetic algorithm (GA) is a metaheuristic inspired by the process of natural selection that belongs to the larger class of evolutionary algorithms (EA). Genetic algorithms are commonly used to generate high-quality solutions to optimization and search problems by relying on bio-inspired operators such as mutation, crossover and selection.
Recently, I was working on a research of modelling the virus infection dynamic with the details of human IR, the dynamics parameter in the equation was going to estimates from the clinical data of some human patients. As the description in others' previous scientific work paper, the genetics algorithm was chosen for this parameter estimates. In this post, I will introduce how to using the genetic algorithm for the parameter estimates using the code that I construct for my research.
RK4 ODEs Solver in VisualBasic
Before the parameter estimates, I needs a ODEs solver for calculating the dynamics model that I've created in the research, and after searching on Wikipedia, here is the algorithm implementation that I used for solving an ODEs in numeric method.

y(tm+1) =y(tm) + (k1 + 2k2 +2k3 + k4) / 6 * h (1)
k1 =f( y(tm), tm) (2)
k2 =f( y(tm) + k1 / 2 * h, tm + h / 2) (3)
k3 =f( y(tm) + k2 / 2 * h, tm + h / 2) (4)
k4 =f( y(tm) + h * k3, tm + h) (5)
Here is the code that implements the Runge-Kutta method for solving an ODEs in VisualBasic, for more technical details, you can viewing the Runge-Kutta wiki page: [https://en.wikipedia.org/wiki/Runge-Kutta]
Private Sub __rungeKutta(dxn As Double,
ByRef dyn As Vector,
dh As Double,
ByRef dynext As Vector)
Call ODEs(dxn, dyn, K1)
Call ODEs(dxn + dh / 2, dyn + dh / 2 * K1, K2)
Call ODEs(dxn + dh / 2, dyn + dh / 2 * K2, K3)
Call ODEs(dxn + dh, dyn + dh * K3, K4)
dynext = dyn + (K1 + K2 + K3 + K4) * dh / 6.0
End Sub
RK4 Solver Code Usage
Here is the brief introduction of how to using the ODEs solver in VisualBasic:
- Inherits the abstract ODEs model:
Microsoft.VisualBasic.Mathematical.Calculus.ODEs
- Declaring of the y variables:
Dim <y_name> As var
- Other parameter just declared as constant or normal fields:
Const param# = value
- Specific the
y0
by using inline value assign: var = value
- Then at last, you can using
ODEs.Solve(Integer, Double, Double, Boolean) As Microsoft.VisualBasic.Mathematical.Calculus.ODEsOut
to solve your equations.
Represented below is a simple example of construct an ODEs model in VisualBasic:
Public Class Kinetics_of_influenza_A_virus_infection_in_humans : Inherits ODEs
Dim T As var
Dim I As var
Dim V As var
Const p# = 3 * 10 ^ -2
Const c# = 2
Const beta# = 8.8 * 10 ^ -6
Const delta# = 2.6
Protected Overrides Sub func(dx As Double, ByRef dy As Vector)
dy(T) = -beta * T * V
dy(I) = beta * T * V - delta * I
dy(V) = p * I - c * V
End Sub
Protected Overrides Function y0() As var()
Return {
V = 1.4 * 10 ^ -2,
T = 4 * 10 ^ 8,
I = 0
}
End Function
End Class
Running the solver and plotting the numerics result of your equations:
Dim model As New Kinetics_of_influenza_A_virus_infection_in_humans
Dim result As ODEsOut = model.Solve(10000, 0, 10)
Call result.DataFrame("#TIME") _
.Save("./Kinetics_of_influenza_A_virus_infection_in_humans.csv", Encodings.ASCII)
Dim sT = result.y("T").x.SeqIterator.ToArray(Function(i) New PointF(result.x(i), +i))
Dim sI = result.y("I").x.SeqIterator.ToArray(Function(i) New PointF(result.x(i), +i))
Dim sV = result.y("V").x.SeqIterator.ToArray(Function(i) New PointF(result.x(i), +i))
Call {
Scatter.FromPoints(sT, "red", "Susceptible Cells"),
Scatter.FromPoints(sI, "lime", "Infected Cells")
}.Plot(fill:=False) _
.SaveAs("./Kinetics_of_influenza_A_virus_infection_in_humans-TI.png")
Call {
Scatter.FromPoints(sV, "skyblue", "Virus Load")
}.Plot(fill:=False) _
.SaveAs("./Kinetics_of_influenza_A_virus_infection_in_humans-V.png")


Using the Code
GAF Core
As the [Genetic Algorithm](https://en.wikipedia.org/wiki/Genetic_algorithm) described, the genetic algorithm should contains the individual mutation, crossover between the individual chromosomes, and the fitness sort for population selection. Then I concluded the points that may be useful in the coding of your own GAF implementation, you can implement the genetics algorithm by yourself from following these steps:
- Initialized a population with specific population size, and each chromosome in this population is the candidate solution for your problem
- Then you are going to use a
while
loop in your function for the genetic iteration, loop until the best fitness in your solution meets the threshold - As the algorithm described, each iteration, each chromosomes has the chance to mutate itself a bit, and the mutate will create a new clone of the chromosomes itself
- In each iteration, each chromosomes or its mutation have the chance to crossover with other chromosomes (random in the population, crossover means swap some elements in the chromosomes)
- At last, these new mutations join the original population to create a new population.
- Then you can calculate the fitness for each chromosome in this new population and then sort these chromosomes by their fitness.
- By taking the first population size number of chromosomes from the new population after the fitness sort step, you may get a better solution for your problem.
- Loop until the first element (best solution) in the population its fitness meets the threshold.
The core implements of the GA is available at namespace Microsoft.VisualBasic.DataMining.Darwinism.GAF.GeneticAlgorithm(Of C, T)
, for using the genetic algorithm, you must implement these interfaces for your model:
1. The GAF core needs the information to know how to calculate the fitness for your chromesome model.
Public Interface Fitness(Of C As Chromosome(Of C), T As IComparable(Of T))
Function Calculate(chromosome As C) As T
End Interface
2. The GAF core required of information to know how to mutate and crossover itself.
Public Interface Chromosome(Of C As Chromosome(Of C))
Function Crossover(anotherChromosome As C) As IList(Of C)
Function Mutate() As C
End Interface
GAF Parallel Computing
Enable the GAF parallel computing is super easy, just needs specific the Parallel
property its value to TRUE
, And then before the fitness sorts, a parallel Linq will be called to boost the entire ODEs fitness evaluation process.
Population(Of chr).Parallel As Boolean
LQuery = From x As NamedValue(Of chr)
In source.AsParallel
Let fit As T = GA._fitnessFunc.Calculate(x.x)
Select New NamedValue(Of T) With {
.Name = x.Name,
.x = fit
}
Parameter Estimates
Chromosome: ParameterVector
The chromosomes in a population can evolve better fitting to the environment due to the reason of mutation and environment selection. Here are the utility functions that can produce a bit of mutation in a chromosome and the crossover function can provide the elements swap between two chromosomes:
<Extension> Public Sub Mutate(ByRef array#(), rnd As Random)
Dim i% = rnd.Next(array.Length)
Dim n# = Math.Abs(array(i))
Dim power# = Math.Log10(n#) - 1
Dim sign% =
If(rnd.NextBoolean, 1, -1)
n += sign * (rnd.NextDouble * 10 * (10 ^ power))
If n.IsNaNImaginary Then
n = Short.MaxValue
End If
array(i) = n
End Sub
<Extension>
Public Sub Crossover(Of T)(random As Random, ByRef v1 As T(), ByRef v2 As T())
Dim index As Integer = random.Next(v1.Length - 1)
Dim tmp As T
For i As Integer = index To v1.Length - 1
tmp = v1(i)
v1(i) = v2(i)
v2(i) = tmp
Next
End Sub
As you can see on the Mutate
and Crossover
utility functions, the chromosomes were abstracted as an array, so that we need a model to translate our fitted parameter list as an array that can be proceeded by these two utility functions:
Namespace Darwinism.GAF
Public Class ParameterVector
Implements Chromosome(Of ParameterVector), ICloneable
Implements IIndividual
Public Property vars As var()
<ScriptIgnore>
Public ReadOnly Property Vector As Double()
Get
Return vars _
.Select(Function(var) var.value) _
.ToArray
End Get
End Property
Public Function Crossover(anotherChromosome As ParameterVector) _
As IList(Of ParameterVector) Implements Chromosome(Of ParameterVector).Crossover
Dim thisClone As ParameterVector = DirectCast(Clone(), ParameterVector)
Dim otherClone As ParameterVector = _
DirectCast(anotherChromosome.Clone, ParameterVector)
Dim array1#() = thisClone.Vector
Dim array2#() = otherClone.Vector
Call seeds() _
.Crossover(array1, array2)
thisClone.__setValues(array1)
otherClone.__setValues(array2)
Return {thisClone, otherClone}.ToList
End Function
Public Function Mutate() As ParameterVector _
Implements Chromosome(Of ParameterVector).Mutate
Dim m As ParameterVector = DirectCast(Clone(), ParameterVector)
Dim random As Random = seeds()
For i As Integer = 0 To 2
Dim array#() = m.Vector
Call array.Mutate(random)
Call m.__setValues(array)
Next
Return m
End Function
End Class
End Namespace
GAF Fitness
Each variable in the function of ODEs its calculated value was compared with the observation value through RMS value:

And then using the average value of these RMS value as the final fitness of the current parameter value.
Public Function Calculate(chromosome As ParameterVector) _
As Double Implements Fitness(Of ParameterVector, Double).Calculate
Dim vars As Dictionary(Of String, Double) =
chromosome _
.vars _
.ToDictionary(Function(var) var.Name,
Function(var) var.value)
Dim out As ODEsOut =
MonteCarlo.Model.RunTest(Model, y0, vars, n, a, b, ref)
Dim fit As New List(Of Double)
Dim NaN%
For Each y$ In modelVariables _
.Where(Function(v)
Return Array.IndexOf(Ignores, v) = -1
End Function)
Dim a#() = observation.y(y$).x
Dim b#() = out.y(y$).x
If log10Fitness Then
a = a.ToArray(Function(x) log10(x))
b = b.ToArray(Function(x) log10(x))
End If
NaN% = b.Where(AddressOf IsNaNImaginary).Count
fit += Math.Sqrt(FitnessHelper.Calculate(a#, b#))
Next
Dim fitness# = fit.Average
If fitness.IsNaNImaginary Then
fitness = Integer.MaxValue * 100.0R
fitness += NaN% * 10
End If
Return fitness
End Function
Public Shared Function log10(x#) As Double
If x = 0R Then
Return -1000
ElseIf x.IsNaNImaginary Then
Return Double.NaN
Else
Return Math.Sign(x) * Math.Log10(Math.Abs(x))
End If
End Function
Testing
Problem & Goal
A virus infection dynamics model of Influenza was used in this testing for the GAF method demo:



Original definition and fitting parameter values can be found in this scientific paper:
Quote:
Kinetics of Influenza A Virus Infection in Humans.
DOI: 10.1128/JVI.01623-05
ODEs Model
We want to estimate the kinetics parameters (p, c, beta and delta) in the equations using GAF method, so that we just define a ODEs model and leaves the parameter blank or assign any value, wait for the estimates, and here is the code example:
Public Class Kinetics_of_influenza_A_virus_infection_in_humans_Model : Inherits GAF.Model
Dim T As var
Dim I As var
Dim V As var
Dim p As Double = Integer.MaxValue
Dim c As Double = Integer.MaxValue
Dim beta As Double = Integer.MaxValue
Dim delta As Double = Integer.MaxValue
Protected Overrides Sub func(dx As Double, ByRef dy As Vector)
dy(T) = -beta * T * V
dy(I) = beta * T * V - delta * I
dy(V) = p * I - c * V
End Sub
End Class
Observation Data Example
For this testing demo, I using the exists model output as the biological experiment observation reference for the GAF estimates' fitness calculation. And using this method to create a fake experiment data:
Public Sub BuildFakeObservationForTest()
Dim result As ODEsOut = ODEsOut _
.LoadFromDataFrame("./Kinetics_of_influenza_A_virus_infection_in_humans.csv")
Dim sampleSize% = 100
Dim xlabels#() = result.x.Split(sampleSize).ToArray(Function(block) block.Average)
Dim samples As NamedValue(Of Double())() =
LinqAPI.Exec(Of NamedValue(Of Double())) <=
_
From y As NamedValue(Of Double())
In result.y.Values
Let sample As Double() = y.x _
.Split(sampleSize) _
.ToArray(Function(block) block.Average)
Select New NamedValue(Of Double()) With {
.Name = y.Name,
.x = sample
}
Call samples.SaveTo(
path:="./Kinetics_of_influenza_A_virus_infection_in_humans-fake-observation.csv",
xlabels:=xlabels)
End Sub
Interpolation Preprocessing
If the experiments data is too small for the GAF based parameter estimates, then you can using the curve interpolation method for increasing the data points in your sample data. There are two curve Interpolation method that available in VisualBasic: **B-spline** method and **Cubic spline** in namespace ``Microsoft.VisualBasic.Mathematical.Interpolation
``. Here is an example of Cubic spline for the experiment datas' curve Interpolation processing:
Dim samples = "./Kinetics_of_influenza_A_virus_infection_in_humans-fake-observation.csv" _
.LoadData _
.ToDictionary
Dim x As Double() = samples("X").x
Dim observations As NamedValue(Of Double())() =
LinqAPI.Exec(Of NamedValue(Of Double())) <=
_
From sample As NamedValue(Of Double())
In samples.Values
Let raw As PointF() = x _
.SeqIterator _
.ToArray(Function(xi) New PointF(+xi, y:=sample.x(xi)))
Let cubicInterplots = CubicSpline.RecalcSpline(raw, 10).ToArray
Let newData As Double() = cubicInterplots _
.ToArray(Function(pt) CDbl(pt.Y))
Select New NamedValue(Of Double()) With {
.Name = sample.Name,
.x = newData,
.Description = cubicInterplots _
.ToArray(Function(pt) pt.X) _
.GetJson
}
Call observations _
.SaveTo("./Kinetics_of_influenza_A_virus_infection_in_humans-samples.csv")
GAF Estimates
All of the method protocol are available in the namespace: ``Microsoft.VisualBasic.Data.Bootstrapping.Darwinism.GAF.Protocol
``, and using ``Fitting
`` function for invoke this GAF parameter estimates:
Public Shared Function Fitting_
(Of T As Microsoft.VisualBasic.Data.Bootstrapping.MonteCarlo.Model)(_
observation As System.Collections.Generic.IEnumerable_
(Of Microsoft.VisualBasic.ComponentModel.DataSourceModel.NamedValue(Of Double())),
x As Double(),
Optional popSize As Integer = 100,
Optional evolIterations As Integer = 2147483647,
Optional ByRef outPrint As Microsoft.VisualBasic.Language.List_
(Of Microsoft.VisualBasic.DataMining.Darwinism.GAF._
Helper.ListenerHelper.outPrint) = Nothing,
Optional threshold As Double = 0.5,
Optional log10Fit As Boolean = True,
Optional ignores As String() = Nothing,
Optional initOverrides As System.Collections.Generic.Dictionary_
(Of String, Double) = Nothing,
Optional estArgsBase As System.Collections.Generic.Dictionary_
(Of String, Double) = Nothing,
Optional isRefModel As Boolean = False,
Optional randomGenerator As Microsoft.VisualBasic.Mathematical.IRandomSeeds_
= Nothing) As Microsoft.VisualBasic.Mathematical.Calculus.var()
Here is the example code of invoke the GAF method and save the result into a csv file:
Dim prints As List(Of outPrint) = Nothing
Dim estimates As var() = observations _
.Fitting(Of Kinetics_of_influenza_A_virus_infection_in_humans_Model)(
x#:=observations.First.Description.LoadObject(Of Double()),
popSize:=1000,
outPrint:=prints)
Call prints _
.SaveTo("./Kinetics_of_influenza_A_virus_infection_in_humans-iterations.csv")
Dim result = MonteCarlo.Model.RunTest(
GetType(Kinetics_of_influenza_A_virus_infection_in_humans_Model),
observations.y0,
estimates,
10000, 0, 10)
Call result.DataFrame("#TIME") _
.Save("./Kinetics_of_influenza_A_virus_infection_in_humans-GAF_estimates.csv", _
Encodings.ASCII)
Testing on Linux and Super Computer
This demo was build with VisualStudio 2015 and has been tested successfully on a Dell 40 CPU core server (running CentOS 7) and China TianHe 1 Super Computer.

History
- 2nd November, 2016: Initial version