14,969,910 members
Articles / Programming Languages / Java / Java SE
Article
Posted 12 Mar 2017

16.1K views
16 bookmarked

# Implementation of Gauss–Newton Algorithm in Java

Rate me:
Implement Gauss-Newton algorithm in Java to solve non-linear least squares problems; i.e. to find minimum of a function.

## Introduction

Gauss-Newton algorithm is a mathematical model to solve non-linear functions. A simple non-linear function is given below:

where a1 and a2 are the unknown parameters of this function. To find these two parameters, the values of y are measured on different values of `x`; `y` can be the rate of a chemical reaction and `x` is the concentration of a chemical affecting the rate. The results may look like this:

 `x` `0.038` `0.194` `0.425` `0.626` `1.253` `2.500` `3.740` `y` `0.050` `0.127` `0.094` `0.2122` `0.2729` `0.2665` `0.3317`

Gauss-Newton algorithm is used to estimate the values of a1 and a2 using the observed data above.

This sample function and the data are taken from https://en.wikipedia.org/wiki/Gauss-Newton_algorithm.

## Background

Gauss-Newton algorithm in a few steps:

1. Create an array of unknown parameters in the function:

b = (b1, b2,...,bn)

2. Initialise the parameters and for each data point in matrix x, calculate the predicted values (y').
3. Calculate the residuals:

ri = y'i - yi

4. Find the partial differential of residuals with respect to the parameters and generate Jacobian matrix.

5. Following an iterative process, calculate the new values for the parameters using the following equation:

s is the iteration number, J is the Jacobian matrix, JT is the transpose of J.

## Using the Code

1. Given matrix x and y, the optimization starts with an initial guess of b matrix. By default, a value of `1.0` is given to all parameters in non-linear function.
2. Calculate the residuals:
Java
```public double[][] calculateResiduals(double[][] x, double[] y, double[] b) {
double[][] res = new double[y.length][1];

for (int i = 0; i < res.length; i++) {
res[i][0] = findY(x[i][0], b) - y[i];
}
return res;
}```

For each data point in x matrix, function `findY()` is called to calculate the predicted value of y. In the code, you will find this function as `abstract`:

Java
`public abstract double findY(double x, double[] b);`

The user needs to implement this function before using the optimization. For example, for the function in the introduction, the implementation will be as follows:

Java
```GaussNewton gaussNewton = new GaussNewton() {

@Override
public double findY(double x, double[] b) {
// y = (x * a1) / (a2 + x)
return (x * b[0]) / (b[1] + x);
}
};
```
3. Calculate the Jacobian matrix which is the partial derivatives of the residuals with respect to parameters in the function:
Java
```public double[][] jacob(double[] b, double[][] x, int numberOfObservations) {
int numberOfVariables = b.length;
double[][] jc = new double[numberOfObservations][numberOfVariables];

for (int i = 0; i < numberOfObservations; i++) {
for (int j = 0; j < numberOfVariables; j++) {
jc[i][j] = derivative(x[i][0], b, j);
}
}
return jc;
}
```

Function `derivative()` is called to calculate the partial derivative:

Java
```public double derivative(double x, double[] b, int bIndex) {
double[] bCopy = b.clone();
bCopy[bIndex] += alpha;
double y1 = findY(x, bCopy);
bCopy = b.clone();
bCopy[bIndex] -= alpha;
double y2 = findY(x, bCopy);
return (y1 - y2) / (2 * alpha);
}
```

This function gives a good approximate of the partial derivative; i.e., the change in `y` after a small change in the variable.

4. Perform (J JT)-1 JT r` ` operations:
Java
```public double[][] transjacob(double[][] JArray, double[][] res) throws NoSquareException {
Matrix r = new Matrix(res); // r
Matrix J = new Matrix(JArray); // J
Matrix JT = MatrixMathematics.transpose(J); // JT
Matrix JTJ = MatrixMathematics.multiply(JT, J); // JT * J
Matrix JTJ_1 = MatrixMathematics.inverse(JTJ); // (JT * J)^-1
Matrix JTJ_1JT = MatrixMathematics.multiply(JTJ_1, JT); // (JT * J)^-1 * JT
Matrix JTJ_1JTr = MatrixMathematics.multiply(JTJ_1JT, r); // (JT * J)^-1 * JT * r
return JTJ_1JTr.getValues();
}
```
5. Using the results of step 4, the new values of the parameters are calculated:
Java
`IntStream.range(0, values.length).forEach(j -> b2[j] = b2[j] - gamma * values[j][0]);`

b2 is the new b matrix. gamma is a fraction of the values coming from Jacobian. If the initial values for b are far from optimum, there is a possibility of convergence problem. Applying this simple fraction seems to solve this issue. The downside of applying this fraction is the number of iterations which will increase.

6. Having the new b matrix, repeat steps 2-5 in the next iteration. All optimisation steps are given in the function below:
Java
```public double[] optimise(double[][] x, double[] y, double[] b) throws NoSquareException {
int maxIteration = 1000;
double oldError = 100;
double precision = 1e-6;
double[] b2 = b.clone();
double gamma = .01;
for (int i = 0; i < maxIteration; i++) {
double[][] res = calculateResiduals(x, y, b2);
double error = calculateError(res);
System.out.println("Iteration : " + i + ", Error-diff: " +
(Math.abs(oldError - error)) + ", b = "+ Arrays.toString(b2));
if (Math.abs(oldError - error) <= precision) {
break;
}
oldError = error;
double[][] jacobs = jacob(b2, x, y.length);
double[][] values = transjacob(jacobs, res);
IntStream.range(0, values.length).forEach(j -> b2[j] = b2[j] - gamma * values[j][0]);
}
return b2;

}```

### Example

#### Observations

 `x` `0.038` `0.194` `0.425` `0.626` `1.253` `2.500` `3.740` `y` `0.050` `0.127` `0.094` `0.2122` `0.2729` `0.2665` `0.3317`

#### Target

Find a1 and a2 using the above data and Gauss-Newton algorithm:

Java
```@Test
public void optimiseWithInitialValueOf1() throws NoSquareException {
double[][] x = new double[7][1];
x[0][0] = 0.038;
x[1][0] = 0.194;
x[2][0] = 0.425;
x[3][0] = 0.626;
x[4][0] = 1.253;
x[5][0] = 2.500;
x[6][0] = 3.740;
double[] y = new double[] { 0.050, 0.127, 0.094, 0.2122, 0.2729, 0.2665, 0.3317 };
GaussNewton gaussNewton = new GaussNewton() {

@Override
public double findY(double x, double[] b) {
// y = (x * a1) / (a2 + x)
return (x * b[0]) / (b[1] + x);
}
};
double[] b = gaussNewton.optimise(x, y, 2);
Assert.assertArrayEquals(new double[]{0.36, 0.56}, b, 0.01);
}
```

In the above test, the initial values for the parameters are the default value `(1.0)`; however, with starting values of `100`, we will get the same values; see the attached codes. I have also tested the algorithm on a large set of randomly generated data and it works with good time and memory efficiency; see the test codes attached.

## Share

 Software Developer (Senior) Private United Kingdom
I have a PhD in computational chemistry from Newcastle University. I worked for Imperial College London as research scientist for 6.5 years followed by 7 years in banking in the City of London as senior software developer. Currently I do mathematical modelling and software development for a private company and spend some time in research and development in the University of Newcastle.