Click here to Skip to main content
15,173,033 members
Articles / Programming Languages / Python
Article
Posted 9 Jan 2015

Stats

111.1K views
2.5K downloads
98 bookmarked

Simple Software for Optimal Control

Rate me:
Please Sign up or sign in to vote.
4.96/5 (83 votes)
10 Jun 2021CPOL15 min read
Simple software to solve quadratic optimal control problem
This article presents simple software for solving quadratic optimal control problem for linear and nonlinear dynamic systems.

Table of Contents

Introduction

In this article, I'd like to present a compact and simple-to-use software tool for optimal control of dynamic systems. Optimal control is a very broad topic. Its foundation was laid with works by Richard Bellman, Lev Pontryagin, Rudolf Kalman and others in late 1950s - 60s. Optimal control has numerous practical applications. Here, we will focus on the following classic optimal control problem. Our goal is to transfer given dynamic system from its initial state to some final state with external control inputs. We desire that during this transition system, parameters follow their prescribed trajectories as close as possible. This is a very fundamental task of optimal control with a lot of practical applications. For example, to start of electrical motor limiting initial jump of electric current to acceptable value; promptly damping undesirable oscillations in electronic and mechanical systems, and countless sophisticated applications ranging from space craft control to missiles interception.

Dealing with this problem and even its statement requires a lot of math. I will confine myself with the necessary minimum required to describe the problem and understand how to use software. Some additional recursive formulas are given in an appendix without proof.

Problem Statement

Let's formalize our problem. Dynamic system is described with the following set of differential  equations:

Image 1 (1)

where t is time, x is state vector consisting of parameters characterizing system behavior, is vector of the first derivative of x by time, m is control vector. Normally, the dimension of the control vector does not exceed the dimension of the state vector. In this statement, vector m is independent on vector x, and may be considered as control of open-loop system. The open-loop control system with input control vector m and state vector x is depicted below:

Image 2
Open-Loop Control System

We’d like to obtain control policy m(t) during time interval between 0 and final time of tf which minimizes cost function:

Image 3 (2)

where superscript T denotes transposing of vector or matrix, r is desirable state vector, u is desirable control vector, Q and Z are weight matrices for deviation between desirable and actual parameters of state and control vectors respectively. In most of the cases, Q and Z are diagonal matrices (but not necessarily!). Each diagonal element of them defines relative weight of square of difference between desirable and real values of a parameter at given moment of time. The matrices may be time dependent, e.g., it is common practice to increase the relative weight of the differences at final time to ensure desirable final state. The problem formulated here is often referred to as "tracking problem."

Linear Dynamic Systems

Before we come to a solution, let's discuss a special case of dynamic systems. The most well studied class of the systems is linear dynamic systems, i.e., system in which dynamics is described with a set of linear differential equations. For such system, equation (1) may be rewritten as follows:

Image 4 (3)

where A(t) and B(t) are matrices. First, we will provide a solution for these kind of systems, and then spread it to nonlinear dynamic systems.

Discrete Time

Since we are going to offer a numeric solution, we will deal with the system in discrete moments of time. So we will reformulate our problem for discrete time t = k ⋅ Δt, where Δt constitutes a small time interval (sampling interval) and k is a given time step, 0 <= k < N, tf = (N-1) ⋅ Δt. Now equations (1) may be rewritten for discrete time:

Image 5 (4)

and linear system is defined as:

Image 6 (5)

where Fk and Hk are matrices obtaining from A(t) and B(t) as follows:
  Fk = exp(A(k⋅Δt)⋅Δt) ≅ I+A(k⋅Δt)⋅Δt,   Hk = A-1[exp(A(k⋅ Δt)⋅Δt) - I]B ≅ B(k⋅Δt)⋅Δt,   where I is unit matrix.

Cost function (2) in discrete time will be reshaped to:

Image 7 (6)

Solution for the Linear Case

A numeric solution of the problem may be obtained based on Bellman's principle of optimality. Results are quite bulky and presented in the appendix. For general understanding, it is essential to know that mk control policy is calculated in two major runs. First, inverse calculation run is performed starting with final time step N-1 and going down to initial time 0. The inverse calculation run using system matrices F and H, desirable state r and control u vectors and weight matrices Q, Z yields some vector ck and matrix Lk. Then the direct run is carried out from time 0 till N-1 obtaining vector mk at each k time step as:

Image 8 (7)

Both inverse and direct calculation runs are shown in the picture below:

Image 9
Calculation scheme

Vectors cN-k and matrices LN-k are known from the inverse run and dependent on system matrices F and H, user defined desirable vectors and weights. They also depend on auxiliary matrices PN-k and vectors vN-k participating in inverse run and starting from zero initial conditions. On each step of direct calculation run, we obtain control vector mk based on state vector xk for this time. Then calculate the next state xk+1 using equation (5). This procedure is repeated until the last time moment N-1 giving us both open-loop control policy mk and states of the system xk vs. time. Statement (7) provides us with close-loop control as depicted below:

Image 10
Close-Loop Control System

where vector c constitutes close-loop control and matrix L provides feedback from state vector x. These parameters allow designer to generate vector m based on actual state of the system, thus making the system more stable and robust.

Discussion of the Solution for the Linear Case

For linear systems, control policy mk minimum of cost function (6) is achieved with just one iteration consisting of one inverse and one direct runs. As we have stated above, this policy depends on system matrices and user chosen parameters, namely desirable vectors and weight matrices. These parameters are intuitive enough to reduce the control policy design to a play with them. The user chooses a set of parameters and calculates the control policy and appropriate state trajectories. If she/he is not satisfied with the result, the calculation is repeated for another set of desirable vectors and weight matrices. It is also interesting to observe that if system matrices F, H and weight matrix Q are time invariant then feedback matrix L will be also time invariant, which considerably simplified design of system’s controller. For linear case system matrices F, H are normally time invariant. So, to attain time invariant feedback, user should choose constant weight matrix Q, i.e. control without formal strict achievement of desirable final state at final moment of time. In real life by proper choice of Q, we can achieve a desirable final state in acceptable time, even before our final time. Although it is convenient to implement control policy using state vector coordinates, in real life usually not all those coordinates are available for measurement. Various techniques have been developed to overcome this difficulty. Different kinds of state coordinate observers may be used to estimate unmeasured coordinates using information for the coordinates available for measurement.

Nonlinear Systems

The nonlinear system case is much more complex. Unlike linear systems, it does not have overall decision for all kinds of systems. Here, we can try to reduce nonlinear optimization problem to a linear one using quasilinearization approach. To use the same math solution, we have to present general dynamic system description given by equation (1) in linearized form similar to equation (5). To achieve this, we should consider iterative calculation process. We can linearize (1) to (5) in a particular point on xk and mk of i-th trajectory:

Image 11 (8)

where:

Image 12 (9)

Linearized system matrices F and H are matrices of appropriate partial derivatives of functions f at point k on i-th trajectory. State and control vectors on (i+1)-th trajectory may be presented as:

Image 13 (10)

Equation (8) may be formulated for increments Δx and Δm. In this case, in cost function (6), new vectors r and u correspond to:

Image 14 (11)

respectively. This allows us to reduce nonlinear problem to a sequence of quasilinear iterations, each of which is depicted in calculation scheme figure above. The algorithm consists of the following steps:

  1. Assuming for iteration i=0 vectors xk and mk (normally all zeros accept xk=0 initial conditions, but choice is up to user) calculate linearized all Fki and Hki for i-th trajectory according to (8).
  2. Calculate new rki and uki for i-th iteration according to (11).
  3. Calculate Δmki and Δxki using inverse and direct runs as for linear case.
  4. Calculate xki+1 and mki+1 for (i+1)-th iteration.
  5. Check condition to stop iterations. This condition may be based, e.g., on the difference between values of cost function comparing to previous iteration, or simply contain fixed number of iterations. Anyway, if the condition is not satisfied, then steps 1-5 should be repeated until the stop condition will be satisfied. After the condition was satisfied, policy mk and states xk may be considered as the final solution.

Obtaining exact formulas for partial derivatives of functions f may be tedious work, and becomes impossible when functions f are defined numerically, e.g., as a table. In these cases, it is possible to use simple numeric approximation by giving small increments to a state vector coordinate of partial derivative and calculate result value of f function. Code sample provides appropriate examples.

Method Limitations

Like any numeric approach, this one has its limitations. One limitation is rather obvious and should be considered for both linear and nonlinear cases: time sampling interval should be small enough to guarantee convergence of calculations. Desirable values of r, u, and weight factors Q and Z should be reasonable to achieve intuitively optimal system behavior. In case of nonlinear systems, we should clearly understand that this is a "game without strict rules," and thus there is not any guarantee that the technique is applicable to a particular case. The method is sensitive to calculation of linearized system matrices accuracy. More complex methods may be used to estimate applicability of this approach to a particular nonlinear system, but usage of try-and-fail approach seems more practical for engineers. Again, in some way dealing with nonlienear system is still "art rather than science."

Constant Feedback

As in linear case, control with constant feedback may be considered for nonlinear systems. To obtain constant feedback in nonlinear case weight matrices Q, Z, initial trajectory for state x and control m should be time invariant (except initial state x0), and acceptable result should be achieved in one iteration (since system matrices F and H calculated according to (9) depend on previous iteration trajectory). Normally, engineers adopt rather informal approach to system optimization, when delivering minimum to our cost function is not the ultimate goal but a mean to attain acceptable system behavior. So it may be wise in some cases to sacrifice further cost function minimization for the sake of feedback simplicity. It is also possible that in some cases constant feedback may to some extent compensate nonlinearity. Of course, like everything in nonlinar systems, it is possible not in all cases. Appropriate numeric example will be discussed below.

Usage of Software

.NET

The software is written in plain C# without any additional libraries. In the last update it has been converted to .NET 5 (DLLs are in .NET Standard 2.0) . .NET demo can be run in Windows and Linux alike by command

dotnet Test.dll

Project MatrixCalc provides arithmetic operations with vectors and matrices including transposition and inversion. Project SequentialQuadratic supports calculations for system optimization and simulation. Project Test uses static methods and interfaces of SequentialQuadratic to solve particular problems. DLL SequentialQuadratic has several internal types responsible for different optimization problems (linear with constant weights and desirable values, linear with changed in time weights and desirable values and appropriate nonlinear). Instance of these types are created with overloaded static methods SQ.Create() and exposed to caller via set of appropriate interfaces IBase, IBaseLQExt, ITrackingLQ and INonLinearQ. In order to perform optimization procedure, user should call appropriate method providing input and getting required interface. Then method RunOptimization() should be called to actually carry out optimization calculations, followed by method Output2Csv() to write result in comma-separated format to an Excel-compatible *.csv file. To simulate the system with given control policy, method Simulate() is called. Sample of the code for optimization of nonlinear system (Van der Pol oscillator) is shown below:

C#
public static void VanDerPol()
{
	const string TITLE = "Van der Pol";
	Console.WriteLine($"Begin \"{TITLE}\"");

	var lstFormatting = new List<OutputConfig> 
	{
		new OutputConfig { outputType = OutputType.Control, index = 0, title="m0",
		                      scaleFactor=1 },
		new OutputConfig { outputType = OutputType.State, index = 0, title="x0" },
		new OutputConfig { outputType = OutputType.State, index = 1, title="x1" },
	};

	const int dimensionX = 2;
	const int dimensionM = 1;

	const double dt = 0.1;
	const int N = 51;

	const int maxIterations = 17;

	var r = new Vector(dimensionX) { [0] = 0, [1] = 0 };
	var u = new Vector(dimensionM) { [0] = 0 };
	var Q = new Matrix(dimensionX, dimensionX) { [0, 0] = 1, [1, 1] = 1 };
	var Z = new Matrix(dimensionM, dimensionM) { [0, 0] = 1 };

	var xInit = new Vector(dimensionX) { [0] = 1, [1] = 0 };

	var mu = 1;

	var functions = new CalcDelegate[]
	{
		(k, deltaT, m, x) => x[1],
		(k, deltaT, m, x) => mu * (1 - Sq(x[0])) * x[1] - x[0] + m[0]
	};

	// Exact formulas for gradients calculation
	var gradientsA = new CalcDelegate[dimensionX, dimensionX];
	gradientsA[0, 0] = (k, deltaT, m, x) => 0;
	gradientsA[0, 1] = (k, deltaT, m, x) => 1;
	gradientsA[1, 0] = (k, deltaT, m, x) => -(2 * mu * x[0] * x[1] + 1);
	gradientsA[1, 1] = (k, deltaT, m, x) => mu * (1 - Sq(x[0]));

	var gradientsB = new CalcDelegate[dimensionX, dimensionM];
	gradientsB[0, 0] = (k, deltaT, m, x) => 0;
	gradientsB[1, 0] = (k, deltaT, m, x) => 1;

	(SQ.Create(functions, gradientsA, gradientsB, Q, Z, r, u, xInit, dt, N,
				(currCost, prevCost, iteration) => iteration > maxIterations)
		.RunOptimization()
		.Output2Csv("VanDerPol_ExactGrads", lstFormatting) as INonLinearQ)
		.OutputExecutionDetails(isExactGradient: true);

	// Numeric gradients calculation
	var delta = 0.1;
	(SQ.Create(functions, new Vector(dimensionM, delta), new Vector(dimensionX, delta),
						 Q, Z, r, u, xInit, dt, N,
						 (currCost, prevCost, iteration) => iteration > maxIterations)
		.RunOptimization()
		.Output2Csv("VanDerPol_NumGrads", lstFormatting) as INonLinearQ)
		.OutputExecutionDetails(isExactGradient: false);

	Console.WriteLine($"End \"{TITLE}\"{Environment.NewLine}");
}

Python

I provide a little simplified version of the code written in Python3. Currently, it contains one linear and one non-linear examples. This code runs both in Windows and Linux (I tested it under Ubuntu 20.04) environment. To run the code, you need to install latest version of Python3 and its numpy and scipy packages, unzip source code and start file Program.py with

python3 Program.py

command.

 

Python version supports transients visualization. To activate this feature, please install matplotlib package and remove all comments #1 in file Program.py.

Numeric Examples

Linear System

There is a mechanical system: mass M is moving by force. According to the Newton's second law, the force is equal to result of multiplication of mass and acceleration which is the second derivative of displacement. Let's denote dispacement as x0, velocity as x1 and force as m0. The system is desribed with the following second order matrix differential equation:

ẋ = Ax + Bm,  where:

A =
\begin{bmatrix} 0 & 1 \\ 0 & 0 \end{bmatrix}
 ,   B =  
\begin{bmatrix} 0 \\ 1/M \end{bmatrix}
 ,   M = 1
xk=0 =  
\begin{bmatrix} 0 \\ 0 \end{bmatrix}

Cost function is  ∫040[(10 - x0)2 + 3x12]dt

Q =  
\begin{bmatrix} 1 & 0 \\ 0 & 3 \end{bmatrix}
 ,   Z =  
\begin{bmatrix} 0 \end{bmatrix}
 ,   r =  
\begin{bmatrix} 10 \\ 0 \end{bmatrix}
 ,   u =  
\begin{bmatrix} 0 \end{bmatrix}

Δt = 1, N = 41.

Image 15
Transients of the linear system
Close-loop input and feedback factors obtained as result of calculations are
input 4.48,
feedbacks: from x0  -0.448 and from x1  -1.22 .

Van der Pol Oscillator

This is a nonlinear system. Our goal is smooth and fast damping of ongoing oscillations.

0 = x1
1 = μ⋅x1⋅(1 - x02) - x0 + m0,    μ=1

xk=0 =  
\begin{bmatrix} 1 \\ 0 \end{bmatrix}

Cost function is  ∫05(x02 + x12 + m02)dt

Q =  
\begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix}
 ,   Z =  
\begin{bmatrix} 1 \end{bmatrix}
 ,   r =  
\begin{bmatrix} 0 \\ 0 \end{bmatrix}
 ,   u =  
\begin{bmatrix} 0 \end{bmatrix}

Δt = 0.1, N = 51.

Image 16
Transients of the controlled Van der Pol oscillator after 17 iterations

In the above case with chosen desirable state and control vectors and weight matrices, we observe good convergent iterative calculations.

Rayleigh Equation with Scalar Control

This is another nonlinear oscillation system. Our goal is the same as in previous case: smooth and fast ongoing oscillations damping.

0 = x1
1 = -x0 + (1.4 - 0.14⋅x12)⋅x1 + 4⋅m0

xk=0 =  
\begin{bmatrix} -5 \\ -5 \end{bmatrix}

Cost function is  ∫02.5(x02 + m02)dt

Q =  
\begin{bmatrix} 1 & 0 \\ 0 & 0 \end{bmatrix}
 ,   Z =  
\begin{bmatrix} 1 \end{bmatrix}
 ,   r =  
\begin{bmatrix} 0 \\ 0 \end{bmatrix}
 ,   u =  
\begin{bmatrix} 0 \\ \end{bmatrix}

Δt = 0.025, N = 101.

Image 17
Transients after 4 iterations

In this case, even the first iteration provides acceptable result. But the value of cost function does not steadily reduce for chosen optimization parameters. The picture above presents transients for 4 iterations.

Nonlinear System with Constant Feedback

This is a nonlinear system that can be adequately translated from its given initial state to zero state using constant feedback.

0 = x1
1 = -x0⋅[π/2 + artan(5⋅x0)] - 5⋅x02 / [2⋅(1 + 25⋅x02)] + 4⋅x1 + 3⋅m0

xk=0 =  
\begin{bmatrix} 3 \\ -2 \end{bmatrix}

Cost function is  ∫010(5x12 + m02)dt

Q =  
\begin{bmatrix} 0 & 0 \\ 0 & 5 \end{bmatrix}
 ,   Z =  
\begin{bmatrix} 1 \end{bmatrix}
 ,   r =  
\begin{bmatrix} 0 \\ 0 \end{bmatrix}
 ,   u =  
\begin{bmatrix} 0 \\ \end{bmatrix}


Δt = 0.1, N = 101.

Image 18
Transients for a single iteration
Image 19
m0   vs.  x1  for a single iteration

The above figures depict transients calculated with our software for chosen weight matrices after a single iteration. The last figure shows relations between controlled state coordinate x1 and control m0 for our calculations.

Two-Link Robotic Manipulator

Application of the described technique to optimal control of a two-link robotic manipulator is presented in this post.

More numeric examples are provided in code sample.

Conclusions

The article briefly explains numeric solution of sequential quadratic optimization control problem for linear and nonlinear systems. Quasilinearization approached is used to reduce non-linear problem to a set of sequential linear-quadratic problems for linearized system. Compact and simple-to-use software is provided along with example of its usage.

Appendix

Parameters c and L in recursive equation (7) for calculation of control vector m in inverse run may be obtained from dynamic programming Bellman equation (format of this article does not permit to provide appropriate maths transformations) as set of the following recursive operations:

Image 20

where P and v are recursively calculated auxiliary matrix and vector respectively, k starts from   k = N - 1  with   P0 = 0,   v0 = 0.

Note that the above calculations require to invert auxiliary W matrix dependent on transfer matrix H, weight matrices and matrix P.

History

  • 9th January, 2015: Initial version
  • 9th June, 2021: Update

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

Share

About the Author

Igor Ladnik
Software Developer (Senior)
Israel Israel


  • Nov 2010: Code Project Contests - Windows Azure Apps - Winner
  • Feb 2011: Code Project Contests - Windows Azure Apps - Grand Prize Winner



Comments and Discussions

 
Questiongoto Statements in Matrix Invert Pin
firmwaredsp24-Jan-15 13:56
Memberfirmwaredsp24-Jan-15 13:56 
AnswerRe: goto Statements in Matrix Invert Pin
Igor Ladnik24-Jan-15 14:52
professionalIgor Ladnik24-Jan-15 14:52 
QuestionUpdate Info Pin
firmwaredsp24-Jan-15 9:28
Memberfirmwaredsp24-Jan-15 9:28 
AnswerRe: Update Info Pin
Igor Ladnik24-Jan-15 9:36
professionalIgor Ladnik24-Jan-15 9:36 
GeneralMy vote of 5 Pin
Volynsky Alex23-Jan-15 23:21
professionalVolynsky Alex23-Jan-15 23:21 
GeneralRe: My vote of 5 Pin
Igor Ladnik24-Jan-15 0:10
professionalIgor Ladnik24-Jan-15 0:10 
GeneralRe: My vote of 5 Pin
Volynsky Alex24-Jan-15 1:35
professionalVolynsky Alex24-Jan-15 1:35 
QuestionIt's just a blast of my mind. Pin
Pirks115-Jan-15 8:45
MemberPirks115-Jan-15 8:45 
Laugh | :laugh:
AnswerRe: It's just a blast of my mind. Pin
Igor Ladnik15-Jan-15 18:52
professionalIgor Ladnik15-Jan-15 18:52 
QuestionPretty crazy article there Pin
Sacha Barber13-Jan-15 1:18
MemberSacha Barber13-Jan-15 1:18 
AnswerRe: Pretty crazy article there Pin
Igor Ladnik13-Jan-15 1:24
professionalIgor Ladnik13-Jan-15 1:24 
GeneralRe: Pretty crazy article there Pin
Sacha Barber14-Jan-15 23:06
MemberSacha Barber14-Jan-15 23:06 
GeneralRe: Pretty crazy article there Pin
Igor Ladnik15-Jan-15 18:48
professionalIgor Ladnik15-Jan-15 18:48 
GeneralRe: Pretty crazy article there Pin
Sacha Barber16-Jan-15 3:42
MemberSacha Barber16-Jan-15 3:42 
GeneralMy vote of 5 Pin
micmim10-Jan-15 22:26
Membermicmim10-Jan-15 22:26 
GeneralRe: My vote of 5 Pin
Igor Ladnik10-Jan-15 22:30
professionalIgor Ladnik10-Jan-15 22:30 
GeneralMy vote of 5 Pin
Petr Ivankov10-Jan-15 20:17
MemberPetr Ivankov10-Jan-15 20:17 
GeneralRe: My vote of 5 Pin
Igor Ladnik10-Jan-15 20:44
professionalIgor Ladnik10-Jan-15 20:44 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.