INTRODUCTION:

The quadratic programming (QP) problem is a special case of the nonlinear optimization problem in which a quadratic objective function is to be optimized subject to linear constraint conditions. However, owing to its importance and frequent occurrence in everyday practice, it is regarded as a separate problem. In this Paper, we consider a neural network approach of solving the QP problems defined relative to the forms of the linear constraints, namely, Neural network for the QP problems in standard form.

These problems arise in a wide variety of scientific and engineering fields including regression analysis, function approximation, signal processing, image restoration, parameter estimation, filter design, robot control, etc., and hence a real time solution is often desired. However, traditional numerical methods might not be efficient for digital computers since the computing time required for a solution is greatly dependent on the dimension and the structure of the problem and the complexity of the algorithm used.

One promising approach to handle these optimization problems with high dimensions and dense structure is to employ artificial neural network based on circuit implementation. The neural network approach can solve optimization problems in running times at the orders of magnitude much faster than conventional optimization algorithms executed on general-purpose digital computers.

The structure of this Paper is as follows. In Section 1, the quadratic form and augmented Lagrange multiplier method are explained. In Section 2, the basic concepts, approach, algorithms and the architecture of the neural networks are presented. In Section 3, the simulation results using MATLAB software for QP problems in standard form are derived. Finally some concluding remarks are drawn in Section 4.

QUADRATIC FORM AND AUGMENTED LAGRANGE MULTIPLIER METHOD

Quadratic form

A quadratic form is a part of the objective function which is optimized in all QP problems. A function defined as

_{n n}

f(x) = Σ Σ q_ij x_i x_j

^{i=1 j=1}

is called a quadratic form of a vector x Є R^nx1 . It can be rewritten in a more compact form as

f(x) = x^T Q x

where Q Є R^nxn (1)

Recognizing that for every x and Q, the product x^TQ x is a scalar quantity, we have

x^TQ x = (x^TQ x)^T= x^TQ^Tx, and therefore,

x^TQ x = (x^TQ x + x^TQ ^Tx) / 2 = x^T[(Q + Q^T) / 2 ] x = x^TΘ x where Θ = (Q + Q^T) / 2

That is, matrix Θ is a symmetric matrix having elements given by (q_ij+ q_ji) / 2 where q_ij and q_ji are elements of Q. It shows that for every matrix Q, an equivalent quadratic form can be formulated with a symmetric coefficient matrix Θ. For that reason, without any loss of generality, we can assume that Q is a real symmetric matrix. A quadratic form is called positive definite if x^TQ x > 0 for every nonzero x Є R^nx1. Also, a quadratic form is called positive semi-definite if x^TQ x ≥ 0 for all nonzero x Є R^nx1 and there exists at-least one vector x ≠ 0 for which x^TQ x = 0.

A fundamental mathematical concept related to nonlinear programming is that of a convex function. A function f(x) defined over a convex set D in R^nx1 is said to be convex if for any two points x₁ and x₂ in D and for any 0 ≤ λ ≤ 1, we have f [λx₁+ (1-λ) x₂] ≤ λ f(x₁) + (1-λ) f(x₂). If the function is convex, its value in an arbitrary point on the segment (x₁, x₂) is smaller than the value on the straight line connecting the points f(x₁) and f(x₂). The following theorem presents an important property of the convex function.

Theorem (1)

If f(x) is convex on a convex set D, then f(x) has at-most one local minimum. If there is such a minimum, it is a global minimum and is obtained on the convex set D. In essence, the theorem states that if the energy function is convex, the problem of local minima does not exist and the gradient-based minimization is guaranteed to terminate at a point of the global minimum. The quadratic form in (1) is a common part of many energy function used in quadratic programming.

Theorem (2)

Let x Є R^nx1 and let f(x) = x^TQ x be a quadratic form defined on a convex set D Ì R^nx1. If matrix Q is semi-definite, the quadratic form f(x) is convex on set D. This theorem shows that the semi-definite quadratic form is a convex function. Therefore, according to Theorem (1), it has a unique minimum that can be easily found by using a gradient-based iterative technique. Hence we assume Q as a symmetric and a positive semi-definite matrix in solving a quadratic programming problem using neural network techniques. However, if the matrix Q is indefinite, there is a possibility that the quadratic form may have multiple local minima. From a practical standpoint, this is a rarely encountered limitations since the vast majority of quadratic programming problems can be formulated with Q being positive semi-definite.

Augmented Lagrange Multiplier Method

The augmented Lagrange multiplier method is one of the most effective general approaches to solving Quadratic programming problems. It was derived independently by Hestens [5] and Powel [8]. Mathematically, the constrained optimization problems can be formulated as follows:

Minimize

f(x) = f (x₁,x₂, …., x_n)

subject to Ax = b and x₁,x₂, …., x_n ≥ 0,

where x Є R^nx1, A Є R^mxn , b Є R ^mx1. The Ordinary Lagrange Multiplier method transforms the above problem to an unconstrained optimization problem. It is formulated by appending the constraints to the objective function with Lagrange multipliers used as searching factors. The new objective function is called the Lagrangian and it is of the form

L(x, λ) = f(x) + λ^T(Ax-b)

where x Є R^nx1 and λ = [ λ₁ ,λ₂ , ….., λ_m]^T Є R ^mx1. The augmented Lagrangian is formed by addition of extra penalty term. The most popular form of the augmented Lagrangian, given by Gill, Murray and Wright [3] is

L_A(x, λ, K) = f(x) + λ^T(Ax-b) + K(Ax-b)²

where λ = [ λ₁ ,λ₂ , ….., λ_m]^T Є R ^mx1 denotes the vector of the Lagrange multipliers and K = [k₁,k₂, ……, k_m] ^TЄR ^mx1 is the vector of positive penalty parameters. After this, the optimal solution is found as its unconstrained minimum. This can be done by using any of the unconstrained optimization techniques. In the most simplistic approach, we can use Steepest Descent where the minimization of the Lagrangian is converted to a system of difference equations of the form

x(k+1) = x(k) - μ_x[∂ L(x, λ,K)/∂x] (2)

λ(k+1) = λ (k) + μ_x[∂ L(x,λ,K)/∂λ] (3)

where μ_x ,μ_λ > 0 represent the learning rule parameters. There are several important properties of the augmented Lagrange multiplier method that need to be taken into consideration.

(i) Local-minimum property: The augmented Lagrange multiplier method guarantees convergence to the local minimum of the augmented Lagrangian. The local minimum of the augmented Lagrangian converges to the constrained minimum of the objective function only in the limiting case when the penalty parameter in K is sufficiently large.

(ii) Choice of the penalty parameter: In general, the penalty parameter needs to be chosen so that the Hessian matrix of the augmented Lagrangian defined as H = Ñ_x²L_A(x, λ, K) = ∂²L/∂x²is positive definite. If the values of the penalty parameter are too small, the algorithm may fail to converge or it may converge to a value that is a local minimum of the augmented Lagrangian but does not minimize the objective function itself. On the other hand, if the parameters are chosen too large, the algorithm may exhibit oscillatory behavior in the vicinity of the solution.

(iii) Convergence of the Lagrange Multipliers: For the algorithm in (2) and (3), to find the optimal solution it is necessary that both x and λ converge to their optimal values x* and λ*. In some cases the augmented Lagrangian can be very sensitive to the values for the multipliers, and it may take a considerable number of iterations before convergence is achieved.

BASIC CONCEPTS AND NEURAL NETWORK APPROACH

Basic Concepts

Neural networks represent a computational (e.g., Marks [6] ) approach to intelligence as contrasted with the traditional, more symbolic approach. The idea of such systems is due to the work of the psychologist D. Hebb (e.g., Hebb [4] ) (and after whom a class of learning techniques is referred to as Hebbian). Despite the pioneering early work of McCulloch and Pitts (e.g., McCulloch and Pitts [7] ) and Rosenblatt (e.g., Rosenblatt [9] ), the field was largely ignored through most of 1960’s and 1970’s, with researchers in artificial intelligence (AI) mostly concentrating on symbolic techniques. Reasons for this could be the lack of appropriate computational hardware or the work of Minsky and Papert which showed limitations of a class of neural networks (single layer perceptrons) popular then. The development of very large-scale integration (VLSI) and parallel computing revived interest in neural networks in the mid 1980’s as an alternate mechanism to investigate, understand, and duplicate intelligence. In the past decade, there has been a phenomenal growth in this field. Some researchers view neural networks as mechanism to study intelligence, but most literature in the area sees neural networks as a tool to solve problems in science and engineering. Most of these problems involve optimizing the solutions.

Neural Network Approach

The basic approach for solving problems using networks involves four phases: The first phase consists of constructing an appropriate error cost function for the particular type of problem to be solved. The error cost function is based on a defined error variable(s) which is typically formulated from a functional network for the particular problem. Thus, the problem, in general, is represented by a structured multiplayer neural network (e.g., Wang and Mendel [11] ).

The second phase is an optimization step which involves deriving the appropriate learning rule for the structured neural network using the defined error cost function from step (1) above. This typically involves deriving the learning rule in its batch (or vector-matrix) form. Once the vector-matrix form for the learning rule is derived, the scalar form can be formulated in a relatively straightforward manner.

The third phase involves the training of the neural network using the learning rule developed in step (2) above to match some set of desired patterns, that is, input/output signal pairs. Therefore, the network is essentially optimized to minimize the associated error cost function. That is, the training phase involves adjusting the network’s synaptic weights according to the derived learning rule in order to minimize the associated error cost function.

The fourth and final phase is actually the application phase in which the appropriate output signal are collected from the structural neural network for a particular set of inputs to solve a specific problem.

Neural Network for the QP Problems in Standard Form

The standard form of the QP problem is as follows:

Minimize f(x) = c^Tx + (1/2) x^TQx

subject to Ax = b and x₁, x₂, …., x_n ≥ 0,

where x Є R^nx1, A Є R^mxn , b Є R ^mx1, Q Є R^nxn , x Є R^nx1 , m<n and Q is assumed to be a symmetric positive definite matrix. In order for a neural network approach to be used, a convenient energy function needs to be defined. Using the augmented Lagrange multiplier method, we can define an energy function as (e.g., Chichoki and Unbehauen [1] )

E (x, λ) = c^Tx + (1/2) x^TQx + λ^T(Ax-b) + (K/2) (Ax-b)^T(Ax-b) (4)

where λ = [ λ₁ , λ₂ , ….., λ_m]^T Є R ^mx1and K ≥ 0 is the penalty parameter. Applying a gradient method, we can form the network update equations as

x(k+1) = x(k)- μ [Ñ_xE(x, λ)]

λ(k+1) = λ (k) + η [Ñ_λE(x,λ)]

where μ, η > 0 are the learning rule parameters. Taking the partial derivatives of the energy function with respect to x and λ in (4) which yield

Ñ_xE(x, λ) = (∂/∂x) [E(x)]

= (∂/∂x) [ c^Tx + (1/2) x^TQx + λ^T(Ax-b) + (K/2) (Ax-b)^T(Ax-b)]

= (∂/∂x) [ c^Tx + (1/2) x^TQx + λ^TAx - λ^Tb + (K/2) {(x^TA^T-b^T) (Ax-b)}]

= (∂/∂x) [ c^Tx + (1/2) x^TQx + λ^TAx - λ^Tb + (K/2) {(x^TA^TAx- x^TA^Tb- b^TAx+b^T b)}]

= (∂/∂x)[ c^Tx + (1/2) x^TQx+ (A^Tλ)^Tx-λ^Tb+(K/2){(x^TA^TAx- x^TA^Tb-(A^T b)^Tx+b^T b)}]

= c + (1/2) (2Qx) + A^Tλ + (K/2) {2A^TAx - A^Tb - A^T b}

= c + Qx + A^Tλ + K (A^TAx - A^Tb)

= c + Qx + A^Tλ + K A^T(Ax - b)

Ñ_λE(x, λ) = (∂/∂λ) [ c^Tx + (1/2) x^TQx + λ^T(Ax-b) + (K/2) (Ax-b)^T(Ax-b)]

=(Ax-b)

After we determine the gradient, the resulting network update equations are

x(k+1) = x(k) - μ [c + Q x(k) + A^Tλ(k) + K A^T{Ax(k) - b}]

λ(k+1)

= λ(k) + η (Ax-b)

SIMULATION

In order to verify the feasibility and efficiency of the above discrete-time neural networks, two examples are carried out using MATLAB software (e.g., Rudra Pratap [10] ) to solve the QP problems in standard form.

Example-1

Consider the following QP problem in standard form taken from Gass [2]

Maximize f(x) = 4x₁+ 6x₂- 2x₁²- 2x₁x₂- 2x₂²

subject to x₁+ 2x₂= 2 and x₁, x₂≥ 0

Here A = [1, 2]; b = [2]; c = [-4 ; 6]; Q = [4 2 ; 2 4]. Zero initial conditions are assumed for x and for Lagrange multiplier, λ. The learning rate parameters are chosen as μ = 0.01 and η = 0.01 and the penalty parameter as K = 0. The Figure (1) shows the trajectories for each of the independent variables. As can be seen, the neural network converges after 400 iterations. The optimal solution to the QP problem is given as x = [0.333, 0.8333] and f(x) = 4.1666, which is the same as the solution obtained by the classical Lagrangian method.

Example-2

Consider the following QP problem in standard form taken from Gass [2]

Minimize f(x) = 4x₁+ 9x₂- x₁²- x₂²

subject to 4x₁+ 3x₂= 15; 3x₁+ 5x₂= 14; and x₁, x₂≥ 0

Here A = [4 3 ; 3 5]; b = [15 ; 14]; c = [4 ; 9]; Q = [-2 0 ; 0 -2]. Zero initial conditions are assumed for x and for Lagrange multiplier, λ. The learning rate parameters are chosen as μ = 0.01 and η = 0.01 and the penalty parameter as K = 2. The Figure (2) shows the trajectories for each of the independent variables. As can be seen, the neural network converges after 400 iterations. The optimal solution to the QP problem is given as x = [3 , 1] and f(x) = 11, which is the same as the solution obtained by the classical Lagrangian method.

Figure (1): Trajectories of the independent variables for the QP problem in Example (1).

igure (2): Trajectories of the independent variables for the QP problem in Example (2).

CONCLUSION:

In this Paper, we have shown how the neural network technique is applied to solve quadratic programming problems in standard form. We have presented the graph to understand the updating process of the neural network architecture and trajectories of the independent variables.

The neural algorithm proposed in this Paper has several advantages over the conventional parallel algorithms. First, the neural algorithms are much simpler than others. The only algebraic operations required in the neural algorithms are addition and multiplication. Inverse and other complicated logic operations are not needed. Second, the neural algorithms are the most parallel, compared with conventional parallel algorithms. Simplicity in implementation of the neural algorithms is the third advantage.

REFERENCES:

[1] Chichoki A. and Unbehauen R., Neural Networks for optimization and Signal processing, Chichester, England: Wiley (1993).

[2] Gass S.I., Linear Programming, Methods and Applications, 5^th ed., New York: McGraw-Hill (1984).

[3] Gill P.E., Murray W., and Wright M.H., Practical Optimization, London: Academic (1981).

[4] Hebb D.O., The Organization of Behavior: A Neuropsychological Theory, New York: Wiley (1949)

[5] Hestens M.R., Multiplier Gradient Methods, Journal of optimization Theory and Applications, (1969), Vol.4, pp: 303-320.

[6] Marks R.J., Intelligence: Computational verses artificial, IEEE Trans. Neural Networks, (1993), vol.4.

[7] McCulloch W.S., and Pitts W., A logical calculus of ideas immanent in nervous activity, Bull. Math. Biophys., (1943) vol. 5, pp. 115-133.

[8] Powell M.J.D., A Method for Nonlinear Constraints in Minimization Problems, In Optimization, ed. Fletcher, R. London: Academic, (1969), pp:283-298.

[9] Rosenblatt F., Principles of Neuro-dynamics. New York: Spartan (1962).

[10] Rudra Pratap, Getting Started with MATLAB-A Quick Introduction for Scientists and Engineers, Version-6, Oxford University Press, New York, (2003).

[11] Wang L.X., and Mendel J.M., Structured Trainable Networks for Matrix Algebra, Proceedings of the International Joint Conference on Neural Networks, San Diego, CA, (1990) vol. 2, pp. 125-132.

Received on 11.07.2016 Modified on 21.07.2016

Research J. Pharm. and Tech 2016; 9(10):1727-1731.

DOI: 10.5958/0974-360X.2016.00347.4