- Hide quoted text — Show quoted text -

> Subject: Maximizing a quadratic form w/non-linear constraint

> Date: 2 Sep 94 00:55:33 GMT

> Organization: University of California, San Diego

> Lines: 27

> Nntp-Posting-Host: sdcc12.ucsd.edu

> I’m trying to solve for an N-component complex vector x such that

> (1) x maximizes the quadratic form x^H R x, where R is a known,

> *real*, symmetric, positive definite matrix, and where ^H

> denotes the Hermitian transpose;

> and

> (2) each component of x has unit modulus; that is,

> x_k = exp(i phi_k)

> Thus, there are N Lagrange multipliers for this nonlinear

> constraint.

> I don’t know how to handle the constraint, both because it is

> complex-valued and because it is nonlinear.

> I’ve thought of taking advantage of the fact that

> x^H R x = a’Ra + b’Rb, where a and b are the real and imaginary parts

> of x, and where ‘ denotes transpose. I could then make an order 2-N

> quadratic form with the augmented vector [a;b] and augmented matrix

> [ [R,0]; [0;R] ], and the correponding nonlinear constraint that

> ties the real and imaginary parts together. I don’t know how to

> proceed from here, nor even if this is the best course.

> Any suggestions or references would be appreciated. Thanks.

> rick adair

Ten years ago I worked on this very problem for six months. There are

several different viable approaches that I used to solve this optimization

problem. This problem is known as a quadratic programming problem. It can

be solved as such or it can be converted into a nonlinear system of

equations using Lagrange multipliers and then solved as a nonlinear vector

equation. I will describe both approaches. The first will be very brief,

the second will be lengthy.

Quadratic Programming Problem

—————————–

A quadratic programming problem is:

maximize a quadratic function

subject to quadratic constraints

(1) The Harwell library function VF02AD solves just such a problem. A

description of the algorithm used by this library function is

presented in a 1977 paper by M.J.D. Powell. See reference [1].

(2) Chapter 18 of "Matrices and Simplex Algorithms," A.R.G. Heesterman,

D. Reidel Publishing Company, 1983, discusses solution methods for

such problems.

(3) The optimal power flow problem is a quadratic programming problem.

Initially it was believed that the quadratic objective function had

many local minima and that a search for a global minimum would at

best be linearly convergent. But then a quadratically convergent

algorithm was discovered. See reference [2].

Nonlinear Vector Equation

————————-

The remainder of this exposition is devoted to using Lagrange multipliers

for solving an optimization problem. Rather than discussing the problem

in its most general form I will treat a specific quadratic programming

problem and will develop the framework for studying this problem from there.

First, an explanation of the notation used herein.

. Scalars, Vectors, Matrices

Upper case letters denote matrices. Lower case letters denote vectors.

Scalar components of a vector are indexed by subscripts. Lower case

letters are also used as scalars and this ambiguity must be resolved

from the context. Iterates of a sequence are indexed by superscripts.

. Functions

Both upper and lower case letters are used to represent functions. For

brevity, a subscript signifies that the function is to be evaluated at

the indicated value of its argument(s). Functions may be scalar-, vector-

or matrix-valued and real- or complex-valued, these different

possibilities must be resolved from the context.

. Symbols

* complex conjugate transpose, (applied to a scalar it simply means

complex conjugate)

T transpose

_ subscript

^ superscript

| | (complex) absolute value or vector norm or matrix norm

The four symbols [ ] , ; are used together to construct vectors and

matrices with a comma adjoining rows and a semicolon terminating rows

according to customary Matlab syntax.

. Real, Complex

A quantity is determined to be real or complex from its context.

This is the statement of the problem

max f(z) = 1/2 (Cz)^{*} (Cz)

|z_{i}|^2 = 1 i=1,…,n

z=[z_{1};...;z_{n}]. Each component z_{i} is a complex scalar and C is a

complex matrix of dimension n by n. Furthermore, C is assumed to be normal

and symmetric.

Let C and z be expressed as C = R + iI, z = x + iy where R, I, x and y

are real. A few calculations give the following useful result:

Since C is symmetric R^{T}=R and I^{T}=I so C^{*} = R^{T} – iI^{T} = R – iI,

and since C is normal C^{*} C = C C^{*}. Therefore,

C^{*} C = (R – iI) (R + iI) = (R^2 + I^2) + i(RI – IR)

C C^{*} = (R + iI) (R – iI) = (R^2 + I^2) + i(IR – RI)

and equating the real and imaginary parts above it is seen that the

imaginary part must be zero (and hence that R and I commute). Now let

A = C^{*} C = R^2 + I^2. A is a real symmetric matrix.

Now proceed with the calculation of f(z). (Cz)^{*} (Cz) = x^{T}Ax + y^{T}Ay

+ i(x^{T}Ay – y^{T}Ax). Since x^{T}Ay is a scalar it equals its transpose.

With this fact and the fact that A is symmetric the transpose is:

(x^{T}Ay)^{T} = y^{T}Ax. Hence, the imaginary part of f(z) is zero (as it

should be). Thus, the objective function, f(z), is equal to (by abuse of

notation f(z) is written as f(x,y))

f(x,y) = (1/2) [ x^{T}, y^{T} ] [ [A,0]; [0,A] ] [ x; y ].

Next, denote the constraint functions by h_{i}:

h_{i}(x,y) = (1/2) (x_{i}^2 + y_{i}^2 – 1)

The quadratic programming problem becomes

max f(x,y)

h_{i}(x,y) = 0

Introduce the Lagrangian function (for this see "Practical Methods of

Optimization," Fletcher, vol. 2, p. 48)

L(x,y,l) = f(x,y) – Sum_{i=1}^{n} l_{i} h_{i}(x,y)

l = (l_{1}, … , l_{n})^{T}, l is the Lagrange multiplier (it is

customarily denoted by lambda).

Let h = (h_{1}, … , h_{n})^{T}. The gradient with respect to x, y and l

of L is

grad(L) = [ [grad(f) - J^{T}(h) l]; [-h] ]

where J is the Jacobian of h. The gradient of f is grad(f) = [Ax; Ay]. The

Jacobian of h is J(h)=[X, Y], where X = diag(x_{1}, … , x_{n}), and

Y = diag(y_{1}, … , y_{n}) are diagonal matrices. The nonlinear vector

equation to be solved is

grad(L) = 0.

Newton-Raphson’s method for solving the vector equation f(x)=0 is easily

stated as:

Start with x^{0}, k=0

Solve J(f_{k}) s^{k} = – f_{k} for s^{k}

set x^{k+1} = s^{k} + x^{k}, k=k+1

iterate until a stopping criteria is met.

Now the Jacobian of the gradient of L is

J(grad(L)) = [ [[[A-D, 0]; [0, A-D]], -[X, Y]^{T}]; [-[X, Y], [0]] ]

where D=diag(l_{1}, … , l_{n}) and 0 is an n by n matrix of zeros. There

were four approaches that I used to solve grad(L) = 0.

(1) IMSL subroutine ZSPOW

(2) NAG subroutine E04WAF

(3) Newton-Raphson method

(4) Newton SQP method (sequential quadratic programming)

All four approaches produced satisfactory results. The details for methods

(3) and (4) are

Newton-Raphson

Start with [x^{0}; y^{0}; l^{0}], k=0

Solve J(grad(L_{k})) s^{k} = -grad(L_{k})

for s^{k}

set [x^{k+1}; y^{k+1}; l^{k+1}] = s^{k} + [x^{k}; y^{k}; l^{k}], k=k+1

iterate

Newton SQP

Start with [x^{0}; y^{0}; l^{0}], k=0

Solve J(grad(L_{k})) [p^{k}; l^{k+1}] = -grad(L(x^{k},y^{k},l=0))

for p^{k} and l^{k+1}

set [x^{k+1}; y^{k+1}] = p^{k} + [x^{k}; y^{k}], k=k+1

iterate

The idea of the Newton SQP method, as I have described it, was written

and sent to me by Professor Walter Murray of Stanford University in a

1984 letter. He suggested choosing starting values for x^{0}, y^{0},

l^{0} so that x^{0}, y^{0} satisfy the constraints and so that l^{0} is

the least squares solution of grad(L(x=x^{0},y=y^{0},l))=0. For example:

set x_{i}^{0} = y_{i}^{0} = sqrt(1/2), i=1,…,n and solve J^{T}(h_{0})l =

grad(f_{0}) for l and set l^{0}=l. The least squares solution of

J^{T}(h_{0})l = grad(f_{0}) is l=A [1;...;1]=sum of columns of A.

For further reading see

[1] "A Fast Algorithm for Nonlinearly Constrained Optimization

Calculations," M.J.D. Powell, Springer-Verlag Lecture Notes in

Mathematics, vol. 630, pp. 144-157.

[2] "Quadratically Convergent Optimal Power Flow," Burchett, Happ, Vierath,

IEEE Trans Power Apparatus and Systems, vol. PAS-103, no. 11, Nov 84,

pp. 3267-3275.

—————————————————————————

I also studied a related optimization problem:

max f(z) = 1/2 (Cz)^{*} (Cz)

|z_{i}|^2 = 1 i=1,…,p

z=[z_{1};...;z_{p}]. Each z_{i} is a complex vector of dimension m>1

with n=mp and C is a complex normal symmetric matrix of dimension n by n.

For the few cases that I studied none of the methods (1)-(4) produced

satisfactory results. My conjecture is that at the argmax of f(z) the

Jacobian of the gradient of the Lagrangian is zero, J(grad(L))=0. Has

someone an insight on this conjecture?

I welcome reports from others who have worked on this or related problems.

In particular, what were the methods and the results of those methods. Plus,

how did the given problem arise? The problem I studied arose from a

consideration of the electric field induced inside an inhomogeneous body.

The internal electric field, E, is related quite simply to the incident

electric field Ei by G E = Ei, where G is a complex normal symmetric

matrix. Hence, E = C Ei where C is the inverse of G and is a normal

symmetric matrix, too. The problem was to maximize E^{*}E subject to the

constraint that each component Ei_{i} (a three-dimensional complex vector)

of Ei had a known magnitude, |Ei_{i}| = k_{i}. Due to the symmetry of the

problem, G was a symmetric matrix. But it wasn’t obvious that G should be a

normal matrix. So I calculated |imag(G^{*}G)|/|real(G^{*}G)| and found the

ratio to be on the order of the precision of the arithmetic and thus

considered that imag(G^{*}G)=0 and then from this that G was normal.

Brent Carruth

carr…@mcs.anl.gov