西南大学学报 (自然科学版)  2018, Vol. 40 Issue (11): 74-80.  DOI: 10.13718/j.cnki.xdzk.2018.11.012
0
Article Options
  • PDF
  • Abstract
  • Figures
  • References
  • 扩展功能
    Email Alert
    RSS
    本文作者相关文章
    王祥玲
    左双勇
    朱志斌
    欢迎关注西南大学期刊社
     

  • 修正共轭投影梯度滤子法    [PDF全文]
    王祥玲1, 左双勇1, 朱志斌2     
    1. 宜春幼儿师范高等专科学校 初等教育学院, 江西 宜春 330814;
    2. 桂林电子科技大学 数学与计算科学学院, 广西 桂林 541004
    摘要:利用共轭投影梯度技术,结合滤子算法的思想,通过修正搜索方向,建立了一个新的共轭投影梯度滤子算法.该算法不需要求解二次规划子问题,而且能有效避免常规滤子算法中的恢复算法.在适当的条件下,证明了算法的全局收敛性.
    关键词共轭投影梯度    滤子    非线性规划    全局收敛性    

    本文考虑如下非线性规划问题:

    $ \begin{array}{*{20}{c}} {\mathop {\min }\limits_{\mathit{\boldsymbol{x}} \in {\mathbb{R}^n}} f\left( \mathit{\boldsymbol{x}} \right)} \\ {{\text{s}}.\;{\text{t}}.\;\;{\mathit{\boldsymbol{c}}_j}\left( \mathit{\boldsymbol{x}} \right) \leqslant 0,j \in I = \left\{ {1,2, \cdots ,m} \right\}} \end{array} $ (1)

    其中:x${\mathbb{R}}$nf${\mathbb{R}}$n$ \longrightarrow $${\mathbb{R}}$cj(jI):${\mathbb{R}}$n$ \longrightarrow $${\mathbb{R}}$是二次连续可微的.可行集X={x${\mathbb{R}}$n|cj(x)≤0,jI},有效集I(x)={jI|cj(x)=0}.

    非线性规划问题广泛应用于工程和社会生活的各个领域,并在其中起着举足轻重的作用.非线性规划问题的求解主要是最优性和可行性.为了使这两个规则都能满足,算法需要在迭代过程中每一步都保持这两种规则的平衡.目前关于非线性规划问题的求解方法多种多样[1-12].近年来,滤子法引起很多学者的密切关注.滤子法最早由Fletecher和Leyffer提出[1],它能有效避免选择罚函数的困难且具有良好的数值结果,从而近年来被广泛应用于各种优化问题的求解中[2-6].共轭投影梯度算法是解决非线性规划问题的另外一种有效算法[7].文献[8]将SQP方法和广义投影技巧结合,提出一个具有显式主搜索方向的共轭投影梯度算法,有效地简化了算法结构及计算工作量.为此,在文献[8]的基础上,本文提出一个修正的共轭投影梯度算法.该方法保证了每个试探点都不会远离可行域,同时在合适的条件下证明了算法的收敛性.

    1 相关理论知识 1.1 相关符号

    对于问题(1)的近似解xkXσk>0,正定矩阵Bk=B(xk),集合LkI,为了便于讨论给出以下记号:

    $ \begin{array}{*{20}{c}} {\mathit{\boldsymbol{g}}_j^k = {\mathit{\boldsymbol{g}}_j}\left( {{\mathit{\boldsymbol{x}}^k}} \right) = \nabla {\mathit{\boldsymbol{c}}_j}\left( {{\mathit{\boldsymbol{x}}^k}} \right),\mathit{\boldsymbol{H}}_j^k = {\mathit{\boldsymbol{H}}_j}\left( {{\mathit{\boldsymbol{x}}^k}} \right) = {\nabla ^2}{\mathit{\boldsymbol{c}}_j}\left( {{\mathit{\boldsymbol{x}}^k}} \right)}\\ {\mathit{\boldsymbol{F}}\left( {{\mathit{\boldsymbol{x}}^k}} \right) = \left( {{\mathit{\boldsymbol{c}}_j}\left( {{\mathit{\boldsymbol{x}}^k}} \right),j \in {L_k}} \right),{\mathit{\boldsymbol{A}}_k} = \mathit{\boldsymbol{A}}\left( {{\mathit{\boldsymbol{x}}^k}} \right) = \left( {{\mathit{\boldsymbol{g}}_j}\left( {{\mathit{\boldsymbol{x}}^k}} \right),j \in {L_k}} \right)\;\;\;\;\;\;\;\;j = 1, \cdots ,m} \end{array} $ (2)

    本文算法引用了文献[7]提出的共轭投影梯度技术,并采用文献[7]中定义的以下各量:

    $ {\mathit{\boldsymbol{Q}}_k} = \mathit{\boldsymbol{Q}}\left( {{\mathit{\boldsymbol{x}}^k}} \right) = {\left( {\mathit{\boldsymbol{A}}_k^{\rm{T}}\mathit{\boldsymbol{B}}_k^{ - 1}{\mathit{\boldsymbol{A}}_k}} \right)^{ - 1}}\mathit{\boldsymbol{A}}_k^{\rm{T}}\mathit{\boldsymbol{B}}_k^{ - 1},{\mathit{\boldsymbol{P}}_k} = \mathit{\boldsymbol{P}}\left( {{\mathit{\boldsymbol{x}}^k}} \right) = \mathit{\boldsymbol{B}}_k^{ - 1}\left( {{\mathit{\boldsymbol{E}}_n} - {\mathit{\boldsymbol{A}}_k}{\mathit{\boldsymbol{Q}}_k}} \right) $ (3)
    $ {\mathit{\boldsymbol{\pi }}^k} = \mathit{\boldsymbol{\pi }}\left( {{\mathit{\boldsymbol{x}}^k}} \right) = - {\mathit{\boldsymbol{Q}}_k}\nabla f\left( {{\mathit{\boldsymbol{x}}^k}} \right),\mathit{\boldsymbol{d}}_0^k = {\mathit{\boldsymbol{d}}_0}\left( {{\mathit{\boldsymbol{x}}^k}} \right) = - {\mathit{\boldsymbol{P}}_k}\nabla f\left( {{\mathit{\boldsymbol{x}}^k}} \right) + \mathit{\boldsymbol{Q}}_k^{\rm{T}}{\mathit{\boldsymbol{V}}^k} $ (4)
    $ {\mathit{\boldsymbol{V}}^k} = \mathit{\boldsymbol{V}}\left( {{\mathit{\boldsymbol{x}}^k}} \right) = \left( {\mathit{\boldsymbol{V}}_j^k,j \in {L_k}} \right),\mathit{\boldsymbol{V}}_j^k = \left\{ {\begin{array}{*{20}{c}} { - {f_j}\left( {{\mathit{\boldsymbol{x}}^k}} \right)}&{\mathit{\boldsymbol{\pi }}_j^k > 0}\\ {\mathit{\boldsymbol{\pi }}_j^k}&{\mathit{\boldsymbol{\pi }}_j^k \le 0} \end{array}} \right. $ (5)
    $ \mathit{\boldsymbol{d}}_1^k = - \mathit{\boldsymbol{Q}}_k^{\rm{T}}\left( {{{\left\| {\mathit{\boldsymbol{d}}_0^k} \right\|}^\tau }\mathit{\boldsymbol{e}} + \mathit{\boldsymbol{F}}\left( {{\mathit{\boldsymbol{x}}^k} + \mathit{\boldsymbol{d}}_0^k} \right)} \right),\mathit{\boldsymbol{e}} = \left( {1, \cdots ,1} \right),{\mathit{\boldsymbol{d}}^k} = \mathit{\boldsymbol{d}}_0^k + \mathit{\boldsymbol{d}}_1^k $ (6)
    $ {\mathit{\boldsymbol{\rho }}_k} = - \nabla f{\left( {{\mathit{\boldsymbol{x}}^k}} \right)^{\rm{T}}}\mathit{\boldsymbol{d}}_0^k,\mathit{\boldsymbol{d}}_2^k = \frac{{ - {\mathit{\boldsymbol{\rho }}_k}}}{{1 + 2\left\| {{\mathit{\boldsymbol{e}}^{\rm{T}}}{\mathit{\boldsymbol{\pi }}^k}} \right\|}}\mathit{\boldsymbol{Q}}_k^{\rm{T}}\mathit{\boldsymbol{e}},{\mathit{\boldsymbol{q}}^k} = {\mathit{\boldsymbol{\rho }}_k}\left( {\mathit{\boldsymbol{d}}_0^k + \mathit{\boldsymbol{d}}_2^k} \right) $ (7)

    定义1[7]  如果函数μ(x):${\mathbb{R}}$n$ \longrightarrow $${\mathbb{R}}$m满足以下条件:

    1) μ(x)连续.

    2) 当x*是问题(1)的K-T点时,μ(x*)为相应的K-T乘子.则称μ(x)为乘子函数.

    1.2 滤子技术

    滤子技术能很好地平衡目标函数和约束条件,对于每次迭代,试探点被接受当且仅当约束违反度函数值或目标函数值相对于当前的滤子集中的点有充分的下降.

    本文定义约束违反度函数为

    $ h\left( \mathit{\boldsymbol{x}} \right) = {\left\| {\mathit{\boldsymbol{c}}{{\left( \mathit{\boldsymbol{x}} \right)}^ + }} \right\|_\infty } $ (8)

    其中

    $ \mathit{\boldsymbol{c}}{\left( \mathit{\boldsymbol{x}} \right)^ + } = \max \left\{ {0,{\mathit{\boldsymbol{c}}_j}\left( \mathit{\boldsymbol{x}} \right):j \in I} \right\} $

    易知h(x)=0的充要条件是保证迭代点x是可行点,所以试探点应该使得约束违反度函数值降低或目标函数值减小.为了保证这两项至少有一项满足,参照文献[1]引入滤子的相关定义.

    定义2  称点x1支配点x2当且仅当

    $ h\left( {{\mathit{\boldsymbol{x}}^1}} \right) \le h\left( {{\mathit{\boldsymbol{x}}^2}} \right)\;且\;f\left( {{\mathit{\boldsymbol{x}}^1}} \right) \le f\left( {{\mathit{\boldsymbol{x}}^2}} \right) $ (9)

    定义3  滤子为具有形式为(hifi)的点集Z,使得任意点不能被其他点支配,即h(xi)≤h(xj)或f(xi)≤f(xj)对所有的ij成立.

    在实际计算中,为了获得全局收敛性,需要添加一些条件,如文献[1]中定义了一个“包络”来阻止点任意接近滤子.从而有以下滤子准则:

    定义4  一个点被滤子接受当且仅当对任意的(hjfj)∈Z

    $ h\left( \mathit{\boldsymbol{x}} \right) \le \beta {h_j}\;或\;f\left( \mathit{\boldsymbol{x}} \right) \le {f_j} - \gamma {h_j} $ (10)

    其中:hj=h(xj);fj=f(xj);0<γβ<1且γ→0,β→1.

    在算法的进程中需要将(hjfj)对添加到滤子中,如果一个点xk被滤子集Z接受,则令xk+1=xk+tkdk且滤子按如下的规则更新

    $ {Z_{k + 1}} = {Z_k} \cup \left\{ {\left( {{h_{k + 1}},{f_{k + 1}}} \right)} \right\}\backslash {D_{k + 1}} $ (11)

    其中

    $ {D_{k + 1}} = \left\{ {\left( {{h_j},{f_j}} \right)\left| {{h_j} \ge {h_k}} \right.,{f_j} - \gamma {h_j} \ge {f_k} - \gamma {h_k},\forall \left( {{h_j},{f_j}} \right) \in {Z_k}} \right\} $

    事实上,这里“点x被滤子集Z接受”严格上说是将(hjfj)对添加到滤子集中.如果点xk被滤子集接受或者已在滤子集中,那么满足

    $ h\left( \mathit{\boldsymbol{x}} \right) \le \left( {1 - \gamma } \right){h_k}\;且\;f\left( \mathit{\boldsymbol{x}} \right) \le {f_k} - \gamma {h_k} $ (12)

    的点x也是被滤子接受的.

    2 算法模型

    算法1

    步0 (初始化)  给定x0${\mathbb{R}}$nμ(x0)∈${\mathbb{R}}$mB0是对称正定矩阵,选择σ0ξεv∈(0,1),τ∈(2,3),δ>2且α1α2η∈(0,1),t0=1,初始化滤子集Z0,并令k=0.

    步1 (有效集Lk)

    (a) 令i=0,σki=σ0

    (b) 若det(AkTAk)>σki,令Lk=Lkiik=i,转步2,否则转步1(c),其中:

    $ {L_{k,i}} = \left\{ {j \in I\left| { - {\sigma _{k,i}}\left| {{\mathit{\boldsymbol{\mu }}_j}\left( {{x^k}} \right)} \right|} \right. \le {f_j}\left( {{\mathit{\boldsymbol{x}}^k}} \right) \le 0} \right\},{\mathit{\boldsymbol{A}}_{k,i}} = \left( {{\mathit{\boldsymbol{g}}_j}\left( {{\mathit{\boldsymbol{x}}^k}} \right),j \in {L_{k,i}}} \right) $

    (c) 令i=i+1,σki=0.5σki-1,返回步1(b).

    步2 (搜索方向)  通过式(2)-(5)获得d0k.若d0k=0,则获得一个K-T点,停止.否则,通过式(6)计算dk.

    $ \nabla f{\left( {{\mathit{\boldsymbol{x}}^k}} \right)^{\rm{T}}}\mathit{\boldsymbol{d}}_0^k \le \min \left\{ { - \xi {{\left\| {\mathit{\boldsymbol{d}}_0^k} \right\|}^\delta }, - \xi {{\left\| {{\mathit{\boldsymbol{d}}^k}} \right\|}^\delta }} \right\} $ (13)

    则转步3.否则,转步4.

    步3 (滤子准则)  若xk+tkdk不被滤子接受,则令xk+1=xktk+1=0.5tkk=k+1,转步4.若xk+tkdk被滤子接受,则令xk+1=xk+dkfk+1=f(xk+dk),hk+1=h(xk+dk),并去除那些被xk+1支配的点,更新滤子集Zk+1,转步5.

    步4 (线搜索)  通过式(7)获得qk,取$ \left\{ {1, \frac{1}{2}, \frac{1}{4}, \cdots } \right\} $的最大值为λk,使其满足

    $ f\left( {{\mathit{\boldsymbol{x}}^k} + {\lambda _k}{\mathit{\boldsymbol{q}}^k}} \right) \le f\left( {{\mathit{\boldsymbol{x}}^k}} \right) + v{\lambda _k}\nabla f{\left( {{\mathit{\boldsymbol{x}}^k}} \right)^{\rm{T}}}{\mathit{\boldsymbol{q}}^k} $ (14)
    $ {\mathit{\boldsymbol{c}}_j}\left( {{\mathit{\boldsymbol{x}}^k} + {\lambda _k}{\mathit{\boldsymbol{q}}^k}} \right) \le 0,j \in I $ (15)

    dk=qktk=λk.

    步5 (迭代更新)  令xk+1=xk+tkdk,更新正定对称矩阵Bk+1k=k+1.返回步1.

    3 算法的可行性和收敛性分析

    算法A需要以下理论的支持.

    假设A

    (A1) 迭代序列{xk}⊆X非空且位于${\mathbb{R}}$n中的有界闭凸子集内;

    (A2) 对任意的x${\mathbb{R}}$n,向量组{▽cj(x),jI(x)}是线性无关的;

    (A3) ∀kBk${\mathbb{R}}$n是一个n阶对称正定矩阵且{Bk}一致有界;

    (A4) 目标函数f(x)和约束违反度函数h(x)都是二次连续可微的.

    (A5) 存在两个常数0<ab,使得ad2dTBkdbd2,∀k,∀d${\mathbb{R}}$n.

    由于集合LkI的有限性,不妨设存在无穷子列K,使得

    $ {\mathit{\boldsymbol{x}}^k} \to {\mathit{\boldsymbol{x}}^ * },{\mathit{\boldsymbol{B}}_k} \to {\mathit{\boldsymbol{B}}_ * },{L_k} \equiv L,k \in K $

    其中L为一个固定指标集.记A*Q*P*π*V*d0*d*q*分别为式(3)-(7)中各式在x*处以B*代替B(x*)的相应各量,则AkA*QkQ*PkP*d0kd0*qkq*kKk→∞.

    引理1  对任意的迭代指标k,算法1步1(b)和步1(c)之间的循环有限步终止.

      利用反证法.

    假设算法在步1(b)和步1(c)之间无限循环,则由算法结构知,

    $ \begin{array}{*{20}{c}} {\det \left( {\mathit{\boldsymbol{A}}_k^{\rm{T}}{\mathit{\boldsymbol{A}}_k}} \right) \le \frac{1}{{{2^i}}}{\sigma _0}}&{i = 1,2, \cdots } \end{array} $

    Lki的定义有

    $ {L_{k,i}} \supseteq {L_{k,i + 1}} $

    又集合I是个有限集,从而对充分大的i

    $ {L_{k,i}} = {L_{k,i + 1}} $

    并定义为Lk*.令i→∞,则

    $ \begin{array}{*{20}{c}} {\det \left( {\mathit{\boldsymbol{A}}_k^{\rm{T}}{\mathit{\boldsymbol{A}}_k}} \right) = 0}&{J_k^ * = I\left( {{\mathit{\boldsymbol{x}}^k}} \right)} \end{array} $

    这与假设(A2)矛盾.

    引理2  若假设A满足且迭代点xkX,则有下列结论成立

    1) xk是问题(1)的K-T点当且仅当d0k=0.

    2) 若xk不是问题(1)的K-T点,则dk仍是问题(1)在点xk处的一个可行下降方向,即有

    $ \nabla f{\left( {{\mathit{\boldsymbol{x}}^k}} \right)^{\rm{T}}}\mathit{\boldsymbol{d}}_0^k < 0,\nabla f{\left( {{\mathit{\boldsymbol{x}}^k}} \right)^{\rm{T}}}{\mathit{\boldsymbol{q}}^k} \le - \frac{1}{2}\mathit{\boldsymbol{\rho }}_k^2 < 0,{\left( {\mathit{\boldsymbol{g}}_j^k} \right)^{\rm{T}}}\mathit{\boldsymbol{d}}_0^k \le 0,{\left( {\mathit{\boldsymbol{g}}_j^k} \right)^{\rm{T}}}{\mathit{\boldsymbol{q}}^k} < 0,j \in I\left( {{\mathit{\boldsymbol{x}}^k}} \right) $

      参考文献[7]中定理1的证明.

    引理3[7]  对于算法1产生的序列{xk},设x*是它的任意一个聚点.若xk+1=xk+tkdk均由步4和步5产生,且x*不是问题(1)的K-T点,则有

    $ \begin{array}{*{20}{c}} {\nabla f{{\left( {{\mathit{\boldsymbol{x}}^ * }} \right)}^{\rm{T}}}{\mathit{\boldsymbol{q}}^ * } < 0,{\mathit{\boldsymbol{g}}_j}{{\left( {{\mathit{\boldsymbol{x}}^ * }} \right)}^{\rm{T}}}{\mathit{\boldsymbol{q}}^ * } < 0,j \in I\left( {{\mathit{\boldsymbol{x}}^ * }} \right)}\\ {{\lambda _k} \ge {\lambda _ * } = \inf \left\{ {{\lambda _k}} \right\} > 0,k \in K} \end{array} $

    引理4  算法是可执行的.

      只需证明在假设A成立的前提下,当k充分大时,算法不再执行步4.由算法知,只需证明由步4产生的迭代点也一定能被滤子接受.假设算法不能有限步终止,则有tk→0,k→∞,且新的迭代点xk+1=xk+tkqk不能被滤子接受.下面分两种情况讨论.

    1) 若hk=0.由h(x)的定义可得,

    $ {h_{k + 1}} = \max \left\{ {0,{\mathit{\boldsymbol{c}}_j}\left( {{\mathit{\boldsymbol{x}}^{k + 1}}} \right)} \right\} = \max \left\{ {0,{\mathit{\boldsymbol{c}}_j}\left( {{\mathit{\boldsymbol{x}}^k}} \right) + {t_k}{{\left( {\mathit{\boldsymbol{g}}_j^k} \right)}^{\rm{T}}}{\mathit{\boldsymbol{q}}^k} + O{{\left\| {{t_k}{\mathit{\boldsymbol{q}}^k}} \right\|}^2}} \right\} $

    由引理1知,

    $ {\left( {\mathit{\boldsymbol{g}}_j^k} \right)^{\rm{T}}}{\mathit{\boldsymbol{q}}^k} < 0 $

    结合tk→0,k→∞,则存在一个常数β,使得

    $ {h_{k + 1}} \le \max \left\{ {0,\beta {\mathit{\boldsymbol{c}}_j}\left( {{\mathit{\boldsymbol{x}}^k}} \right)} \right\} \le \beta {h_k} $ (16)

    再由式(14)和引理2知

    $ f\left( {{\mathit{\boldsymbol{x}}^{k + 1}}} \right) \le f\left( {{\mathit{\boldsymbol{x}}^k}} \right) + v{\lambda _k}\nabla f{\left( {{\mathit{\boldsymbol{x}}^k}} \right)^{\rm{T}}}{\mathit{\boldsymbol{q}}^k} \le f\left( {{\mathit{\boldsymbol{x}}^k}} \right) - v{\lambda _k}\frac{1}{2}\mathit{\boldsymbol{\rho }}_k^2 \le f\left( {{\mathit{\boldsymbol{x}}^k}} \right) - c\mathit{\boldsymbol{\rho }}_k^2,0 < \mathit{\boldsymbol{c}} < 1 $

    $ f\left( {{\mathit{\boldsymbol{x}}^{k + 1}}} \right) \le f\left( {{\mathit{\boldsymbol{x}}^k}} \right) $ (17)

    由式(16)-(17)可知新的迭代点xk+1=xk+tkqk能被滤子和xk接受.矛盾.

    2) 若hk≠0.类似于情况(1),可得hk+1βhk.由于xk被滤子接受,所以有

    $ {h_k} \le \beta {h_j}\;或\;{f_k} \le {f_j} - \gamma {h_j},\forall \left( {{h_j},{f_j}} \right) \in Z $

    由假设迭代点xk+1=xk+tkqk不能被滤子接受,则有

    $ {h_j} \ge {h_k} $ (18)

    $ {f_j} - \gamma {h_j} \ge {f_k} - \gamma {h_k} $ (19)

    对于点xk,若hkβhi成立,则由引理1知,

    $ {\left( {\mathit{\boldsymbol{g}}_j^k} \right)^{\rm{T}}}{\mathit{\boldsymbol{q}}^k} < 0 $

    结合tk→0,k→∞,则存在一个常数β,使得

    $ {h_{k + 1}} = \max \left\{ {0,{\mathit{\boldsymbol{c}}_j}\left( {{\mathit{\boldsymbol{x}}^{k + 1}}} \right)} \right\} \le \max \left\{ {0,{\mathit{\boldsymbol{c}}_j}\left( {{\mathit{\boldsymbol{x}}^k}} \right)} \right\} \le {h_k} \le \beta {h_j} $

    这与式(18)矛盾.

    对于点xk,若fkfj-βhj成立,则由引理1知,

    $ {\left( {\mathit{\boldsymbol{g}}_j^k} \right)^{\rm{T}}}{\mathit{\boldsymbol{q}}^k} < 0 $

    结合tk→0,k→∞,则存在一个常数β,使得

    $ f\left( {{\mathit{\boldsymbol{x}}^{k + 1}}} \right) = f\left( {{\mathit{\boldsymbol{x}}^k}} \right) + {t_k}\nabla f{\left( {{\mathit{\boldsymbol{x}}^k}} \right)^{\rm{T}}}{\mathit{\boldsymbol{q}}^k} + O\left( {{{\left\| {{t_k}{\mathit{\boldsymbol{q}}^k}} \right\|}^2}} \right) \le {f_j} - \beta {h_j} $

    这与式(19)矛盾.

    综上分析,当k充分大时,算法不再执行步4,即算法是有效的.

    引理5[9]  考虑{xk}是进入滤子集的无穷序列,与其对应的h(xk)>0且{f(xk)}有下界,则

    $ \mathop {\lim }\limits_{k \to \infty } h\left( {{\mathit{\boldsymbol{x}}^k}} \right) = 0 $

    引理6[9]  1)如果有无限多个点被添加到滤子集中,则

    $ \mathop {\lim }\limits_{k \to \infty } h\left( {{\mathit{\boldsymbol{x}}^k}} \right) = 0 $

    2) 如果有有限多个点被添加到滤子集中,则存在一个k1,使得当kk1时,有

    $ \mathop {\lim }\limits_{k \to \infty } h\left( {{\mathit{\boldsymbol{x}}^k}} \right) = 0 $

    引理7  按式(6)计算的主方向dk满足‖dk‖~‖d0k‖,‖d1k‖=O(‖d0k2).

      由算法1步4中λk的取法易知,当k充分大时,有λk≡1,此时xk+1=xk+dk.当jI*时有

    $ \begin{array}{l} {\mathit{\boldsymbol{c}}_j}\left( {{\mathit{\boldsymbol{x}}^{k + 1}}} \right) = {\mathit{\boldsymbol{c}}_j}\left( {{\mathit{\boldsymbol{x}}^k} + {\mathit{\boldsymbol{d}}^k}} \right) = {\mathit{\boldsymbol{c}}_j}\left( {{\mathit{\boldsymbol{x}}^k} + \mathit{\boldsymbol{d}}_0^k + \mathit{\boldsymbol{d}}_1^k} \right) = \\ {\mathit{\boldsymbol{c}}_j}\left( {{\mathit{\boldsymbol{x}}^k} + \mathit{\boldsymbol{d}}_0^k} \right) + {\left( {\mathit{\boldsymbol{g}}_j^k} \right)^{\rm{T}}}\mathit{\boldsymbol{d}}_1^k + O\left( {{{\left\| {\mathit{\boldsymbol{d}}_1^k} \right\|}^2}} \right) = {\mathit{\boldsymbol{c}}_j}\left( {{\mathit{\boldsymbol{x}}^k} + \mathit{\boldsymbol{d}}_0^k} \right) + {\left( {\mathit{\boldsymbol{g}}_j^k} \right)^{\rm{T}}}\mathit{\boldsymbol{d}}_1^k + O\left( {{{\left\| {\mathit{\boldsymbol{d}}_0^k} \right\|}^3}} \right) \end{array} $

    $ \mathit{\boldsymbol{A}}_k^{\rm{T}}\mathit{\boldsymbol{d}}_1^k = - {\left\| {\mathit{\boldsymbol{d}}_0^k} \right\|^\tau }\mathit{\boldsymbol{e}} - \mathit{\boldsymbol{F}}\left( {{\mathit{\boldsymbol{x}}^k} + \mathit{\boldsymbol{d}}_0^k} \right),故\;{\mathit{\boldsymbol{g}}_j}{\left( {{\mathit{\boldsymbol{x}}^k}} \right)^{\rm{T}}}\mathit{\boldsymbol{d}}_1^k = - {\left\| {\mathit{\boldsymbol{d}}_0^k} \right\|^\tau } - {\mathit{\boldsymbol{c}}_j}\left( {{\mathit{\boldsymbol{x}}^k} + \mathit{\boldsymbol{d}}_0^k} \right) $

    所以,

    $ {\mathit{\boldsymbol{c}}_j}\left( {{\mathit{\boldsymbol{x}}^{k + 1}}} \right) = {\mathit{\boldsymbol{c}}_j}\left( {{\mathit{\boldsymbol{x}}^k} + {\mathit{\boldsymbol{d}}^k}} \right) = - {\left\| {\mathit{\boldsymbol{d}}_0^k} \right\|^\tau } + O\left( {{{\left\| {\mathit{\boldsymbol{d}}_0^k} \right\|}^3}} \right),\tau \in \left( {2,3} \right) $

    从而当k充分大时有,

    $ \left\| {\mathit{\boldsymbol{d}}_1^k} \right\| = O\left( {{{\left\| {\mathit{\boldsymbol{d}}_0^k} \right\|}^2}} \right),\left\| {{\mathit{\boldsymbol{d}}^k}} \right\| \sim \left\| {\mathit{\boldsymbol{d}}_0^k} \right\| $

    定理1  在假设A成立的前提下,算法1或有限次迭代终止于问题(1)的一个K-T点xk,或产生一个无穷迭代序列{xk},使得它的任意一个聚点x*是问题(1)的K-T点.

      若算法1终止于点xk,则d0k=0.故由引理2可知迭代点xk是问题(1)的K-T点.即证明了定理1的前半部分.下面假设x*为无穷迭代序列{xk}的任意一个聚点,不妨设当k→∞时有xkx*kK,其中K为无穷指标集.下面分情况讨论.

    1) 若存在一个无穷序列K1K,使得迭代点xk+1=xk+tkdk均由算法1步2和步3产生.结合引理5、引理6知$ \mathop {\lim }\limits_{k \to \infty } $h(xk)=0.若$ \mathop {\lim }\limits_{k \in {K_1}, k \to \infty } $dk‖=0,则易知x*是一个K-T点.由$ \mathop {\lim }\limits_{k \to \infty } h\left( {{\mathit{\boldsymbol{x}}^k}} \right) = 0 $和假设A中的(A5)可设存在一个k0,使得当kk0kK1,有$ h\left( {{\mathit{\boldsymbol{x}}^k}} \right) \le \frac{{a{\varepsilon ^2}}}{{2M}} \le \frac{{a{{\left\| {{\mathit{\boldsymbol{d}}^k}} \right\|}^2}}}{{2M}} \le \frac{{{{\left( {{\mathit{\boldsymbol{d}}^k}} \right)}^{\rm{T}}}{\mathit{\boldsymbol{B}}_k}{\mathit{\boldsymbol{d}}^k}}}{{2M}} $,结合K-T条件,对所有的kk0kK1

    $ \begin{array}{l} \nabla f{\left( {{\mathit{\boldsymbol{x}}^k}} \right)^{\rm{T}}}{\mathit{\boldsymbol{d}}^k} = - {\left( {{\mathit{\boldsymbol{d}}^k}} \right)^{\rm{T}}}{\mathit{\boldsymbol{B}}_k}{\mathit{\boldsymbol{d}}^k} - {\left( {{\mathit{\boldsymbol{d}}^k}} \right)^{\rm{T}}}{\mathit{\boldsymbol{A}}_k}{\omega _k} = - {\left( {{\mathit{\boldsymbol{d}}^k}} \right)^{\rm{T}}}{\mathit{\boldsymbol{B}}_k}{\mathit{\boldsymbol{d}}^k} - \omega _k^{\rm{T}}\mathit{\boldsymbol{c}}\left( {{\mathit{\boldsymbol{x}}^k}} \right) = \\ \;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\; - {\left( {{\mathit{\boldsymbol{d}}^k}} \right)^{\rm{T}}}{\mathit{\boldsymbol{B}}_k}{\mathit{\boldsymbol{d}}^k} + {\left\| {{\omega _k}} \right\|_\infty }h\left( {{\mathit{\boldsymbol{x}}^k}} \right) \le Mh\left( {{\mathit{\boldsymbol{x}}^k}} \right) - {\left( {{\mathit{\boldsymbol{d}}^k}} \right)^{\rm{T}}}{\mathit{\boldsymbol{B}}_k}{\mathit{\boldsymbol{d}}^k} \le \\ \;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\; - 0.5{\left( {{\mathit{\boldsymbol{d}}^k}} \right)^{\rm{T}}}{\mathit{\boldsymbol{B}}_k}{\mathit{\boldsymbol{d}}^k} \le - 0.5a{\left\| {{\mathit{\boldsymbol{d}}^k}} \right\|^2} \end{array} $

    其中ωk为拉格朗日乘子.从而,

    $ \begin{array}{l} 0 = \mathop {\lim }\limits_{k \in {K_1},k \to \infty } \left( {f\left( {{\mathit{\boldsymbol{x}}^{k + 1}}} \right) - f\left( {{\mathit{\boldsymbol{x}}^k}} \right)} \right) = \\ \;\;\;\;\;\mathop {\lim }\limits_{k \in {K_1},k \to \infty } \left( {{t_k}\nabla f{{\left( {{\mathit{\boldsymbol{x}}^k}} \right)}^{\rm{T}}}{\mathit{\boldsymbol{d}}^k} + O\left( {{{\left\| {{t_k}{\mathit{\boldsymbol{d}}^k}} \right\|}^2}} \right)} \right. \le \\ \;\;\;\;\;\mathop {\lim }\limits_{k \in {K_1},k \to \infty } {t_k}\nabla f{\left( {{\mathit{\boldsymbol{x}}^k}} \right)^{\rm{T}}}{\mathit{\boldsymbol{d}}^k} \le 0 \end{array} $

    因此

    $ \mathop {\lim }\limits_{k \in {K_1},k \to \infty } \left\| {{\mathit{\boldsymbol{d}}^k}} \right\| = 0 $

    再结合引理7可得

    $ \mathop {\lim }\limits_{k \in {K_1},k \to \infty } \left\| {\mathit{\boldsymbol{d}}_0^k} \right\| = 0 $

    $ \mathop {\lim }\limits_{k \in {K_1},k \to \infty } \mathit{\boldsymbol{d}}_0^k = \mathit{\boldsymbol{d}}_0^ * $

    因此,根据引理2的证明,可证x*是问题(1)的一个K-T点.

    2) 若存在一个无穷序列K1K,使得若迭代点xk+1=xk+tkdk均由算法1步4和步5产生,反设x*不是问题(1)的一个K-T点,则由引理3知

    $ \nabla f{\left( {{\mathit{\boldsymbol{x}}^ * }} \right)^{\rm{T}}}{\mathit{\boldsymbol{q}}^ * } < 0,{\mathit{\boldsymbol{g}}_j}{\left( {{\mathit{\boldsymbol{x}}^ * }} \right)^{\rm{T}}}{\mathit{\boldsymbol{q}}^ * } < 0,j \in I\left( {{\mathit{\boldsymbol{x}}^ * }} \right),{\lambda _k} \ge {\lambda _ * } = \inf \left\{ {{\lambda _k}} \right\} > 0,k \in K $

    再由式(14)知

    $ 0 = \mathop {\lim }\limits_{k \in {K_1},k \to \infty } \left( {f\left( {{\mathit{\boldsymbol{x}}^{k + 1}}} \right) - f\left( {{\mathit{\boldsymbol{x}}^k}} \right)} \right) \le \mathop {\lim }\limits_{k \in {K_1},k \to \infty } v{\lambda _k}\nabla f{\left( {{\mathit{\boldsymbol{x}}^k}} \right)^{\rm{T}}}{\mathit{\boldsymbol{q}}^k} \le 0.5v{\lambda _ * }\nabla f{\left( {{\mathit{\boldsymbol{x}}^ * }} \right)^{\rm{T}}}{\mathit{\boldsymbol{q}}^ * } < 0 $

    这是一个矛盾.因此x*是问题(1)的一个K-T点.

    4 数值实验

    在这一部分从数值计算的角度进一步说明本文给出的算法的有效性,算法的编程使用MATLAB软件.算法中涉及的参数取值如下:

    $ {\sigma _0} = 0.01,\xi = 0.01,\varepsilon = 0.05,v = 0.1,\tau = 2.25,\delta = 2.5,\gamma = 0.05,\beta = 0.95,{\mathit{\boldsymbol{B}}_0} = I \in {\mathbb{R}^{n \times n}} $

    Bk的更新选自BFGS公式修正[10]

    $ {\mathit{\boldsymbol{B}}_{k + 1}} = BFGS\left( {{\mathit{\boldsymbol{B}}_k},{\mathit{\boldsymbol{s}}^k},{\mathit{\boldsymbol{y}}^k}} \right) $

    其中:

    $ {\mathit{\boldsymbol{s}}^k} = {\mathit{\boldsymbol{x}}^{k + 1}} - {\mathit{\boldsymbol{x}}^k},{{\mathit{\boldsymbol{\hat y}}}^k} = \nabla f\left( {{\mathit{\boldsymbol{x}}^{k + 1}}} \right) - \nabla f\left( {{\mathit{\boldsymbol{x}}^k}} \right) + \sum\limits_{j = 1}^m {\mathit{\boldsymbol{\mu }}_j^k\left( {{\mathit{\boldsymbol{g}}_j}\left( {{\mathit{\boldsymbol{x}}^{k + 1}}} \right) - {\mathit{\boldsymbol{g}}_j}\left( {{\mathit{\boldsymbol{x}}^k}} \right)} \right)} ,\\{\mathit{\boldsymbol{y}}^k} = \theta {{\mathit{\boldsymbol{\hat y}}}^k} + \left( {1 - \theta } \right){\mathit{\boldsymbol{B}}_k}{\mathit{\boldsymbol{s}}^k} $
    $ \theta = \left\{ {\begin{array}{*{20}{c}} 1&{如果{{\left( {{{\mathit{\boldsymbol{\hat y}}}^k}} \right)}^{\rm{T}}}{\mathit{\boldsymbol{s}}^k} \ge 0.2{{\left( {{\mathit{\boldsymbol{s}}^k}} \right)}^{\rm{T}}}{\mathit{\boldsymbol{B}}_k}{\mathit{\boldsymbol{s}}^k}}\\ {\frac{{0.8{{\left( {{\mathit{\boldsymbol{s}}^k}} \right)}^{\rm{T}}}{\mathit{\boldsymbol{B}}_k}{\mathit{\boldsymbol{s}}^k}}}{{{{\left( {{\mathit{\boldsymbol{s}}^k}} \right)}^{\rm{T}}}{\mathit{\boldsymbol{B}}_k}{\mathit{\boldsymbol{s}}^k} - {{\left( {{{\mathit{\boldsymbol{\hat y}}}^k}} \right)}^{\rm{T}}}{\mathit{\boldsymbol{s}}^k}}}}&{否则} \end{array}} \right. $

    而乘子函数μ(x)=-(N(x)TN(x)+D(x))-1N(x)Tf(x),其中N(x)=(gj(x),jI),D(x)=diag(cj2(x),jI).终止准则为‖d0k‖<10-6.

    从文献[11]中选取了7个测试问题,对本文的算法进行测试,测试结果见表 1.

    表 1 本文算法与文献[11]的数值结果

    表 1问题编号即测试问题在文献[11]中的编号,n表示问题中变量的维数,m表示问题中约束的维数,X0表示问题的初始点,NF1NF2分别表示本文算法和文献[11]的目标函数和约束函数的计算次数,NG1NG2分别表示本文算法和文献[11]的梯度的计算次数,NIT1NIT2分别表示本文算法和文献[11]的算法的迭代次数.

    从上述分析可看出,本文给出的修正共轭投影梯度滤子算法是有效可行的.将共轭投影技术引入到滤子算法使得该算法不需要求解二次规划子问题,而且能有效避免常规滤子算法中的恢复算法,简化了计算.

    参考文献
    [1] FLETCHER R, LEYFFER S. Nonlinear Programming Without a Penalty Function[J]. Mathematica Programming, 2002, 91(2): 239-269. DOI:10.1007/s101070100244
    [2] NIE P Y. Composite-Step like Filter Methods for Equality Constriant Problems[J]. Journal of Computational Mathematics, 2003, 21(5): 613-624.
    [3] 苏珂, 刘英. 求解非线性规划的修正滤子信赖域方法[J]. 数学学报(中文版), 2009, 52(6): 1157-1164.
    [4] 王祥玲, 朱志斌, 杨萌. 一种基于步长的SQP滤子法[J]. 应用数学, 2010, 23(3): 670-674.
    [5] WANG X L, ZHU Z B, ZUO S Y, et al. An SQP-Filter Method for Inequality Constrained Optimization and Its Global Convergence[J]. Applied Mathematics and Computation, 2011, 217(24): 10224-10230. DOI:10.1016/j.amc.2011.05.019
    [6] HUANG Q Q, ZHU Z B, WANG X L. A Predictor-Corrector Algorithm Combined Conjugate Gradient with Homotopy Interior Point for General Nonlinear Programming[J]. Applied Mathematics and Computation, 2013, 219(9): 4379-4386. DOI:10.1016/j.amc.2012.10.036
    [7] 朱志斌, 张可村. 一个新的共轭投影梯度算法及其超线性收敛性[J]. 应用数学学报, 2004, 27(1): 149-161. DOI:10.3321/j.issn:0254-3079.2004.01.017
    [8] 王祥玲, 朱志斌, 周志轩. 共轭投影梯度滤子算法及其全局收敛性[J]. 桂林电子科技大学学报, 2012, 32(6): 496-498. DOI:10.3969/j.issn.1673-808X.2012.06.017
    [9] FLETCHER R, LEYFFER S, TOINT P L. On the Global Convergence of a Filter-SQP Algorithm[J]. SIAM Journal on Optimization, 2002, 13(1): 44-59.
    [10] 袁亚湘, 孙文瑜. 最优化理论与方法[M]. 北京: 科学出版社, 1997: 232-238.
    [11] HOCK W, SCHITTKOWSKI K. Test Examples for Nonlinear Programming Codes[J]. Journal of Optimization Theory and Applications, 1980, 30(1): 127-129. DOI:10.1007/BF00934594
    [12] 柳馨. 两个修正的DL共轭梯度法[J]. 重庆工商大学学报(自然科学版), 2017, 34(5): 13-18.
    A Modified Conjugate Projection Gradient Filter Method
    WANG Xiang-ling1, ZUO Shuang-yong1, ZHU Zhi-bin2     
    1. Primary Education College, Yichun Early Childhood Teachers College, Yichun Jiangxi 330814, China;
    2. Department of Mathematics and Computational Science, Guilin University of Electronic Technology, Guilin Guangxi 541004, China
    Abstract: A new conjugate projection gradient filter algorithm is established by modifying the search direction. In this algorithm, conjugate projection gradient technology and filter method are combined. By the introduction of the filter, this algorithm does not need to solve a QP sub-problem. With the idea of the conjugate projection gradient, this method is effective to avoid the restoration algorithm in general filter algorithms. Under some conditions, its global convergence is obtained.
    Key words: conjugate projection gradient    filter    nonlinear programming    global convergence    
    X