MATLAB演示梯度上升寻找极值

梯度
梯度的本意是一个向量(矢量),表示某一函数在该点处的方向导数沿着该方向取得最大值,即函数在该点处沿着该方向(此梯度的方向)变化最快,变化率最大(为该梯度的模)。对于二元函数
    
     
      
       
        f
       
       
        (
       
       
        x
       
       
        ,
       
       
        y
       
       
        )
       
      
      
       f(x,y)
      
     
    f(x,y)的梯度计算公式为:
 
     
      
       
        
         gradf
        
        
         
        
        
         (
        
        
         x
        
        
         ,
        
        
         y
        
        
         )
        
        
         =
        
        
         ∇
        
        
         f
        
        
         (
        
        
         x
        
        
         ,
        
        
         y
        
        
         )
        
        
         =
        
        
         
          {
         
         
          
           
            ∂
           
           
            f
           
          
          
           
            ∂
           
           
            x
           
          
         
         
          ,
         
         
          
           
            ∂
           
           
            f
           
          
          
           
            ∂
           
           
            y
           
          
         
         
          }
         
        
        
         =
        
        
         
          f
         
         
          x
         
        
        
         (
        
        
         x
        
        
         ,
        
        
         y
        
        
         )
        
        
         
          i
         
         
          ˉ
         
        
        
         +
        
        
         
          f
         
         
          y
         
        
        
         (
        
        
         x
        
        
         ,
        
        
         y
        
        
         )
        
        
         
          j
         
         
          ˉ
         
        
       
       
         \operatorname{gradf}(\mathrm{x}, \mathrm{y})=\nabla f(x, y)=\left\{\frac{\partial f}{\partial x}, \frac{\partial f}{\partial y}\right\}=f_{x}(x, y) \bar{i}+f_{y}(x, y) \bar{j} 
       
      
     gradf(x,y)=∇f(x,y)={∂x∂f,∂y∂f}=fx(x,y)iˉ+fy(x,y)jˉ
 利用梯度求解以下函数的最大值,同时用MATLAB画出迭代过程。
 
     
      
       
        
         z
        
        
         =
        
        
         f
        
        
         (
        
        
         x
        
        
         ,
        
        
         y
        
        
         )
        
        
         =
        
        
         
          (
         
         
          
           x
          
          
           2
          
         
         
          −
         
         
          2
         
         
          x
         
         
          )
         
        
        
         
          e
         
         
          
           −
          
          
           
            x
           
           
            2
           
          
          
           −
          
          
           
            y
           
           
            2
           
          
          
           −
          
          
           x
          
          
           y
          
         
        
       
       
         z=f(x, y)=\left(x^{2}-2 x\right) \mathrm{e}^{-x^{2}-y^{2}-x y} 
       
      
     z=f(x,y)=(x2−2x)e−x2−y2−xy
 利用梯度的话,我们需要先求偏导。可以手算,也可以用MATLAB的符号推导功能。
 
     
      
       
        
         
         
          
           
            z
           
          
         
         
          
           
            
            
             =
            
            
             (
            
            
             
              x
             
             
              2
             
            
            
             −
            
            
             2
            
            
             x
            
            
             )
            
            
             ⋅
            
            
             
              e
             
             
              
               −
              
              
               
                x
               
               
                2
               
              
              
               −
              
              
               
                y
               
               
                2
               
              
              
               −
              
              
               x
              
              
               y
              
             
            
           
          
         
         
         
        
        
         
         
          
           
            
             
              ∂
             
             
              z
             
            
            
             
              ∂
             
             
              x
             
            
           
          
         
         
          
           
            
            
             =
            
            
             −
            
            
             2
            
            
             (
            
            
             1
            
            
             +
            
            
             
              x
             
             
              3
             
            
            
             +
            
            
             (
            
            
             
              y
             
             
              2
             
            
            
             −
            
            
             2
            
            
             )
            
            
             
              x
             
             
              2
             
            
            
             +
            
            
             (
            
            
             −
            
            
             y
            
            
             −
            
            
             1
            
            
             )
            
            
             x
            
            
             )
            
            
             
              e
             
             
              
               −
              
              
               
                x
               
               
                2
               
              
              
               −
              
              
               x
              
              
               y
              
              
               −
              
              
               
                y
               
               
                2
               
              
             
            
           
          
         
         
         
        
        
         
         
          
           
            
             
              ∂
             
             
              z
             
            
            
             
              ∂
             
             
              y
             
            
           
          
         
         
          
           
            
            
             =
            
            
             −
            
            
             x
            
            
             (
            
            
             x
            
            
             +
            
            
             2
            
            
             y
            
            
             )
            
            
             (
            
            
             x
            
            
             −
            
            
             2
            
            
             )
            
            
             
              e
             
             
              
               −
              
              
               
                x
               
               
                2
               
              
              
               −
              
              
               x
              
              
               y
              
              
               −
              
              
               
                y
               
               
                2
               
              
             
            
           
          
         
         
         
        
       
       
         \begin{align} z &= (x^{2}-2x)\cdot{\mathrm e}^{-x^{2}-y^{2}-xy} \\ \frac{\partial z}{\partial x} &= -2 (1+x^{3}+(\frac{y}{2}-2) x^{2}+(-y-1) x) {\mathrm e}^{-x^{2}-x y-y^{2}} \\ \frac{\partial z}{\partial y} &= -x(x+2 y)(x-2) {\mathrm e}^{-x^{2}-x y-y^{2}} \end{align} 
       
      
     z∂x∂z∂y∂z=(x2−2x)⋅e−x2−y2−xy=−2(1+x3+(2y−2)x2+(−y−1)x)e−x2−xy−y2=−x(x+2y)(x−2)e−x2−xy−y2
syms x y
z = (x^2-2*x)*exp(-x^2-y^2-x*y);
zx = simplify(diff(z,x));
zy = diff(z,y);
 
朝着梯度的方向函数值的上升最快。先设定一个初始值
    
     
      
       
        
         x
        
        
         0
        
       
       
        =
       
       
        0
       
       
        ,
       
       
        
         y
        
        
         0
        
       
       
        =
       
       
        0
       
      
      
       x_0 = 0,y_0 = 0
      
     
    x0=0,y0=0和步长
    
     
      
       
        a
       
       
        =
       
       
        0.01
       
      
      
       a=0.01
      
     
    a=0.01然后按照梯度的方向开始迭代上升。
 
     
      
       
        
         
         
          
           
            
             
              x
             
             
              n
             
            
            
             =
            
            
             
              x
             
             
              
               n
              
              
               −
              
              
               1
              
             
            
            
             +
            
            
             a
            
            
             
              
               ∂
              
              
               f
              
             
             
              
               ∂
              
              
               
                x
               
               
                
                 n
                
                
                 −
                
                
                 1
                
               
              
             
            
           
          
         
         
         
        
        
         
         
          
           
            
             
              y
             
             
              n
             
            
            
             =
            
            
             
              y
             
             
              
               n
              
              
               −
              
              
               1
              
             
            
            
             +
            
            
             a
            
            
             
              
               ∂
              
              
               f
              
             
             
              
               ∂
              
              
               
                y
               
               
                
                 n
                
                
                 −
                
                
                 1
                
               
              
             
            
           
          
         
         
         
        
       
       
         \begin{align} x_n = x_{n-1} + a\frac{\partial f}{\partial x_{n-1}} \\ y_n = y_{n-1} + a\frac{\partial f}{\partial y_{n-1}} \end{align} 
       
      
     xn=xn−1+a∂xn−1∂fyn=yn−1+a∂yn−1∂f
 MATLAB代码
f = @(x,y) (x.^2 - 2.*x).*exp(-x.^2-y.^2-x.*y);							% 目标函数
fx = @(x,y) -2.*(1+x.^3+(0.5.*y-2).*x.^2-(y+1).*x).*exp(-x.^2-x.*y-y.^2);	% x偏导
fy = @(x,y) -x.*(x+2.*y).*(x-2).*exp(-x.^2-x.*y-y.^2);					  % y偏导
[x0,y0] = meshgrid(-3:.2:3,-3:.2:3);
z0 = f(x0,y0);
figure
surf(x0,y0,z0)
view(30,30)		% 调整视角
hold on
[x,y,a] = deal(0,0,0.01);
z = f(x,y);
[tx,ty,tz] = deal(x,y,z+0.03);	% 纪录过程
h = plot3(tx,ty,tz,'r*-');
set(gcf,'Position',[200 200 1000 800])
for i = 1:100
    x = x + a*fx(x,y);	% x变量梯度上升
    y = y + a*fy(x,y);	% y变量梯度上升
    z = f(x,y);
    if mod(i,5) == 0
        pause(1);
        [tx,ty,tz] = deal([tx x],[ty y],[tz z+0.03]);
        h.XData = tx;
        h.YData = ty;
        h.ZData = tz;
    end
end
                










![[阿里云] 10分钟带你玩转阿里云ECS和云盘 (大数据上云必备)](https://img-blog.csdnimg.cn/fcb82890e5204b3aaf42942643d1ea2d.png)







