精细光流估计

这篇博客是论文:High Accuracy Optical Flow Estimation Based Theory for Warp 的笔记,论文为2004年,在2015年EpicFlow方法中引用。

在稠密光流估计中,最后一步往往是将光流图进行Refine和smooth,来确保光流在空间上连续,大部分区域符合刚体运动假设,同时也会处理遮挡,为遮挡区域提供连续的光流值。

在介绍的论文中,建立了一个能量函数,将光流图的优化和优化能量函数联系起来。论文偏向传统,但是也很经典,接下来一起看下做法。

一、灰度恒常假设

我们假设参与光流计算的两张图,保持了大致恒定的亮度(即亮度相同),如果记 I 为图像矩阵,t和t+1时刻的运动向量用w=(u,v,1)表示,假设平移不改变亮度:

I(x,y,t)=I(x+u,y+u,t+1)

进一步引出光流算法中经常会用到的约束:(下标表示偏导)

I_xu+I_yu+I_t=0

二、梯度恒常假设

灰度恒常性假设有一个很大的缺陷,当光照变化时假设就不再成立,而图像中的运动常常伴随着光照变化。于是我们对上式做出一些改变,不再假设灰度恒定,而假设位移前后的梯度恒定,那么可以给出:

\nabla I(x,y,t)=\nabla I(x+u,y+v,t+1) \\ where \nabla=(\partial_x,\partial_y)^T

该假设较灰度恒常假设更具有一般性。

三、平滑假设

在估计局部光流时,很少考虑周围或者更远的像素,只在局部进行光流估计通常会忽略领域或者全局像素之间的关联。对于刚性的运动,通常相临近的位置光流值的变化是连续的。如果局部光流估计不准确,可能会出现outlier点。

于是平滑假设这样描述,我们假定在物体内部,通常是练成一片的区域内,光流值在空间上连续变化。而在物体边界,光流值会产生突变。

四、多尺度策略

我们通常会使用coast-to-fine的策略,或者多尺度策略来处理光流问题。大运动或者噪声情况下,光流在downsample的小图上处理通常更加有效。

五、灰度恒常和梯度恒常的能量函数

我们定义 gamma 作为权重,灰度恒常性和梯度恒常性的代价函数可以写成如下形式:

E_{Data}(u,v)=\int_\Omega(|I(x+w)-I(x)|^2+\gamma|\nabla I(x+w)-\nabla I(x)|^2)dx

我们考虑到outliner点会带来巨大影响,于是我们增加一个带有正则项的函数psi(s^2),来增加代价函数的鲁棒性

E_{Data}(u,v)=\int_\Omega\Psi(|I(x+w)-I(x)|^2+\gamma|\nabla I(x+w)-\nabla I(x)|^2)dx \\ \Psi(s^2)=\sqrt{s^2+\epsilon^2}

六、平滑性约束的能量函数

平滑性约束,就是对物体内部光流(u,v)的一致性做约束,写成能量函数可以做如下表达

E_{Smooth}(u,v)=\int_\Omega\Psi(|\nabla_3u|^3+|\nabla_3v|^3)dx \\ \nabla_3=(\partial_x,\partial_y,\partial_t)^T

七、总的能量函数

我们的目标是找到u,v,使如下能量函数的值最小(其中alpha>0,调节两项的相对权重)

E(u,v)=E_{Data}+\alpha E_{Smooth}

八、欧拉-拉格朗日方程

根据费马引理,一个连续函数取极值的必要条件是它的导函数等于0。我们将函数这一定义推广到泛函,可以得到泛函的极值条件,也就是欧拉-拉格朗日方程。

前面能量函数 E(u,v) 是高度非线形的,所以方程没有平凡解。

我们用z替换t,考虑时域上是不可导的,所有没有z的偏导数,我们规定如下记号

I_x=\partial_xI(x+w) \\
I_y=\partial_yI(x+w) \\
I_z=I(x+w)-I(x) \\
I_{xx}=\partial_{xx}I(x+w) \\
I_{xy}=\partial_{xy}I(x+w) \\
I_{yy}=\partial_{yy}I(x+w) \\
I_{xz}=\partial_xI(x+w)-\partial_xI(x) \\
I_{yz}=\partial_yI(x+w)-\partial_yI(x) 

能量方程的优化必须完全满足欧拉-拉格朗日方程,于是它x和y的导函数满足:

x:\\ \\
\Psi'(I_z^2+\gamma(I_{xz}^2+I_{yz}^2)) \circ(I_xI_z+\gamma(I_{xx}I_{xz}+I_{xy}I_{yz}))\\-\alpha \ {div} (\Psi'(|\nabla_3u|^2+|\nabla_3v|^2)\nabla_3u)=0
\\ \\
y:
\\ \\
\Psi'(I_z^2+\gamma(I_{xz}^2+I_{yz}^2)) \circ(I_yI_z+\gamma(I_{yy}I_{yz}+I_{xy}I_{xz}))\\-\alpha \ {div} (\Psi'(|\nabla_3u|^2+|\nabla_3v|^2)\nabla_3u)=0

九、解的数值近似

实际求解过程可以采用金字塔coast-to-fine的方式求解,先求解整数部分,再求解小数部分。我们用k表示帧序号,那么上式z用k表达可写作下式。

因为z的偏导数是非线形的,为了移除非线形,我们对与z相关的偏导数进行泰勒一阶展开,我们只取一阶项,解uv的问题就被转换成了线形方程组的求解。

代入上述公式,合并smooth项和data项

我们用 l 表示我们内层迭代中每一步的初值(步数标号),那么式子可以写成。

上面的方程因为已经丢掉z偏导的高阶项,所以是一个线形方程组,可以用 Gauss-Seidel 法或者 SOR (successive over relaxation)迭代求解。

Gauss-Seidel迭代可以参考:线性方程组-迭代法 2:Jacobi迭代和Gauss-Seidel迭代 – 知乎 (zhihu.com)

SOR可以参考:逐次超松弛SOR迭代概述 – 知乎 (zhihu.com)

十、代码

代码可以从epic.cpp的论文连接中获取,这里以epic 代码中的variationc.pp进行解释

首先是variation.cpp的入口,首先是设置参数,然后对参与计算的两张图进行平滑,然后送入compute_one_level进行处理计算。

/* Compute a refinement of the optical flow (wx and wy are modified) between im1 and im2 */
void variational(image_t *wx, image_t *wy, const color_image_t *im1, const color_image_t *im2, variational_params_t *params){
  
    // Check parameters
    if(!params){
        params = (variational_params_t*) malloc(sizeof(variational_params_t));
        if(!params){
          fprintf(stderr,"error: not enough memory\n");
          exit(1);
        }
        variational_params_default(params);
    }

    // initialize global variables
    half_alpha = 0.5f*params->alpha;
    half_gamma_over3 = params->gamma*0.5f/3.0f;
    half_delta_over3 = params->delta*0.5f/3.0f;
    
    float deriv_filter[3] = {0.0f, -8.0f/12.0f, 1.0f/12.0f};
    deriv = convolution_new(2, deriv_filter, 0);
    float deriv_filter_flow[2] = {0.0f, -0.5f};
    deriv_flow = convolution_new(1, deriv_filter_flow, 0);


    // presmooth images
    int width = im1->width, height = im1->height, filter_size;
    color_image_t *smooth_im1 = color_image_new(width, height), *smooth_im2 = color_image_new(width, height);
    float *presmooth_filter = gaussian_filter(params->sigma, &filter_size);
    convolution_t *presmoothing = convolution_new(filter_size, presmooth_filter, 1);
    color_image_convolve_hv(smooth_im1, im1, presmoothing, presmoothing);
    color_image_convolve_hv(smooth_im2, im2, presmoothing, presmoothing); 
    convolution_delete(presmoothing);
    free(presmooth_filter);
    
    compute_one_level(wx, wy, smooth_im1, smooth_im2, params);
  
    // free memory
    color_image_delete(smooth_im1);
    color_image_delete(smooth_im2);
    convolution_delete(deriv);
    convolution_delete(deriv_flow);
}

接下来在 compute one level 函数中,首先compute_dpsis_weight 在梯度图上计算局部smooth权重,然后开始迭代求解光流。

迭代分为内层迭代和外层迭代,内层迭代中,首先按照本次迭代的初始光流值对图像进行warp,本次迭代在此基础上求解残差。然后求解计算x,y,z(t)方向的偏导数。

内层迭代中,就开始计算能量函数的 data item 和 smooth item,然后sor_couple函数求解得到光流值。

/* perform flow computation at one level of the pyramid */
void compute_one_level(image_t *wx, image_t *wy, color_image_t *im1, color_image_t *im2, const variational_params_t *params){ 
    const int width = wx->width, height = wx->height, stride=wx->stride;

    image_t *du = image_new(width,height), *dv = image_new(width,height), // the flow increment
        *mask = image_new(width,height), // mask containing 0 if a point goes outside image boundary, 1 otherwise
        *smooth_horiz = image_new(width,height), *smooth_vert = image_new(width,height), // horiz: (i,j) contains the diffusivity coeff from (i,j) to (i+1,j) 
        *uu = image_new(width,height), *vv = image_new(width,height), // flow plus flow increment
        *a11 = image_new(width,height), *a12 = image_new(width,height), *a22 = image_new(width,height), // system matrix A of Ax=b for each pixel
        *b1 = image_new(width,height), *b2 = image_new(width,height); // system matrix b of Ax=b for each pixel

    color_image_t *w_im2 = color_image_new(width,height), // warped second image
        *Ix = color_image_new(width,height), *Iy = color_image_new(width,height), *Iz = color_image_new(width,height), // first order derivatives
        *Ixx = color_image_new(width,height), *Ixy = color_image_new(width,height), *Iyy = color_image_new(width,height), *Ixz = color_image_new(width,height), *Iyz = color_image_new(width,height); // second order derivatives
  
  
    image_t *dpsis_weight = compute_dpsis_weight(im1, 5.0f, deriv);  
  
    int i_outer_iteration;
    for(i_outer_iteration = 0 ; i_outer_iteration < params->niter_outer ; i_outer_iteration++){
        int i_inner_iteration;
        // warp second image
        image_warp(w_im2, mask, im2, wx, wy);
        // compute derivatives
        get_derivatives(im1, w_im2, deriv, Ix, Iy, Iz, Ixx, Ixy, Iyy, Ixz, Iyz);
        // erase du and dv
        image_erase(du);
        image_erase(dv);
        // initialize uu and vv
        memcpy(uu->data,wx->data,wx->stride*wx->height*sizeof(float));
        memcpy(vv->data,wy->data,wy->stride*wy->height*sizeof(float));
        // inner fixed point iterations
        for(i_inner_iteration = 0 ; i_inner_iteration < params->niter_inner ; i_inner_iteration++){
            //  compute robust function and system
            compute_smoothness(smooth_horiz, smooth_vert, uu, vv, dpsis_weight, deriv_flow, half_alpha );
            compute_data_and_match(a11, a12, a22, b1, b2, mask, du, dv, Ix, Iy, Iz, Ixx, Ixy, Iyy, Ixz, Iyz, half_delta_over3, half_gamma_over3);
            sub_laplacian(b1, wx, smooth_horiz, smooth_vert);
            sub_laplacian(b2, wy, smooth_horiz, smooth_vert);
            // solve system
            sor_coupled(du, dv, a11, a12, a22, b1, b2, smooth_horiz, smooth_vert, params->niter_solver, params->sor_omega);          
            // update flow plus flow increment
            int i;
            v4sf *uup = (v4sf*) uu->data, *vvp = (v4sf*) vv->data, *wxp = (v4sf*) wx->data, *wyp = (v4sf*) wy->data, *dup = (v4sf*) du->data, *dvp = (v4sf*) dv->data;
            for( i=0 ; i<height*stride/4 ; i++){
                (*uup) = (*wxp) + (*dup);
                (*vvp) = (*wyp) + (*dvp);
                uup+=1; vvp+=1; wxp+=1; wyp+=1;dup+=1;dvp+=1;
	        }
        }
        // add flow increment to current flow
        memcpy(wx->data,uu->data,uu->stride*uu->height*sizeof(float));
        memcpy(wy->data,vv->data,vv->stride*vv->height*sizeof(float));
    }   
    // free memory
    image_delete(du); image_delete(dv);
    image_delete(mask);
    image_delete(smooth_horiz); image_delete(smooth_vert);
    image_delete(uu); image_delete(vv);
    image_delete(a11); image_delete(a12); image_delete(a22);
    image_delete(b1); image_delete(b2);
    image_delete(dpsis_weight);
    color_image_delete(w_im2); 
    color_image_delete(Ix); color_image_delete(Iy); color_image_delete(Iz);
    color_image_delete(Ixx); color_image_delete(Ixy); color_image_delete(Iyy); color_image_delete(Ixz); color_image_delete(Iyz);
}

sor求解,如果是单次迭代,会调用到下面的代码:

void sor_coupled_slow_but_readable(image_t *du, image_t *dv, const image_t *a11, const image_t *a12, const image_t *a22, const image_t *b1, const image_t *b2, const image_t *dpsis_horiz, const image_t *dpsis_vert, const int iterations, const float omega){
    int i,j,iter;
    float sigma_u,sigma_v,sum_dpsis,A11,A22,A12,B1,B2,det;
    for(iter = 0 ; iter<iterations ; iter++){
        for(j=0 ; j<du->height ; j++){
	        for(i=0 ; i<du->width ; i++){
	            sigma_u = 0.0f;
	            sigma_v = 0.0f;
	            sum_dpsis = 0.0f;
	            if(j>0){
		            sigma_u -= dpsis_vert->data[(j-1)*du->stride+i]*du->data[(j-1)*du->stride+i];
		            sigma_v -= dpsis_vert->data[(j-1)*du->stride+i]*dv->data[(j-1)*du->stride+i];
		            sum_dpsis += dpsis_vert->data[(j-1)*du->stride+i];
		        }
	            if(i>0){
                    sigma_u -= dpsis_horiz->data[j*du->stride+i-1]*du->data[j*du->stride+i-1];
                    sigma_v -= dpsis_horiz->data[j*du->stride+i-1]*dv->data[j*du->stride+i-1];
                    sum_dpsis += dpsis_horiz->data[j*du->stride+i-1];
		        }
	            if(j<du->height-1){
		            sigma_u -= dpsis_vert->data[j*du->stride+i]*du->data[(j+1)*du->stride+i];
		            sigma_v -= dpsis_vert->data[j*du->stride+i]*dv->data[(j+1)*du->stride+i];
		            sum_dpsis += dpsis_vert->data[j*du->stride+i];
		        }
	            if(i<du->width-1){
		            sigma_u -= dpsis_horiz->data[j*du->stride+i]*du->data[j*du->stride+i+1];
		            sigma_v -= dpsis_horiz->data[j*du->stride+i]*dv->data[j*du->stride+i+1];
		            sum_dpsis += dpsis_horiz->data[j*du->stride+i];
		        }
                A11 = a11->data[j*du->stride+i]+sum_dpsis;
                A12 = a12->data[j*du->stride+i];
                A22 = a22->data[j*du->stride+i]+sum_dpsis;
                det = A11*A22-A12*A12;
                B1 = b1->data[j*du->stride+i]-sigma_u;
                B2 = b2->data[j*du->stride+i]-sigma_v;
                du->data[j*du->stride+i] = (1.0f-omega)*du->data[j*du->stride+i] +omega*( A22*B1-A12*B2)/det;
                dv->data[j*du->stride+i] = (1.0f-omega)*dv->data[j*du->stride+i] +omega*(-A12*B1+A11*B2)/det;
	        }
	    }
    }
}

更细致的过程可以参考原始论文或者epicflow代码。

发表评论